Commit Graph

6872 Commits

Author SHA1 Message Date
Piotr Smaron
fbd75c5c06 Implement ALTER tablets KEYSPACE statement support
This commit adds support for executing ALTER KS for keyspaces with
tablets and utilizes all the previous commits.
The ALTER KS is handled in alter_keyspace_statement, where a global
topology request in generated with data attached to system.topology
table. Then, once topology state machine is ready, it starts to handle
this global topology event, which results in producing mutations
required to change the schema of the keyspace, delete the
system.topology's global req, produce tablets mutations and additional
mutations for a table tracking the lifetime of the whole req. Tracking
the lifetime is necessary to not return the control to the user too
early, so the query processor only returns the response while the
mutations are sent.
2024-05-28 13:56:42 +02:00
Piotr Smaron
80ed442be2 Introduce TABLET_KEYSPACE event to differentiate processing path of a vnode vs tablets ks 2024-05-28 13:55:11 +02:00
Piotr Smaron
cb40f13831 Add storage service to query processor
Query processor needs to access storage service to check if global
topology request is still ongoing and to be able to wait until it
completes.
2024-05-27 12:48:44 +02:00
Paweł Zakrzewski
c888945354 tablets: tests for adding/removing replicas
Note we're suppressing a UBSanitizer overflow error in UTs. That's
because our linter complains about a possible overflow, which never
happens, but tests are still failing because of it.
2024-05-27 12:48:44 +02:00
Marcin Maliszkiewicz
2ab143fb40 db: auth: move auth tables to system keyspace
Separate keyspace which also behaves as system brings
little benefit while creating some compatibility problems
like schema digest mismatch during rollback. So we decided
to move auth tables into system keyspace.

Fixes https://github.com/scylladb/scylladb/issues/18098

Closes scylladb/scylladb#18769
2024-05-26 22:30:42 +03:00
Avi Kivity
56d523b071 Merge 'build, test: disable operator<< for vector and unordered_map' from Kefu Chai
this series disables operator<<:s for vector and unordered_map, and drop operator<< for mutation, because we don't have to keep it to work with these operator:s anymore. this change is a follow up of https://github.com/scylladb/seastar/issues/1544

this change is a cleanup. so no need to backport

Closes scylladb/scylladb#18866

* github.com:scylladb/scylladb:
  mutation,db: drop operator<< for mutation and seed_provider_type&
  build: disable operator<< for vector and unordered_map
  db/heat_load_balance: include used header
  test: define a more generic boost_test_print_type
  test/boost: define fmt::formatter for service_level_controller_test.cc
  test/boost: include test/lib/test_utils.hh
2024-05-26 19:19:20 +03:00
Pavel Emelyanov
9108952a52 test/cql-pytest: Add test for token() filter againts mutation_fragments()
When selecting from mutation_fragments(table) one may want to apply
token() filtering againts partition key. This doesn't work currently,
but used to crash. This patch adds a regression test for that

refs: #18637
refs: #18768

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#18759
2024-05-26 15:31:20 +03:00
Kefu Chai
4c1b6f0476 test: define a more generic boost_test_print_type
fmt::is_formattable<T>::value is false, even if

* T is a container of U, and
* fmt::is_formattable<U>, and
* U can be formatted using fmt::formatter

so, we have to define a more generic boost_test_print_type()
for the all types supported by {fmt}. it will help us to ditch the
operator<< for vector and unordered_map in Seastar, and allow us
to use the fmt::formatter specialization of the element
types.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-05-26 12:32:43 +08:00
Kefu Chai
bfe918ac9e test/boost: define fmt::formatter for service_level_controller_test.cc
since we are moving away for operator<< based formatter, more and more
types now only have {fmt} based formatters. the same will apply to the
STL container types after ditching the generic homebrew formatter in
to_string.hh, so to be prepared for the change, let's add the
fmt::formatter for tests as well.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-05-26 12:32:43 +08:00
Kefu Chai
222dbf2ce4 test/boost: include test/lib/test_utils.hh
this change was created in the same spirit of 505900f18f. because
we are deprecating the operator<< for vector and unorderd_map in
Seastar, some tests do not compile anymore if we disable these
operators. so to be prepared for the change disabling them, let's
include test/lib/test_utils.hh for accessing the printer dedicated
for Boost.test. and also '#include <fmt/ranges.h>' when necessary,
because, in order to format the ranges using {fmt}, we need to
use fmt/ranges.h.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-05-26 12:32:43 +08:00
Michał Chojnowski
de798775fd test: test_coordinator_queue_management: wait for logs properly
The modified lines of code intend to await the first appearance of a log
on one of the nodes.

But due to misplaced parentheses, instead of creating a list of log-awaiting
tasks with a list comprehension, they pass a generator expression to
asyncio.create_task().

This is nonsense, and it fails immediately with a type error.
But since they don't actually check the result of the await,
the test just assumes that the search completed successfully.

This was uncovered by an upgrade to Python 3.12, because its typing is stronger
and asyncio.create_task() screams when it's passed a regular generator.

This patch fixes the bad list comprehension, and also adds an error check
on the completed awaitables (by calling `await` on them).

Fixes #18740

Closes scylladb/scylladb#18754
2024-05-25 10:54:44 +03:00
Nadav Har'El
dc80b5dafe test/alternator: do not write to auth tables
As part of the Alternator test suite, we check Alternator's support for
authentication. Alternator maps Scylla's existing CQL roles to AWS's
authentication:
  * AWS's access_key_id     <- the name of the CQL role
  * AWS's secret_access_key <- the salted hash of the password of the CQL role

Before this patch, the Alternator test suite created a new role with a
preset salted hash (role "alternator", salted hash "secret_pass")
and than used that in the tests. However, with the advent of Raft-based
metadata it is wrong to write directly to the roles table, and starting
with #17952 such writes will be outright forbidden.

But we don't actually need to create a new CQL role! We already have
a perfectly good CQL role called "cassandra", and our tests already use
it. So what this patch does is to have the Alternator tests (conftest.py)
read from the roles system-table the salted hash of the "cassandra" role,
and then use that - instead of the hard-coded pair alternator/secret_pass -
in the tests.

A couple more tests assumed that the role name that was used was
"alternator", but now it was changed to "cassandra" so those tests
needed minor fixes as well.

After this patch, the Alternator tests no longer *write* to the roles
system table. Moreover, after this patch, test/alternator/run and
test/alternator/suite.yaml (used when testing with test.py) no longer
need to do extra ugly CQL setup before starting the Alternator tests.

Fixes #18744

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#18771
2024-05-22 11:00:15 +03:00
Pavel Emelyanov
26eda88401 test/tablets: Check that after RF change data is replicated properly
There's a test that checks system.tablets contents to see that after
changing ks replication factor via ALTER KEYSPACE the tablet map is
updated properly. This patch extends this test that also validates that
mutations themselves are replicated according to the desired replication
factor.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#18644
2024-05-22 11:00:15 +03:00
Botond Dénes
5e41dd28c7 Merge 'Sanitize sl controller draining' from Pavel Emelyanov
The sl-controller is stopped in three steps. The first (and instantly the second) is unsubscribing from lifecycle notification and draining. The third is stop itself. First two steps are "out of order" as compared to the desired start-stop sequence of any service, this patch fixes these steps.

After this PR the drain_on_shutdown() (the call that drains the node upon stop) finally becomes clean and tidy and is no longer accompanied by ad-hoc fellow drains/stops/aborts/whatever.

refs: #2737

Closes scylladb/scylladb#18731

* github.com:scylladb/scylladb:
  sl_controller: Remove drain() method
  sl_controller: Move abort kicking into do_abort()
  main,sl_controller: Subscribe for early abort
  main: Unsubscribe sl controller next to subscribing
2024-05-21 17:16:23 +03:00
Kefu Chai
86b988a70b test/lib: do not use variable which could be moved away
C++ standard does not define the order in which the parameters
passed to a function are evaluated. so in theory, in
```c++
reusable_sst(sst->get_schema(), std::move(sst));
```
`std::move(sst)` could be evaluated before `sst->get_schema`.
but please note, `std::move(sst)` does not move `sst`
away, it merely cast `sst` to a rvalue reference, it is
`reusable_sst()` which *could* move `sst` away by
consuming it. so following call is much more dangerous
than the above one:
```c++
reusable_sst(sst->get_schema(), modify_sst(std::move(sst)))
```
nevertheless, this usage is still confusing. so instead
of passing a copy of `sst` to `reusable_sst`.

this change is inspired by clang-tidy, it warns like:

```
Warning: /home/runner/work/scylladb/scylladb/test/lib/test_services.cc:397:25: warning: 'sst' used after it was moved [bugprone-use-after-move]
  397 |     return reusable_sst(sst->get_schema(), std::move(sst));
      |                         ^
/home/runner/work/scylladb/scylladb/test/lib/test_services.cc:397:44: note: move occurred here
  397 |     return reusable_sst(sst->get_schema(), std::move(sst));
      |                                            ^
/home/runner/work/scylladb/scylladb/test/lib/test_services.cc:397:25: note: the use and move are unsequenced, i.e. there is no guarantee about the order in which they are evaluated
  397 |     return reusable_sst(sst->get_schema(), std::move(sst));
      |
```

per the analysis above, this is a false alarm.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18775
2024-05-21 10:02:10 +03:00
Avi Kivity
33ec6ccea9 test: boost: chunked_vector_test: include <optional>
std::optional is used but not imported. This fails on libstdc++-14.

Closes scylladb/scylladb#18739
2024-05-21 07:37:11 +03:00
Pavel Emelyanov
8d4c8711fa main,sl_controller: Subscribe for early abort
There's stop-signal in main that fires an abort source on stop. Lots of
other services are subscribed in it, add the sl-controller too. For now
it's a no-op, but next patches will make use of it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-05-20 21:26:31 +03:00
Avi Kivity
61505d057e Merge 'Sort user-defined types in describe statements' from Michał Jadwiszczak
User-defined types can depend on each other, creating directed acyclic graph.

In order to support restoring schema from `DESC SCHEMA`, UDTs should be
ordered topologically, not alphabetically as it was till now.

This patch changes the way UDTs are ordered in `DESC SCHEMA`/`DESC KEYSPACE <ks>` statements, so the output can be safely copy-pasted to restore the schema.

Fixes #18539

Closes scylladb/scylladb#18302

* github.com:scylladb/scylladb:
  test/cql-pytest/test_describe: add test for UDTs ordering
  cql3/statements/describe_statement: UDTs topological sorting
  cql3/statements/describe_statement: allow to skip alphabetical sorting
  types: add a method to get all referenced user types
  db/cql_type_parser: use generic topological sorting
  db/cql_type_parses: futurize raw_builder::build()
  test/boost: add test for topological sorting
  utils: introduce generic topological sorting algorithm
2024-05-20 16:58:17 +03:00
Botond Dénes
e1c4e6c151 Merge 'sstables_manager: use maintenance scheduling group to run components reload fiber' from Lakshmi Narayanan Sreethar
PR https://github.com/scylladb/scylladb/pull/18186 introduced a fiber that reloads reclaimed bloom filters when memory becomes available. Use maintenance scheduling group to run that fiber instead of running it in the main scheduling group.

Fixes #18675

Closes scylladb/scylladb#18721

* github.com:scylladb/scylladb:
  sstables_manager: use maintenance scheduling group to run components reload fiber
  sstables_manager: add member to store maintenance scheduling group
2024-05-20 16:38:42 +03:00
Andrei Chekun
bce53efd36 Enrich test results produced by test.py
This PR resolves issue with double count of the test result for topology tests. It will not appear in the consolidated report anymore.
Another fix is to provide a better view which test failed by modifying the test case name in the report enriching it with mode and run id, so making them unique across the run.

The scope of this change is:
1. Modify the test name to have run id in name
2. Add handlers to get logs of test.py and pytest in one file that are related to test, rather than to the full suite
3. Remove topology tests from aggregating them on a suite level in Junit results
4. Add a link to the logs related to the failed tests in Junit results, so it will be easier to navigate to all logs related to test
5. Gather logs related to the failed test to one directory for better logs investigation

Ref: scylladb/scylladb#17851

Closes scylladb/scylladb#18277
2024-05-20 15:33:57 +02:00
Avi Kivity
52fe351c31 Merge 'Balance tablets within nodes (intra-node migration)' from Tomasz Grabiec
This is needed to avoid severe imbalance between shards which can
happen when some table grows and is split. The inter-node balance can
be equal, so inter-node migration cannot fix the imbalance. Also, if RF=N
then there is not even a possibility of moving tablets around to fix the imbalance.
The only way to bring the system to balance is to move tablets within the nodes.

The system is not prepared for intra-node migration currently. Request coordination
is host-based, while for intra-node migration it should be (also) shard-based.
The solution employed here is to keep the coordination between nodes as-is,
and for intra-node migration storage_proxy-level coordinator is not aware of
the migration (no pending host). The replica-side request handler will be a
second-level coordinator which routes requests to shards, similar to how
the first-level coordinator routes them to hosts.

Tablet sharder is adjusted to handle intra-migration where a tablet
can have two replicas on the same host. For reads, sharder uses the
read selector to resolve the conflict. For writes, the write selector
is used.

The old shard_of() API is kept to represent shard for reads, and new
method is introduced to query the shards for writing:
shard_for_writes(). All writers should be switched to that API, which
is not done in this patch yet.

The request handler on replica side acts as a second-level
coordinator, using sharder to determine routing to shards. A given
sharder has a scope of a single topology version, a single
effective_replication_map_ptr, which should be kept alive during
writes.

perf-simple-query test results show no signs of regression:

Command: perf-simple-query -c1 -m1G --write --tablets --duration=10

Before:

> 83294.81 tps ( 59.5 allocs/op,  14.3 tasks/op,   53725 insns/op,        0 errors)
> 87756.72 tps ( 59.5 allocs/op,  14.3 tasks/op,   54049 insns/op,        0 errors)
> 86428.47 tps ( 59.6 allocs/op,  14.3 tasks/op,   54208 insns/op,        0 errors)
> 86211.38 tps ( 59.7 allocs/op,  14.3 tasks/op,   54219 insns/op,        0 errors)
> 86559.89 tps ( 59.6 allocs/op,  14.3 tasks/op,   54188 insns/op,        0 errors)
> 86609.39 tps ( 59.6 allocs/op,  14.3 tasks/op,   54117 insns/op,        0 errors)
> 87464.06 tps ( 59.5 allocs/op,  14.3 tasks/op,   54039 insns/op,        0 errors)
> 86185.43 tps ( 59.6 allocs/op,  14.3 tasks/op,   54169 insns/op,        0 errors)
> 86254.71 tps ( 59.6 allocs/op,  14.3 tasks/op,   54139 insns/op,        0 errors)
> 83395.35 tps ( 60.2 allocs/op,  14.4 tasks/op,   54693 insns/op,        0 errors)
>
> median 86428.47 tps ( 59.6 allocs/op,  14.3 tasks/op,   54208 insns/op,        0 errors)
> median absolute deviation: 243.04
> maximum: 87756.72
> minimum: 83294.81
>

After:

> 85523.06 tps ( 59.5 allocs/op,  14.3 tasks/op,   53872 insns/op,        0 errors)
> 89362.47 tps ( 59.6 allocs/op,  14.3 tasks/op,   54226 insns/op,        0 errors)
> 88167.55 tps ( 59.7 allocs/op,  14.3 tasks/op,   54400 insns/op,        0 errors)
> 87044.40 tps ( 59.7 allocs/op,  14.3 tasks/op,   54310 insns/op,        0 errors)
> 88344.50 tps ( 59.6 allocs/op,  14.3 tasks/op,   54289 insns/op,        0 errors)
> 88355.06 tps ( 59.6 allocs/op,  14.3 tasks/op,   54242 insns/op,        0 errors)
> 88725.46 tps ( 59.6 allocs/op,  14.3 tasks/op,   54230 insns/op,        0 errors)
> 88640.08 tps ( 59.6 allocs/op,  14.3 tasks/op,   54210 insns/op,        0 errors)
> 90306.31 tps ( 59.4 allocs/op,  14.3 tasks/op,   54043 insns/op,        0 errors)
> 87343.62 tps ( 59.8 allocs/op,  14.3 tasks/op,   54496 insns/op,        0 errors)
>
> median 88355.06 tps ( 59.6 allocs/op,  14.3 tasks/op,   54242 insns/op,        0 errors)
> median absolute deviation: 1007.41
> maximum: 90306.31
> minimum: 85523.06

Command (reads): perf-simple-query -c1 -m1G  --tablets --duration=10

Before:

> 95860.18 tps ( 63.1 allocs/op,  14.1 tasks/op,   42476 insns/op,        0 errors)
> 97537.69 tps ( 63.1 allocs/op,  14.1 tasks/op,   42454 insns/op,        0 errors)
> 97549.23 tps ( 63.1 allocs/op,  14.1 tasks/op,   42470 insns/op,        0 errors)
> 97511.29 tps ( 63.1 allocs/op,  14.1 tasks/op,   42470 insns/op,        0 errors)
> 97227.32 tps ( 63.1 allocs/op,  14.1 tasks/op,   42471 insns/op,        0 errors)
> 94031.94 tps ( 63.1 allocs/op,  14.1 tasks/op,   42441 insns/op,        0 errors)
> 96978.04 tps ( 63.1 allocs/op,  14.1 tasks/op,   42462 insns/op,        0 errors)
> 96401.70 tps ( 63.1 allocs/op,  14.1 tasks/op,   42473 insns/op,        0 errors)
> 96573.77 tps ( 63.1 allocs/op,  14.1 tasks/op,   42440 insns/op,        0 errors)
> 96340.54 tps ( 63.1 allocs/op,  14.1 tasks/op,   42468 insns/op,        0 errors)
>
> median 96978.04 tps ( 63.1 allocs/op,  14.1 tasks/op,   42462 insns/op,        0 errors)
> median absolute deviation: 571.20
> maximum: 97549.23
> minimum: 94031.94
>

After:

> 99794.67 tps ( 63.1 allocs/op,  14.1 tasks/op,   42471 insns/op,        0 errors)
> 101244.99 tps ( 63.1 allocs/op,  14.1 tasks/op,   42472 insns/op,        0 errors)
> 101128.37 tps ( 63.1 allocs/op,  14.1 tasks/op,   42485 insns/op,        0 errors)
> 101065.27 tps ( 63.1 allocs/op,  14.1 tasks/op,   42465 insns/op,        0 errors)
> 101212.98 tps ( 63.1 allocs/op,  14.1 tasks/op,   42456 insns/op,        0 errors)
> 101413.31 tps ( 63.1 allocs/op,  14.1 tasks/op,   42463 insns/op,        0 errors)
> 101464.92 tps ( 63.1 allocs/op,  14.1 tasks/op,   42466 insns/op,        0 errors)
> 101086.74 tps ( 63.1 allocs/op,  14.1 tasks/op,   42488 insns/op,        0 errors)
> 101559.09 tps ( 63.1 allocs/op,  14.1 tasks/op,   42468 insns/op,        0 errors)
> 100742.58 tps ( 63.1 allocs/op,  14.1 tasks/op,   42491 insns/op,        0 errors)
>
> median 101212.98 tps ( 63.1 allocs/op,  14.1 tasks/op,   42456 insns/op,        0 errors)
> median absolute deviation: 200.33
> maximum: 101559.09
> minimum: 99794.67
>

Fixes #16594

Closes scylladb/scylladb#18026

* github.com:scylladb/scylladb:
  Implement fast streaming for intra-node migration
  test: tablets_test: Test sharding during intra-node migration
  test: tablets_test: Check sharding also on the pending host
  test: py: tablets: Test writes concurrent with migration
  test: py: tablets: Test crash during intra-node migration
  api, storage_service: Introduce API to wait for topology to quiesce
  dht, replica: Remove deprecated sharder APIs
  test: Avoid using deprecated sharded API
  db: do_apply_many() avoid deprecated sharded API
  replica: mutation_dump: Avoid deprecated sharder API
  repair: Avoid deprecated sharder API
  table: Remove optimization which returns empty reader when key is not owned by the shard
  dht: is_single_shard: Avoid deprecated sharder API
  dht: split_range_to_single_shard: Work with static_sharder only
  dht: ring_position_range_sharder: Avoid deprecated sharder APIs
  dht: token: Avoid use of deprecated sharder API by switching to static_sharder
  selective_token_sharder: Avoid use of deprecated sharder API
  docs: Document tablet sharding vs tablet replica placement
  readers/multishard.cc: use shard_for_reads() instead of shard_of()
  multishard_mutation_query.cc: use shard_for_reads() instead of shard_of()
  storage_proxy: Extract common code to apply mutations on many shards according to sharder
  storage_proxy: Prepare per-partition rate-limiting for intra-node migration
  storage_proxy: Avoid shard_of() use in mutate_counter_on_leader_and_replicate()
  storage_proxy: Prepare mutate_hint() for intra-node tablet migration
  commitlog_replayer: Avoid deprecated sharder::shard_of()
  lwt: Avoid deprecated sharder::shard_of()
  compaction: Avoid deprecated sharder::shard_of()
  dht: Extract dht::static_sharder
  replica: Deprecate table::shard_of()
  locator: Deprecate effective_replication_map::shard_of()
  dht: Deprecate old sharder API: shard_of/next_shard/token_for_next_shard
  tests: tablets: py: Add intra-node migration test
  tests: tablets: Test that drained nodes are not balanced internally
  tests: tablets: Add checks of replica set validity to test_load_balancing_with_random_load
  tests: tablets: Verify that disabling balancing results in no intra-node migrations
  tests: tablets: Check that nodes are internally balanced
  tests: tablets: Improve debuggability by showing which rows are missing
  tablets, storage_service: Support intra-node migration in move_tablet() API
  tablet_allocator: Generate intra-node migration plan
  tablet_allocator: Extract make_internode_plan()
  tablet_allocator: Maintain candidate list and shard tablet count for target nodes
  tablet_allocator: Lift apply_load/can_accept_load lambdas to member functions
  tablets, streaming: Implement tablet streaming for intra-node migration
  dht, auto_refreshing_sharder: Allow overriding write selector
  multishard_writer: Handle intra-node migration
  storage_proxy: Handle intra-node tablet migration for writes
  tablets: Get rid of tablet_map::get_shard()
  tablets: Avoid tablet_map::get_shard in cleanup
  tablets: test: Use sharder instead of tablet_map::get_shard()
  tablets: tablet_sharder: Allow working with non-local host
  sharding: Prepare for intra-node-migration
  docs: Document sharder use for tablets
  tablets: Introduce tablet transition kind for intra-node migration
  tests: tablets: Fix use-after-move of skiplist in rebalance_tablets()
  sstables, gdb: Track readers in a linked list
  raft topology: Fix global token metadata barrier to not fence ahead of what is drained
2024-05-20 16:13:01 +03:00
Kefu Chai
40ce52c3cc test: use generic boost_test_print_type()
in this change, we trade the `boost_test_print_type()` overloads
for the generic template of `boost_test_print_type()`, except for
those in the very small tests, which presumably want to keep
themselves relative self-contained.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18727
2024-05-20 12:56:20 +03:00
Lakshmi Narayanan Sreethar
79f6746298 sstables_manager: add member to store maintenance scheduling group
Store that maintenance scheduling group inside the sstables_manager. The
next patch will use this to run the components reloader fiber.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-05-19 15:23:45 +05:30
Avi Kivity
2fbd78c769 feature: grandfather DIGEST_FOR_NULL_VALUES
The DIGEST_FOR_NULL_VALUES feature was added in 21a77612b3 (2020; 4.4)
and can now be assumed to be always present. The hasher which it invoked
is removed.
2024-05-18 00:24:00 +03:00
Avi Kivity
3bead8cea0 feature: grandfather PER_TABLE_PARTITIONERS
The PER_TABLE_PARTITIONERS feature was added in 90df9a44ce (2020; 4.0)
and can now be assumed to be always present. We also remove the associated
schema_feature.
2024-05-18 00:15:07 +03:00
Avi Kivity
6b532fd40b test: schema_change_test: regenerate digest for PER_TABLE_PARTITIONERS
The first digest tested was generated without the PER_TABLE_PARTITIONERS
schema feature. We're about to make that feature mandatory, so we won't
be able (and won't need) to generate a digest without it.

Update the digest to include the feature. Note it wasn't untested before,
we have a test with schema_features::full().
2024-05-18 00:14:43 +03:00
Avi Kivity
c4d8b17f4c test: test_schema_change_digest: drop unneeded reference digests
digests[0] was used by the VIEW_VIRTUAL_COLUMNS feature, which
no longer exists.

digests[1] is the same as digests[2], so drop it.
2024-05-17 20:41:20 +03:00
Avi Kivity
b5f6021a6b feature: grandfather VIEW_VIRTUAL_COLUMNS
The VIEW_VIRTUAL_COLUMNS feature was added in a108df09f9 (2019; 3.1)
and can now be assumed to be always present.

The corresponding schema_feature is removed. Note schema_features are not sent
over the wire. A digest calculation without VIEW_VIRTUAL_COLUMNS is no longer tested.
2024-05-17 20:41:19 +03:00
Botond Dénes
db70e8dd5f test/cql-pytest: test_tombstone_limit.py: enable xfailing tests
These tests were marked as xfail because they use to fail with tablets.
They don't anymore, so remove the xfail.

Fixes: #16486

Closes scylladb/scylladb#18671
2024-05-16 20:14:47 +03:00
Nadav Har'El
c7aa47354a Merge 'mutation_fragment_stream_validating_filter: respect validating_level::none' from Botond Dénes
Even when configured to not do any validation at all, the validator still did some. This small series fixes this, and adds a test to check that validation levels in general are respected, and the validator doesn't validate more than it is asked to.

Fixes: #18662

Closes scylladb/scylladb#18667

* github.com:scylladb/scylladb:
  test/boost/mutation_fragment_test.cc: add test for validator validation levels
  mutation: mutation_fragment_stream_validating_filter: fix validation_level::none
  mutation: mutation_fragment_stream_validating_filter: add raises_error ctor parameter
2024-05-16 19:57:49 +03:00
Kamil Braun
734c5de314 Merge 'fix test teardown race with ongoing test operation' from Artsiom Mishuta
This commit brings several new features in scylla_cluster.py to fix runaway asyncio task problems in topology tests

- Start-Stop Lock and Stop Event in ScyllaServer
- Tasks History, Wait for tasks from Tasks History and Manager broken state in ScyllaClusterManager
- make ManagerClient object function scope
- test_finished_event in ManagerClient

Fixes: scylladb/scylladb#16472
Fixes: scylladb/scylladb#16651

Closes scylladb/scylladb#18236

* github.com:scylladb/scylladb:
  test/pylib: Introduce ManagerClient.test_finished_event
  test/topology: make ManagerClient object function scope
  test/pylib: Introduce Manager broken state:
  test/pylib: Wait for tasks from Tasks History:
  test/pylib: Introduce Tasks History:
  test/pylib: Introduce Stop Event
  test/pylib: Introduce Start-Stop Lock:
2024-05-16 17:42:00 +02:00
Kefu Chai
759156b56d test: perf: alternator: mark format string as constexpr
before this change, we use `update_item_suffix` as a format string
fed to `format(...)`, which is resolved to `seastar::format()`.
but with a patch which migrates the `seastar::format()` to the backend
with compile-time format check, the caller sites using `format()` would
fail to build, because `update_item_suffix` is not a `constexpr`:
```
/home/kefu/.local/bin/clang++ -DFMT_SHARED -DSCYLLA_BUILD_MODE=release -DSEASTAR_API_LEVEL=7 -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SSTRING -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"RelWithDebInfo\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -isystem /home/kefu/dev/scylladb/abseil -isystem /home/kefu/dev/scylladb/build/rust -ffunction-sections -fdata-sections -O3 -g -gz -std=gnu++20 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-enum-constexpr-conversion -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb=. -march=westmere -mllvm -inline-threshold=2500 -fno-slp-vectorize -U_FORTIFY_SOURCE -Werror=unused-result -MD -MT test/perf/CMakeFiles/test-perf.dir/RelWithDebInfo/perf_alternator.cc.o -MF test/perf/CMakeFiles/test-perf.dir/RelWithDebInfo/perf_alternator.cc.o.d -o test/perf/CMakeFiles/test-perf.dir/RelWithDebInfo/perf_alternator.cc.o -c /home/kefu/dev/scylladb/test/perf/perf_alternator.cc
/home/kefu/dev/scylladb/test/perf/perf_alternator.cc:249:69: error: call to consteval function 'fmt::basic_format_string<char, const char (&)[1]>::basic_format_string<const char *, 0>' is not a constant expression
  249 |     return make_request(cli, "UpdateItem", prefix + seastar::format(update_item_suffix, ""));
      |                                                                     ^
/usr/include/fmt/core.h:2776:67: note: read of non-constexpr variable 'update_item_suffix' is not allowed in a constant expression
 2776 |   FMT_CONSTEVAL FMT_INLINE basic_format_string(const S& s) : str_(s) {
      |                                                                   ^
/home/kefu/dev/scylladb/test/perf/perf_alternator.cc:249:69: note: in call to 'basic_format_string<const char *, 0>(update_item_suffix)'
  249 |     return make_request(cli, "UpdateItem", prefix + seastar::format(update_item_suffix, ""));
      |                                                                     ^~~~~~~~~~~~~~~~~~
/home/kefu/dev/scylladb/test/perf/perf_alternator.cc:198:6: note: declared here
  198 | auto update_item_suffix = R"(
      |      ^
```

so, to prepare the change switching to compile-time format checking,
let's mark this variable `static constexpr`. this is also more correct,
as this variable is

* a compile time constant, and
* is not shared across different compilation units.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18685
2024-05-16 15:18:42 +03:00
Michał Jadwiszczak
b3e6a39604 test/cql-pytest/test_describe: add test for UDTs ordering 2024-05-16 13:30:03 +02:00
Michał Jadwiszczak
7f04c88395 test/boost: add test for topological sorting 2024-05-16 13:30:03 +02:00
Nadav Har'El
27ab560abd cql: fix hang during certain SELECT statements
The function intersection(r1,r2) in statement_restrictions.cc is used
when several WHERE restrictions were applied to the same column.
For example, for "WHERE b<1 AND b<2" the intersection of the two ranges
is calculated to be b<1.

As noted in issue #18690, Scylla is inconsistent in where it allows or
doesn't allow these intersecting restrictions. But where they are
allowed they must be implemented correctly. And it turns out the
function intersection() had a bug that caused it to sometimes enter
an infinite loop - when the intent was only to call itself once with
swapped parameters.

This patch includes a test reproducing this bug, and a fix for the
bug. The test hangs before the fix, and passes after the fix.

While at it, I carefully reviewed the entire code used to implement
the intersection() function to try to make sure that the bug we found
was the only one. I also added a few more comments where I thought they
were needed to understand complicated logic of the code.

The bug, the fix and the test were originally discovered by
Michał Chojnowski.

Fixes #18688
Refs #18690

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#18694
2024-05-16 11:25:44 +03:00
Piotr Dulikowski
68eca3778c Merge 'mv: throttle view update generation for large queries' from Wojciech Mitros
This series is a reupload of #13792 with a few modifications, namely a test is added and the conflicts with recent tablet related changes are fixed.

See https://github.com/scylladb/scylladb/issues/12379 and https://github.com/scylladb/scylladb/pull/13583 for a detailed description of the problem and discussions.

This PR aims to extend the existing throttling mechanism to work with requests that internally generate a large amount of view updates, as suggested by @nyh.

The existing mechanism works in the following way:

* Client sends a request, we generate the view updates corresponding to the request and spawn background tasks which will send these updates to remote nodes
* Each background task consumes some units from the `view_update_concurrency_semaphore`, but doesn't wait for these units, it's just for tracking
* We keep track of the percent of consumed units on each node, this is called `view update backlog`.
* Before sending a response to the client we sleep for a short amount of time. The amount of time to sleep for is based on the fullness of this `view update backlog`. For a well behaved client with limited concurrency this will limit the amount of incoming requests to a manageable level.

This mechanism doesn't handle large DELETE queries. Deleting a partition is fast for the base table, but it requires us to generate a view update for every single deleted row. The number of deleted rows per single client request can be in the millions. Delaying response to the request doesn't help when a single request can generate millions of updates.

To deal with this we could treat the view update generator just like any other client and force it to wait a bit of time before sending the next batch of updates. The amount of time to wait for is calculated just like in the existing throttling code, it's based on the fullness of `view update backlogs`.

The new algorithm of view update generation looks something like this:
```c++
for(;;) {
    auto updates = generate_updates_batch_with_max_100_rows();
    co_await seastar::sleep(calculate_sleep_time_from_backlogs());
    spawn_background_tasks_for_updates(updates);
}
```
Fixes: https://github.com/scylladb/scylladb/issues/12379

Closes scylladb/scylladb#16819

* github.com:scylladb/scylladb:
  test: add test for bad_allocs during large mv queries
  mv: throttle view update generation for large queries
  exceptions: add read_write_timeout_exception, a subclass of request_timeout_exception
  db/view: extract view throttling delay calculation to a global function
  view_update_generator: add get_storage_proxy()
  storage_proxy: make view backlog getters public
2024-05-16 08:22:54 +02:00
Tomasz Grabiec
a179f37780 test: tablets_test: Test sharding during intra-node migration 2024-05-16 00:28:47 +02:00
Tomasz Grabiec
5f32d2ddb6 test: tablets_test: Check sharding also on the pending host 2024-05-16 00:28:47 +02:00
Tomasz Grabiec
6d809c75fb test: py: tablets: Test writes concurrent with migration 2024-05-16 00:28:47 +02:00
Tomasz Grabiec
ad02d85c16 test: py: tablets: Test crash during intra-node migration 2024-05-16 00:28:47 +02:00
Tomasz Grabiec
7956a2991e api, storage_service: Introduce API to wait for topology to quiesce 2024-05-16 00:28:47 +02:00
Tomasz Grabiec
32a191384a test: Avoid using deprecated sharded API
There is not tablet migration in unit tests, so shard_of() can be
safely replaced with shard_for_reads(). Even if it's used for writes.
2024-05-16 00:28:47 +02:00
Tomasz Grabiec
c9e6b4dca7 dht: split_range_to_single_shard: Work with static_sharder only
In preparation for intra-node tablet migration, to avoid
using deprecated sharder APIs.

This function is used for generating sstable sharding metadata.
For tablets, it is not invoked, so we can safely work with the
static sharder. The call site already passes static_sharder only.
2024-05-16 00:28:47 +02:00
Tomasz Grabiec
9da3bd84c7 dht: Extract dht::static_sharder
Before the patch, dht::sharder could be instantiated and it would
behave like a static sharder. This is not safe with regards to
extensions of the API because if a derived implementation forgets to
override some method, it would incorrectly default to the
implementation from static sharder. Better to fail the compilation in
this case, so extract static sharder logic to dht::static_sharder
class and make all methods in dht::sharder pure virtual.

This also allows us to have algorithms indicate that they only work
with static sharder by accepting the type, and have compile-time
safety for this requirement.

schema::get_sharder() is changed to return the static_sharder&.
2024-05-16 00:28:47 +02:00
Tomasz Grabiec
10a4903d0c dht: Deprecate old sharder API: shard_of/next_shard/token_for_next_shard
Require users to specify whether we want shard for reads or for writes
by switching to appropriate non-deprecated variant.

For example, shard_of() can be replaced with shard_for_reads() or
shard_for_writes().

The next_shard/token_for_next_shard APIs have only for-reads variant,
and the act of switching will be a testimony to the fact that the code
is valid for intra-node migration.
2024-05-16 00:28:47 +02:00
Tomasz Grabiec
b3cdf9a379 tests: tablets: py: Add intra-node migration test 2024-05-16 00:28:47 +02:00
Tomasz Grabiec
d26cd97633 tests: tablets: Test that drained nodes are not balanced internally
It would be a waste of effort to do so, since we migrate tablets away
anyway.
2024-05-16 00:28:47 +02:00
Tomasz Grabiec
04f0088679 tests: tablets: Add checks of replica set validity to test_load_balancing_with_random_load 2024-05-16 00:28:47 +02:00
Tomasz Grabiec
c76ba52c70 tests: tablets: Verify that disabling balancing results in no intra-node migrations 2024-05-16 00:28:47 +02:00
Tomasz Grabiec
0addca88b9 tests: tablets: Check that nodes are internally balanced
Existing tests are augmented with a check which verifies that
all nodes are internally balanced.
2024-05-16 00:28:47 +02:00