scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-01 21:55:50 +00:00

Author	SHA1	Message	Date
Piotr Smaron	fbd75c5c06	Implement ALTER tablets KEYSPACE statement support This commit adds support for executing ALTER KS for keyspaces with tablets and utilizes all the previous commits. The ALTER KS is handled in alter_keyspace_statement, where a global topology request in generated with data attached to system.topology table. Then, once topology state machine is ready, it starts to handle this global topology event, which results in producing mutations required to change the schema of the keyspace, delete the system.topology's global req, produce tablets mutations and additional mutations for a table tracking the lifetime of the whole req. Tracking the lifetime is necessary to not return the control to the user too early, so the query processor only returns the response while the mutations are sent.	2024-05-28 13:56:42 +02:00
Piotr Smaron	80ed442be2	Introduce TABLET_KEYSPACE event to differentiate processing path of a vnode vs tablets ks	2024-05-28 13:55:11 +02:00
Piotr Smaron	cb40f13831	Add storage service to query processor Query processor needs to access storage service to check if global topology request is still ongoing and to be able to wait until it completes.	2024-05-27 12:48:44 +02:00
Paweł Zakrzewski	c888945354	tablets: tests for adding/removing replicas Note we're suppressing a UBSanitizer overflow error in UTs. That's because our linter complains about a possible overflow, which never happens, but tests are still failing because of it.	2024-05-27 12:48:44 +02:00
Marcin Maliszkiewicz	2ab143fb40	db: auth: move auth tables to system keyspace Separate keyspace which also behaves as system brings little benefit while creating some compatibility problems like schema digest mismatch during rollback. So we decided to move auth tables into system keyspace. Fixes https://github.com/scylladb/scylladb/issues/18098 Closes scylladb/scylladb#18769	2024-05-26 22:30:42 +03:00
Avi Kivity	56d523b071	Merge 'build, test: disable operator<< for vector and unordered_map' from Kefu Chai this series disables operator<<:s for vector and unordered_map, and drop operator<< for mutation, because we don't have to keep it to work with these operator:s anymore. this change is a follow up of https://github.com/scylladb/seastar/issues/1544 this change is a cleanup. so no need to backport Closes scylladb/scylladb#18866 * github.com:scylladb/scylladb: mutation,db: drop operator<< for mutation and seed_provider_type& build: disable operator<< for vector and unordered_map db/heat_load_balance: include used header test: define a more generic boost_test_print_type test/boost: define fmt::formatter for service_level_controller_test.cc test/boost: include test/lib/test_utils.hh	2024-05-26 19:19:20 +03:00
Pavel Emelyanov	9108952a52	test/cql-pytest: Add test for token() filter againts mutation_fragments() When selecting from mutation_fragments(table) one may want to apply token() filtering againts partition key. This doesn't work currently, but used to crash. This patch adds a regression test for that refs: #18637 refs: #18768 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#18759	2024-05-26 15:31:20 +03:00
Kefu Chai	4c1b6f0476	test: define a more generic boost_test_print_type fmt::is_formattable<T>::value is false, even if * T is a container of U, and * fmt::is_formattable<U>, and * U can be formatted using fmt::formatter so, we have to define a more generic boost_test_print_type() for the all types supported by {fmt}. it will help us to ditch the operator<< for vector and unordered_map in Seastar, and allow us to use the fmt::formatter specialization of the element types. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-05-26 12:32:43 +08:00
Kefu Chai	bfe918ac9e	test/boost: define fmt::formatter for service_level_controller_test.cc since we are moving away for operator<< based formatter, more and more types now only have {fmt} based formatters. the same will apply to the STL container types after ditching the generic homebrew formatter in to_string.hh, so to be prepared for the change, let's add the fmt::formatter for tests as well. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-05-26 12:32:43 +08:00
Kefu Chai	222dbf2ce4	test/boost: include test/lib/test_utils.hh this change was created in the same spirit of 505900f18f. because we are deprecating the operator<< for vector and unorderd_map in Seastar, some tests do not compile anymore if we disable these operators. so to be prepared for the change disabling them, let's include test/lib/test_utils.hh for accessing the printer dedicated for Boost.test. and also '#include <fmt/ranges.h>' when necessary, because, in order to format the ranges using {fmt}, we need to use fmt/ranges.h. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-05-26 12:32:43 +08:00
Michał Chojnowski	de798775fd	test: test_coordinator_queue_management: wait for logs properly The modified lines of code intend to await the first appearance of a log on one of the nodes. But due to misplaced parentheses, instead of creating a list of log-awaiting tasks with a list comprehension, they pass a generator expression to asyncio.create_task(). This is nonsense, and it fails immediately with a type error. But since they don't actually check the result of the await, the test just assumes that the search completed successfully. This was uncovered by an upgrade to Python 3.12, because its typing is stronger and asyncio.create_task() screams when it's passed a regular generator. This patch fixes the bad list comprehension, and also adds an error check on the completed awaitables (by calling `await` on them). Fixes #18740 Closes scylladb/scylladb#18754	2024-05-25 10:54:44 +03:00
Nadav Har'El	dc80b5dafe	test/alternator: do not write to auth tables As part of the Alternator test suite, we check Alternator's support for authentication. Alternator maps Scylla's existing CQL roles to AWS's authentication: * AWS's access_key_id <- the name of the CQL role * AWS's secret_access_key <- the salted hash of the password of the CQL role Before this patch, the Alternator test suite created a new role with a preset salted hash (role "alternator", salted hash "secret_pass") and than used that in the tests. However, with the advent of Raft-based metadata it is wrong to write directly to the roles table, and starting with #17952 such writes will be outright forbidden. But we don't actually need to create a new CQL role! We already have a perfectly good CQL role called "cassandra", and our tests already use it. So what this patch does is to have the Alternator tests (conftest.py) read from the roles system-table the salted hash of the "cassandra" role, and then use that - instead of the hard-coded pair alternator/secret_pass - in the tests. A couple more tests assumed that the role name that was used was "alternator", but now it was changed to "cassandra" so those tests needed minor fixes as well. After this patch, the Alternator tests no longer write to the roles system table. Moreover, after this patch, test/alternator/run and test/alternator/suite.yaml (used when testing with test.py) no longer need to do extra ugly CQL setup before starting the Alternator tests. Fixes #18744 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#18771	2024-05-22 11:00:15 +03:00
Pavel Emelyanov	26eda88401	test/tablets: Check that after RF change data is replicated properly There's a test that checks system.tablets contents to see that after changing ks replication factor via ALTER KEYSPACE the tablet map is updated properly. This patch extends this test that also validates that mutations themselves are replicated according to the desired replication factor. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#18644	2024-05-22 11:00:15 +03:00
Botond Dénes	5e41dd28c7	Merge 'Sanitize sl controller draining' from Pavel Emelyanov The sl-controller is stopped in three steps. The first (and instantly the second) is unsubscribing from lifecycle notification and draining. The third is stop itself. First two steps are "out of order" as compared to the desired start-stop sequence of any service, this patch fixes these steps. After this PR the drain_on_shutdown() (the call that drains the node upon stop) finally becomes clean and tidy and is no longer accompanied by ad-hoc fellow drains/stops/aborts/whatever. refs: #2737 Closes scylladb/scylladb#18731 * github.com:scylladb/scylladb: sl_controller: Remove drain() method sl_controller: Move abort kicking into do_abort() main,sl_controller: Subscribe for early abort main: Unsubscribe sl controller next to subscribing	2024-05-21 17:16:23 +03:00
Kefu Chai	86b988a70b	test/lib: do not use variable which could be moved away C++ standard does not define the order in which the parameters passed to a function are evaluated. so in theory, in ```c++ reusable_sst(sst->get_schema(), std::move(sst)); ``` `std::move(sst)` could be evaluated before `sst->get_schema`. but please note, `std::move(sst)` does not move `sst` away, it merely cast `sst` to a rvalue reference, it is `reusable_sst()` which could move `sst` away by consuming it. so following call is much more dangerous than the above one: ```c++ reusable_sst(sst->get_schema(), modify_sst(std::move(sst))) ``` nevertheless, this usage is still confusing. so instead of passing a copy of `sst` to `reusable_sst`. this change is inspired by clang-tidy, it warns like: ``` Warning: /home/runner/work/scylladb/scylladb/test/lib/test_services.cc:397:25: warning: 'sst' used after it was moved [bugprone-use-after-move] 397 \| return reusable_sst(sst->get_schema(), std::move(sst)); \| ^ /home/runner/work/scylladb/scylladb/test/lib/test_services.cc:397:44: note: move occurred here 397 \| return reusable_sst(sst->get_schema(), std::move(sst)); \| ^ /home/runner/work/scylladb/scylladb/test/lib/test_services.cc:397:25: note: the use and move are unsequenced, i.e. there is no guarantee about the order in which they are evaluated 397 \| return reusable_sst(sst->get_schema(), std::move(sst)); \| ``` per the analysis above, this is a false alarm. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#18775	2024-05-21 10:02:10 +03:00
Avi Kivity	33ec6ccea9	test: boost: chunked_vector_test: include <optional> std::optional is used but not imported. This fails on libstdc++-14. Closes scylladb/scylladb#18739	2024-05-21 07:37:11 +03:00
Pavel Emelyanov	8d4c8711fa	main,sl_controller: Subscribe for early abort There's stop-signal in main that fires an abort source on stop. Lots of other services are subscribed in it, add the sl-controller too. For now it's a no-op, but next patches will make use of it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-05-20 21:26:31 +03:00
Avi Kivity	61505d057e	Merge 'Sort user-defined types in describe statements' from Michał Jadwiszczak User-defined types can depend on each other, creating directed acyclic graph. In order to support restoring schema from `DESC SCHEMA`, UDTs should be ordered topologically, not alphabetically as it was till now. This patch changes the way UDTs are ordered in `DESC SCHEMA`/`DESC KEYSPACE <ks>` statements, so the output can be safely copy-pasted to restore the schema. Fixes #18539 Closes scylladb/scylladb#18302 * github.com:scylladb/scylladb: test/cql-pytest/test_describe: add test for UDTs ordering cql3/statements/describe_statement: UDTs topological sorting cql3/statements/describe_statement: allow to skip alphabetical sorting types: add a method to get all referenced user types db/cql_type_parser: use generic topological sorting db/cql_type_parses: futurize raw_builder::build() test/boost: add test for topological sorting utils: introduce generic topological sorting algorithm	2024-05-20 16:58:17 +03:00
Botond Dénes	e1c4e6c151	Merge 'sstables_manager: use maintenance scheduling group to run components reload fiber' from Lakshmi Narayanan Sreethar PR https://github.com/scylladb/scylladb/pull/18186 introduced a fiber that reloads reclaimed bloom filters when memory becomes available. Use maintenance scheduling group to run that fiber instead of running it in the main scheduling group. Fixes #18675 Closes scylladb/scylladb#18721 * github.com:scylladb/scylladb: sstables_manager: use maintenance scheduling group to run components reload fiber sstables_manager: add member to store maintenance scheduling group	2024-05-20 16:38:42 +03:00
Andrei Chekun	bce53efd36	Enrich test results produced by test.py This PR resolves issue with double count of the test result for topology tests. It will not appear in the consolidated report anymore. Another fix is to provide a better view which test failed by modifying the test case name in the report enriching it with mode and run id, so making them unique across the run. The scope of this change is: 1. Modify the test name to have run id in name 2. Add handlers to get logs of test.py and pytest in one file that are related to test, rather than to the full suite 3. Remove topology tests from aggregating them on a suite level in Junit results 4. Add a link to the logs related to the failed tests in Junit results, so it will be easier to navigate to all logs related to test 5. Gather logs related to the failed test to one directory for better logs investigation Ref: scylladb/scylladb#17851 Closes scylladb/scylladb#18277	2024-05-20 15:33:57 +02:00
Avi Kivity	52fe351c31	Merge 'Balance tablets within nodes (intra-node migration)' from Tomasz Grabiec This is needed to avoid severe imbalance between shards which can happen when some table grows and is split. The inter-node balance can be equal, so inter-node migration cannot fix the imbalance. Also, if RF=N then there is not even a possibility of moving tablets around to fix the imbalance. The only way to bring the system to balance is to move tablets within the nodes. The system is not prepared for intra-node migration currently. Request coordination is host-based, while for intra-node migration it should be (also) shard-based. The solution employed here is to keep the coordination between nodes as-is, and for intra-node migration storage_proxy-level coordinator is not aware of the migration (no pending host). The replica-side request handler will be a second-level coordinator which routes requests to shards, similar to how the first-level coordinator routes them to hosts. Tablet sharder is adjusted to handle intra-migration where a tablet can have two replicas on the same host. For reads, sharder uses the read selector to resolve the conflict. For writes, the write selector is used. The old shard_of() API is kept to represent shard for reads, and new method is introduced to query the shards for writing: shard_for_writes(). All writers should be switched to that API, which is not done in this patch yet. The request handler on replica side acts as a second-level coordinator, using sharder to determine routing to shards. A given sharder has a scope of a single topology version, a single effective_replication_map_ptr, which should be kept alive during writes. perf-simple-query test results show no signs of regression: Command: perf-simple-query -c1 -m1G --write --tablets --duration=10 Before: > 83294.81 tps ( 59.5 allocs/op, 14.3 tasks/op, 53725 insns/op, 0 errors) > 87756.72 tps ( 59.5 allocs/op, 14.3 tasks/op, 54049 insns/op, 0 errors) > 86428.47 tps ( 59.6 allocs/op, 14.3 tasks/op, 54208 insns/op, 0 errors) > 86211.38 tps ( 59.7 allocs/op, 14.3 tasks/op, 54219 insns/op, 0 errors) > 86559.89 tps ( 59.6 allocs/op, 14.3 tasks/op, 54188 insns/op, 0 errors) > 86609.39 tps ( 59.6 allocs/op, 14.3 tasks/op, 54117 insns/op, 0 errors) > 87464.06 tps ( 59.5 allocs/op, 14.3 tasks/op, 54039 insns/op, 0 errors) > 86185.43 tps ( 59.6 allocs/op, 14.3 tasks/op, 54169 insns/op, 0 errors) > 86254.71 tps ( 59.6 allocs/op, 14.3 tasks/op, 54139 insns/op, 0 errors) > 83395.35 tps ( 60.2 allocs/op, 14.4 tasks/op, 54693 insns/op, 0 errors) > > median 86428.47 tps ( 59.6 allocs/op, 14.3 tasks/op, 54208 insns/op, 0 errors) > median absolute deviation: 243.04 > maximum: 87756.72 > minimum: 83294.81 > After: > 85523.06 tps ( 59.5 allocs/op, 14.3 tasks/op, 53872 insns/op, 0 errors) > 89362.47 tps ( 59.6 allocs/op, 14.3 tasks/op, 54226 insns/op, 0 errors) > 88167.55 tps ( 59.7 allocs/op, 14.3 tasks/op, 54400 insns/op, 0 errors) > 87044.40 tps ( 59.7 allocs/op, 14.3 tasks/op, 54310 insns/op, 0 errors) > 88344.50 tps ( 59.6 allocs/op, 14.3 tasks/op, 54289 insns/op, 0 errors) > 88355.06 tps ( 59.6 allocs/op, 14.3 tasks/op, 54242 insns/op, 0 errors) > 88725.46 tps ( 59.6 allocs/op, 14.3 tasks/op, 54230 insns/op, 0 errors) > 88640.08 tps ( 59.6 allocs/op, 14.3 tasks/op, 54210 insns/op, 0 errors) > 90306.31 tps ( 59.4 allocs/op, 14.3 tasks/op, 54043 insns/op, 0 errors) > 87343.62 tps ( 59.8 allocs/op, 14.3 tasks/op, 54496 insns/op, 0 errors) > > median 88355.06 tps ( 59.6 allocs/op, 14.3 tasks/op, 54242 insns/op, 0 errors) > median absolute deviation: 1007.41 > maximum: 90306.31 > minimum: 85523.06 Command (reads): perf-simple-query -c1 -m1G --tablets --duration=10 Before: > 95860.18 tps ( 63.1 allocs/op, 14.1 tasks/op, 42476 insns/op, 0 errors) > 97537.69 tps ( 63.1 allocs/op, 14.1 tasks/op, 42454 insns/op, 0 errors) > 97549.23 tps ( 63.1 allocs/op, 14.1 tasks/op, 42470 insns/op, 0 errors) > 97511.29 tps ( 63.1 allocs/op, 14.1 tasks/op, 42470 insns/op, 0 errors) > 97227.32 tps ( 63.1 allocs/op, 14.1 tasks/op, 42471 insns/op, 0 errors) > 94031.94 tps ( 63.1 allocs/op, 14.1 tasks/op, 42441 insns/op, 0 errors) > 96978.04 tps ( 63.1 allocs/op, 14.1 tasks/op, 42462 insns/op, 0 errors) > 96401.70 tps ( 63.1 allocs/op, 14.1 tasks/op, 42473 insns/op, 0 errors) > 96573.77 tps ( 63.1 allocs/op, 14.1 tasks/op, 42440 insns/op, 0 errors) > 96340.54 tps ( 63.1 allocs/op, 14.1 tasks/op, 42468 insns/op, 0 errors) > > median 96978.04 tps ( 63.1 allocs/op, 14.1 tasks/op, 42462 insns/op, 0 errors) > median absolute deviation: 571.20 > maximum: 97549.23 > minimum: 94031.94 > After: > 99794.67 tps ( 63.1 allocs/op, 14.1 tasks/op, 42471 insns/op, 0 errors) > 101244.99 tps ( 63.1 allocs/op, 14.1 tasks/op, 42472 insns/op, 0 errors) > 101128.37 tps ( 63.1 allocs/op, 14.1 tasks/op, 42485 insns/op, 0 errors) > 101065.27 tps ( 63.1 allocs/op, 14.1 tasks/op, 42465 insns/op, 0 errors) > 101212.98 tps ( 63.1 allocs/op, 14.1 tasks/op, 42456 insns/op, 0 errors) > 101413.31 tps ( 63.1 allocs/op, 14.1 tasks/op, 42463 insns/op, 0 errors) > 101464.92 tps ( 63.1 allocs/op, 14.1 tasks/op, 42466 insns/op, 0 errors) > 101086.74 tps ( 63.1 allocs/op, 14.1 tasks/op, 42488 insns/op, 0 errors) > 101559.09 tps ( 63.1 allocs/op, 14.1 tasks/op, 42468 insns/op, 0 errors) > 100742.58 tps ( 63.1 allocs/op, 14.1 tasks/op, 42491 insns/op, 0 errors) > > median 101212.98 tps ( 63.1 allocs/op, 14.1 tasks/op, 42456 insns/op, 0 errors) > median absolute deviation: 200.33 > maximum: 101559.09 > minimum: 99794.67 > Fixes #16594 Closes scylladb/scylladb#18026 * github.com:scylladb/scylladb: Implement fast streaming for intra-node migration test: tablets_test: Test sharding during intra-node migration test: tablets_test: Check sharding also on the pending host test: py: tablets: Test writes concurrent with migration test: py: tablets: Test crash during intra-node migration api, storage_service: Introduce API to wait for topology to quiesce dht, replica: Remove deprecated sharder APIs test: Avoid using deprecated sharded API db: do_apply_many() avoid deprecated sharded API replica: mutation_dump: Avoid deprecated sharder API repair: Avoid deprecated sharder API table: Remove optimization which returns empty reader when key is not owned by the shard dht: is_single_shard: Avoid deprecated sharder API dht: split_range_to_single_shard: Work with static_sharder only dht: ring_position_range_sharder: Avoid deprecated sharder APIs dht: token: Avoid use of deprecated sharder API by switching to static_sharder selective_token_sharder: Avoid use of deprecated sharder API docs: Document tablet sharding vs tablet replica placement readers/multishard.cc: use shard_for_reads() instead of shard_of() multishard_mutation_query.cc: use shard_for_reads() instead of shard_of() storage_proxy: Extract common code to apply mutations on many shards according to sharder storage_proxy: Prepare per-partition rate-limiting for intra-node migration storage_proxy: Avoid shard_of() use in mutate_counter_on_leader_and_replicate() storage_proxy: Prepare mutate_hint() for intra-node tablet migration commitlog_replayer: Avoid deprecated sharder::shard_of() lwt: Avoid deprecated sharder::shard_of() compaction: Avoid deprecated sharder::shard_of() dht: Extract dht::static_sharder replica: Deprecate table::shard_of() locator: Deprecate effective_replication_map::shard_of() dht: Deprecate old sharder API: shard_of/next_shard/token_for_next_shard tests: tablets: py: Add intra-node migration test tests: tablets: Test that drained nodes are not balanced internally tests: tablets: Add checks of replica set validity to test_load_balancing_with_random_load tests: tablets: Verify that disabling balancing results in no intra-node migrations tests: tablets: Check that nodes are internally balanced tests: tablets: Improve debuggability by showing which rows are missing tablets, storage_service: Support intra-node migration in move_tablet() API tablet_allocator: Generate intra-node migration plan tablet_allocator: Extract make_internode_plan() tablet_allocator: Maintain candidate list and shard tablet count for target nodes tablet_allocator: Lift apply_load/can_accept_load lambdas to member functions tablets, streaming: Implement tablet streaming for intra-node migration dht, auto_refreshing_sharder: Allow overriding write selector multishard_writer: Handle intra-node migration storage_proxy: Handle intra-node tablet migration for writes tablets: Get rid of tablet_map::get_shard() tablets: Avoid tablet_map::get_shard in cleanup tablets: test: Use sharder instead of tablet_map::get_shard() tablets: tablet_sharder: Allow working with non-local host sharding: Prepare for intra-node-migration docs: Document sharder use for tablets tablets: Introduce tablet transition kind for intra-node migration tests: tablets: Fix use-after-move of skiplist in rebalance_tablets() sstables, gdb: Track readers in a linked list raft topology: Fix global token metadata barrier to not fence ahead of what is drained	2024-05-20 16:13:01 +03:00
Kefu Chai	40ce52c3cc	test: use generic boost_test_print_type() in this change, we trade the `boost_test_print_type()` overloads for the generic template of `boost_test_print_type()`, except for those in the very small tests, which presumably want to keep themselves relative self-contained. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#18727	2024-05-20 12:56:20 +03:00
Lakshmi Narayanan Sreethar	79f6746298	sstables_manager: add member to store maintenance scheduling group Store that maintenance scheduling group inside the sstables_manager. The next patch will use this to run the components reloader fiber. Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>	2024-05-19 15:23:45 +05:30
Avi Kivity	2fbd78c769	feature: grandfather DIGEST_FOR_NULL_VALUES The DIGEST_FOR_NULL_VALUES feature was added in `21a77612b3` (2020; 4.4) and can now be assumed to be always present. The hasher which it invoked is removed.	2024-05-18 00:24:00 +03:00
Avi Kivity	3bead8cea0	feature: grandfather PER_TABLE_PARTITIONERS The PER_TABLE_PARTITIONERS feature was added in `90df9a44ce` (2020; 4.0) and can now be assumed to be always present. We also remove the associated schema_feature.	2024-05-18 00:15:07 +03:00
Avi Kivity	6b532fd40b	test: schema_change_test: regenerate digest for PER_TABLE_PARTITIONERS The first digest tested was generated without the PER_TABLE_PARTITIONERS schema feature. We're about to make that feature mandatory, so we won't be able (and won't need) to generate a digest without it. Update the digest to include the feature. Note it wasn't untested before, we have a test with schema_features::full().	2024-05-18 00:14:43 +03:00
Avi Kivity	c4d8b17f4c	test: test_schema_change_digest: drop unneeded reference digests digests[0] was used by the VIEW_VIRTUAL_COLUMNS feature, which no longer exists. digests[1] is the same as digests[2], so drop it.	2024-05-17 20:41:20 +03:00
Avi Kivity	b5f6021a6b	feature: grandfather VIEW_VIRTUAL_COLUMNS The VIEW_VIRTUAL_COLUMNS feature was added in `a108df09f9` (2019; 3.1) and can now be assumed to be always present. The corresponding schema_feature is removed. Note schema_features are not sent over the wire. A digest calculation without VIEW_VIRTUAL_COLUMNS is no longer tested.	2024-05-17 20:41:19 +03:00
Botond Dénes	db70e8dd5f	test/cql-pytest: test_tombstone_limit.py: enable xfailing tests These tests were marked as xfail because they use to fail with tablets. They don't anymore, so remove the xfail. Fixes: #16486 Closes scylladb/scylladb#18671	2024-05-16 20:14:47 +03:00
Nadav Har'El	c7aa47354a	Merge 'mutation_fragment_stream_validating_filter: respect validating_level::none' from Botond Dénes Even when configured to not do any validation at all, the validator still did some. This small series fixes this, and adds a test to check that validation levels in general are respected, and the validator doesn't validate more than it is asked to. Fixes: #18662 Closes scylladb/scylladb#18667 * github.com:scylladb/scylladb: test/boost/mutation_fragment_test.cc: add test for validator validation levels mutation: mutation_fragment_stream_validating_filter: fix validation_level::none mutation: mutation_fragment_stream_validating_filter: add raises_error ctor parameter	2024-05-16 19:57:49 +03:00
Kamil Braun	734c5de314	Merge 'fix test teardown race with ongoing test operation' from Artsiom Mishuta This commit brings several new features in scylla_cluster.py to fix runaway asyncio task problems in topology tests - Start-Stop Lock and Stop Event in ScyllaServer - Tasks History, Wait for tasks from Tasks History and Manager broken state in ScyllaClusterManager - make ManagerClient object function scope - test_finished_event in ManagerClient Fixes: scylladb/scylladb#16472 Fixes: scylladb/scylladb#16651 Closes scylladb/scylladb#18236 * github.com:scylladb/scylladb: test/pylib: Introduce ManagerClient.test_finished_event test/topology: make ManagerClient object function scope test/pylib: Introduce Manager broken state: test/pylib: Wait for tasks from Tasks History: test/pylib: Introduce Tasks History: test/pylib: Introduce Stop Event test/pylib: Introduce Start-Stop Lock:	2024-05-16 17:42:00 +02:00
Kefu Chai	759156b56d	test: perf: alternator: mark format string as `constexpr` before this change, we use `update_item_suffix` as a format string fed to `format(...)`, which is resolved to `seastar::format()`. but with a patch which migrates the `seastar::format()` to the backend with compile-time format check, the caller sites using `format()` would fail to build, because `update_item_suffix` is not a `constexpr`: ``` /home/kefu/.local/bin/clang++ -DFMT_SHARED -DSCYLLA_BUILD_MODE=release -DSEASTAR_API_LEVEL=7 -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SSTRING -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"RelWithDebInfo\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -isystem /home/kefu/dev/scylladb/abseil -isystem /home/kefu/dev/scylladb/build/rust -ffunction-sections -fdata-sections -O3 -g -gz -std=gnu++20 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-enum-constexpr-conversion -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb=. -march=westmere -mllvm -inline-threshold=2500 -fno-slp-vectorize -U_FORTIFY_SOURCE -Werror=unused-result -MD -MT test/perf/CMakeFiles/test-perf.dir/RelWithDebInfo/perf_alternator.cc.o -MF test/perf/CMakeFiles/test-perf.dir/RelWithDebInfo/perf_alternator.cc.o.d -o test/perf/CMakeFiles/test-perf.dir/RelWithDebInfo/perf_alternator.cc.o -c /home/kefu/dev/scylladb/test/perf/perf_alternator.cc /home/kefu/dev/scylladb/test/perf/perf_alternator.cc:249:69: error: call to consteval function 'fmt::basic_format_string<char, const char (&)[1]>::basic_format_string<const char , 0>' is not a constant expression 249 \| return make_request(cli, "UpdateItem", prefix + seastar::format(update_item_suffix, "")); \| ^ /usr/include/fmt/core.h:2776:67: note: read of non-constexpr variable 'update_item_suffix' is not allowed in a constant expression 2776 \| FMT_CONSTEVAL FMT_INLINE basic_format_string(const S& s) : str_(s) { \| ^ /home/kefu/dev/scylladb/test/perf/perf_alternator.cc:249:69: note: in call to 'basic_format_string<const char , 0>(update_item_suffix)' 249 \| return make_request(cli, "UpdateItem", prefix + seastar::format(update_item_suffix, "")); \| ^~~~~~~~~~~~~~~~~~ /home/kefu/dev/scylladb/test/perf/perf_alternator.cc:198:6: note: declared here 198 \| auto update_item_suffix = R"( \| ^ ``` so, to prepare the change switching to compile-time format checking, let's mark this variable `static constexpr`. this is also more correct, as this variable is * a compile time constant, and * is not shared across different compilation units. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#18685	2024-05-16 15:18:42 +03:00
Michał Jadwiszczak	b3e6a39604	test/cql-pytest/test_describe: add test for UDTs ordering	2024-05-16 13:30:03 +02:00
Michał Jadwiszczak	7f04c88395	test/boost: add test for topological sorting	2024-05-16 13:30:03 +02:00
Nadav Har'El	27ab560abd	cql: fix hang during certain SELECT statements The function intersection(r1,r2) in statement_restrictions.cc is used when several WHERE restrictions were applied to the same column. For example, for "WHERE b<1 AND b<2" the intersection of the two ranges is calculated to be b<1. As noted in issue #18690, Scylla is inconsistent in where it allows or doesn't allow these intersecting restrictions. But where they are allowed they must be implemented correctly. And it turns out the function intersection() had a bug that caused it to sometimes enter an infinite loop - when the intent was only to call itself once with swapped parameters. This patch includes a test reproducing this bug, and a fix for the bug. The test hangs before the fix, and passes after the fix. While at it, I carefully reviewed the entire code used to implement the intersection() function to try to make sure that the bug we found was the only one. I also added a few more comments where I thought they were needed to understand complicated logic of the code. The bug, the fix and the test were originally discovered by Michał Chojnowski. Fixes #18688 Refs #18690 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#18694	2024-05-16 11:25:44 +03:00
Piotr Dulikowski	68eca3778c	Merge 'mv: throttle view update generation for large queries' from Wojciech Mitros This series is a reupload of #13792 with a few modifications, namely a test is added and the conflicts with recent tablet related changes are fixed. See https://github.com/scylladb/scylladb/issues/12379 and https://github.com/scylladb/scylladb/pull/13583 for a detailed description of the problem and discussions. This PR aims to extend the existing throttling mechanism to work with requests that internally generate a large amount of view updates, as suggested by @nyh. The existing mechanism works in the following way: * Client sends a request, we generate the view updates corresponding to the request and spawn background tasks which will send these updates to remote nodes * Each background task consumes some units from the `view_update_concurrency_semaphore`, but doesn't wait for these units, it's just for tracking * We keep track of the percent of consumed units on each node, this is called `view update backlog`. * Before sending a response to the client we sleep for a short amount of time. The amount of time to sleep for is based on the fullness of this `view update backlog`. For a well behaved client with limited concurrency this will limit the amount of incoming requests to a manageable level. This mechanism doesn't handle large DELETE queries. Deleting a partition is fast for the base table, but it requires us to generate a view update for every single deleted row. The number of deleted rows per single client request can be in the millions. Delaying response to the request doesn't help when a single request can generate millions of updates. To deal with this we could treat the view update generator just like any other client and force it to wait a bit of time before sending the next batch of updates. The amount of time to wait for is calculated just like in the existing throttling code, it's based on the fullness of `view update backlogs`. The new algorithm of view update generation looks something like this: ```c++ for(;;) { auto updates = generate_updates_batch_with_max_100_rows(); co_await seastar::sleep(calculate_sleep_time_from_backlogs()); spawn_background_tasks_for_updates(updates); } ``` Fixes: https://github.com/scylladb/scylladb/issues/12379 Closes scylladb/scylladb#16819 * github.com:scylladb/scylladb: test: add test for bad_allocs during large mv queries mv: throttle view update generation for large queries exceptions: add read_write_timeout_exception, a subclass of request_timeout_exception db/view: extract view throttling delay calculation to a global function view_update_generator: add get_storage_proxy() storage_proxy: make view backlog getters public	2024-05-16 08:22:54 +02:00
Tomasz Grabiec	a179f37780	test: tablets_test: Test sharding during intra-node migration	2024-05-16 00:28:47 +02:00
Tomasz Grabiec	5f32d2ddb6	test: tablets_test: Check sharding also on the pending host	2024-05-16 00:28:47 +02:00
Tomasz Grabiec	6d809c75fb	test: py: tablets: Test writes concurrent with migration	2024-05-16 00:28:47 +02:00
Tomasz Grabiec	ad02d85c16	test: py: tablets: Test crash during intra-node migration	2024-05-16 00:28:47 +02:00
Tomasz Grabiec	7956a2991e	api, storage_service: Introduce API to wait for topology to quiesce	2024-05-16 00:28:47 +02:00
Tomasz Grabiec	32a191384a	test: Avoid using deprecated sharded API There is not tablet migration in unit tests, so shard_of() can be safely replaced with shard_for_reads(). Even if it's used for writes.	2024-05-16 00:28:47 +02:00
Tomasz Grabiec	c9e6b4dca7	dht: split_range_to_single_shard: Work with static_sharder only In preparation for intra-node tablet migration, to avoid using deprecated sharder APIs. This function is used for generating sstable sharding metadata. For tablets, it is not invoked, so we can safely work with the static sharder. The call site already passes static_sharder only.	2024-05-16 00:28:47 +02:00
Tomasz Grabiec	9da3bd84c7	dht: Extract dht::static_sharder Before the patch, dht::sharder could be instantiated and it would behave like a static sharder. This is not safe with regards to extensions of the API because if a derived implementation forgets to override some method, it would incorrectly default to the implementation from static sharder. Better to fail the compilation in this case, so extract static sharder logic to dht::static_sharder class and make all methods in dht::sharder pure virtual. This also allows us to have algorithms indicate that they only work with static sharder by accepting the type, and have compile-time safety for this requirement. schema::get_sharder() is changed to return the static_sharder&.	2024-05-16 00:28:47 +02:00
Tomasz Grabiec	10a4903d0c	dht: Deprecate old sharder API: shard_of/next_shard/token_for_next_shard Require users to specify whether we want shard for reads or for writes by switching to appropriate non-deprecated variant. For example, shard_of() can be replaced with shard_for_reads() or shard_for_writes(). The next_shard/token_for_next_shard APIs have only for-reads variant, and the act of switching will be a testimony to the fact that the code is valid for intra-node migration.	2024-05-16 00:28:47 +02:00
Tomasz Grabiec	b3cdf9a379	tests: tablets: py: Add intra-node migration test	2024-05-16 00:28:47 +02:00
Tomasz Grabiec	d26cd97633	tests: tablets: Test that drained nodes are not balanced internally It would be a waste of effort to do so, since we migrate tablets away anyway.	2024-05-16 00:28:47 +02:00
Tomasz Grabiec	04f0088679	tests: tablets: Add checks of replica set validity to test_load_balancing_with_random_load	2024-05-16 00:28:47 +02:00
Tomasz Grabiec	c76ba52c70	tests: tablets: Verify that disabling balancing results in no intra-node migrations	2024-05-16 00:28:47 +02:00
Tomasz Grabiec	0addca88b9	tests: tablets: Check that nodes are internally balanced Existing tests are augmented with a check which verifies that all nodes are internally balanced.	2024-05-16 00:28:47 +02:00

1 2 3 4 5 ...

6872 Commits