scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-08 07:53:20 +00:00

Author	SHA1	Message	Date
Pavel Emelyanov	652153c291	Merge 'populate_keyspace: use datadir' from Benny Halevy Currently the datadir is ignored. Use it to construct the table's base path. Fixes scylladb/scylladb#15418 Closes scylladb/scylladb#15480 * github.com:scylladb/scylladb: distributed_loader: populate_keyspace: access cf by ref distributed_loader: table_populator: use datadir for base_path distributed_loader: populate_keyspace: issue table mark_ready_for_writes after all datadirs are processed distributed_loader: populate_keyspace: fixup indentation distributed_loader: populate_keyspace: iterate over datadirs in the inner loop test: sstable_directory_test: add test_multiple_data_dirs table: init_storage: create upload and staging subdirs on all datadirs	2023-09-25 13:40:50 +03:00
Nadav Har'El	1a5debac5c	test/cql-pytest: cleaner reproducer for spurious static row returned Issue #10357 is about a SELECT with a filter on a regular column which incorrectly returns a static row without regular columns set (so the filter would not have matched). We already have four tests reproducing this issue, but each of them is a small part of a large tests translated from Cassandra, making it hard to understand the scope of this bug. So in this patch we add two new tests, one passing and one xfailing, which clarify the scope of this bug. It turns out that the bug only occurs when a partition has no clustering rows and only has a static row. If the partition does have clustering rows - even if those don't match the filter - the bug doesn't happen. The xfailing test is just two statements long - a single INSERT and a single SELECT Refs #10357. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#15120	2023-09-25 11:01:22 +03:00
Nadav Har'El	be942c1bce	Merge 'treewide: rename s3 credentials related variable and option names' from Kefu Chai in this series, we rename s3 credential related variable and option names so they are more consistent with AWS's official document. this should help with the maintainability. Closes scylladb/scylladb#15529 * github.com:scylladb/scylladb: main.cc: rename aws option utils/s3/creds: rename aws_config member variables	2023-09-24 14:03:47 +03:00
Nadav Har'El	4e1e7568d8	Merge 'cql3:statements:describe_statement: include UDT/UDF/UDA in generic describe' from Michał Jadwiszczak So far generic describe (`DESC <name>`) followed Cassandra implementation and it only described keyspace/table/view/index. This commit adds UDT/UDF/UDA to generic describe. Fixes: #14170 Closes scylladb/scylladb#14334 * github.com:scylladb/scylladb: docs:cql: add information about generic describe cql-pytest:test_describe: add test for generic UDT/UDF/UDA desc cql3:statements:describe_statement: include UDT/UDF/UDA in generic describe	2023-09-24 13:03:04 +03:00
Kefu Chai	f3f31f0c65	main.cc: rename aws option - s/aws_key/aws_access_key_id/ - s/aws_secret/aws_secret_access_key/ - s/aws_token/aws_session_token/ rename them to more popular names, these names are also used by boto's API. this should improve the readability and consistency. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-09-23 14:31:32 +08:00
Kefu Chai	ac3406e537	utils/s3/creds: rename aws_config member variables - s/key/access_key_id/ - s/secret/secret_access_key/ - s/token/session_token/ so they are more aligned with the AWS document. for instance, in https://docs.aws.amazon.com/AmazonS3/latest/userguide/RESTAuthentication.html#ConstructingTheAuthenticationHeader AWSAccessKeyId is used in the "Authorization" header. this would help with the readability and maintainability. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-09-23 14:28:07 +08:00
Benny Halevy	14da3e4218	distributed_loader: populate_keyspace: issue table mark_ready_for_writes after all datadirs are processed Currently, mark_ready_for_writes is called too early, after the first data dir is processed, then the next datadir will hit an assert in `table::mark_ready_for_writes`. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-09-23 08:50:53 +03:00
Benny Halevy	2591f5f935	test: sstable_directory_test: add test_multiple_data_dirs Add a basic regression test that starts the cql test env with multiple data directories. It fails without the previous patch: table: init_storage: create upload and staging subdirs on all datadirs Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-09-23 08:24:54 +03:00
Kamil Braun	99d83808cc	Merge 'test/topology_custom/test_select_from_mutation_fragments.py: use async api and clean-up' from Botond Dénes Also, while at it, add copyright/license blurbs for tests that were missing it. Closes scylladb/scylladb#15495 * github.com:scylladb/scylladb: test/topology_custom: add copyright/license blurb to tests test/topology_custom: test_select_from_mutation_fragments.py: use async query api	2023-09-22 10:59:48 +02:00
Botond Dénes	4acde0fb4b	test/topology_custom: add test_read_repair.py	2023-09-22 02:53:15 -04:00
Botond Dénes	d007a0ec16	replica/mutation_dump: detect end-of-page in range-scans The current read-loop fails to detect end-of-page and if the query result buider cuts the page, it will just proceed to the next partition. This will result in distorted query results, as the result builder will request for the consumption to stop after each clustering row. To fix, check if the page was cut before moving on to the next partition. A unit test reproducing the bug was also added.	2023-09-22 02:53:15 -04:00
Botond Dénes	70e26e5a10	test/pylib: add REST methods to get node exe and workdir paths	2023-09-22 02:53:15 -04:00
Botond Dénes	8bd5f67039	test/pylib/rest_client: add load_new_sstables, keyspace_{flush,compaction} To support the equivalent (roughly) of the following nodetool commands: * nodetool refresh * nodetool flush * nodetool compact	2023-09-22 02:53:15 -04:00
Botond Dénes	91a8100b3f	Merge 'Validate compaction strategy options in prepare' from Aleksandra Martyniuk Table properties validation is performed on statement execution. Thus, when one attempts to create a table with invalid options, an incorrect command gets committed in Raft. But then its application fails, leading to a raft machine being stopped. Check table properties when create and alter statements are prepared. Fixes: #14710. Closes scylladb/scylladb#15091 * github.com:scylladb/scylladb: cql3: statements: delete execute override cql3: statements: call check_restricted_table_properties in prepare cql3: statements: pass data_dictionary::database to check_restricted_table_properties	2023-09-22 09:49:19 +03:00
Michael Huang	a684e51e4d	cql3: fix bad optional access when executing fromJson function Fix fromJson(null) to return null, not a error as it did before this patch. We use "null" as the default value when unwrapping optionals to avoid bad optional access errors. Fixes: scylladb#7912 Signed-off-by: Michael Huang <michaelhly@gmail.com> Closes scylladb/scylladb#15481	2023-09-21 20:18:49 +03:00
Avi Kivity	61440d20c3	Merge 'Enable incremental compaction on off-strategy' from Raphael "Raph" Carvalho Off-strategy suffers with a 100% space overhead, as it adopted a sort of all or nothing approach. Meaning all input sstables, living in maintenance set, are kept alive until they're all reshaped according to the strategy criteria. Input sstables in off-strategy are very likely to be mostly disjoint, so it can greatly benefit from incremental compaction. The incremental compaction approach is not only good for decreasing disk usage, but also memory usage (as metadata of input and output live in memory), and file desc count, which takes memory away from OS. Turns out that this approach also greatly simplifies the off-strategy impl in compaction manager, as it no longer have to maintain new unused sstables and mark them for deletion on failure, and also unlink intermediary sstables used between reshape rounds. Fixes https://github.com/scylladb/scylladb/issues/14992. Closes scylladb/scylladb#15400 * github.com:scylladb/scylladb: test: Verify that off-strategy can do incremental compaction compaction: Clear pending_replacement list when tombstone GC is disabled compaction: Enable incremental compaction on off-strategy compaction: Extend reshape type to allow for incremental compaction compaction: Move reshape_compaction in the source compaction: Enable incremental compaction only if replacer callback is engaged	2023-09-21 20:12:19 +03:00
Avi Kivity	1da6a939fe	Merge 'Track memory usage of S3 object uploads' from Pavel Emelyanov The S3 uploading sink needs to collect buffers internally before sending them out, because the minimal upload-able part size is 5Mb. When the necessary amount of bytes is accumulated, the part uploading fibers starts in the background. On flush the sink waits for all the fibers to complete and handles failure of any. Uploading parallelism is nowadays limited by the means of the http client max-connections parameter. However, when a part uploading fibers waits for it connection it keeps the 5Mb+ buffers on the request's body, so even though the number of uploading parts is limited, the number of _waiting_ parts is effectively not. This PR adds a shard-wide limiter on the number of background buffers S3 clients (and theirs http clients) may use. Closes scylladb/scylladb#15497 * github.com:scylladb/scylladb: s3::client: Track memory in client uploads code: Configure s3 clients' memory usage s3::client: Construct client with shared semaphore sstables::storage_manager: Introduce config	2023-09-21 18:24:42 +03:00
Raphael S. Carvalho	91efd878d7	test: Verify that off-strategy can do incremental compaction Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-09-21 11:15:46 -03:00
Botond Dénes	3b95f4f107	Merge 'Sanitize view-update-generator start-stop sequence' from Pavel Emelyanov The v.u.g. start stop is now spread over main() code heavily. 1. sharded<v.u.g.>.start() happens early enough to allow depending services register staging sstables on it 2. after the system is "more-or-less" alive the invoke_on_all(v.u.g.::start()) is called (conditionally) to activate the generator background fiber. Not 100% sure why it happens _that_ late, but somehow it's required that while scylla is joining the cluster the generation doesn't happen 3. early on stop the v.u.g. is fully stopped The 3rd step is pretty nasty. It may happen that v.u.g. is not stopped if scylla start aborts before the last action is defer-scheduled. Also, when it happens, it leaves stopping dependencies with non-initialized v.u.g.'s local instances, which is not symmetrical to how they start. Said that, this PR fixes the stopping sequence to happen later, i.e. -- being defer-scheduled right after sharded<v.u.g.> is started. Also it makes sure that terminating the background fiber happens as early as it is now. This is done the compaction_manager-style -- the v.u.g. subscribes on stop signal abort source and kicks the fiber to stop when it fires. Closes scylladb/scylladb#15466 * github.com:scylladb/scylladb: view_update_generator: Stop for real later view_update_generator: Add logging to do_abort() view_update_generator: Move abort kicking to do_abort() view_update_generator: Add early abort subscription	2023-09-21 17:01:27 +03:00
Aleksandra Martyniuk	60fdc44bce	cql3: statements: call check_restricted_table_properties in prepare Table properties validation is performed on statement execution. Thus, when one attempts to create a table with invalid options, an incorrect command gets committed in Raft. But then its application fails, leading to a raft machine being stopped. Check table properties when create and alter statements are prepared. The error is no longer returned as an exceptional future, but it is thrown. Adjust the tests accordingly.	2023-09-21 13:21:51 +02:00
Pavel Emelyanov	e34220ebb7	view_update_generator: Add early abort subscription Subscribe v.u.g. to the main's stop_signal. For now a no-op callback. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-21 13:32:45 +03:00
Kefu Chai	c364efb998	utils/s3: auth using AWS_SESSION_TOKEN when accessing AWS resources, uses are allowed to long-term security credentials, they can also the temporary credentials. but if the latter are used, we have to pass a session token along with the keys. see also https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp_use-resources.html so, if we want to programatically get authenticated, we need to set the "x-amz-security-token" header, see https://docs.aws.amazon.com/AmazonS3/latest/userguide/RESTAuthentication.html#UsingTemporarySecurityCredentials so, in this change, we 1. add another member named `token` in `s3::endpoint_config::aws_config` for storing "AWS_SESSION_TOKEN". 2. populate the setting from "object_storage.yaml" and "$AWS_SESSION_TOKEN" environment variable. 3. set "x-amz-security-token" header if `s3::endpoint_config::aws_config::token` is not empty. this should allow us to test s3 client and s3 object store backend with S3 bucket, with the temporary credentials. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#15486	2023-09-21 13:26:11 +03:00
Botond Dénes	f6575344df	Merge 'Collect dangling object-store sstables' from Pavel Emelyanov Sstables in transitional states are marked with the respective 'status' in the registry. Currently there are two of such -- 'creating' and 'removing'. And the 'sealed' status for sstables in use. On boot the distributed loader tries to garbage collect the dangling sstables. For filesystem storage it's done with the help of temorary sstables' dirs and pending deletion logs. For s3-backed sstables, the garbage collection means fetching all non-sealed entries and removing the corresponding objects from the storage. Test included (last patch) fixes #13024 Closes scylladb/scylladb#15318 * github.com:scylladb/scylladb: test: Extend object_store test to validate GC works sstable_directory: Garbage collect S3 sstables on reboot sstable_directory: Pass storage to garbage_collect() sstable_directory: Create storage instance too	2023-09-21 09:15:00 +03:00
Pavel Emelyanov	182a5348d4	code: Configure s3 clients' memory usage This sets the real limits on the memory semaphore. - scylla sets it to 1% of total memory, 10Mb min, 100Mb max - tests set it to 16Mb - perf test sets it to all available memory Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-20 17:50:29 +03:00
Pavel Emelyanov	b299757884	s3::client: Construct client with shared semaphore The semaphore will be used to cap memory consumption by client. This patch makes sure the reference to a semaphore exists as an argument to client's constructor, not more than that. In scylla binary, the semaphore sits on storage_manager. In tests the semaphore is some local object. For now the semaphore is unused and is initialized locked as this patch just pushes the needed argument all the way around, next patches will make use of it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-20 17:50:07 +03:00
Pavel Emelyanov	f40b4e3e84	sstables::storage_manager: Introduce config Just an empty config that's fed to storage_manager when constructed as a preparation for further heavier patching Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-09-20 17:42:59 +03:00
Botond Dénes	f2df8cf484	test/topology_custom: add copyright/license blurb to tests Most tests were missing this, fix it.	2023-09-20 10:41:31 -04:00
Botond Dénes	3e5fe6e0a6	test/topology_custom: test_select_from_mutation_fragments.py: use async query api cql.execute_async() can now execute paged queries, use it instead of a blocking API. While at it, clean-up the test: * remove unneded wait on ring0 settle * address flake8 concerns: - unused imports - unused variables - style	2023-09-20 10:41:31 -04:00
Tomasz Grabiec	3d4398d1b2	Merge 'Don't calculate hashes for schema versions in Raft mode' from Kamil Braun When performing a schema change through group 0, extend the schema mutations with a version that's persisted and then used by the nodes in the cluster in place of the old schema digest, which becomes horribly slow as we perform more and more schema changes (#7620). If the change is a table create or alter, also extend the mutations with a version for this table to be used for `schema::version()`s instead of having each node calculate a hash which is susceptible to bugs (#13957). When performing a schema change in Raft RECOVERY mode we also extend schema mutations which forces nodes to revert to the old way of calculating schema versions when necessary. We can only introduce these extensions if all of the cluster understands them, so protect this code by a new cluster/schema feature, `GROUP0_SCHEMA_VERSIONING`. Fixes: #7620 Fixes: #13957 Closes scylladb/scylladb#15331 * github.com:scylladb/scylladb: test: add test for group 0 schema versioning test/pylib: log_browsing: fix type hint feature_service: enable `GROUP0_SCHEMA_VERSIONING` in Raft mode schema_tables: don't delete `version` cell from `scylla_tables` mutations from group 0 migration_manager: add `committed_by_group0` flag to `system.scylla_tables` mutations schema_tables: use schema version from group 0 if present migration_manager: store `group0_schema_version` in `scylla_local` during schema changes migration_manager: migration_request handler: assume `canonical_mutation` support system_keyspace: make `get/set_scylla_local_param` public feature_service: add `GROUP0_SCHEMA_VERSIONING` feature schema_tables: refactor `scylla_tables(schema_features)` migration_manager: add `std::move` to avoid a copy schema_tables: remove default value for `reload` in `merge_schema` schema_tables: pass `reload` flag when calling `merge_schema` cross-shard system_keyspace: fix outdated comment	2023-09-20 10:43:40 +02:00
Botond Dénes	45dfce6632	Merge 'compaction: change behaviour of compaction task executors' from Aleksandra Martyniuk Compaction tasks executors serve two different purposes - as compaction manager related entity they execute compaction operation and as task manager related entity they track compaction status. When one role depends on the other, as it currently is for compaction_task_impl::done() and compaction_task_executor::compaction_done(), requirements of both roles need to be satisfied at the same time in each corner case. Such complexity leads to bugs. To prevent it, compaction_task_impl::done() of executors no longer depends on compaction_task_executor::compaction_done(). Fixes: #14912. Closes scylladb/scylladb#15140 * github.com:scylladb/scylladb: compaction: warn about compaction_done() compaction: do not run stopped compaction compaction: modify lowest compaction tasks' run method compaction: pass do_throw_if_stopping to compaction_task_executor	2023-09-19 15:15:14 +03:00
Kefu Chai	4b53a70d76	build: cmake: add `tests` target this target mirrors the target named `{mode}e-test` in the `build.ninja` build script created by `configure.py`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#15448	2023-09-19 11:20:02 +03:00
Michael Huang	62a8a31be7	cdc: use chunked_vector for topology_description entries Lists can grow very big. Let's use a chunked vector to prevent large contiguous allocations. Fixes: #15302. Closes scylladb/scylladb#15428	2023-09-18 23:17:01 +03:00
Avi Kivity	ab6988c52f	Merge "auth: do not grant permissions to creator without actually creating" from Wojciech Mitros Currently, when creating the table, permissions may be mistakenly granted to the user even if the table is already existing. This can happen in two cases: The query has a IF NOT EXISTS clause - as a result no exception is thrown after encountering the existing table, and the permission granting is not prevented. The query is handled by a non-zero shard - as a result we accept the query with a bounce_to_shard result_message, again without preventing the granting of permissions. These two cases are now avoided by checking the result_message generated when handling the query - now we only grant permissions when the query resulted in a schema_change message. Additionally, a test is added that reproduces both of the mentioned cases. CVE-2023-33972 Fixes #15467. * 'no-grant-on-no-create' of github.com:scylladb/scylladb-ghsa-ww5v-p45p-3vhq: auth: do not grant permissions to creator without actually creating transport: add is_schema_change() method to result_message	2023-09-18 21:47:28 +03:00
Kefu Chai	ece45c9f70	build: cmake: use find_program(.. REQUIRED) when appropriate instead of checking the availability of a required program, let's use the `REQUIRED` argument introduced by CMake 3.18, simpler this way. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#15447	2023-09-18 16:35:46 +03:00
Kefu Chai	054beb6377	tests: tablets: do not compare signed integer with unsigned integer when compiling the tests with -Wsign-compare, the compiler complains like: ``` /home/kefu/.local/bin/clang++ -DBOOST_ALL_DYN_LINK -DBOOST_NO_CXX98_FUNCTION_BASE -DDEBUG -DDEBUG_LSA_SANITIZER -DFMT_DEPRECATED_OSTREAM -DFMT_SHARED -DSANITIZE -DSCYLLA_BUILD_MODE=debug -DSCYLLA_ENABLE_ERROR_INJECTION -DSEASTAR_API_LEVEL=7 -DSEASTAR_BROKEN_SOURCE_LOCATION -DSEASTAR_DEBUG -DSEASTAR_DEBUG_SHARED_PTR -DSEASTAR_DEFAULT_ALLOCATOR -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SHUFFLE_TASK_QUEUE -DSEASTAR_TESTING_MAIN -DSEASTAR_TYPE_ERASE_MORE -DXXH_PRIVATE_API -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/cmake/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/cmake/seastar/gen/include -isystem /home/kefu/dev/scylladb/build/cmake/rust -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-mismatched-tags -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-unused-parameter -Wno-missing-field-initializers -Wno-deprecated-copy -Wno-ignored-qualifiers -march=westmere -Og -g -gz -std=gnu++20 -fvisibility=hidden -U_FORTIFY_SOURCE -DSEASTAR_SSTRING -Wno-error=unused-result "-Wno-error=#warnings" -fstack-clash-protection -fsanitize=address -fsanitize=undefined -fno-sanitize=vptr -MD -MT test/boost/CMakeFiles/tablets_test.dir/tablets_test.cc.o -MF test/boost/CMakeFiles/tablets_test.dir/tablets_test.cc.o.d -o test/boost/CMakeFiles/tablets_test.dir/tablets_test.cc.o -c /home/kefu/dev/scylladb/test/boost/tablets_test.cc /home/kefu/dev/scylladb/test/boost/tablets_test.cc:1335:53: error: comparison of integers of different signs: 'int' and 'size_t' (aka 'unsigned long') [-Werror,-Wsign-compare] for (int log2_tablets = 0; log2_tablets < tablet_count_bits; ++log2_tablets) { ~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~ ``` in this case, it should be safe to use an signed int as the loop variable to be compared with `tablet_count_bits`, but let's just appease the compiler so we can enable the warning option project-wide to prevent any potential issues caused by signed-unsigned comparision. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#15449	2023-09-18 13:17:16 +02:00
Kamil Braun	bc6f7d1b20	Merge 'raft topology: add garbage collection for internal CDC generations table' from Patryk Jędrzejczak We add garbage collection for the `CDC_GENERATIONS_V3` table to prevent it from endlessly growing. This mechanism is especially needed because we send the entire contents of `CDC_GENERATIONS_V3` as a part of the group 0 snapshot. The solution is to keep a clean-up candidate, which is one of the already published CDC generations. The CDC generation publisher introduced in #15281 continually uses this candidate to remove all generations with timestamps not exceeding the candidate's and sets a new candidate when needed. We also add `test_cdc_generation_clearing.py` that verifies this new mechanism. Fixes #15323 Closes scylladb/scylladb#15413 * github.com:scylladb/scylladb: test: add test_cdc_generation_clearing raft topology: remove obsolete CDC generations raft topology: set CDC generation clean-up candidate topology_coordinator: refactor publish_oldest_cdc_generation system_keyspace: introduce decode_cdc_generation_id system_keyspace: add cleanup_candidate to CDC_GENERATIONS_V3	2023-09-18 11:30:10 +02:00
Pavel Emelyanov	30959fc9b1	lsa, test: Extend memory footprint test with per-type total sizes When memory footprint test is over it prints total size taken by row cache, memtable and sstables as well as individual objects' sizes. It's also nice to know the details on the row-cache's individual objects. This patch extends the printing with total size of allocated object types according to migrator_fn types. Sample output: mutation footprint: - in cache: 11040928 - in memtable: 9142424 - in sstable: mc: 2160000 md: 2160000 me: 2160000 - frozen: 540 - canonical: 827 - query result: 342 sizeof(cache_entry) = 64 sizeof(memtable_entry) = 64 sizeof(bptree::node) = 288 sizeof(bptree::data) = 72 -- sizeof(decorated_key) = 32 -- sizeof(mutation_partition) = 96 -- -- sizeof(_static_row) = 8 -- -- sizeof(_rows) = 24 -- -- sizeof(_row_tombstones) = 40 sizeof(rows_entry) = 144 sizeof(evictable) = 24 sizeof(deletable_row) = 72 sizeof(row) = 16 radix_tree::inner_node::node_sizes = 48 80 144 272 528 1040 radix_tree::leaf_node::node_sizes = 120 216 416 816 3104 sizeof(atomic_cell_or_collection) = 16 btree::linear_node_size(1) = 24 btree::inner_node_size = 216 btree::leaf_node_size = 120 LSA stats: N18compact_radix_tree4treeI13cell_and_hashjE9leaf_nodeE: 360 N5bplus4dataIl15intrusive_arrayI11cache_entryEN3dht25raw_token_less_comparatorELm16ELNS_10key_searchE0ELNS_10with_debugE0EEE: 5040 N5bplus4nodeIl15intrusive_arrayI11cache_entryEN3dht25raw_token_less_comparatorELm16ELNS_10key_searchE0ELNS_10with_debugE0EEE: 19296 17partition_version: 952416 N11intrusive_b4nodeI10rows_entryXadL_ZNS1_5_linkEEENS1_11tri_compareELm12ELm20ELNS_10key_searchE0ELNS_10with_debugE0EEE: 317472 10rows_entry: 1429056 12blob_storage: 254 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#15434	2023-09-18 11:23:18 +02:00
Kamil Braun	5add0e1734	test: add test for group 0 schema versioning Perform schema changes while mixing nodes in RECOVERY mode with nodes in group 0 mode: - schema changes originating from RECOVERY node use digest-based schema versioning. - schema changes originating from group 0 nodes use persisted versions committed through group 0. Verify that schema versions are in sync after each schema change, and that each schema change results in a different version. Also add a simple upgrade test, performing a schema change before we enable Raft (which also enables the new versioning feature) in the entire cluster, then once upgrade is finished. One important upgrade test is missing, which we should add to dtest: create a cluster in Raft mode but in a Scylla version that doesn't understand GROUP0_SCHEMA_VERSIONING. Then start upgrading to a version that has this patchset. Perform schema changes while the cluster is mixed, both on non-upgraded and on upgraded nodes. Such test is especially important because we're adding a new column to the `system.scylla_local` table (which we then redact from the schema definition when we see that the feature is disabled).	2023-09-15 18:36:11 +02:00
Kamil Braun	52903ef456	test/pylib: log_browsing: fix type hint	2023-09-15 17:58:54 +02:00
Kamil Braun	c2beee348a	feature_service: enable `GROUP0_SCHEMA_VERSIONING` in Raft mode As promised in earlier commits: Fixes: #7620 Fixes: #13957 Also modify two test cases in `schema_change_test` which depend on the digest calculation method in their checks. Details are explained in the comments.	2023-09-15 17:54:36 +02:00
Kamil Braun	4376854473	schema_tables: remove default value for `reload` in `merge_schema` To avoid bugs like the one fixed in the previous commit.	2023-09-15 13:04:04 +02:00
Patryk Jędrzejczak	840e1c5185	test: add test_cdc_generation_clearing We add a test for the new CDC generation garbage collection mechanism.	2023-09-15 09:28:32 +02:00
Petr Gusev	6c3cc7d6e0	test_fence_hints: increase timeouts We saw failures on CI in debug mode, probably the machine running the test is shared, and we starved for some resources. Fix #15285 Closes #15388	2023-09-14 16:22:50 +02:00
Avi Kivity	d9a453e72e	Merge 'Introduce a scylla-native nodetool' from Botond Dénes This series introduces a scylla-native nodetool. It is invokable via the main scylla executable as the other native tools we have. It uses the seastar's new `http::client` to connect to the specified node and execute the desired commands. For now a single command is implemented: `nodetool compact`, invokable as `scylla nodetool compact`. Once all the boilerplate is added to create a new tool, implementing a single command is not too bad, in terms of code-bloat. Certainly not as clean as a python implementation would be, but good enough. The advantages of a C++ implementation is that all of us in the core team know C++ and that it is shipped right as part of the scylla executable.. Closes #14841 * github.com:scylladb/scylladb: test: add nodetool tests test.py: add ToolTestSuite and ToolTest tools/scylla-nodetool: implement compact operation tools/scylla-nodetool: implement basic scylla_rest_api_client tools: introduce scylla-nodetool utils: export dns_connection_factory from s3/client.cc to http.hh utils/s3/client: pass logger to dns_connection_factory in constructor tools/utils: tool_app_template::run_async(): also detect --help* as --help	2023-09-14 17:20:40 +03:00
Avi Kivity	a3d73bfba7	Merge 'Add support for decommission with tablets' from Tomasz Grabiec Load balancer will recognize decommissioning nodes and will move tablet replicas away from such nodes with highest priority. Topology changes have now an extra step called "tablet draining" which calls the load balancer. The step will execute tablet migration track as long as there are nodes which require draining. It will not do regular load balancing. If load balancer is unable to find new tablet replicas, because RF cannot be met or availability is at risk due to insufficient node distribution in racks, it will throw an exception. Currently, topology change will retry in a loop. We should make this error cause topology change to be aborted. There is no infrastructure for aborts yet, so this is not implemented. Closes #15197 * github.com:scylladb/scylladb: tablets, raft topology: Add support for decommission with tablets tablet_allocator: Compute load sketch lazily tablet_allocator: Set node id correctly tablet_allocator: Make migration_plan a class tablets: Implement cleanup step storage_service, tablets: Prevent stale RPCs from running beyond their stage locator: Introduce tablet_metadata_guard locator, replica: Add a way to wait for table's effective_replication_map change storage_service, tablets: Extract do_tablet_operation() from stream_tablet() raft topology: Add break in the final case clause raft topology: Fix SIGSEGV when trace-level logging is enabled raft topology: Set node state in topology raft topology: Always set host id in topology	2023-09-14 17:16:23 +03:00
Kamil Braun	0564d000c6	Merge 'Validate compaction strategy options' from Aleksandra Martyniuk When a column family's schema is changed new compaction strategy type may be applied. To make sure that it will behave as expected, compaction strategy need to contain only the allowed options and values. Methods throwing exception on invalid options are added. Fixes: #2336. Closes #13956 * github.com:scylladb/scylladb: test: add test for compaction strategy validation compaction: unify exception messages compaction: cql3: validate options in check_restricted_table_properties compaction: validate options used in different compaction strategies compaction: validate common compaction strategy options compaction: split compaction_strategy_impl constructor compaction: validate size_tiered_compaction_strategy specific options compaction: validate time_window_compaction_strategy specific options compaction: add method to validate min and max threshold compaction: split size_tiered_compaction_strategy_options constructor compaction: make compaction strategy keys static constexpr compaction: use helpers in validate_* functions compaction: split time_window_compaction_strategy_options construtor compaction: add validate method to compaction_strategy_options time_window_compaction_strategy_options: make copy and move-able size_tiered_compaction_strategy_options: make copy and move-able	2023-09-14 16:11:52 +02:00
Tomasz Grabiec	551cc0233d	tablets, raft topology: Add support for decommission with tablets Load balancer will recognize decommissioning nodes and will move tablet replicas away from such nodes with highest priority. Topology changes have now an extra step called "tablet draining" which calls the load balancer. The step will execute tablet migration track as long as there are nodes which require draining. It will not do regular load balancing. If load balancer is unable to find new tablet replicas, because RF cannot be met or availability is at risk due to insufficient node distribution in racks, it will throw an exception. Currently, topology change will retry in a loop. We should make this error cause topology change to be paused so that admin becomes aware of the problem and issues an abort on the topology change. There is no infrastructure for aborts yet, so this is not implemented.	2023-09-14 13:05:49 +02:00
Tomasz Grabiec	389573543e	tablet_allocator: Make migration_plan a class It will be extended with more fields so that load balancer can communicate more information to the coordinator.	2023-09-14 13:04:47 +02:00
Botond Dénes	3e2d8ca94d	test: add nodetool tests Testing the new scylla nodetool tool. The tests can be run aginst both implementations of nodetool: the scylla-native one and the cassandra one. They all pass with both implementations.	2023-09-14 05:25:14 -04:00
Kamil Braun	bff9cedef9	Merge 'system_keyspace: remove flushes when writing to system tables' from Petr Gusev There are several system tables with strict durability requirements. This means that if we have written to such a table, we want to be sure that the write won't be lost in case of node failure. We currently accomplish this by accompanying each write to these tables with `db.flush()` on all shards. This is expensive, since it causes all the memtables to be written to sstables, which causes a lot of disk writes. This overheads can become painful during node startup, when we write the current boot state to `system.local`/`system.scylla_local` or during topology change, when `update_peer_info`/`update_tokens` write to `system.peers`. In this series we remove flushes on writes to the `system.local`, `system.peers`, `system.scylla_local` and `system.cdc_local` tables and start using schema commitlog for durability. Fixes: #15133 Closes #15279 * github.com:scylladb/scylladb: system_keyspace: switch CDC_LOCAL to schema commitlog system_keyspace: scylla_local: use schema commitlog database.cc: make _uses_schema_commitlog optional system_keyspace: drop load phases database.hh: add_column_family: add readonly parameter schema_tables: merge_tables_and_views: delay events until tables/views are created on all shards system_keyspace: switch system.peers to schema commitlog system_keyspace: switch system.local to schema commitlog main.cc: move schema commitlog replay earlier sstables_format_selector: extract listener sstables_format_selector: wrap when_enabled with seastar::async main.cc: inline and split system_keyspace.setup system_keyspace: refactor save_system_schema function system_keyspace: move initialize_virtual_tables into virtual_tables.hh system_keyspace: remove unused parameter config.cc: drop db::config::host_id main.cc:: extract local_info initialization into function schema.cc: check static_props for sanity system_keyspace: set null sharder when configuring schema commitlog system_keyspace: rename static variables system_keyspace: remove redundant wait_for_sync_to_commitlog	2023-09-14 10:39:20 +02:00

... 123 124 125 126 127 ...

11801 Commits