scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-23 01:50:35 +00:00

Author	SHA1	Message	Date
Pavel Emelyanov	f227f4332c	test: Remove unused path local variable Left after #20499 :( Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#20540	2024-09-11 23:10:25 +03:00
Avi Kivity	ed7d352e7d	Merge 'Validate checksums for uncompressed SSTables' from Nikos Dragazis This PR introduces a new file data source implementation for uncompressed SSTables that will be validating the checksum of each chunk that is being read. Unlike for compressed SSTables, checksum validation for uncompressed SSTables will be active for scrub/validate reads but not for normal user reads to ensure we will not have any performance regression. It consists of: * A new file data source for uncompressed SSTables. * Integration of checksums into SSTable's shareable components. The validation code loads the component on demand and manages its lifecycle with shared pointers. * A new `integrity_check` flag to enable the new file data source for uncompressed SSTables. The flag is currently enabled only through the validation path, i.e., it does not affect normal user reads. * New scrub tests for both compressed and uncompressed SSTables, as well as improvements in the existing ones. * A change in JSON response of `scylla validate-checksums` to report if an uncompressed SSTable cannot be validated due to lack of checksums (no `CRC.db` in `TOC.txt`). Refs #19058. New feature, no backport is needed. Closes scylladb/scylladb#20207 * github.com:scylladb/scylladb: test: Add test to validate SSTables with no checksums tools: Fix typo in help message of scylla validate-checksums sstables: Allow validate_checksums() to report missing checksums test: Add test for concurrent scrub/validate operations test: Add scrub/validate tests for uncompressed SSTables test/lib: Add option to create uncompressed random schemas test: Add test for scrub/validate with file-level corruption test: Check validation errors in scrub tests sstables: Enable checksum validation for uncompressed SSTables sstables: Expose integrity option via crawling mutation readers sstables: Expose integrity option via data_consume_rows() sstables: Add option for integrity check in data streams sstables: Remove unused variable sstables: Add checksum in the SSTable components sstables: Introduce checksummed file data source implementation sstables: Replace assert with on_internal_error	2024-09-11 23:09:45 +03:00
Nikos Dragazis	d1152a200f	test: Add test to validate SSTables with no checksums In a previous patch we extended the return status of `sstables::validate_checksums()` to report if an SSTable cannot be validated due to a missing CRC component (i.e., CRC.db does not appear in TOC.txt). Add a test case for this. Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>	2024-09-11 13:12:40 +03:00
Nikos Dragazis	5c0a7f706b	sstables: Allow validate_checksums() to report missing checksums Change the return type of `sstable::validate_checksums()` from binary (valid/invalid) to a ternary (valid/invalid/no_checksums). The third status represents uncompressed SSTables without a CRC component (no entry for CRC.db in the TOC). Also, change the JSON response of `sstable validate-checksums` to expose the new status. Replace the boolean value for valid/invalid checksums with an object that contains two boolean keys: one that indicates if the SSTable has checksums, and one that indicates if the checksums are valid or not. The second key is optional and appears only if the SSTable has checksums. Finally, update the documentation to reflect the changes in the API. Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>	2024-09-11 13:12:39 +03:00
Nikos Dragazis	5a284f4a9d	test: Add test for concurrent scrub/validate operations Theoretically it is possible to launch more than one scrub instances simultaneously. Since the checksum component is a shared resource, accesses have to be synchronized. Add a test that launches two scrub operations in validate mode and ensures that the checksum component is loaded once, referenced by all scrub instances via shared pointers, and deleted once the scrub operations finish. Introduce an injection point to achieve concurrent execution of scrubs. Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>	2024-09-11 13:12:39 +03:00
Nikos Dragazis	e2353f3b3e	test: Add scrub/validate tests for uncompressed SSTables Currently the unit tests check scrub in validate mode against compressed SSTables only. Mirror the tests for uncompressed SSTables as well. Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>	2024-09-11 13:12:39 +03:00
Nikos Dragazis	2991b09c8e	test/lib: Add option to create uncompressed random schemas Extend the `random_schema_specification` to support creating both compressed and uncompressed schemas. Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>	2024-09-11 13:12:32 +03:00
Nikos Dragazis	4f56c587f6	test: Add test for scrub/validate with file-level corruption Currently, we test scrub/validate only against a corrupted SSTable with content-level corruption (out-of-order partition key). Add a test for file-level corruption as well. This should trigger the checksum check in the underlying compressed file data source implementation. Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>	2024-09-11 12:28:59 +03:00
Nikos Dragazis	cc10a5f287	test: Check validation errors in scrub tests Scrub was extended in PR #11074 to report validation errors but the unit tests were not updated. Update the tests to check the validation errors reported by scrub. Validation errors must be zero for valid SSTables and non-zero for invalid SSTables. Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>	2024-09-11 12:28:59 +03:00
Botond Dénes	de81388edb	Merge 'commitlog: Handle oversized entries' from Calle Wilund Refs #18161 Yet another approach to dealing with large commitlog submissions. We handle oversize single mutation by adding yet another entry typo: fragmented. In this case we only add a fragment (aha) of the data that needs storing into each entry, along with metadata to correlate and reconstruct the full entry on replay. Because these fragmented entries are spread over N segments, we also need to add references from the first segment in a chain to the subsequent ones. These are released once we clear the relevant cf_id count in the base. * This approach has the downside that due to how serialization etc works w.r.t. mutations, we need to create an intermediate buffer to hold the full serialized target entry. This is then incrementally written into entries of < max_mutation_size, successively requesting more segments. On replay, when encountering a fragment chain, the fragment is added to a "state", i.e. a mapping of currently processing frag chains. Once we've found all fragments and concatenated the buffers into a single fragmented one, we can issue a replay callback as usual. Note that a replay caller will need to create and provide such a state object. Old signature replay function remains for tests and such. This approach bumps the file format (docs to come). To ensure "atomicity" we both force synchronization, and should the whole op fail, we restore segment state (rewinding), thus discarding data all we wrote. Closes scylladb/scylladb#19472 * github.com:scylladb/scylladb: commitlog/database: Make some commitlog options updatable + add feature listener features/config: Add feature for fragmented commitlog entries docs: Add entry on commitlog file format v4 commitlog_test: Add more oversized cases commitlog_replayer: Replay segments in order created commitlog_replayer: Use replay state to support fragmented entries commitlog_replayer: coroutinize partly commitlog: Handle oversized entries	2024-09-10 17:15:46 +03:00
Pavel Emelyanov	ac2127a640	test: Call table::make_sstable() directly in compaction test The test in question generates a bunch of table_for_tests objects and creates sstables for each. For that it calls test_env::make_sstable(), but it can be made shorter, by calling table method directly. The hidden goal of this change is to remove the explicit caller of table::dir() method. The latter is going away. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#20451	2024-09-10 10:19:20 +03:00
Botond Dénes	76bb22664a	Merge 'Sanitize open_sstables() helper in compaction test' from Pavel Emelyanov This includes - coroutinization - elimination of unused overload Closes scylladb/scylladb#20456 * github.com:scylladb/scylladb: test: Squash two open_sstables() helper together test: Coroutinize open_sstables() helper	2024-09-10 10:18:33 +03:00
Pavel Emelyanov	42f8d06a17	test: Use correct schema in directory tests with created table There are some test cases in sstable_directory_test test actually create a table with CQL and then try to manipulate its sstables with the help of sstable_directory. Those tests use existing local helper that starts sharded<sstable_directory> and this helper passes test-local static schema to sstable_directory constructor. As a result -- the schema of a table that test case created and the schema that sstable_directory works with are different. They match in the columns layout, which helps the test cases pass, but otherwise are two different schema objects with different IDs. It's more correct to use table schema for those runs. The fix introduces another helper to start sharded<sstable_directory>, and the older wrapper around cql_test_env becomes unused. Drop it too not to encourage future tests use it and re-introduce schema mismatch again. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#20499	2024-09-10 09:56:26 +03:00
Pavel Emelyanov	0f48847d02	test: Use shorter with_sstable_directory overload() In sstable directory test there are two of those -- one that works on path, state, env and callback, and the other one that just needs env and callback, getting path from env and assuming state is normal. Two test cases in this test can enjoy the shorter one. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#20395	2024-09-09 14:25:24 +03:00
Pavel Emelyanov	2bfbbaffac	test: Use sstables::test_env to make sstables for schema loader test This test calls manager directly, but it's shorter to ask test_env for that Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#20431	2024-09-09 14:22:58 +03:00
Pavel Emelyanov	69a5ec69c4	test: Use table storage options in sstable_directory_test When creating sstables this test allocates temporary local options. That works, because this test doesn't run on object storage, but it's more correct to pick storage options from the table at hand. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#20440	2024-09-05 17:48:25 +03:00
Pavel Emelyanov	1f0db29ef6	test: Remove unused directory semaphore The with_sstable_dir() helper no longer needs one, it used to pass it as argument to sstable_directory constructor, but now the directory doesn't need it (takes semaphore via table object). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#20396	2024-09-05 13:11:35 +03:00
Pavel Emelyanov	a150a63259	test: Squash two open_sstables() helper together One accepts integer generations, another one accepts "generic" ones. The latter is only called by the former, so no sense in keeping it around. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-09-05 09:08:40 +03:00
Pavel Emelyanov	4184c688ea	test: Coroutinize open_sstables() helper Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-09-05 09:08:12 +03:00
Pavel Emelyanov	c03b1e2827	test: Remove unused database argument from make_sstable_for_all_shards() helper Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#20427	2024-09-03 21:36:28 +03:00
Calle Wilund	ad595e4d6a	commitlog_test: Add more oversized cases Also adds some randomization to the tests.	2024-09-03 16:38:28 +00:00
Calle Wilund	05bf2ae5d7	commitlog: Handle oversized entries Refs #18161 Yet another approach to dealing with large commitlog submissions. We handle oversize single mutation by adding yet another entry type: fragmented. In this case we only add a fragment (aha) of the data that needs storing into each entry, along with metadata to correlate and reconstruct the full entry on replay. Because these fragmented entries are spread over N segments, we also need to add references from the first segment in a chain to the subsequent ones. These are released once we clear the relevant cf_id count in the base. * This approach has the downside that due to how serialization etc works w.r.t. mutations, we need to create an intermediate buffer to hold the full serialized target entry. This is then incrementally written into entries of < max_mutation_size, successively requesting more segments. On replay, when encountering a fragment chain, the fragment is added to a "state", i.e. a mapping of currently processing frag chains. Once we've found all fragments and concatenated the buffers into a single fragmented one, we can issue a replay callback as usual. Note that a replay caller will need to create and provide such a state object. Old signature replay function remains for tests and such. This approach bumps the file format (docs to come). To ensure "atomicity" we both force syncronization, and should the whole op fail, we restore segment state (rewinding), thus discarding data all we wrote. v2: * Improve some bookeep, ensure we keep track of segments and flush properly, to get counter correct	2024-09-03 16:38:27 +00:00
Pavel Emelyanov	e4bc5470cf	test: Call reusable sst from ka_sst() helper The sstable_mutation_test wants to load pre-existing sstables from resouce/ subdir. For that there's reusable_sst() helper on env. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-09-03 14:01:28 +03:00
Botond Dénes	52bed81a1e	Merge 'cql3: add option to not unify bind variables with the same name' from Avi Kivity Bind variables in CQL have two formats: positional (`?`) where a variable is referred to by its relative position in the statement, and named (`:var`), where the user is expected to supply a name->value mapping. In `19a6e69001` we identified the case where a named bind variable appears twice in a query, and collapsed it to a single entry in the statement metadata. Without this, a driver using the named variable syntax cannot disambiguate which variable is referred to. However, it turns out that users can use the positional call form even with the named variable syntax, by using the positional API of the driver. To support this use case, we add a configuration variable to disable the same-variable detection. Because the detection has to happen when the entire statement is visible, we have to supply the configuration to the parser. We call it the `dialect` and pass it from all callers. The alternative would be to add a pre-prepare call similar to fill_prepare_context that rewrites all expressions in a statement to deduplicate variables. A unit test is added. Fixes #15559 This may be useful to users transitioning from Cassandra, so merits a backport. Closes scylladb/scylladb#19493 * github.com:scylladb/scylladb: cql3: add option to not unify bind variables with the same name cql3: introduce dialect infrastructure cql3: prepared_statement_cache: drop cache key default constructor	2024-09-02 08:34:24 +03:00
Pavel Emelyanov	7df43312ac	test: Remove sstable making helpers from table_for_tests All users of it have sstable_test_env at hand (in fact -- they call env method to get table_for_test). And since sstable_test_env already has a bunch of methods to create sstable, the table_for_test wrapper doesn't need to duplicate this code. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#20360	2024-09-01 19:58:15 +03:00
Kefu Chai	e431b90145	test/boost/view_build_test: include used header before this change, when building the test of `view_build_test` with clang-20, we can have following build failure: ``` FAILED: test/boost/CMakeFiles/view_build_test.dir/Debug/view_build_test.cc.o /home/kefu/.local/bin/clang++ -DBOOST_ALL_DYN_LINK -DDEBUG -DDEBUG_LSA_SANITIZER -DFMT_SHARED -DSANITIZE -DSCYLLA_BUILD_MODE=debug -DSCYLLA_ENABLE_ERROR_INJECTION -DSEASTAR_API_LEVEL=7 -DSEASTAR_DEBUG -DSEASTAR_DEBUG_PROMISE -DSEASTAR_DEBUG_SHARED_PTR -DSEASTAR_DEFAULT_ALLOCATOR -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SHUFFLE_TASK_QUEUE -DSEASTAR_SSTRING -DSEASTAR_TESTING_MAIN -DSEASTAR_TYPE_ERASE_MORE -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"Debug\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -isystem /home/kefu/dev/scylladb/abseil -isystem /home/kefu/dev/scylladb/build/rust -g -Og -g -gz -std=gnu++23 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb/build=. -march=westmere -Xclang -fexperimental-assignment-tracking=disabled -Werror=unused-result -fstack-clash-protection -fsanitize=address -fsanitize=undefined -fno-sanitize=vptr -MD -MT test/boost/CMakeFiles/view_build_test.dir/Debug/view_build_test.cc.o -MF test/boost/CMakeFiles/view_build_test.dir/Debug/view_build_test.cc.o.d -o test/boost/CMakeFiles/view_build_test.dir/Debug/view_build_test.cc.o -c /home/kefu/dev/scylladb/test/boost/view_build_test.cc /home/kefu/dev/scylladb/test/boost/view_build_test.cc:998:5: error: unknown type name 'simple_schema' 998 \| simple_schema ss; \| ^ ``` apparently, `simple_schema`'s declaration is not available in this translation unit. in this change * we include the header where `simple_schema` is defined, so that the build passes with clang-20. * also take this opportunity to reorder the header a little bit, so the testing headers are grouped together. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#20367	2024-09-01 18:58:23 +03:00
Avi Kivity	ea8441dfa3	cql3: add option to not unify bind variables with the same name Bind variables in CQL have two formats: positional (`?`) where a variable is referred to by its relative position in the statement, and named (`:var`), where the user is expected to supply a name->value mapping. In `19a6e69001` we identified the case where a named bind variable appears twice in a query, and collapsed it to a single entry in the statement metadata. Without this, a driver using the named variable syntax cannot disambiguate which variable is referred to. However, it turns out that users can use the positional call form even with the named variable syntax, by using the positional API of the driver. To support this use case, we add a configuration variable to disable the same-variable detection. Because the detection has to happen when the entire statement is visible, we have to supply the configuration to the parser. We call it the `dialect` and pass it from all callers. The alternative would be to add a pre-prepare call similar to fill_prepare_context that rewrites all expressions in a statement to deduplicate variables. A unit test is added. Fixes #15559	2024-09-01 17:27:48 +03:00
Avi Kivity	d69bf4f010	cql3: introduce dialect infrastructure A dialect is a different way to interpret the same CQL statement. Examples: - how duplicate bind variable names are handled (later in this series) - whether `column = NULL` in LWT can return true (as is now) or whether it always returns NULL (as in SQL) Currently, dialect is an empty structure and will be filled in later. It is passed to query_processor methods that also accept a CQL string, and from there to the parser. It is part of the prepared statement cache key, so that if the dialect is changed online, previous parses of the statement are ignored and the statement is prepared again. The patch is careful to pick up the dialect at the entry point (e.g. CQL protocol server) so that the dialect doesn't change while a statement is parsed, prepared, and cached.	2024-08-29 21:19:23 +03:00
Avi Kivity	67b24859bc	Merge 'generic_server: convert connection tracking to seastar::gate' from Laszlo Ersek ~~~ generic_server: convert connection tracking to seastar::gate If we call server::stop() right after "server" construction, it hangs: With the server never listening (never accepting connections and never serving connections), nothing ever calls server::maybe_stop(). Consequently, co_await _all_connections_stopped.get_future(); at the end of server::stop() deadlocks. Such a server::stop() call does occur in controller::do_start_server() [transport/controller.cc], when - cserver->start() (sharded<cql_server>::start()) constructs a "server"-derived object, - start_listening_on_tcp_sockets() throws an exception before reaching listen_on_all_shards() (for example because it fails to set up client encryption -- certificate file is inaccessible etc.), - the "deferred_action" cserver->stop().get(); is invoked during cleanup. (The cserver->stop() call exposing the connection tracking problem dates back to commit `ae4d5a60ca` ("transport::controller: Shut down distributed object on startup exception", 2020-11-25), and it's been triggerable through the above code path since commit `6b178f9a4a` ("transport/controller: split configuring sockets into separate functions", 2024-02-05).) Tracking live connections and connection acceptances seems like a good fit for "seastar::gate", so rewrite the tracking with that. "seastar::gate" can be closed (and the returned future can be waited for) without anyone ever having entered the gate. NOTE: this change makes it quite clear that neither server::stop() nor server::shutdown() must be called multiple times. The permitted sequences are: - server::shutdown() + server::stop() - or just server::stop(). Fixes #10305 Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com> ~~~ Fixes #10305. I think we might want to backport this -- it fixes a hang-on-misconfiguration which affects `scylla-6.1.0-0.20240804.abbf0b24a60c.x86_64` minimally. Basically every release that contains commit `ae4d5a60ca` has a theoretical chance for the hang, and every release that contains commit `6b178f9a4a` has a practical chance for the hang. Focusing on the more practical symptom (i.e., releases containing commit `6b178f9a4a`), `git tag --contains 6b178f9a4a90` gives us (ignoring candidates and release candidates): - scylla-6.0.0 - scylla-6.0.1 - scylla-6.0.2 - scylla-6.1.0 Closes scylladb/scylladb#20212 * github.com:scylladb/scylladb: generic_server: make server::stop() idempotent generic_server: coroutinize server::shutdown() generic_server: make server::shutdown() idempotent test/generic_server: add test case configure, cmake: sort the lists of boost unit tests generic_server: convert connection tracking to seastar::gate	2024-08-29 19:45:48 +03:00
Avi Kivity	7da3314deb	Merge 'Integrated restore' from Ernest Zaslavsky Handed over from https://github.com/scylladb/scylladb/pull/20149 This adds minimal implementation of the start-restore API call. The method starts a task that runs load-and-stream functionality against sstables from S3 bucket. Arguments are: ``` endpoint -- the ID in object_store.yaml config file bucket -- the target bucket to get objects from keyspace -- the keyspace to work on table -- the table to work on snapshot -- the name of the snapshot from which the backup was taken ``` The task runs in the background, its task_id is returned from the method once it's spawned and it should be used via /task_manager API to track the task execution and completion. Remote sstables components are scanned as if they were placed in local upload/ directory. Then colelcted sstables are fed into load-and-stream. This branch has https://github.com/scylladb/scylladb/pull/19890 (Integrated backup), https://github.com/scylladb/scylladb/pull/20120 (S3 lister) and few more minor PRs merged in. The restore branch itself starts with [utils: Introduce abstract (directory) lister](`29c867b54d`) commit. refs: https://github.com/scylladb/scylladb/issues/18392 Closes scylladb/scylladb#20305 * github.com:scylladb/scylladb: tools/scylla-nodetool: add restore integration test/object_store: Add simple restore test test/object_store: Generalize prepare_snapshot_for_backup() code: Introduce restore API method sstable_loader: Add sstables::storage_manager dependency sstable_loader: Maintain task manager module sstable_loader: Out-line constructor distributed_loader: Split get_sstables_from_upload_dir() sstables/storage: Compose uploaded sstable path simpler sstable_directory: Prepare FS lister to scan files on S3 sstable_directory: Parse sstable component without full path s3-client: Add support for lister::filter utils: Introduce abstract (directory) lister	2024-08-29 18:25:30 +03:00
Patryk Jędrzejczak	ed55261650	treewide: distinguish all nodes from all token owners In one of the following patches, we introduce support for zero-token nodes. From that point, getting all nodes and getting all token owners isn't equivalent. In this patch, we ensure that we consider only token owners when we want to consider only token owners (for example, in the replication logic), and we consider all nodes when we want to consider all nodes (for example, in the topology logic). The main purpose of this patch is to make the PR introducing zero-token nodes easier to review. The patch that introduces zero-token nodes is already complicated. We don't want trivial changes from this patch to make noise there. This patch introduces changes needed for zero-token nodes only in the Raft-based topology and in the recovery mode. Zero-token nodes are unsupported in the gossip-based topology outside recovery. Some functions added to `token_metadata` and `topology` are inefficient because they compute a new data structure in every call. They are never called in the hot path, so it's not a serious problem. Nevertheless, we should improve it somehow. Note that it's not obvious how to do it because we don't want to make `token_metadata` store topology-related data. Similarly, we don't want to make `topology` store token-related data. We can think of an improvement in a follow-up. We don't remove unused `topology::get_datacenter_rack_nodes` and `topology::get_datacenter_nodes`. These function can be useful in the future. Also, `topology::_dc_nodes` is used internally in `topology`.	2024-08-29 10:37:07 +02:00
Patryk Jędrzejczak	c7016dedb3	locator: topology: add_or_update_endpoint: use none as the default node state In one of the following patches, we change the gossiper to work the same for zero-token nodes and token-owning nodes. We replace occurrences of `is_normal_token_owner` with topology-based conditions. We want to rely on the invariant that token-owning nodes own tokens if and only if they are in the normal or leaving state. However, this invariant can be broken in the gossip-based topology when a new node joins the cluster. When a boostrapping node starts gossiping, other nodes add it to their topology in `storage_service::on_alive`. Surprisingly, the state of the new node is set to `normal`, as it's the default value used by `add_or_update_endpoint`. Later, the state will be set to `bootstrapping` or `replacing`, and finally it will be set again to `normal` when the join operation finishes. We fix this strange behavior by setting the node state to `none` in `storage_service::on_alive` for nodes not present in the topology. Note that we must add such nodes to the topology. Other code needs their Host ID, IP, and location. We change the default node state from `normal` to `none` in `add_or_update_endpoint` to prevent bugs like the one in `storage_service::on_alive`. Also, we ensure that nodes in the `none` state are ignored in the getters of `locator::topology`.	2024-08-29 10:37:07 +02:00
Patryk Jędrzejczak	6adaf85634	test: boost: tablets tests: ensure all nodes are normal token owners In one of the following patches, we make NetworkTopologyStrategy and the tablet load balancer consider only normal token owners to ensure they ignore zero-token nodes. Some unit tests would start failing after this change because they do not ensure that all nodes are normal token owners. This patch prevents it. Judging by the logic in the test cases in `network_topology_strategy_test`, `point++` was probably intended anyway.	2024-08-29 10:37:07 +02:00
Laszlo Ersek	dbc0ca6354	test/generic_server: add test case Check whether we can stop a generic server without first asking it to listen. The test fails currently; the failure mode is a hang, which triggers the 5 minute timeout set in the test: > unknown location(0): fatal error: in "stop_without_listening": > seastar::timed_out_error: timedout > seastar/src/testing/seastar_test.cc(43): last checkpoint > test/boost/generic_server_test.cc(34): Leaving test case > "stop_without_listening"; testing time: 300097447us Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com>	2024-08-28 10:59:44 +02:00
Laszlo Ersek	931f2f8d73	configure, cmake: sort the lists of boost unit tests Both lists were obviously meant to be sorted originally, but by today we've introduced many instances of disorder -- thus, inserting a new test in the proper place leaves the developer scratching their head. Sort both lists. Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com>	2024-08-28 10:59:44 +02:00
Pavel Emelyanov	86bc5b11fe	s3-client: Add support for lister::filter Directory lister comes with a filter function that tells lister which entries to skip by its .get() method. For uniformity, add the same to S3 bucket_lister. After this change the lister reports shorter name in the returned directory entry (with the prefix cut), so also need to tune up the unit test respectively. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-08-27 16:15:40 +03:00
Botond Dénes	5c0f6d4613	Merge 'Make Summary support histogram with infinite bucket vlaues' from Amnon Heiman This series fixes an issue where histogram Summaries return an infinite value. It updated the quantile calculation logic to address cases where values fall into the infinite bucket of a histogram. Now, instead of returning infinite (max int), the calculation will return the last bucket limit, ensuring finite outputs in all cases. The series adds a test for summaries with a specific test case for this scenario. Fixes #20255 Need backport to 6.0, 6.1 and 2023.1 and above Closes scylladb/scylladb#20257 * github.com:scylladb/scylladb: test/estimated_histogram_test Add summary tests utils/histogram.hh: Make summary support inifinite bucket.	2024-08-27 10:33:54 +03:00
Avi Kivity	0acfa4a00d	Merge 'abstract_replication_strategy: make get_ranges async' from Benny Halevy To prevent stalls due to large number of tokens. For example, large cluster with say 70 nodes can have more than 16K tokens. Fixes #19757 Closes scylladb/scylladb#19758 * github.com:scylladb/scylladb: abstract_replication_strategy: make get_ranges async database: get_keyspace_local_ranges: get vnode_effective_replication_map_ptr param compaction: task_manager_module: open code maybe_get_keyspace_local_ranges alternator: ttl: token_ranges_owned_by_this_shard: let caller make the ranges_holder alternator: ttl: can pass const gms::gossiper& to ranges_holder alternator: ttl: ranges_holder_primary: unconstify _token_ranges member alternator: ttl: refactor token_ranges_owned_by_this_shard	2024-08-26 16:56:18 +03:00
Avi Kivity	72a85e3812	Merge 'Integrated backup' from Pavel Emelyanov This adds minimal implementation of the start-backup API call. The method starts a task that uploads all files from the given keyspace's snapshot to the requested endpoint/bucket. Arguments are: - endpoint -- the ID in object_store.yaml config file - bucket -- the target bucket to put objects into - keyspace -- the keyspace to work on - snapshot -- the method assumes that the snapshot had been already taken and only copies sstables from it The task runs in the background, its task_id is returned from the method once it's spawned and it should be used via /task_manager API to track the task execution and completion (hint: it's good to have non-zero TTL value to make sure fast backups don't finish before the caller manages to call wait_task API). Sstables components are scanned for all tables in the keyspace and are uploaded into the /bucket/${cf_name}/${snapshot_name}/ path. refs: #18391 Closes scylladb/scylladb#19890 * github.com:scylladb/scylladb: tools/scylla-nodetool: add backup integration docs: Document the new backup method test/object_store: Test that backup task is abortable test/object_store: Add simple backup test test/object_store: Move format_tuples() test/pylib: Add more methods to rest client backup-task: Make it abortable (almost) code: Introduce backup API method database: Export parse_table_directory_name() helper database: Introduce format_table_directory_name() helper snapshot-ctl: Add config to snapshot_ctl snapshot-ctl: Add sstables::storage_manager dependency snapshot-ctl: Maintain task manager module snapshot-ctl: Add "snapshots" logger snapshot-ctl: Outline stop() method and constructor snapshot-ctl: Inline run_snapshot_list<> test/cql_test_env: Export task manager from cql test env task_manager: Print task ttl on start (for debugging) docs: Update object_storage.md with AWS_ environment docs: Restructure object_storage.md	2024-08-25 20:19:10 +03:00
Benny Halevy	686a8f2939	abstract_replication_strategy: make get_ranges async To prevent stalls due to large number of tokens. For example, large cluster with say 70 nodes can have more than 16K tokens. Fixes #19757 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2024-08-25 10:57:34 +03:00
Benny Halevy	2bbbe2a8bc	database: get_keyspace_local_ranges: get vnode_effective_replication_map_ptr param Prepare for making the function async. Then, it will need to hold on to the erm while getting the token_ranges asynchronously. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2024-08-25 10:55:33 +03:00
Botond Dénes	15fdc3f6cc	Merge 'Add ability to list S3 bucket contents' from Pavel Emelyanov This is prerequisite for "restore from object storage" feature. In order to collect the sstables in bucket one would need to list the bucket contents with the given prefix. The ListObjectsV2 provides a way for it and here's the respective s3::client extension. Closes scylladb/scylladb#20120 * github.com:scylladb/scylladb: test: Add test for s3::client::bucket_lister s3_client: Add bucket lister s3_client: Encode query parameter value for query-string	2024-08-23 10:16:07 +03:00
Kefu Chai	ee19bbed05	test: do not define boost_test_print_type() for types with operator<< in `30e82a81`, we add a contraint to the template parameter of boost_test_print_type() to prevent it from being matched with types which can be formatted with operator<<. but it failed to work. we still have test failure reports like: ``` [Exception] - critical check ['s', 's', 't', '_', 'm', 'r', '.', 'i', 's', '_', 'e', 'n', 'd', '_', 'o', 'f', '_', 's', 't', 'r', 'e', 'a', 'm', '(', ')'] has failed ``` this is not what we expect. the reason is that we passed the template parameters to the `has_left_shift` trait in the wrong order, see https://live.boost.org/doc/libs/1_83_0/libs/type_traits/doc/html/boost_typetraits/reference/has_left_shift.html. we should have passed the lhs of operator<< expression as first parameter, and rhs the second. so, in this change, we correct the type constraint by passing the template parameter in the right order, now the error message looks better, like: ``` test/boost/mutation_query_test.cc(110): error: in "test_partition_query_is_full": check !partition_slice_builder(*s) .with_range({}) .build() .is_full() has failed ``` it turns out boost::transformed_range<> is formattable with operator<<, as it fulfills the constraints of `boost::has_left_shift<ostream, R>`, but when printing it, the compiler fails when it tries to insert the elements in the range to the output stream. so, in order to workaround this issue, we add a specialization for `boost::transformed_range<F, R`. also, to improve the readability, we reimplement the `has_left_shift<>` as a concept, so that it's obvious that we need to put both the output stream as the first parameter. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#20233	2024-08-23 09:26:22 +03:00
Amnon Heiman	644e6f0121	test/estimated_histogram_test Add summary tests This patch adds tests for summary calculation. It adds two tests, the first is a basic calculation for P50, P95, P99 by adding 100 elements into 20 buckets. The second test look that if elements are found in the infinite bucket, the result would be the lower limit (33s) and not infinite. Relates to #20255 Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2024-08-22 23:34:24 +03:00
Kefu Chai	39dd088374	test: include used headers before this change, clang 20 fails to build the tree, like: ``` /home/kefu/.local/bin/clang++ -DBOOST_ALL_DYN_LINK -DDEBUG -DDEBUG_LSA_SANITIZER -DFMT_SHARED -DSANITIZE -DSCYLLA_BUILD_MODE=debug -DSCYLLA_ENABLE_ERROR_INJECTION -DSEASTAR_API_LEVEL=7 -DSEASTAR_DEBUG -DSEASTAR_DEBUG_PROMISE -DSEASTAR_DEBUG_SHARED_PTR -DSEASTAR_DEFAULT_ALLOCATOR -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SHUFFLE_TASK_QUEUE -DSEASTAR_SSTRING -DSEASTAR_TESTING_MAIN -DSEASTAR_TYPE_ERASE_MORE -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"Debug\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -isystem /home/kefu/dev/scylladb/abseil -isystem /home/kefu/dev/scylladb/build/rust -g -Og -g -gz -std=gnu++23 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb=. -march=westmere -Xclang -fexperimental-assignment-tracking=disabled -U_FORTIFY_SOURCE -Werror=unused-result -fstack-clash-protection -fsanitize=address -fsanitize=undefined -fno-sanitize=vptr -MD -MT test/boost/CMakeFiles/database_test.dir/Debug/database_test.cc.o -MF test/boost/CMakeFiles/database_test.dir/Debug/database_test.cc.o.d -o test/boost/CMakeFiles/database_test.dir/Debug/database_test.cc.o -c /home/kefu/dev/scylladb/test/boost/database_test.cc /home/kefu/dev/scylladb/test/boost/database_test.cc:539:29: error: invalid use of incomplete type 'schema_builder' 539 \| return *schema_builder(ks_name, cf_name) \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/kefu/dev/scylladb/schema/schema.hh:115:7: note: forward declaration of 'schema_builder' 115 \| class schema_builder; \| ^ ``` and ``` /home/kefu/.local/bin/clang++ -DBOOST_ALL_DYN_LINK -DDEBUG -DDEBUG_LSA_SANITIZER -DFMT_SHARED -DSANITIZE -DSCYLLA_BUILD_MODE=debug -DSCYLLA_ENABLE_ERROR_INJECTION -DSEASTAR_API_LEVEL=7 -DSEASTAR_DEBUG -DSEASTAR_DEBUG_PROMISE -DSEASTAR_DEBUG_SHARED_PTR -DSEASTAR_DEFAULT_ALLOCATOR -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SHUFFLE_TASK_QUEUE -DSEASTAR_SSTRING -DSEASTAR_TESTING_MAIN -DSEASTAR_TYPE_ERASE_MORE -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"Debug\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -isystem /home/kefu/dev/scylladb/abseil -isystem /home/kefu/dev/scylladb/build/rust -g -Og -g -gz -std=gnu++23 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb=. -march=westmere -Xclang -fexperimental-assignment-tracking=disabled -U_FORTIFY_SOURCE -Werror=unused-result -fstack-clash-protection -fsanitize=address -fsanitize=undefined -fno-sanitize=vptr -MD -MT test/boost/CMakeFiles/group0_cmd_merge_test.dir/Debug/group0_cmd_merge_test.cc.o -MF test/boost/CMakeFiles/group0_cmd_merge_test.dir/Debug/group0_cmd_merge_test.cc.o.d -o test/boost/CMakeFiles/group0_cmd_merge_test.dir/Debug/group0_cmd_merge_test.cc.o -c /home/kefu/dev/scylladb/test/boost/group0_cmd_merge_test.cc /home/kefu/dev/scylladb/test/boost/group0_cmd_merge_test.cc:78:18: error: member access into incomplete type 'db::config' 78 \| cfg.db_config->commitlog_segment_size_in_mb(1); \| ^ /home/kefu/dev/scylladb/data_dictionary/data_dictionary.hh:28:7: note: forward declaration of 'db::config' 28 \| class config; \| ^ 1 error generated. ``` and ``` `FAILED: test/boost/CMakeFiles/repair_test.dir/Debug/repair_test.cc.o /home/kefu/.local/bin/clang++ -DBOOST_ALL_DYN_LINK -DDEBUG -DDEBUG_LSA_SANITIZER -DFMT_SHARED -DSANITIZE -DSCYLLA_BUILD_MODE=debug -DSCYLLA_ENABLE_ERROR_INJECTION -DSEASTAR_API_LEVEL=7 -DSEASTAR_DEBUG -DSEASTAR_DEBUG_PROMISE -DSEASTAR_DEBUG_SHARED_PTR -DSEASTAR_DEFAULT_ALLOCATOR -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SHUFFLE_TASK_QUEUE -DSEASTAR_SSTRING -DSEASTAR_TESTING_MAIN -DSEASTAR_TYPE_ERASE_MORE -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"Debug\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -isystem /home/kefu/dev/scylladb/abseil -isystem /home/kefu/dev/scylladb/build/rust -g -Og -g -gz -std=gnu++23 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb=. -march=westmere -Xclang -fexperimental-assignment-tracking=disabled -U_FORTIFY_SOURCE -Werror=unused-result -fstack-clash-protection -fsanitize=address -fsanitize=undefined -fno-sanitize=vptr -MD -MT test/boost/CMakeFiles/repair_test.dir/Debug/repair_test.cc.o -MF test/boost/CMakeFiles/repair_test.dir/Debug/repair_test.cc.o.d -o test/boost/CMakeFiles/repair_test.dir/Debug/repair_test.cc.o -c /home/kefu/dev/scylladb/test/boost/repair_test.cc /home/kefu/dev/scylladb/test/boost/repair_test.cc:149:45: error: use of undeclared identifier 'global_schema_ptr' 149 \| co_await e.db().invoke_on_all([gs = global_schema_ptr(gen.schema())](replica::database& db) -> future<> { \| ^ /home/kefu/dev/scylladb/test/boost/repair_test.cc:150:62: error: use of undeclared identifier 'gs' 150 \| co_await db.add_column_family_and_make_directory(gs.get(), replica::database::is_new_cf::yes); \| ^ 2 errors generated. ``` because we are using incomplete types when their complete definitions are required. so, in this change, we include the headers for their complete definition. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#20239	2024-08-22 20:51:38 +03:00
Pavel Emelyanov	dff51fd58c	snapshot-ctl: Add config to snapshot_ctl Pretty much all services in Scylla have their own config. Add one to snapshot-ctl too, it will be populated later. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-08-22 14:57:20 +03:00
Pavel Emelyanov	f37857e20a	snapshot-ctl: Add sstables::storage_manager dependency The storage_manager maintains set of clients to configured object storage(s). The snapshot ctl is going to spawn tasks that will talk to those storages, thus it needs the storage manager to get the clients from. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-08-22 14:08:21 +03:00
Pavel Emelyanov	362331c89b	snapshot-ctl: Maintain task manager module This service is going to start tasks managed by task manager. For that, it should have its module set up and registered. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-08-22 14:08:21 +03:00
Benny Halevy	f40d06b766	table: calculate_tablet_count: use sg_manager storage_groups size Now, when each shard storage_group_manager keeps only the storage_groups for the tablet replica it owns, we can simple return the storage_group map size instead of counting the number of tablet replicas mapped to this shard. Add a unit test that sums the tablet count on all shards and tests that the sum is equal to the configured default `initial_tablets. Fixes #18909 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#20223	2024-08-21 11:01:58 +02:00
Tomasz Grabiec	c1de4859d8	Merge 'tablets: Fix race between repair and split' from Raphael "Raph" Carvalho Consider the following: ``` T 0 split prepare starts 1 repair starts 2 split prepare finishes 3 repair adds unsplit sstables 4 repair ends 5 split executes ``` If repair produces sstable after split prepare phase, the replica will not split that sstable later, as prepare phase is considered completed already. That causes split execution to fail as replicas weren't really prepared. This also can be triggered with load-and-stream which shares the same write (consumer) path. The approach to fix this is the same employed to prevent a race between split and migration. If migration happens during prepare phase, it can happen source misses the split request, but the tablet will still be split on the destination (if needed). Similarly, the repair writer becomes responsible for splitting the data if underlying table is in split mode. That's implemented in replica::table for correctness, so if node crashes, the new sstable missing split is still split before added to the set. Fixes #19378. Fixes #19416. *Please replace this line with justification for the backport/\ labels added to this PR** Closes scylladb/scylladb#19427 * github.com:scylladb/scylladb: tablets: Fix race between repair and split compaction: Allow "offline" sstable to be split	2024-08-19 14:44:28 +02:00

1 2 3 4 5 ...

3380 Commits