scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-09 00:13:31 +00:00

Author	SHA1	Message	Date
Kamil Braun	cdc3cd2b79	Merge 'raft: add fencing tests' from Petr Gusev In this PR a simple test for fencing is added. It exercises the data plane, meaning if it somehow happens that the node has a stale topology version, then requests from this node will get an error 'stale topology'. The test just decrements the node version manually through CQL, so it's quite artificial. To test a more real-world scenario we need to allow the topology change fiber to sometimes skip unavailable nodes. Now the algorithm fails and retries indefinitely in this case. The PR also adds some logs, and removes one seemingly redundant topology version increment, see the commit messages for details. Closes #14901 * github.com:scylladb/scylladb: test_fencing: add test_fence_hints test.py: output the skipped tests test.py: add skip_mode decorator and fixture test.py: add mode fixture hints: add debug log for dropped hints hints: send_one_hint: extend the scope of file_send_gate holder pylib: add ScyllaMetrics hints manager: add send_errors counter token_metadata: add debug logs fencing: add simple data plane test random_tables.py: add counter column type raft topology: don't increment version when transitioning to node_state::normal	2023-08-22 16:28:21 +02:00
Piotr Grabowski	17e3e367ca	test: use more frequent reconnection policy The default reconnection policy in Python Driver is an exponential backoff (with jitter) policy, which starts at 1 second reconnection interval and ramps up to 600 seconds. This is a problem in tests (refs #15104), especially in tests that restart or replace nodes. In such a scenario, a node can be unavailable for an extended period of time and the driver will try to reconnect to it multiple times, eventually reaching very long reconnection interval values, exceeding the timeout of a test. Fix the issue by using a exponential reconnection policy with a maximum interval of 4 seconds. A smaller value was not chosen, as each retry clutters the logs with reconnection exception stack trace. Fixes #15104 Closes #15112	2023-08-22 15:40:39 +02:00
Avi Kivity	d944872d19	Merge 'Prevent reactor stalls in to_repair_rows_list' from Benny Halevy This sort series deals with two stall sources in row-level repair `to_repair_rows_list`: 1. Freeing the input `repair_rows_on_wire` in one shot on return (as seen in https://github.com/scylladb/scylladb/issues/14537) 2. Freeing the result `row_list` in one shot on error. this hasn't been seen in testing but I have no reason to believe it is not susceptible to stalls exactly like repair_rows_on_wire with the same number of rows and mutations. Fixes https://github.com/scylladb/scylladb/issues/14537 Closes #15102 * github.com:scylladb/scylladb: repair: reindent to_repair_rows_list repair: to_repair_rows_list: clear_gently on error repair: to_repair_rows_list: consume frozen rows gently	2023-08-22 15:29:37 +03:00
Petr Gusev	1ddc76ffd1	test_fencing: add test_fence_hints The test makes a write through the first node with the third node down, this causes a hint to be stored on the first node for the second. We increment the version and fence_version on the third node, restart it, and expect to see a hint delivery failure because of versions mismatch. Then we update the versions of the first node and expect hint to be successfully delivered.	2023-08-22 15:48:40 +04:00
Petr Gusev	3ccd2abad4	test.py: output the skipped tests pytest option -rs forces it to print all the skipped tests along with the reasons. Without this option we can't tell why certain tests were skipped, maybe some of them shouldn't already.	2023-08-22 15:48:40 +04:00
Petr Gusev	c434d26b36	test.py: add skip_mode decorator and fixture Syntactic sugar for marking tests to be skipped in a particular mode. There is skip_in_debug/skip_in_release in suite.yaml, but they can be applied only on the entire file, which is unnatural and inconvenient. Also, they don't allow to specify a reason why the test is skipped. Separate dictionary skipped_funcs is needed since we can't use pytest fixtures in decorators.	2023-08-22 15:48:40 +04:00
Petr Gusev	a639d161e6	test.py: add mode fixture Sometimes a test wants to know what mode it is running in so that e.g. it can skip itself in some of them.	2023-08-22 15:48:40 +04:00
Petr Gusev	439c91851f	hints: add debug log for dropped hints Dropping data is rather important event, let's log it at least at the debug level. It'll help in debugging tests.	2023-08-22 15:48:40 +04:00
Petr Gusev	9fd3df13a2	hints: send_one_hint: extend the scope of file_send_gate holder The problem was that the holder in with_gate call was released too early. This happened before the possible call to on_hint_send_failure in then_wrapped. As a result, the effects of on_hint_send_failure (segment_replay_failed flag) were not visible in send_one_file after ctx_ptr->file_send_gate.close(), so we could decide that the segment was sent in full and delete it even if sending of some hints led to errors. Fixes #15110	2023-08-22 15:48:40 +04:00
Petr Gusev	0b7a90dff6	pylib: add ScyllaMetrics This patch adds facilities to work with Scylla metrics from test.py tests. The new metrics property was added to ManagerClient, its query method sends a request to Scylla metrics endpoint and returns and object to conveniently access the result. ScyllaMetrics is copy-pasted from test_shedding.py. It's difficult to reuse code between 'new' and 'old' styles of tests, we can't just import pylib in 'old' tests because of some problems with python search directories. A past commit of mine that attempted to solve this problem was rejected on review.	2023-08-22 14:31:04 +04:00
Petr Gusev	1b7603af23	hints manager: add send_errors counter There was no indication of problems in the hints manager metrics before. We need this counter for fencing tests in the later commit, but it seems to be useful on its own.	2023-08-22 14:31:04 +04:00
Petr Gusev	fa25e6d63e	token_metadata: add debug logs We log the new version when the new token metadata is set. Also, the log for fence_version is moved in shared_token_metadata from storage_service for uniformity.	2023-08-22 14:31:04 +04:00
Petr Gusev	360453fd87	fencing: add simple data plane test The test starts a three node cluster and manually decrements the version on the last node. It then tries to write some data through the last node and expects to get 'stale topology' exception.	2023-08-22 14:31:01 +04:00
Benny Halevy	758dc252ff	repair: reindent to_repair_rows_list	2023-08-22 08:46:26 +03:00
Benny Halevy	7406e9f99b	repair: to_repair_rows_list: clear_gently on error Prevent destroying of potentially large `rows` and `row_list` in one shot on error as it might caused a reactor stall. Instead, use utils::clear_gently on the error return path. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-08-22 08:45:59 +03:00
Benny Halevy	e55143148f	repair: to_repair_rows_list: consume frozen rows gently Although to_repair_rows_list may yield if needed between rows and mutation fragments, the input `repair_rows_on_wire` is freed in one shot and that may cause stalls as seen in qa: ``` \| bytes_ostream::free_chain at ././bytes_ostream.hh:163 ++ - addr=0x4103be0: \| bytes_ostream::~bytes_ostream at ././bytes_ostream.hh:199 \| (inlined by) frozen_mutation_fragment::~frozen_mutation_fragment at ././mutation/frozen_mutation.hh:273 \| (inlined by) std::destroy_at<frozen_mutation_fragment> at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/stl_construct.h:88 \| (inlined by) ?? at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/alloc_traits.h:537 \| (inlined by) ?? at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/list.tcc:77 \| (inlined by) std::__cxx11::_List_base<frozen_mutation_fragment, std::allocator<frozen_mutation_fragment> >::~_List_base at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/stl_list.h:575 \| (inlined by) partition_key_and_mutation_fragments::~partition_key_and_mutation_fragments at ././repair/repair.hh:203 \| (inlined by) std::destroy_at<partition_key_and_mutation_fragments> at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/stl_construct.h:88 \| (inlined by) ?? at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/alloc_traits.h:537 \| (inlined by) ?? at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/list.tcc:77 \| (inlined by) std::__cxx11::_List_base<partition_key_and_mutation_fragments, std::allocator<partition_key_and_mutation_fragments> >::~_List_base at /usr/bin/../lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/stl_list.h:575 \| (inlined by) to_repair_rows_list at ./repair/row_level.cc:597 ``` This change consumes the rows and frozen mutation fragments incrementally, freeing each after being processed. Fixes scylladb/scylladb#14537 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-08-22 08:45:54 +03:00
Nadav Har'El	a963b59495	test/cql-pytest: add reproducer for IN not working with secondary index We already have a test for issue #13533, where an "IN" doesn't work with a secondary index (the secondary index isn't used in that case, and instead inefficient filtering is required). Recently a user noticed the same problem also exists for local secondary indexes - and this patch includes a reproducing test. The new test is marked xfail, as the issue is still unfixed. The new test is Scylla-only because local secondary index is a Scylla-only extension that doesn't exist in Cassandra. Refs #13533. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #15106	2023-08-22 07:25:32 +03:00
Avi Kivity	23be6f0336	tablets: change persistent type of replica set from set to list The system.tablets table stores replica sets as a CQL set type, which is sorted. This means that if, in a tablet replica set [n1, n2, n3] n2 is replaced with n4, then on reload we'll see [n1, n3, n4], changing the relative position of n3 from the third replica to the second. The relative position of replicas in a replica set is important for materialized views, as they use it to pair base replicas with view replicas. To prepare for materialized views using tablets, change the persistent data type to list, which preserves order. The code that generates new replica sets already preserves order: see locator::replace_replica(). While this changes the system schema, tablets are an experimental feature so we don't need to worry about upgrades. Closes #15111	2023-08-21 22:55:14 +02:00
Nadav Har'El	18e8e62798	cql-pytest: translate Cassandra's tests for SELECT with LIMIT This is a translation of Cassandra's CQL unit test source file validation/operations/SelectLimitTest.java into our cql-pytest framework. The tests reproduce two already-known bugs: Refs #9879: Using PER PARTITION LIMIT with aggregate functions should fail as Invalid query Refs #10357: Spurious static row returned from query with filtering, despite not matching filter And also helped discover two new issues: Refs #15099: Incorrect sort order when combining IN, and ORDER BY Refs #15109: PER PARTITION LIMIT should be rejected if SELECT DISTINCT is used Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #15114	2023-08-21 22:29:11 +03:00
Kefu Chai	63b32cbdb4	tasks: s/stoppping/stopping/ fix a typo Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #15103	2023-08-21 22:28:38 +03:00
Eliran Sinvani	eb368f9f6e	internal_keyspace extention: enhance the semantics also to flushes commit `7c8c020` introduced a new type of a keyspace, an internal keyspace It defined the semantics for this internal keyspace, this keyspace is somewhat a hybrid between system and user keyspace. Here we extend the semantics to include also flushes, meaning that flushes will be done using the system dirty_mamory_manager. This is in order to allow inter dependencies between internal tables and user tables and prevent deadlocks. One example of such a deadlock is our `replicated_key_provider` encryption on the enterprise version. The deadlock occur because in some circumstances, an encrypted user table flush is dependant upon the `encrypted_keys` table being flushed but since the requests are serialized, we get a deadlock. Tests: unit tests dev + debug The deadlock dtest reproducer: encryption_at_rest_test.py::TestEncryptionAtRest::test_reboot Fixes #14529 Signed-off-by: Eliran Sinvani <eliransin@scylladb.com> Closes #14547	2023-08-21 18:17:05 +03:00
Avi Kivity	ce43effc21	Merge "fix rebuild with consistent topology management" From Gleb Natapov " The series fixes bogus asserting during topology state load and add a test that runs rebuild to make sure the code will not regress again. Fixes #14958 " * 'gleb/rebuilding_fix_v1' of github.com:scylladb/scylla-dev: test: add rebuild test system_keyspace: fix assertion for missing transition_state	2023-08-21 16:00:42 +03:00
Kefu Chai	8cc215db96	test: randomized_nemesis_test: do not brace around scalars Clang and GCC's warning option of `-Wbraced-scalar-init` warns at seeing superfluous use of braces, like: ``` /home/kefu/dev/scylladb/test/raft/randomized_nemesis_test.cc:2187:32: error: braces around scalar initializer [-Werror,-Wbraced-scalar-init] .snapshot_threshold{1}, ^~~ ``` usually, this does not hurt. but by taking the braces out, we have a more readable piece of code, and less warnings. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #15086	2023-08-21 15:57:06 +03:00
Aleksandra Martyniuk	e0ce711e4f	compaction: do not swallow compaction_stopped_exception for reshape Loop in shard_reshaping_compaction_task_impl::run relies on whether sstables::compaction_stopped_exception is thrown from run_custom_job. The exception is swallowed for each type of compaction in compaction_manager::perform_task. Rethrow an exception in perfrom task for reshape compaction. Fixes: #15058. Closes #15067	2023-08-21 12:41:55 +03:00
Vlad Zolotarov	e13a2b687d	scylla_raid_setup: make --online-discard argument useful This argument was dead since its introduction and 'discard' was always configured regardless of its value. This patch allows actually configuring things using this argument. Fixes #14963 Closes #14964	2023-08-21 12:21:23 +03:00
Anna Stuchlik	b5c4d13e36	doc: update the Seastar Perftune page This commit updates the description of perftune.py. It is based on the information in the reported issue (below), the contents of help for perftune.py, and the input from @vladzcloudius. Fixes https://github.com/scylladb/scylladb/issues/14233 Closes #14879	2023-08-21 10:23:30 +03:00
Anna Stuchlik	57e86b05f1	doc: fix the outdated Networking section Fixes https://github.com/scylladb/scylla-docs/issues/2467 This commit updates the Networking section. The scope is: - Removing the outdated content, including the reference to the super outdated posix_net_conf.sh script. - Adding the guidelines provided by @vladzcloudius. - Adding the reference to the documentation for the perftune.py script. Closes #14859	2023-08-21 10:17:37 +03:00
Petr Gusev	9176a3341a	test_topology_smp: more logs for debug/aarch64 The test is flaky on CI in debug builds on aarch64 (#14752), here we sprinkle more logs for debug/aarch64 hoping it'll help to debug it. Ref #14752 Closes #14822	2023-08-21 10:03:09 +03:00
Kefu Chai	adfc139a74	tools/scylla-sstable: path::parent_path() when appropriate in load_sstables(), `sst_path` is already an instace of `std::filesystem::path`, so there is no need to cast it to `std::filesystem::path`. also, `path.remove_filename()` returns something like "system_schema/columns-24101c25a2ae3af787c1b40ee1aca33f/", when the trailing slash. when we get a component's path in `sstable::filename`, we always add a "/" in between the `dir` and the filename, so this'd end up with two slashes in the path like: "/var/scylla/data/system_schema/columns-24101c25a2ae3af787c1b40ee1aca33f//mc-2-big-Data.db" so, in order to remove the duplicated slash, let's just use `path.parent_path()` here. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #15035	2023-08-21 09:28:03 +03:00
Benny Halevy	0f54e24519	migration_notifier: get schema_ptr by value To prevent use-after-free as seen in https://github.com/scylladb/scylladb/issues/15097 where a temp schema_ptr retrieved from a global_schema_ptr get destroyed when the notification function yielded. Capturing the schema_ptr on the coroutine frame is inexpensive since its a shared ptr and it makes sure that the schema remains valid throughput the coroutine life time. Fixes scylladb/scylladb#15097 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #15098	2023-08-20 21:36:57 +03:00
David Garcia	e23d9cd7eb	docs: Autogenerate db/config.cc docs Update layout docs: remove output param docs: generate cc properties on build docs: track cc file on change rm: note dependency docs: clean _data Fixes #8424. Closes #14973	2023-08-20 21:27:37 +03:00
Kefu Chai	1aa01d63d4	test: randomized_nemesis_test: mark direct_fd_{pinger,clock} final `raft_server` in test/raft/randomized_nemesis_test.cc manages instances of direct_fd_pinger and direct_fd_clock with unique_ptr<>. this unique_ptr<> deletes these managed instances using delete. but since these two classes have virtual methods, the compiler feels nervous when deleting them. because these two classes have virtual functions, but they do not have virtual destructor. in other words, in theory, these pointers could be pointing derived classes of them, and deleting them could lead to leak. so to silence the warning and to prevent potential issues, let's just mark these two classes final. this should address the warning like: ``` In file included from /home/kefu/dev/scylladb/test/raft/randomized_nemesis_test.cc:9: In file included from /home/kefu/dev/scylladb/seastar/include/seastar/core/reactor.hh:24: In file included from /home/kefu/dev/scylladb/seastar/include/seastar/core/aligned_buffer.hh:24: In file included from /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/memory:78: /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/unique_ptr.h:99:2: error: delete called on non-final 'direct_fd_pinger<int>' that has virtual functions but non-virtual destructor [-Werror,-Wdelete-non-abstract-non-virtual-dtor] delete __ptr; ^ /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/unique_ptr.h:404:4: note: in instantiation of member function 'std::default_delete<direct_fd_pinger<int>>::operator()' requested here get_deleter()(std::move(__ptr)); ^ /home/kefu/dev/scylladb/test/raft/randomized_nemesis_test.cc:1400:5: note: in instantiation of member function 'std::unique_ptr<direct_fd_pinger<int>>::~unique_ptr' requested here ~raft_server() { ^ /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/unique_ptr.h:99:2: note: in instantiation of member function 'raft_server<ExReg>::~raft_server' requested here delete __ptr; ^ /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/unique_ptr.h:404:4: note: in instantiation of member function 'std::default_delete<raft_server<ExReg>>::operator()' requested here get_deleter()(std::move(__ptr)); ^ /home/kefu/dev/scylladb/test/raft/randomized_nemesis_test.cc:1704:24: note: in instantiation of member function 'std::unique_ptr<raft_server<ExReg>>::~unique_ptr' requested here ._server = nullptr, ^ /home/kefu/dev/scylladb/test/raft/randomized_nemesis_test.cc:1742:19: note: in instantiation of member function 'environment<ExReg>::new_node' requested here auto id = new_node(first, std::move(cfg)); ^ /home/kefu/dev/scylladb/test/raft/randomized_nemesis_test.cc:2113:39: note: in instantiation of member function 'environment<ExReg>::new_server' requested here auto leader_id = co_await env.new_server(true); ^ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #15084	2023-08-20 21:26:08 +03:00
Avi Kivity	4db5d8dd56	Merge 'build: cmake: support Coverage and Sanitize build modes' from Kefu Chai to mirror the build modes supported by `configure.py`. Closes #15085 * github.com:scylladb/scylladb: build: cmake: support Coverage and Sanitize build modes build: cmake: error out if specified build type is unknown	2023-08-20 21:25:21 +03:00
Pavel Emelyanov	6bc30f1944	system_keyspace: De-bloat .setup() from messing with system.local On boot several manipulations with system.local are performed. 1. The host_id value is selected from it with key = local If not found, system_keyspace generates a new host_id, inserts the new value into the table and returns back 2. The cluster_name is selected from it with key = local Then it's system_keyspace that either checks that the name matches the one from db::config, or inserts the db::config value into the table 3. The row with key = local is updated with various info like versions, listen, rpc and bcast addresses, dc, rack, etc. Unconditionally All three steps are scattered over main, p.1 is called directly, p.2 and p.3 are executed via system_keyspace::setup() that happens rather late. Also there's some touch of this table from the cql_test_env startup code. The proposal is to collect this setup into one place and execute it early -- as soon as the system.local table is populated. This frees the system_keyspace code from the logic of selecting host id and cluster name leaving it to main and keeps it with only select/insert work. refs: #2795 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #15082	2023-08-20 21:24:31 +03:00
Tomasz Grabiec	1552044615	storage_service, tablets: Fix corrupting tablet metadata on migration concurrent with table drop Tablet migration may execute a global token metadata barrier before executing updates of system.tablets. If table is dropped while the barrier is happening, the updates will bring back rows for migrated tablets in a table which is no longer there. This will cause tablet metadata loading to fail with error: missing_column (missing column: tablet_count) Like in this log line: storage_service - raft topology: topology change coordinator fiber got error raft::stopped_error (Raft instance is stopped, reason: "background error, std::_Nested_exception<raft::state_machine_error> (State machine error at raft/server.cc:1206): std::_Nested_exception<std::runtime_error> (Failed to read tablet metadata): missing_column (missing column: tablet_count)œ") The fix is to read and execute the updates in a single group0 guard scope, and move execution of the barrier later. We cannot now generate updates in the same handle_tablet_migration() step if barrier needs to be executed, so we resuse the mechanism for two-step stage transition which we already have for handling of streaming. The next pass will notice that the barrier is not needed for a given tablet and will generate the stage update. Fixes #15061 Closes #15069	2023-08-20 21:17:57 +03:00
Avi Kivity	a4e7f9bed0	docs: cql: split DML page into one page per statement The DML page is quite long (21 screenfuls on my monitor); split it into one page per statement to make it more digestible. The sections that are common to multiple statement are kept in the main DML page, and references to them are added. Closes #15053	2023-08-20 17:14:32 +03:00
Kefu Chai	12d6ec5a18	config: respect --log-with-color 1 scylladb overrides some of seastar logging related options with its own options by applying them with `logging::apply_settings()`. but we fail to inherit `with_color` from Seastar as we are using the designated initializer, so the unspecified members are zero initialized. that's why we always have logging message in black and white even if scylla is running in a tty and `--log-with-color 1` is specified. so, make the debugging life more colorful, let's inherit the option from Seastar, and apply it when setting logging related options. see also `29e09a3292` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #15076	2023-08-20 13:47:43 +03:00
Tomasz Grabiec	bd8bb5d4b1	Merge 'Wire tablet into compaction group' from Raphael "Raph" Carvalho Compaction group is the data plane for tablets, so this integration allows each tablet to have its own storage (memtable + sstables). A crucial step for dynamic tablets, where each tablet can be worked on independently. There are still some inefficiencies to be worked on, but as it is, it already unlocks further development. ``` INFO 2023-07-27 22:43:38,331 [shard 0] init - loading tablet metadata INFO 2023-07-27 22:43:38,333 [shard 0] init - loading non-system sstables INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 0 present for ks.cf INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 2 present for ks.cf INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 4 present for ks.cf INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 6 present for ks.cf INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 1 present for ks.cf INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 3 present for ks.cf INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 5 present for ks.cf INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 7 present for ks.cf ``` Closes #14863 * github.com:scylladb/scylladb: Kill scylla option to configure number of compaction groups replica: Wire tablet into compaction group token_metadata: Add this_host_id to topology config replica: Switch to chunked_vector for storing compaction groups replica: Generate group_id for compaction_group on demand	2023-08-18 15:17:17 +02:00
Kefu Chai	9fa0b9b75b	build: cmake: support Coverage and Sanitize build modes Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-08-18 14:17:12 +08:00
Kefu Chai	3c3fb03b01	build: cmake: error out if specified build type is unknown this should help the developer to understand what build types are supported if the specified one is unknown. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-08-18 14:17:12 +08:00
Avi Kivity	1901475598	Merge 'config: mark "experimental" option unused and cleanups' from Kefu Chai in this series, the "experimental" option is marked `Unused` as it has been marked deprecated for almost 2 years since scylla 4.6. and use `experimental_features` to specify the used experimental features explicitly. Closes #14948 * github.com:scylladb/scylladb: config: remove unused namespace alias config: use std::ranges when appropriate config: drop "experimental" option test: disable 'enable_user_defined_functions' if experimental_features does not include udf test: pylib: specify experimental_features explicitly	2023-08-17 20:42:02 +03:00
Kefu Chai	7275b8967c	docs: add sstablemetadata to operating-scylla/admin-tools to note that sstablemetadata is being deprecated and encourage user to switch over to the native tools. Fixes #15020 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #15040	2023-08-17 18:48:46 +03:00
Avi Kivity	e91256a621	Merge 'build: cmake: fix the build of rpm/deb from submodules' from Kefu Chai in this series, the build of rpm and deb from submodules is fixed: 1. correct the path of reloc package 2. add the dependency of reloc package to deb/rpm build targets Closes #15062 * github.com:scylladb/scylladb: build: cmake: correct reloc_pkg's path build: cmake: build rpm/deb from reloc_pkg	2023-08-17 17:58:49 +03:00
Pavel Emelyanov	3ed5b00ba2	Merge 's3/client: generate config file for tests and cleanups' from Kefu Chai before this change, object_store/test_basic.py create a config file for specifying the object storage settings, and pass the path of this file as the argument of `--object-storage-config-file` option when running scylla. we have the same requirement when testing scylla with minio server, where we launch a minio server and manually create a the config file and feed it to scylla. to ease the preparation work, let's consolidate by creating the config file in `minio_server.py`, so it always creates the config file and put it in its tempdir. since object_store/test_basic.py can also run against an S3 bucket, the fixture implemented object_store/conftest.py is updated accordingly to reuse the helper exposed by MinioServer to create the config file when it is not available. Closes #15064 * github.com:scylladb/scylladb: s3/client: avoid hardwiring env variables names s3/client: generate config file for tests	2023-08-17 16:39:23 +03:00
Gleb Natapov	4ffc39d885	cql3: Extend the scope of group0_guard during DDL statement execution Currently we hold group0_guard only during DDL statement's execute() function, but unfortunately some statements access underlying schema state also during check_access() and validate() calls which are called by the query_processor before it calls execute. We need to cover those calls with group0_guard as well and also move retry loop up. This patch does it by introducing new function to cql_statement class take_guard(). Schema altering statements return group0 guard while others do not return any guard. Query processor takes this guard at the beginning of a statement execution and retries if service::group0_concurrent_modification is thrown. The guard is passed to the execute in query_state structure. Fixes: #13942 Message-ID: <ZNsynXayKim2XAFr@scylladb.com>	2023-08-17 15:52:48 +03:00
Kefu Chai	6788903fd6	db: config: mark config class final in `34c3688017`, we added a virtual function to `config_file`, and we new and delete pointer pointing to a `db::config` instance with `unique_ptr<>`. this makes the compiler nervous, as deleting a pointer pointing to an instance of non-final class with virtual function could lead to leak, if this pointer actually points to a derived class of this non-final class. so, in order to silence the warning and to prevent potential problem in future, let's mark `db::config` final. the warning from Clang 16 looks like: ``` In file included from /home/kefu/dev/scylladb/test/lib/test_services.cc:10: In file included from /home/kefu/dev/scylladb/test/lib/test_services.hh:25: In file included from /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/memory:78: /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/unique_ptr.h:99:2: error: delete called on non-final 'db::config' that has virtual functions but non-virtual destructor [-Werror,-Wdelete-non-abstract-non-virtual-dtor] delete __ptr; ^ /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/unique_ptr.h:404:4: note: in instantiation of member function 'std::default_delete<db::config>::operator()' requested here get_deleter()(std::move(__ptr)); ^ /home/kefu/dev/scylladb/test/lib/test_services.cc:189:16: note: in instantiation of member function 'std::unique_ptr<db::config>::~unique_ptr' requested here auto cfg = std::make_unique<db::config>(); ^ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #15071	2023-08-17 13:43:16 +03:00
Kefu Chai	fc6b8d4040	s3/client: avoid hardwiring env variables names instead of hardwiring the names in multiple places, let's just keep them in a single place as variables, and reference them by these variables instead of their values. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-08-17 16:06:55 +08:00
Kefu Chai	ec7fa3628c	s3/client: generate config file for tests before this change, object_store/test_basic.py create a config file for specifying the object storage settings, and pass the path of this file as the argument of `--object-storage-config-file` option when running scylla. we have the same requirement when testing scylla with minio server, where we launch a minio server and manually create a the config file and feed it to scylla. to ease the preparation work, let's consolidate by creating the config file in `minio_server.py`, so it always creates the config file and put it in its tempdir. since object_store/test_basic.py can also run against an S3 bucket, the fixture implemented object_store/conftest.py is updated accordingly to reuse the helper exposed by MinioServer to create the config file when it is not available. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-08-17 16:06:55 +08:00
Raphael S. Carvalho	b578d6643f	Kill scylla option to configure number of compaction groups The option was introduced to bootstrap the project. It's still useful for testing, but that translates into maintaining an additional option and code that will not be really used outside of testing. A possible option is to later map the option in boost tests to initial_tablets, which may yield the same effect for testing. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-08-16 18:23:53 -03:00
Raphael S. Carvalho	cc60598368	replica: Wire tablet into compaction group Compaction group is the data plane for tablets, so this integration allows each tablet to have its own storage (memtable + sstables). A crucial step for dynamic tablets, where each tablet can be worked on independently. There are still some inefficiencies to be worked on, but as it is, it already unlocks further development. INFO 2023-07-27 22:43:38,331 [shard 0] init - loading tablet metadata INFO 2023-07-27 22:43:38,333 [shard 0] init - loading non-system sstables INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 0 present for ks.cf INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 2 present for ks.cf INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 4 present for ks.cf INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 6 present for ks.cf INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 1 present for ks.cf INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 3 present for ks.cf INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 5 present for ks.cf INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 7 present for ks.cf There's a need for compaction_group_manager, as table will still support "tabletless" mode, and we don't want to sprinkle ifs here and there, to support both modes. It's not really a manager (it's not even supposed to store a state), but I couldn't find a better name. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-08-16 18:23:53 -03:00

1 2 3 4 5 ...

38474 Commits