scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-28 04:06:59 +00:00

Author	SHA1	Message	Date
Benny Halevy	672ec66769	test: rest_api: add test_gossiper Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-08-24 11:37:12 +03:00
Nadav Har'El	5530c529c2	test/cql-pytest: regression test for old bug with CAST(f AS TEXT) precision When casting a float or double column to a string with `CAST(f AS TEXT)`, Scylla is expected to print the number with enough digits so that reading that string back to a float or double restores the original number exactly. This expectation isn't documented anywhere, but makes sense, and is what Cassandra does. Before commit `71bbd7475c`, this wasn't the case in Scylla: `CAST(f AS TEXT)` always printed 6 digits of precision, which was a bit under enough for a float (which can have 7 decimal digits of precision), but very much not enough for a double (which can need 15 digits). The origin of this magic "6 digits" number was that Scylla uses seastar::to_sstring() to print the float and double values, and before the aforementioned commit those functions used sprintf with the "%g" format - which always prints 6 decimal digits of precision! After that commit, to_sstring() now uses a different approach (based on fmt) to print the float and double values, that prints all significant digits. This patch adds a regression test for this bug: We write float and double values to the database, cast them to text, and then recover the float or double number from that text - and check that we get back exactly the same float or double object. The test fails before the aforementioned commit, and passes after it. It also passes on Cassandra. Refs #15127 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #15131	2023-08-23 16:06:52 +03:00
Botond Dénes	139ba553b8	Merge 'sstable, test: log sstable name and pk when capping local_deletion_time ' from Kefu Chai in this series, we also print the sstable name and pk when writing a tombstone whose local_deletion_time (ldt for short) is greater than INT32_MAX which cannot be represented by an uint32_t. Fixes #15015 Closes #15107 * github.com:scylladb/scylladb: sstable/writer: log sstable name and pk when capping ldt test: sstable_compaction_test: add a test for capped tombstone ldt	2023-08-23 09:29:54 +03:00
Kamil Braun	169d19e5b0	Merge 'raft topology: support --ignore-dead-nodes in removenode and replace' from Patryk Jędrzejczak We add support for `--ignore-dead-nodes` in `raft_removenode` and `--ignore-dead-nodes-for-replace` in `raft_replace`. For now, we allow passing only host ids of the ignored nodes. Supporting IPs is currently impossible because `raft_address_map` doesn't provide a mapping from IP to a host id. The main steps of the implementation are as follows: - add the `ignore_nodes` column to `system.topology`, - set the `ignore_nodes` value of the topology mutation in `raft_removenode` and `raft_replace`, - extend `service::request_param` with alternative types that allow storing a set of ids of the ignored nodes, - load `ignore_nodes` from `system.topology` into `request_param` in `system_keyspace::load_topology_state`, - add `ignore_nodes` to `exclude_nodes` in `topology_coordinator::exec_global_command`, - pass `ignore_nodes` to `replace_with_repair` and `remove_with_repair` in `storage_service::raft_topology_cmd_handler`. Additionally, we add `test_raft_ignore_nodes.py` with two tests that verify the added changes. Fixes #15025 Closes #15113 * github.com:scylladb/scylladb: test: add test_raft_ignore_nodes test: ManagerClient.remove_node: allow List[HostId] for ignore_dead raft topology: pass ignore_nodes to {replace, remove}_with_repair raft topology: exec_global_command: add ignore_nodes to exclude_nodes raft topology: exec_global_command: change type of exclude_nodes topology_state_machine: extend request_param with a set of raft ids raft topology: set ignore_nodes in raft_removenode and raft_replace utils: introduce split_comma_separated_list raft topology: add the ignore_nodes column to system.topology	2023-08-22 18:04:59 +02:00
Kamil Braun	cdc3cd2b79	Merge 'raft: add fencing tests' from Petr Gusev In this PR a simple test for fencing is added. It exercises the data plane, meaning if it somehow happens that the node has a stale topology version, then requests from this node will get an error 'stale topology'. The test just decrements the node version manually through CQL, so it's quite artificial. To test a more real-world scenario we need to allow the topology change fiber to sometimes skip unavailable nodes. Now the algorithm fails and retries indefinitely in this case. The PR also adds some logs, and removes one seemingly redundant topology version increment, see the commit messages for details. Closes #14901 * github.com:scylladb/scylladb: test_fencing: add test_fence_hints test.py: output the skipped tests test.py: add skip_mode decorator and fixture test.py: add mode fixture hints: add debug log for dropped hints hints: send_one_hint: extend the scope of file_send_gate holder pylib: add ScyllaMetrics hints manager: add send_errors counter token_metadata: add debug logs fencing: add simple data plane test random_tables.py: add counter column type raft topology: don't increment version when transitioning to node_state::normal	2023-08-22 16:28:21 +02:00
Piotr Grabowski	17e3e367ca	test: use more frequent reconnection policy The default reconnection policy in Python Driver is an exponential backoff (with jitter) policy, which starts at 1 second reconnection interval and ramps up to 600 seconds. This is a problem in tests (refs #15104), especially in tests that restart or replace nodes. In such a scenario, a node can be unavailable for an extended period of time and the driver will try to reconnect to it multiple times, eventually reaching very long reconnection interval values, exceeding the timeout of a test. Fix the issue by using a exponential reconnection policy with a maximum interval of 4 seconds. A smaller value was not chosen, as each retry clutters the logs with reconnection exception stack trace. Fixes #15104 Closes #15112	2023-08-22 15:40:39 +02:00
Patryk Jędrzejczak	b044ee535f	test: add test_raft_ignore_nodes We add two tests verifying that --ignore-dead-nodes in raft_removenode and --ignore-dead-nodes-for-replace in raft_replace are handled correctly. We need a 7-cluster to have a Raft majority. Therefore, these tests are quite slow, and we want to run them only in the dev mode.	2023-08-22 14:19:21 +02:00
Patryk Jędrzejczak	6818d13f7d	test: ManagerClient.remove_node: allow List[HostId] for ignore_dead ManagerClient.remove_node allows passing ignore_dead only as List[IPAddress]. However, raft_removenode currently supports only host ids. To write a test that passes ignore_dead to ManagerClient.remove_node in the Raft topology mode, we allow passing ignore_dead as List[HostId]. Note that we don't want to use List[IPAddress \| HostId] because mixing IP addresses and host ids fails anyway. See ss::remove_node.set(...) in api::set_storage_service.	2023-08-22 14:19:09 +02:00
Petr Gusev	1ddc76ffd1	test_fencing: add test_fence_hints The test makes a write through the first node with the third node down, this causes a hint to be stored on the first node for the second. We increment the version and fence_version on the third node, restart it, and expect to see a hint delivery failure because of versions mismatch. Then we update the versions of the first node and expect hint to be successfully delivered.	2023-08-22 15:48:40 +04:00
Petr Gusev	c434d26b36	test.py: add skip_mode decorator and fixture Syntactic sugar for marking tests to be skipped in a particular mode. There is skip_in_debug/skip_in_release in suite.yaml, but they can be applied only on the entire file, which is unnatural and inconvenient. Also, they don't allow to specify a reason why the test is skipped. Separate dictionary skipped_funcs is needed since we can't use pytest fixtures in decorators.	2023-08-22 15:48:40 +04:00
Petr Gusev	a639d161e6	test.py: add mode fixture Sometimes a test wants to know what mode it is running in so that e.g. it can skip itself in some of them.	2023-08-22 15:48:40 +04:00
Petr Gusev	0b7a90dff6	pylib: add ScyllaMetrics This patch adds facilities to work with Scylla metrics from test.py tests. The new metrics property was added to ManagerClient, its query method sends a request to Scylla metrics endpoint and returns and object to conveniently access the result. ScyllaMetrics is copy-pasted from test_shedding.py. It's difficult to reuse code between 'new' and 'old' styles of tests, we can't just import pylib in 'old' tests because of some problems with python search directories. A past commit of mine that attempted to solve this problem was rejected on review.	2023-08-22 14:31:04 +04:00
Petr Gusev	360453fd87	fencing: add simple data plane test The test starts a three node cluster and manually decrements the version on the last node. It then tries to write some data through the last node and expects to get 'stale topology' exception.	2023-08-22 14:31:01 +04:00
Nadav Har'El	a963b59495	test/cql-pytest: add reproducer for IN not working with secondary index We already have a test for issue #13533, where an "IN" doesn't work with a secondary index (the secondary index isn't used in that case, and instead inefficient filtering is required). Recently a user noticed the same problem also exists for local secondary indexes - and this patch includes a reproducing test. The new test is marked xfail, as the issue is still unfixed. The new test is Scylla-only because local secondary index is a Scylla-only extension that doesn't exist in Cassandra. Refs #13533. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #15106	2023-08-22 07:25:32 +03:00
Nadav Har'El	18e8e62798	cql-pytest: translate Cassandra's tests for SELECT with LIMIT This is a translation of Cassandra's CQL unit test source file validation/operations/SelectLimitTest.java into our cql-pytest framework. The tests reproduce two already-known bugs: Refs #9879: Using PER PARTITION LIMIT with aggregate functions should fail as Invalid query Refs #10357: Spurious static row returned from query with filtering, despite not matching filter And also helped discover two new issues: Refs #15099: Incorrect sort order when combining IN, and ORDER BY Refs #15109: PER PARTITION LIMIT should be rejected if SELECT DISTINCT is used Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #15114	2023-08-21 22:29:11 +03:00
Avi Kivity	ce43effc21	Merge "fix rebuild with consistent topology management" From Gleb Natapov " The series fixes bogus asserting during topology state load and add a test that runs rebuild to make sure the code will not regress again. Fixes #14958 " * 'gleb/rebuilding_fix_v1' of github.com:scylladb/scylla-dev: test: add rebuild test system_keyspace: fix assertion for missing transition_state	2023-08-21 16:00:42 +03:00
Kefu Chai	8cc215db96	test: randomized_nemesis_test: do not brace around scalars Clang and GCC's warning option of `-Wbraced-scalar-init` warns at seeing superfluous use of braces, like: ``` /home/kefu/dev/scylladb/test/raft/randomized_nemesis_test.cc:2187:32: error: braces around scalar initializer [-Werror,-Wbraced-scalar-init] .snapshot_threshold{1}, ^~~ ``` usually, this does not hurt. but by taking the braces out, we have a more readable piece of code, and less warnings. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #15086	2023-08-21 15:57:06 +03:00
Kefu Chai	0bc99c7f49	test: sstable_compaction_test: add a test for capped tombstone ldt local_delection_time (short for ldt) is a timestamp used for the purpose of purging the tombstone after gc_grace_seconds. if its value is greater than INT32_MAX, it is capped when being written to sstable. this is very likely a signal of bad configuration or a even a bug in scylla. so we keep track of it with a metric named "scylla_sstables_capped_tombstone_deletion_time". in this change, a test is added to verify that the metric is updated upon seeing a tombstone with this abnormal ldt. because we validate the consistency before and after compaction in tests, this change adds a parameter to disable this check, otherwise, because capping the ldt changes the mutation, the validation would fail the test. Refs #15015 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-08-21 19:25:32 +08:00
Petr Gusev	9176a3341a	test_topology_smp: more logs for debug/aarch64 The test is flaky on CI in debug builds on aarch64 (#14752), here we sprinkle more logs for debug/aarch64 hoping it'll help to debug it. Ref #14752 Closes #14822	2023-08-21 10:03:09 +03:00
Kefu Chai	1aa01d63d4	test: randomized_nemesis_test: mark direct_fd_{pinger,clock} final `raft_server` in test/raft/randomized_nemesis_test.cc manages instances of direct_fd_pinger and direct_fd_clock with unique_ptr<>. this unique_ptr<> deletes these managed instances using delete. but since these two classes have virtual methods, the compiler feels nervous when deleting them. because these two classes have virtual functions, but they do not have virtual destructor. in other words, in theory, these pointers could be pointing derived classes of them, and deleting them could lead to leak. so to silence the warning and to prevent potential issues, let's just mark these two classes final. this should address the warning like: ``` In file included from /home/kefu/dev/scylladb/test/raft/randomized_nemesis_test.cc:9: In file included from /home/kefu/dev/scylladb/seastar/include/seastar/core/reactor.hh:24: In file included from /home/kefu/dev/scylladb/seastar/include/seastar/core/aligned_buffer.hh:24: In file included from /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/memory:78: /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/unique_ptr.h:99:2: error: delete called on non-final 'direct_fd_pinger<int>' that has virtual functions but non-virtual destructor [-Werror,-Wdelete-non-abstract-non-virtual-dtor] delete __ptr; ^ /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/unique_ptr.h:404:4: note: in instantiation of member function 'std::default_delete<direct_fd_pinger<int>>::operator()' requested here get_deleter()(std::move(__ptr)); ^ /home/kefu/dev/scylladb/test/raft/randomized_nemesis_test.cc:1400:5: note: in instantiation of member function 'std::unique_ptr<direct_fd_pinger<int>>::~unique_ptr' requested here ~raft_server() { ^ /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/unique_ptr.h:99:2: note: in instantiation of member function 'raft_server<ExReg>::~raft_server' requested here delete __ptr; ^ /usr/bin/../lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/bits/unique_ptr.h:404:4: note: in instantiation of member function 'std::default_delete<raft_server<ExReg>>::operator()' requested here get_deleter()(std::move(__ptr)); ^ /home/kefu/dev/scylladb/test/raft/randomized_nemesis_test.cc:1704:24: note: in instantiation of member function 'std::unique_ptr<raft_server<ExReg>>::~unique_ptr' requested here ._server = nullptr, ^ /home/kefu/dev/scylladb/test/raft/randomized_nemesis_test.cc:1742:19: note: in instantiation of member function 'environment<ExReg>::new_node' requested here auto id = new_node(first, std::move(cfg)); ^ /home/kefu/dev/scylladb/test/raft/randomized_nemesis_test.cc:2113:39: note: in instantiation of member function 'environment<ExReg>::new_server' requested here auto leader_id = co_await env.new_server(true); ^ ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #15084	2023-08-20 21:26:08 +03:00
Pavel Emelyanov	6bc30f1944	system_keyspace: De-bloat .setup() from messing with system.local On boot several manipulations with system.local are performed. 1. The host_id value is selected from it with key = local If not found, system_keyspace generates a new host_id, inserts the new value into the table and returns back 2. The cluster_name is selected from it with key = local Then it's system_keyspace that either checks that the name matches the one from db::config, or inserts the db::config value into the table 3. The row with key = local is updated with various info like versions, listen, rpc and bcast addresses, dc, rack, etc. Unconditionally All three steps are scattered over main, p.1 is called directly, p.2 and p.3 are executed via system_keyspace::setup() that happens rather late. Also there's some touch of this table from the cql_test_env startup code. The proposal is to collect this setup into one place and execute it early -- as soon as the system.local table is populated. This frees the system_keyspace code from the logic of selecting host id and cluster name leaving it to main and keeps it with only select/insert work. refs: #2795 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #15082	2023-08-20 21:24:31 +03:00
Tomasz Grabiec	1552044615	storage_service, tablets: Fix corrupting tablet metadata on migration concurrent with table drop Tablet migration may execute a global token metadata barrier before executing updates of system.tablets. If table is dropped while the barrier is happening, the updates will bring back rows for migrated tablets in a table which is no longer there. This will cause tablet metadata loading to fail with error: missing_column (missing column: tablet_count) Like in this log line: storage_service - raft topology: topology change coordinator fiber got error raft::stopped_error (Raft instance is stopped, reason: "background error, std::_Nested_exception<raft::state_machine_error> (State machine error at raft/server.cc:1206): std::_Nested_exception<std::runtime_error> (Failed to read tablet metadata): missing_column (missing column: tablet_count)œ") The fix is to read and execute the updates in a single group0 guard scope, and move execution of the barrier later. We cannot now generate updates in the same handle_tablet_migration() step if barrier needs to be executed, so we resuse the mechanism for two-step stage transition which we already have for handling of streaming. The next pass will notice that the barrier is not needed for a given tablet and will generate the stage update. Fixes #15061 Closes #15069	2023-08-20 21:17:57 +03:00
Tomasz Grabiec	bd8bb5d4b1	Merge 'Wire tablet into compaction group' from Raphael "Raph" Carvalho Compaction group is the data plane for tablets, so this integration allows each tablet to have its own storage (memtable + sstables). A crucial step for dynamic tablets, where each tablet can be worked on independently. There are still some inefficiencies to be worked on, but as it is, it already unlocks further development. ``` INFO 2023-07-27 22:43:38,331 [shard 0] init - loading tablet metadata INFO 2023-07-27 22:43:38,333 [shard 0] init - loading non-system sstables INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 0 present for ks.cf INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 2 present for ks.cf INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 4 present for ks.cf INFO 2023-07-27 22:43:38,354 [shard 0] table - Tablet with id 6 present for ks.cf INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 1 present for ks.cf INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 3 present for ks.cf INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 5 present for ks.cf INFO 2023-07-27 22:43:38,428 [shard 1] table - Tablet with id 7 present for ks.cf ``` Closes #14863 * github.com:scylladb/scylladb: Kill scylla option to configure number of compaction groups replica: Wire tablet into compaction group token_metadata: Add this_host_id to topology config replica: Switch to chunked_vector for storing compaction groups replica: Generate group_id for compaction_group on demand	2023-08-18 15:17:17 +02:00
Avi Kivity	1901475598	Merge 'config: mark "experimental" option unused and cleanups' from Kefu Chai in this series, the "experimental" option is marked `Unused` as it has been marked deprecated for almost 2 years since scylla 4.6. and use `experimental_features` to specify the used experimental features explicitly. Closes #14948 * github.com:scylladb/scylladb: config: remove unused namespace alias config: use std::ranges when appropriate config: drop "experimental" option test: disable 'enable_user_defined_functions' if experimental_features does not include udf test: pylib: specify experimental_features explicitly	2023-08-17 20:42:02 +03:00
Pavel Emelyanov	3ed5b00ba2	Merge 's3/client: generate config file for tests and cleanups' from Kefu Chai before this change, object_store/test_basic.py create a config file for specifying the object storage settings, and pass the path of this file as the argument of `--object-storage-config-file` option when running scylla. we have the same requirement when testing scylla with minio server, where we launch a minio server and manually create a the config file and feed it to scylla. to ease the preparation work, let's consolidate by creating the config file in `minio_server.py`, so it always creates the config file and put it in its tempdir. since object_store/test_basic.py can also run against an S3 bucket, the fixture implemented object_store/conftest.py is updated accordingly to reuse the helper exposed by MinioServer to create the config file when it is not available. Closes #15064 * github.com:scylladb/scylladb: s3/client: avoid hardwiring env variables names s3/client: generate config file for tests	2023-08-17 16:39:23 +03:00
Gleb Natapov	4ffc39d885	cql3: Extend the scope of group0_guard during DDL statement execution Currently we hold group0_guard only during DDL statement's execute() function, but unfortunately some statements access underlying schema state also during check_access() and validate() calls which are called by the query_processor before it calls execute. We need to cover those calls with group0_guard as well and also move retry loop up. This patch does it by introducing new function to cql_statement class take_guard(). Schema altering statements return group0 guard while others do not return any guard. Query processor takes this guard at the beginning of a statement execution and retries if service::group0_concurrent_modification is thrown. The guard is passed to the execute in query_state structure. Fixes: #13942 Message-ID: <ZNsynXayKim2XAFr@scylladb.com>	2023-08-17 15:52:48 +03:00
Kefu Chai	fc6b8d4040	s3/client: avoid hardwiring env variables names instead of hardwiring the names in multiple places, let's just keep them in a single place as variables, and reference them by these variables instead of their values. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-08-17 16:06:55 +08:00
Kefu Chai	ec7fa3628c	s3/client: generate config file for tests before this change, object_store/test_basic.py create a config file for specifying the object storage settings, and pass the path of this file as the argument of `--object-storage-config-file` option when running scylla. we have the same requirement when testing scylla with minio server, where we launch a minio server and manually create a the config file and feed it to scylla. to ease the preparation work, let's consolidate by creating the config file in `minio_server.py`, so it always creates the config file and put it in its tempdir. since object_store/test_basic.py can also run against an S3 bucket, the fixture implemented object_store/conftest.py is updated accordingly to reuse the helper exposed by MinioServer to create the config file when it is not available. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-08-17 16:06:55 +08:00
Raphael S. Carvalho	b578d6643f	Kill scylla option to configure number of compaction groups The option was introduced to bootstrap the project. It's still useful for testing, but that translates into maintaining an additional option and code that will not be really used outside of testing. A possible option is to later map the option in boost tests to initial_tablets, which may yield the same effect for testing. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-08-16 18:23:53 -03:00
Raphael S. Carvalho	5d1f60439a	token_metadata: Add this_host_id to topology config The motivation is that token_metadata::get_my_id() is not available early in the bootstrap process, as raft topology is pulled later than new tables are registered and created, and this node is added to topology even later. To allow creation of compaction groups to retrieve "my id" from token metadata early, initialization will now feed local id into topology config which is immutable for each node anyway. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-08-16 18:23:44 -03:00
Kefu Chai	564522c4a8	s3/test: remove tempdir if log does not exists should have been use `ignore_errors=True` to ignore the error. this issue has not poped up, because we haven't run into the case where the log file does not exist. this was a regression introduced by `d4ee84ee1e` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #15063	2023-08-16 15:11:00 +03:00
Avi Kivity	e8f3b073c3	Merge 'Maintain sstable state explicitly' from Pavel Emelyanov An sstable can be in one of several states -- normal, quarantined, staging, uploading. Right now this "state" is hard-wired into sstable's path, e.g. quarantined sstable would sit in e.g. /var/lib/data/ks-cf-012345/quarantine/ directory. Respectively, there's a bunch of directory names constexprs in sstables.hh defining each "state". Other than being confusing, this approach doesn't work well with S3 backend. Additionally, there's snapshot subdir that adds to the confusion, because snapshot is not quite a state. This PR converts "state" from constexpr char* directories names into a enum class and patches the sstable creation, opening and state-changing API to use that enum instead of parsing the path. refs: #13017 refs: #12707 Closes #14152 * github.com:scylladb/scylladb: sstable/storage: Make filesystem storage with initial state sstable: Maintain state sstable: Make .change_state() accept state, not directory string sstable: Construct it with state sstables_manager: Remove state-less make_sstable() table: Make sstables with required state test: Make sstables with upload state in some cases tools: Make sstables with normal state table: Open-code sstables making streaming helpers tests: Make sstables with normal state by default sstable_directory: Make sstable with required state sstable_directory: Construct with state distributed_loader: Make sstable with desired state when populating distributed_loader: Make sstable with upload state when uploading sstable: Introduce state enum sstable_directory: Merge verify and g.c. calls distributed_loader: Merge verify and gc invocations sstable/filesystem: Put underscores to dir members sstable/s3: Mark make_s3_object_name() const sstable: Remove filename(dir, ...) method	2023-08-15 17:44:06 +03:00
Avi Kivity	5949623e0d	Merge 'sstable_set: maintain bytes on disk' from Benny Halevy and use that in compaction_group, rather than respective accumulators of its own. This is part of of larger series to make cache updates exception safe. Refs #14043 Closes #15052 * github.com:scylladb/scylladb: sstable_set: maintain total bytes_on_disk sstable_set: insert, erase: return status	2023-08-15 17:32:12 +03:00
Kefu Chai	64ed0127d7	s3/client: retry if minio server fails to start there is a small time window after we find a free port and before the minio server listens on that port, if another server sneaked in the time window and listen on that port, minio server can still fail to start even there might be free port for it. so, in this change, we just retry with a random port for a fixed number of times until the minio server is able to serve. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #15042	2023-08-15 16:17:47 +03:00
Raphael S. Carvalho	2590eec352	replica: Generate group_id for compaction_group on demand There are a few good reasons for this change. 1) compaction_group doesn't have to be aware of # of groups 2) thinking forward to dynamic tablets, # of groups cannot be statically embedded in group id, otherwise it gets stale. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-08-15 09:04:05 -03:00
Avi Kivity	d57a951d48	Revert "cql3: Extend the scope of group0_guard during DDL statement execution" This reverts commit `70b5360a73`. It generates a failure in group0_test .test_concurrent_group0_modifications in debug mode with about 4% probability. Fixes #15050	2023-08-15 00:26:45 +03:00
Benny Halevy	f54ab48273	sstable_set: maintain total bytes_on_disk and use that in compaction_group, rather than respective accumulators of its own. bytes_on_disk is implemented by each sstable_set_impl and is update on insert and erase (whether directly into the sstable_set_impl or via the sstable_set). Although compound_sstable_set doesn't implement insert and erase, it override `bytes_on_disk()` to return the sum of all the underlying `sstable_set::bytes_on_disk()`. Also, added respective unit tests for `partitioned_sstable_set` and `time_series_sstable_set`, that test each type's bytes_on_disk, including cloning of the set, and the `compound_sstable_set` bytes_on_disk semantics. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-08-14 21:07:27 +03:00
Pavel Emelyanov	b06917f235	sstable: Make .change_state() accept state, not directory string Pretty cosmetic change, but it will allow S3 to finally support moving sstables between states (after this patch it still doesn't) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-14 15:40:44 +03:00
Pavel Emelyanov	855f2b4b86	test: Make sstables with upload state in some cases As was mentione in the previous patch, there are few places in tests that put sstables in upload/ subdir and they really mean it. Those need to use sstables manager/directory API directly (already) and specify the state explicitly (this patch) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-14 15:28:54 +03:00
Pavel Emelyanov	734c0820df	tests: Make sstables with normal state by default It's assumed that sstables are not very specific about which subdirectory an sstable is, so they can use normal state. Places that need to move sstables between states will use sstable manager API explicitly Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-14 14:56:02 +03:00
Pavel Emelyanov	c0b922a8af	sstable_directory: Construct with state This is to replace full path sitting on this object eventually. For now they have to co-exist, but state will be used to make_sstable()-s from manager with its new API Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-14 14:56:01 +03:00
Avi Kivity	b120d35c58	Merge 'Relax cql_test_env services maintenance' from Pavel Emelyanov To add a sharded service to the cql_test_env one needs to patch it in 5 or 6 places - add cql_test_env reference - add cql_test_env constructor argument - initialize the reference in initializer list - add service variable to do_with method - pass the variable to cql_test_env constructor - (optionally) export it via cql_test_env public method Steps 1 through 5 are annoying, things get much simpler if look like - add cql_test_env variable - (optionally) export it via cql_test_env public method This is what this PR does refs: #2795 Closes #15028 * github.com:scylladb/scylladb: cql_test_env: Drop local *this reference cql_test_env: Drop local references cql_test_env: Move most of the stuff in run_in_thread() cql_test_env: Open-code env start/stop and remove both cql_test_env: Keep other services as class variables cql_test_env: Keep services as class variables cql_test_env: Construct env early cql_test_env: De-static fdpinger variable cql_test_env: Define all services' variables early cql_test_env: Keep group0_client pointer	2023-08-13 20:24:52 +03:00
Gleb Natapov	70b5360a73	cql3: Extend the scope of group0_guard during DDL statement execution Currently we hold group0_guard only during DDL statement's execute() function, but unfortunately some statements access underlying schema state also during check_access() and validate() calls which are called by the query_processor before it calls execute. We need to cover those calls with group0_guard as well and also move retry loop up. This patch does it by introducing new function to cql_statement class take_guard(). Schema altering statements return group0 guard while others do not return any guard. Query processor takes this guard at the beginning of a statement execution and retries if service::group0_concurrent_modification is thrown. The guard is passed to the execute in query_state structure. Fixes: #13942 Message-ID: <ZNSWF/cHuvcd+g1t@scylladb.com>	2023-08-13 14:19:39 +03:00
Pavel Emelyanov	64ddc9e4b4	cql_test_env: Drop local this reference The auto& env = this is also now excessive, so drop it Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-12 15:30:34 +03:00
Pavel Emelyanov	de679d7c36	cql_test_env: Drop local references The local auto& foo = env._foo references in run_in_thread() a no longer needed, the code that uses foo can be switched to use _foo (this->_foo) instead Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-12 15:29:42 +03:00
Pavel Emelyanov	487ecae517	cql_test_env: Move most of the stuff in run_in_thread() Thw do_with() method is static and cannot just access cql_test_env variable's fields, using local references instead. To simplify this, most of the method's content is moved to non-static run_in_thread() method Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-12 15:28:40 +03:00
Pavel Emelyanov	2c175660f2	cql_test_env: Open-code env start/stop and remove both These two just make more churn in next patch, so drop both Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-12 15:28:03 +03:00
Pavel Emelyanov	10f9292fe8	cql_test_env: Keep other services as class variables There are more services on do_with() stack that are not referenced from the cql_test_env. Move them to be class variables too Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-12 15:27:19 +03:00
Pavel Emelyanov	08a3be3b17	cql_test_env: Keep services as class variables Now they are duplicated -- variables exist on do_with() stack and the class references some of them. This patch makes is vice-versa -- all the variables are on the cql_test_env and do_with() references them. The latter will change soon Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-12 15:26:21 +03:00
Pavel Emelyanov	b31d2097b8	cql_test_env: Construct env early Its constructor is _just_ assigning references and setting up rlimits. Both can happen early Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-08-12 15:25:49 +03:00

1 2 3 4 5 ...

5469 Commits