scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-07 15:33:15 +00:00

Author	SHA1	Message	Date
Avi Kivity	4b1ef00dbb	Merge 'File stream for tablet preparation' from Asias He This series adds preparation patches for file stream tablet implementation in enterprise branch. It minimizes the differences between those two branches. Closes scylladb/scylladb#16297 * github.com:scylladb/scylladb: messaging_service: Introduce STREAM_BLOB and TABLET_STREAM_FILES verb compaction_group_for_token: Handle minimum_token and maximum_token token serializer: Add temporary_buffer support cql_test_env: Allow messaging_service to start listen	2023-12-07 16:26:22 +02:00
Avi Kivity	ed2a9b8750	Merge 'Commitlog: Fix reading/writing position calculations and allocation size checks' from Calle Wilund Fixes #16298 The adjusted buffer position calculation in buffer_position(), introduced in https://github.com/scylladb/scylladb/pull/15494 was in fact broken. It calculated (like previously) a "position" based on diff between underlying buffer size and ostream size() (i.e. avail), then adjusted this according to sector overhead rules. However, the underlying buffer size is in unadjusted terms, and the ostream is adjusted. The two cannot be compared as such, which means the "positions" we get here are borked. Luckily for us (sarcasm), the position calculation in replayer made a similar error, in that it adjusts up current position by one sector overhead to much, leading to us more or less getting the same, erroneous results in both ends. However, when/iff one needs to adjust the segment file format further, one might very quickly realize that this does not work well if, say, one needs to be able to safely read some extra bytes before first chunk in a segment. Conversely, trying to adjust this also exposes a latent potential error in the skip mechanism, manifesting here. Issue fixed by keeping track of the initial ostream capacity for segment buffer, and use this for position calculation, and in the case of replayer, move file pos adjustment from read_data() to subroutine (shared with skipping), that better takes data stream position vs. file position adjustment. In implementaion terms, we first inc the "data stream" pos (i.e. pos in data without overhead), then adjust for overhead. Also fix replayer::skip, so that we handle the buffer/pos relation correctly now. Added test for intial entry position, as well as data replay consistency for single entry_writer paths. Fixes #16301 The calculation on whether data may be added is based on position vs. size of incoming data. However, it did not take sector overhead into account, which lead us to writing past allowed segment end, which in turn also leads to metrics overflows. Closes scylladb/scylladb#16302 * github.com:scylladb/scylladb: commitlog: Fix allocation size check to take sector overhead into account. commitlog: Fix commitlog_segment::buffer_position() calculation and replay counterpart	2023-12-07 12:27:54 +02:00
Botond Dénes	fb9379edf1	test/cql-pytest: test_select_from_mutation_fragments: bump timeout for slow test The test test_many_partitions is very slow, as it tests a slow scan over a lot of partitions. This was observed to time out on the slower ARM machines, making the test flaky. To prevent this, create an extra-patient cql connection with a 10 minutes timeout for the scan itself. Fixes: #16145 Closes scylladb/scylladb#16303	2023-12-07 11:55:53 +02:00
Pavel Emelyanov	76705b6ba2	test/s3: Avoid object range overflow There's a test case the validates uploading sink by getting random portions of the uploaded object. The portions are generated as len = random % chunk_size off = random % file_size - len The latter may apparently render negative value which will translate into huuuuge 64-bit range offset which, in turn, would result in invalid http range specifier and getting object part fails with status OK Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-12-07 10:54:54 +03:00
Calle Wilund	dba39b47bd	commitlog: Fix allocation size check to take sector overhead into account. Fixes #16301 The calculation on whether data may be added is based on position vs. size of incoming data. However, it did not take sector overhead into account, which lead us to writing past allowed segment end, which in turn also leads to metrics overflows.	2023-12-07 07:36:27 +00:00
Calle Wilund	0d35c96ef4	commitlog: Fix commitlog_segment::buffer_position() calculation and replay counterpart Fixes #16298 The adjusted buffer position calculation in buffer_position(), introduced in #15494 was in fact broken. It calculated (like previously) a "position" based on diff between underlying buffer size and ostream size() (i.e. avail), then adjusted this according to sector overhead rules. However, the underlying buffer size is in unadjusted terms, and the ostream is adjusted. The two cannot be compared as such, which means the "positions" we get here are borked. Luckily for us (sarcasm), the position calculation in replayer made a similar error, in that it adjusts up current position by one sector overhead to much, leading to us more or less getting the same, erroneous results in both ends. However, when/iff one needs to adjust the segment file format further, one might very quickly realize that this does not work well if, say, one needs to be able to safely read some extra bytes before first chunk in a segment. Conversely, trying to adjust this also exposes a latent potential error in the skip mechanism, manifesting here. Issue fixed by keeping track of the initial ostream capacity for segment buffer, and use this for position calculation, and in the case of replayer, move file pos adjustment from read_data() to subroutine (shared with skipping), that better takes data stream position vs. file position adjustment. In implementaion terms, we first inc the "data stream" pos (i.e. pos in data without overhead), then adjust for overhead. Also fix replayer::skip, so that we handle the buffer/pos relation correctly now. Added test for intial entry position, as well as data replay consistency for single entry_writer paths.	2023-12-07 07:36:27 +00:00
Asias He	faaf58f62c	cql_test_env: Allow messaging_service to start listen This is needed for rpc calls to work in the tests. With this patch, by default, messaging_service does not listen as it was before. This is useful for file stream for tablet test.	2023-12-07 09:46:36 +08:00
Tomasz Grabiec	7d0f4c10a2	test: tablets: Add test for failed streaming being fenced away	2023-12-06 18:37:01 +01:00
Tomasz Grabiec	733eb21601	api: Add API to kill connection to a particular host For testing failure scenarios.	2023-12-06 18:36:17 +01:00
Tomasz Grabiec	d1c1b59236	storage_service, api: Add API to disable tablet balancing Load balancing needs to be disabled before making a series of manual migrations so that we don't fight with the load balancer. Also will be used in tests to ensure tablets stick to expected locations.	2023-12-06 18:36:17 +01:00
Tomasz Grabiec	1f57d1ea28	storage_service, api: Add API to migrate a tablet Will be used in tests, or for hot fixes in production.	2023-12-06 18:36:17 +01:00
Tomasz Grabiec	5381792401	tablets: Add per-tablet session id field to tablet metadata range_streamer will pick it up when creating topology_guard. It's materialized in memory only for migrating tablets in tablet_transition_info.	2023-12-06 18:36:17 +01:00
Botond Dénes	d2a88cd8de	Merge 'Typos: fix typos in code' from Yaniv Kaul Fixes some more typos as found by codespell run on the code. In this commit, there are more user-visible errors. Refs: https://github.com/scylladb/scylladb/issues/16255 Closes scylladb/scylladb#16289 * github.com:scylladb/scylladb: Update unified/build_unified.sh Update main.cc Update dist/common/scripts/scylla-housekeeping Typos: fix typos in code	2023-12-06 07:36:41 +02:00
Avi Kivity	12f160045b	Merge 'Get rid of fb_utilities' from Benny Halevy utils::fb_utilities is a global in-memory registry for storing and retrieving broadcast_address and broadcat_rpc_address. As part of the effort to get rid of all global state, this series gets rid of fb_utilities. This will eventually allow e.g. cql_test_env to instantiate multiple scylla server nodes, each serving on its own address. Closes scylladb/scylladb#16250 * github.com:scylladb/scylladb: treewide: get rid of now unused fb_utilities tracing: use locator::topology rather than fb_utilities streaming: use locator::topology rather than fb_utilities raft: use locator::topology/messaging rather than fb_utilities storage_service: use locator::topology rather than fb_utilities storage_proxy: use locator::topology rather than fb_utilities service_level_controller: use locator::topology rather than fb_utilities misc_services: use locator::topology rather than fb_utilities migration_manager: use messaging rather than fb_utilities forward_service: use messaging rather than fb_utilities messaging_service: accept broadcast_addr in config rather than via fb_utilities messaging_service: move listen_address and port getters inline test: manual: modernize message test table: use gossiper rather than fb_utilities repair: use locator::topology rather than fb_utilities dht/range_streamer: use locator::topology rather than fb_utilities db/view: use locator::topology rather than fb_utilities database: use locator::topology rather than fb_utilities db/system_keyspace: use topology via db rather than fb_utilities db/system_keyspace: save_local_info: get broadcast addresses from caller db/hints/manager: use locator::topology rather than fb_utilities db/consistency_level: use locator::topology rather than fb_utilities api: use locator::topology rather than fb_utilities alternator: ttl: use locator::topology rather than fb_utilities gossiper: use locator::topology rather than fb_utilities gossiper: add get_this_endpoint_state_ptr test: lib: cql_test_env: pass broadcast_address in cql_test_config init: get_seeds_from_db_config: accept broadcast_address locator: replication strategies: use locator::topology rather than fb_utilities locator: topology: add helpers to retrieve this host_id and address snitch: pass broadcast_address in snitch_config snitch: add optional get_broadcast_address method locator: ec2_multi_region_snitch: keep local public address as member ec2_multi_region_snitch: reindent load_config ec2_multi_region_snitch: coroutinize load_config ec2_snitch: reindent load_config ec2_snitch: coroutinize load_config thrift: thrift_validation: use std::numeric_limits rather than fb_utilities	2023-12-05 19:40:14 +02:00
Benny Halevy	0bcce35abd	treewide: get rid of now unused fb_utilities Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-05 16:22:49 +02:00
Yaniv Kaul	ae2ab6000a	Typos: fix typos in code Fixes some more typos as found by codespell run on the code. In this commit, there are more user-visible errors. Refs: https://github.com/scylladb/scylladb/issues/16255	2023-12-05 15:18:11 +02:00
Tomasz Grabiec	d3d83869ce	storage_service: Introduce session concept	2023-12-05 14:09:34 +01:00
Botond Dénes	5fb0d667cb	tools/scylla-sstable: always read scylla.yaml Currently, scylla.yaml is read conditionally, if either the user provided `--scylla-yaml-file` command line parameter, or if deducing the data dir location from the sstable path failed. We want the scylla.yaml file to be always read, so that when working with encrypted file (enterprise), scylla-sstable can pick up the configuration for the encryption. This patch makes scylla-sstable always attempt to read the scylla-yaml file, whether the user provided a location for it or not. When not, the default location is used (also considering the `SCYLLA_CONF` and `SCYLLA_HOME` environment variables. Failing to find the scylla.yaml file is not considered an error. The rational is that the user will discover this if they attempt to do an operation that requires this anyway. There is a debug-level log about whether it was successfully read or not. Fixes: #16132 Closes scylladb/scylladb#16174	2023-12-05 15:06:29 +02:00
Benny Halevy	6c00c9a45d	raft: use locator::topology/messaging rather than fb_utilities Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-05 13:26:46 +02:00
Kamil Braun	52ae6b8738	Merge 'fix shutdown order between group0 and storage service' from Gleb Storage service uses group0 internally, but group0 is create long after storage service is initialized and passed to it using ss::set_group0() function. What it means is that during shutdown group0 is destroyed before ss::stop() is called and thus storage service is left with a dangling reference. Fix it by introducing a function that cancels all group0 operations and waits for background fibers to complete. For that we need separate abort source for group0 operation which the patch series also introduces. * 'gleb/group0-ss-shutdown' of github.com:scylladb/scylla-dev: storage_service: topology coordinator: ignore abort_requested_exception in background fibers storage_service: fix de-initialization order between storage service and group0_service	2023-12-05 11:20:52 +01:00
Benny Halevy	984a576405	messaging_service: accept broadcast_addr in config rather than via fb_utilities Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-05 09:46:25 +02:00
Benny Halevy	eabd4570da	test: manual: modernize message test Basically, make it work (great) again. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-05 09:44:26 +02:00
Benny Halevy	4bb4d673c3	db/system_keyspace: save_local_info: get broadcast addresses from caller So not to rely on fb_utilities. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-05 08:42:49 +02:00
Benny Halevy	f3e0358563	gossiper: use locator::topology rather than fb_utilities And add `get_endpoint_state_ptr` for this_node. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-05 08:42:49 +02:00
Benny Halevy	21ace44f03	test: lib: cql_test_env: pass broadcast_address in cql_test_config For getting rid of fb_utilities. In the future, that could be used to instantiate multiple scylla node instances. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-05 08:42:49 +02:00
Benny Halevy	4d461fc788	locator: replication strategies: use locator::topology rather than fb_utilities Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-05 08:42:49 +02:00
Benny Halevy	86716b2048	locator: topology: add helpers to retrieve this host_id and address And respective `is_me()` predicates, to prepare for getting rid of fb_utilities. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-05 08:42:49 +02:00
Benny Halevy	52412087b7	snitch: pass broadcast_address in snitch_config To untangle snitch from fb_utilities. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-05 08:42:49 +02:00
Kefu Chai	a03be17da7	test/boost/sstable_generation_test: s/LE/LT/ when appropriate in `7a1fbb38`, a new test is added to an existing test for comparing the UUIDs with different time stamps, but we should tighten the test a little bit to reflect the intention of the test: the timestamp of "2023-11-24 23:41:56" should be less than "2023-11-24 23:41:57". in this change, we replace LE with LT to correct it. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16245	2023-12-05 08:25:04 +03:00
Patryk Jędrzejczak	c8ee7d4499	db: make schema commitlog feature mandatory Using consistent cluster management and not using schema commitlog ends with a bad configuration throw during bootstrap. Soon, we will make consistent cluster management mandatory. This forces us to also make schema commitlog mandatory, which we do in this patch. A booting node decides to use schema commitlog if at least one of the two statements below is true: - the node has `force_schema_commitlog=true` config, - the node knows that the cluster supports the `SCHEMA_COMMITLOG` cluster feature. The `SCHEMA_COMMITLOG` cluster feature has been added in version 5.1. This patch is supposed to be a part of version 6.0. We don't support a direct upgrade from 5.1 to 6.0 because it skips two versions - 5.2 and 5.4. So, in a supported upgrade we can assume that the version which we upgrade from has schema commitlog. This means that we don't need to check the `SCHEMA_COMMITLOG` feature during an upgrade. The reasoning above also applies to Scylla Enterprise. Version 2024.2 will be based on 6.0. Probably, we will only support an upgrade to 2024.2 from 2024.1, which is based on 5.4. But even if we support an upgrade from 2023.x, this patch won't break anything because 2023.1 is based on 5.2, which has schema commitlog. Upgrades from 2022.x definitely won't be supported. When we populate a new cluster, we can use the `force_schema_commitlog=true` config to use schema commitlog unconditionally. Then, the cluster feature check is irrelevant. This check could fail because we initiate schema commitlog before we learn about the features. The `force_schema_commitlog=true` config is especially useful when we want to use consistent cluster management. Failing feature checks would lead to crashes during initial bootstraps. Moreover, there is no point in creating a new cluster with `consistent_cluster_management=true` and `force_schema_commitlog=false`. It would just cause some initial bootstraps to fail, and after successful restarts, the result would be the same as if we used `force_schema_commitlog=true` from the start. In conclusion, we can unconditionally use schema commitlog without any checks in 6.0 because we can always safely upgrade a cluster and start a new cluster. Apart from making schema commitlog mandatory, this patch adds two changes that are its consequences: - making the unneeded `force_schema_commitlog` config unused, - deprecating the `SCHEMA_COMMITLOG` feature, which is always assumed to be true. Closes scylladb/scylladb#16254	2023-12-04 21:02:16 +02:00
Calle Wilund	e94070db64	commitlog_test: Add test for commit log replay skip past EOF Refs #15269 Unit test to check that trying to skip past EOF in a borked segment will not crash the process. file_data_input_impl asserts iff caller tries this.	2023-12-04 20:50:42 +02:00
Yaniv Kaul	21cce458d8	test: alternator: fix typo passs instead of pass in test_gsi.py Fix a typo. Refs: https://github.com/scylladb/scylladb/issues/16255 Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com> Closes scylladb/scylladb#16258	2023-12-04 18:58:31 +02:00
Nadav Har'El	4505a86f46	tablets, mv: fix base-view pairing to consider base replication map In the view update code, the function get_view_natural_endpoint() determines which view replica this base replica should send an update to. It currently gets the view table's replication map (i.e., the map from view tokens to lists of replicas holding the token), but assumes that this is also the base table's replication map. This assumption was true with vnodes, but is no longer true with tablets - the base table's replication map can be completely different from the view table's. By looking at the wrong mapping, get_view_natural_endpoint() can believe that this node isn't really a base-replica and drop the view update. Alternatively, it can think it is a base replica - but use the wrong base-view pairing and create base-view inconsistencies. This patch solves this bug - get_view_natural_endpoint() now gets two separate replication maps - the base's and the view's. The callers need to remember what the base table was (in some cases they didn't care at the point of the call), and pass it to the function call. This patch also includes a simple test that reproduces the bug, and confirms it is fixed: The test has a 6-node cluster using tablets and a base table with RF=1, and writes one row to it. Before this patch, the code usually gets confused, thinking the base replica isn't a replica and loses the view update. With this patch, the view update works. Fixes #16227. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#16228	2023-12-04 16:38:54 +02:00
Avi Kivity	60af2f3cb2	Merge 'New commitlog file format using tagged pages' from Calle Wilund Prototype implementation of format suggested/requested by @avikivity: Divides segments into disk-write-alignment sized pages, each tagged with segment ID + CRC of data content. When read, we both verify sector integrity (CRC) to detect corruption, as well as matching ID read with expected one. If the latter mismatches we have a prematurely terminated segment (read truncation), which, depending on whether the CL is written in batch or periodic mode, as well as explicit sync, can mean data loss. Note: all-zero pages are treated as kosher, both to align with newly allocated segments, as well as fully terminated (zero-page) ones. Note: This is a preview/RFC - the rest of the file format is not modified. At least parts of entry CRC could probably be removed, but I have not done so yet (needs some thinking). Note: Some slight abstraction breaks in impl. and probably less than maximal efficiency. v2: * Removed entry CRC:s in file format. * Added docs on format v3 * Added one more test for recycling-truncation v3: * Fixed typos in size calc and docs * Changed sect metadata order * Explicit iter type Closes scylladb/scylladb#15494 * github.com:scylladb/scylladb: commitlog_test: Add test for replaying large-ish mutation commitlog_test: Add additional test for segmnent truncation docs: Add docs on commitlog format 3 commitlog: Remove entry CRC from file format commitlog: Implement new format using CRC:ed sectors commitlog: Add iterator adaptor for doing buffer splitting into sub-page ranges fragmented_temporary_buffer: Add const iterator access to underlying buffers commitlog_replayer: differentiate between truncated file and corrupt entries	2023-12-04 13:31:13 +01:00
Avi Kivity	8fa2e3ad2a	Merge 'Remove sstables::remove_by_toc_name()' from Pavel Emelyanov The helper in question complicates the logic of sstable_directory::process() by making garbage collection differently for sstables deleted "atomically" and deleted "one-by-one". Also, the code that deletes sstables one-by-one and uses remove_by_toc_name() renders excessive TOC file reading, because there's sstable object at hand and it had all_components() ready for use. Surprisingly, there was no test for the deletion-log functionality. This PR adds one. The test passes before the g.c. and regular unlink fix, and (of course) continues passing after it. Closes scylladb/scylladb#16240 * github.com:scylladb/scylladb: sstables: Drop remove_by_name() sstables/fs_storage: Wipe by recognized+unrecognized components sstable_directory: Enlight deletion log replay sstables: Split remove_by_toc_name() test: Add test case to validate deletion log work sstable_directory: Close dir on exception sstable_directory: Fix indentation after previous patch sstable_directory: Coroutinize delete_with_pending_deletion_log() test: Sstable on_delete() is not necessarily in a thread sstable_directory: Split delete_with_pending_deletion_log()	2023-12-03 17:29:34 +02:00
Nadav Har'El	59ff27ea4a	Merge 'Typos: fix typos in comments' from Yaniv Kaul Fixes some typos as found by codespell run on the code. In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc. Follow-up commits will take care of them. Refs: https://github.com/scylladb/scylladb/issues/16255 Closes scylladb/scylladb#16257 * github.com:scylladb/scylladb: Update service/topology_state_machine.hh Update raft/tracker.hh Update db/view/view.cc Typos: fix typos in comments	2023-12-03 11:23:51 +02:00
Yaniv Kaul	c658bdb150	Typos: fix typos in comments Fixes some typos as found by codespell run on the code. In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc. Follow-up commits will take care of them. Refs: https://github.com/scylladb/scylladb/issues/16255 Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>	2023-12-02 22:37:22 +02:00
Kamil Braun	01e54f5b12	Merge 'test: delete topology_raft_disabled suite' from Patryk Jędrzejczak This PR is a necessary step to fix #15854 -- making consistent cluster management mandatory on master. Before making consistent cluster management mandatory, we have to get rid of all tests that depend on the `consistent_cluster_management=false` config. These are the tests in the `topology_raft_disabled` suite. There's the internal Raft upgrade procedure, which is the bulk of the upgrade logic. Then, there are two thin "layers" around it that invoke it underneath: recovery procedure and enable-raft-in-the-cluster procedure. We're getting rid of the second one by making Raft always enabled, so we naturally have to get rid of tests that depend on it. The idea is to replace every necessary enable-raft-in-the-cluster procedure in these tests with the recovery procedure. Then, we will still be testing the internal Raft upgrade procedure in the in-tree tests. The enable-raft-in-the-cluster procedure is already tested by QA tests, so we don't need to worry about these changes. Unfortunately, we cannot adapt `test_raft_upgrade_no_schema`. After making consistent cluster management mandatory on master, schema commitlog will also become mandatory because `consistent_cluster_management: True`, `force_schema_commit_log: False` is considered a bad configuration. These changes will make `test_raft_upgrade_no_schema` unimplementable in the Scylla repo. Therefore, we remove this test. If we want to keep it, we must rewrite it as an upgrade dtest. After making all tests in `topology_raft_disabled` use consistent cluster management, there is no point in keeping this suite. Therefore, we delete it and move all the tests to `topology_custom`. Closes scylladb/scylladb#16192 * github.com:scylladb/scylladb: test: delete topology_raft_disabled suite test: topology_raft_disabled: move tests to topology_custom suite test: topology_raft_disabled: move utils to topology suite test: topology_raft_disabled: use consistent cluster management test: topology_raft_disabled: add new util functions test: topology_raft_disabled: delete test_raft_upgrade_no_schema	2023-12-01 17:11:32 +01:00
Pavel Emelyanov	b10ca96e07	test: Add test case to validate deletion log work The test sequence is - create several sstables - create deletion log for a sub-set of them - partially unlink smaller sub-sub-set - make sstable directory do the processing with g.c. - check that the sstables loaded do NOT include the deleted ones The .throw_on_missing_toc bit set additionally validates that the directory doesn't contain garbage not attached to any other TOCs Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-12-01 18:20:20 +03:00
Pavel Emelyanov	92f0aa04d0	test: Sstable on_delete() is not necessarily in a thread One of the test cases injects an observer into sstable->unlink() method via its _on_delete() callback. The test's callback assumes that it runs in an async context, but it's a happy coincidence, because deletion via the deletion log runs so. Next patch is changing it and the test case will no longer work. But since it's a test case it can just directly call a libc function for its needs Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-12-01 15:00:38 +03:00
Nadav Har'El	49860952f9	Merge '`LIST EFFECTIVE SERVICE LEVEL` statement' from Michał Jadwiszczak Add `LIST EFFECTIVE SERVICE LEVEL` statement to be able to display from which service level come which service level options. Example: There are 2 roles: role1 and role2. Role1 is assigned with sl1 (timeout = 2s, workload_type = interactive) and role2 is assigned with sl2 (timeout = 10s, workload_type = batch). Then, if we grant role1 to role2, the user with role2 will have 2s timeout (from sl1) and batch workload type (from sl2). ``` > LIST EFFECTIVE SERVICE LEVEL OF role2; service_level_option \| effective_service_level \| value ----------------------+-------------------------+------------- workload_type \| sl2 \| batch timeout \| sl1 \| 2s ``` Fixes: https://github.com/scylladb/scylladb/issues/15604 Closes scylladb/scylladb#14431 * github.com:scylladb/scylladb: cql-pytest: add `LIST EFFECTIVE SERVICE LEVEL OF` test docs: add `LIST EFFECTIVE SERVICE LEVEL` statement docs cql3:statements: add `LIST EFFECTIVE SERVICE LEVEL` statement service:qos: add option to include effective names to SLO	2023-11-30 18:12:52 +02:00
Gleb Natapov	8ed8b151da	storage_service: fix de-initialization order between storage service and group0_service Storage service uses group0 internally, but group0 is create long after storage service is initialized and passed to it using ss::set_group0() function. But what it means is that during shutdown group0 is destroyed before ss::stop() is called and thus storage service is left with a dangling reference. Fix it by introducing a function that cancels all group0 operations and waits for background fibers to complete. For that we need separate abort source for group0 operation which the patch also introduces.	2023-11-30 17:52:38 +02:00
Patryk Jędrzejczak	77c4ee92e5	test: delete topology_raft_disabled suite After moving all tests out of topology_raft_disabled, we can safely remove this suite.	2023-11-30 15:50:22 +01:00
Patryk Jędrzejczak	ba990d90bb	test: topology_raft_disabled: move tests to topology_custom suite We move the remaining tests in topology_raft_disabled to topology_custom. We choose topology_custom because these tests cannot use consistent topology changes. We need to modify these tests a bit because we cannot pass RandomTables to a test case function if the initial cluster size equals 0. RandomTables.__init__ requires manager.cql to be present.	2023-11-30 15:50:22 +01:00
Patryk Jędrzejczak	659ac9c7f5	test: topology_raft_disabled: move utils to topology suite We move all used util functions from topology_raft_disabled to topology before we remove topology_raft_disabled. After this change, util.py in topology will be the single util file for all topology tests. Some util functions in topology_raft_disabled aren't used anymore. We don't move such functions and remove them instead.	2023-11-30 15:50:22 +01:00
Patryk Jędrzejczak	684b070b20	test: topology_raft_disabled: use consistent cluster management Soon, we will make consistent cluster management mandatory on master. Before this, we have to change all tests in the topology_raft_disabled suite so that they do not depend on the consistent_cluster_management=false config. Adapting test_raft_upgrade_majority_loss is simple. We only have to get rid of the initial upgrade. This initial upgrade didn't test anything. Every test in topology_raft_disabled had to do it at the beginning because of consistent_cluster_management=false. Adapting test_raft_upgrade_basic and test_raft_upgrade_stuck is more difficult. It requires changing the initial upgrade to clearing Raft data in RECOVERY mode on all servers and restarting them. Then, the servers will run the same upgrade procedure as before. After changing the tests, we also update their names appropriately. test_raft_upgrade_stuck becomes a bit slower, so we remove the comment about running time. Also, one TODO was fixed in the process of rewriting the test. This fix forced us to skip the test in the release mode since we cannot update the list of error injections through manager.server_update_config in this mode.	2023-11-30 15:50:22 +01:00
Patryk Jędrzejczak	1059fece19	test: topology_raft_disabled: add new util functions They are shorter and more readable than long CQL queries. We use them even more in the following commit.	2023-11-30 15:50:22 +01:00
Patryk Jędrzejczak	7e43ebf88e	test: topology_raft_disabled: delete test_raft_upgrade_no_schema After making consistent cluster management mandatory on master, schema commitlog will also become mandatory because consistent_cluster_management: True, force_schema_commit_log: False is considered a bad configuration. These changes will make test_raft_upgrade_no_schema unimplementable in the Scylla repo, so we remove it. If we want to keep this test, we must rewrite it as an upgrade dtest.	2023-11-30 15:50:21 +01:00
Kefu Chai	7a1fbb38f9	sstable: order uuid-based generation as timeuuid under most circumstances, we don't care the ordering of the sstable identifiers, as they are just identifiers. so, as long as they can be compared, we are good. but we have tests with expect that the sstables can be ordered by the time they are created. for instance, sstable_run_based_compaction_test has this expectaion. before this change, we compare two UUID-based generations by its (MSB, LSB) lexicographically. but UUID v1 put the lower bits of the timestamp at the higher bits of MSB, so the ordering of the "time" in timeuuid is not preserved when comparing the UUID-based generations. this breaks the test of sstable_run_based_compaction_test, which feeds the sstables to be compacted in a set, and the set is ordered with the generation of the sstables. after this change, we consider the UUID-based generation as a timeuuid when comparing them. Fixes #16215 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16238	2023-11-30 14:50:44 +02:00
Michał Jadwiszczak	e3515cfc1b	cql-pytest: add `LIST EFFECTIVE SERVICE LEVEL OF` test	2023-11-30 13:07:20 +01:00

... 116 117 118 119 120 ...

11801 Commits