scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-28 10:41:12 +00:00

Author	SHA1	Message	Date
Calle Wilund	78d9dda060	config: break out object_storage_endpoint_param preparing for multi storage Moves the config wrapper to own file (to reduce recompilation for modifying) and refactors to handle extending this parameter to non-s3 endpoint configs.	2025-10-13 08:53:24 +00:00
Piotr Wieczorek	8cd9f5d271	test/alternator: Add a Streams test reproducing #26382 This commit adds a test that reproduces an issue, wherein OldImage isn't included in the REMOVE events produced by Alternator Streams. Refs https://github.com/scylladb/scylladb/issues/26382 Closes scylladb/scylladb#26383	2025-10-12 11:09:57 +03:00
Piotr Wieczorek	a55c5e9ec7	alternator: Correct RCU undercount in BatchGetItem The `describe_multi_item` function treated the last reference-captured argument as the number of used RCU half units. The caller `batch_get_item`, however, expected this parameter to hold an item size. This RCU value was then passed to `rcu_consumed_capacity_counter::get_half_units`, treating the already-calculated RCU integer as if it were a size in bytes. This caused a second conversion that undercounted the true RCU. During conversion, the number of bytes is divided by `RCU_BLOCK_SIZE_LENGTH` (=4KB), so the double conversion divided the number of bytes by 16 MB. The fix removes the second conversion in `describe_multi_item` and changes the API of `describe_multi_item`. Fixes: https://github.com/scylladb/scylladb/pull/25847 Closes scylladb/scylladb#25842	2025-10-12 10:42:32 +03:00
Patryk Jędrzejczak	5f68b9dc6b	test: test_raft_no_quorum: test_can_restart: deflake the read barrier call Expecting the group 0 read barrier to succeed with a timeout of 1s, just after restarting 3 out of 5 voters, turned out to be flaky. In some unlikely scenarios, such as multiple vote splits, the Raft leader election could finish after the read barrier times out. To deflake the test, we increase the timeout of Raft operations back to 300s for read barriers we expect to succeed. Fixes #26457 Closes scylladb/scylladb#26489	2025-10-10 15:22:39 +03:00
Asias He	13dd88b010	repair: Rename incremental mode name Using the name regular as the incremental mode could be confusing, since regular might be interpreted as the non-incremental repair. It is better to use incremental directly. Before: - regular (standard incremental repair) - full (full incremental repair) - disabled (incremental repair disabled) After: - incremental (standard incremental repair) - full (full incremental repair) - disabled (incremental repair disabled) Fixes #26503 Closes scylladb/scylladb#26504	2025-10-10 15:21:54 +03:00
Michał Chojnowski	85fd4d23fa	test_sstable_compression_dictionaries_basic: reconnect robustly after node reboots Using `driver_connect()` after a cluster restart isn't enough to ensure full CQL availability, but the test assumes that it is. Fix that by making the test wait for CQL availability via `get_ready_cql()`. Also, replace some manual usages of wait_for_cql_and_get_hosts with `get_ready_cql()` too. Fixes scylladb/scylladb#25362 Closes scylladb/scylladb#25366	2025-10-10 14:27:02 +03:00
Avi Kivity	55d4d39ae3	Merge 'transport: service_level_controller: create and use driver service level' from Andrzej Jackowski This is a cherry-pick of https://github.com/scylladb/scylladb/pull/25412 commits, as the changes were reverted in 364316dd2f2212bbbb446eaa2a4b0bd53d125ad5 due to https://github.com/scylladb/scylladb/issues/26163. The underlying problem (https://github.com/scylladb/scylladb/issues/26190) was fixed in seastar (https://github.com/scylladb/seastar/pull/2994), so https://github.com/scylladb/scylladb/pull/25412 commits are restored without changes (only rebase conflicts were resolved). === This patch series: - Increases the number of allowed scheduling groups to allow creation of `sl:driver` - Implements `create_driver_service_level` that creates `sl:driver` with shares=200 if it wasn't already created - Implements creation of `sl:driver` for new systems and tests in `raft_initialize_discovery_leader` - Modifies `topology_coordinator` to use create `sl:driver` after upgrades. - Implements using `sl:driver` for new connections in `transport/server` - Adds to `transport/server` recognition of driver's control connections and forcing them to keep using `sl:driver`. - Adds tests to verify the new functionality - Modifies existing tests to let them pass after `sl:driver` is added - Modifies the documentation to contain new `sl:driver` The changes were evaluated by a test with the following scenario ([test_connections-sl-driver.py](https://github.com/user-attachments/files/22021273/test_connections-sl-driver.py)): - Start ScyllaDB with one node - Create 1000 keyspaces, 1 table in each keyspace - Start `cassandra-stress` (`-rate threads=50 -mode native cql3`) - Run connection storm with 1000 session (100 python processes, 10 sessions each) The maximum latency during connection storm dropped from 224.94ms to 41.43ms (those numbers are average from 20 test executions, were max latency was in [140ms, 361ms] before change and [31.4ms, 61.5ms] after). The snippet of cassandra-stress output from the moment of connection storm: Before: ``` type total ops, op/s, pk/s, row/s, mean, med, .95, .99, .999, max, time, stderr, errors, gc: #, max ms, sum ms, sdv ms, mb ... total, 789206, 85887, 85887, 85887, 0.6, 0.3, 2.0, 2.0, 2.5, 5.0, 9.0, 0.09679, 0, 0, 0, 0, 0, 0 total, 909322, 120116, 120116, 120116, 0.4, 0.2, 1.9, 2.0, 2.1, 3.1, 10.0, 0.09053, 0, 0, 0, 0, 0, 0 total, 964392, 55070, 55070, 55070, 0.9, 0.4, 2.0, 4.5, 7.7, 18.9, 11.0, 0.09203, 0, 0, 0, 0, 0, 0 total, 975705, 11313, 11313, 11313, 4.4, 3.5, 6.5, 24.5, 82.7, 83.0, 12.0, 0.11713, 0, 0, 0, 0, 0, 0 total, 987548, 11843, 11843, 11843, 4.2, 3.5, 6.5, 33.7, 48.6, 51.5, 13.0, 0.13366, 0, 0, 0, 0, 0, 0 total, 995422, 7874, 7874, 7874, 6.3, 4.0, 7.7, 85.6, 112.9, 113.5, 14.0, 0.14753, 0, 0, 0, 0, 0, 0 total, 1007228, 11806, 11806, 11806, 4.3, 3.5, 6.5, 29.1, 43.8, 87.1, 15.0, 0.15598, 0, 0, 0, 0, 0, 0 total, 1012840, 5612, 5612, 5612, 8.2, 5.0, 11.5, 121.8, 166.6, 170.1, 16.0, 0.16535, 0, 0, 0, 0, 0, 0 total, 1016186, 3346, 3346, 3346, 13.4, 7.4, 20.1, 204.9, 207.6, 210.4, 17.0, 0.17405, 0, 0, 0, 0, 0, 0 total, 1025462, 9276, 9276, 9276, 6.3, 3.9, 9.6, 74.6, 206.8, 210.0, 18.0, 0.17800, 0, 0, 0, 0, 0, 0 total, 1035979, 10517, 10517, 10517, 4.8, 3.5, 6.7, 38.5, 82.6, 83.0, 19.0, 0.18120, 0, 0, 0, 0, 0, 0 total, 1047488, 11509, 11509, 11509, 4.3, 3.5, 6.0, 32.6, 72.3, 74.0, 20.0, 0.18334, 0, 0, 0, 0, 0, 0 total, 1077456, 29968, 29968, 29968, 1.7, 1.6, 2.9, 3.6, 7.0, 8.2, 21.0, 0.17943, 0, 0, 0, 0, 0, 0 total, 1105490, 28034, 28034, 28034, 1.8, 1.8, 3.5, 4.6, 5.3, 13.8, 22.0, 0.17609, 0, 0, 0, 0, 0, 0 total, 1132221, 26731, 26731, 26731, 1.9, 1.8, 3.8, 5.2, 8.4, 11.1, 23.0, 0.17314, 0, 0, 0, 0, 0, 0 total, 1162149, 29928, 29928, 29928, 1.7, 1.7, 3.0, 4.5, 8.0, 9.1, 24.0, 0.16950, 0, 0, 0, 0, 0, 0 ... ``` After: ``` type total ops, op/s, pk/s, row/s, mean, med, .95, .99, .999, max, time, stderr, errors, gc: #, max ms, sum ms, sdv ms, mb ... total, 822863, 94379, 94379, 94379, 0.5, 0.3, 2.0, 2.0, 2.1, 3.7, 9.0, 0.06669, 0, 0, 0, 0, 0, 0 total, 937337, 114474, 114474, 114474, 0.4, 0.2, 2.0, 2.0, 2.1, 3.4, 10.0, 0.06301, 0, 0, 0, 0, 0, 0 total, 986630, 49293, 49293, 49293, 1.0, 1.0, 2.0, 2.1, 17.9, 19.0, 11.0, 0.07318, 0, 0, 0, 0, 0, 0 total, 1026734, 40104, 40104, 40104, 1.2, 1.0, 2.0, 2.2, 6.3, 7.1, 12.0, 0.08410, 0, 0, 0, 0, 0, 0 total, 1066124, 39390, 39390, 39390, 1.3, 1.0, 2.0, 2.2, 2.6, 3.4, 13.0, 0.09108, 0, 0, 0, 0, 0, 0 total, 1103082, 36958, 36958, 36958, 1.3, 1.1, 2.1, 2.5, 3.1, 4.2, 14.0, 0.09643, 0, 0, 0, 0, 0, 0 total, 1141987, 38905, 38905, 38905, 1.3, 1.0, 2.0, 2.4, 11.4, 12.7, 15.0, 0.09894, 0, 0, 0, 0, 0, 0 total, 1180023, 38036, 38036, 38036, 1.3, 1.0, 2.0, 3.7, 5.6, 7.1, 16.0, 0.10070, 0, 0, 0, 0, 0, 0 total, 1216481, 36458, 36458, 36458, 1.4, 1.0, 2.1, 3.6, 4.7, 5.0, 17.0, 0.10210, 0, 0, 0, 0, 0, 0 total, 1256819, 40338, 40338, 40338, 1.2, 1.0, 2.0, 2.2, 3.5, 5.4, 18.0, 0.10173, 0, 0, 0, 0, 0, 0 total, 1295122, 38303, 38303, 38303, 1.3, 1.0, 2.0, 2.4, 21.0, 21.1, 19.0, 0.10136, 0, 0, 0, 0, 0, 0 total, 1334743, 39621, 39621, 39621, 1.3, 1.0, 2.0, 2.3, 3.3, 4.0, 20.0, 0.10055, 0, 0, 0, 0, 0, 0 total, 1375579, 40836, 40836, 40836, 1.2, 1.0, 2.0, 2.1, 3.4, 5.7, 21.0, 0.09927, 0, 0, 0, 0, 0, 0 total, 1415576, 39997, 39997, 39997, 1.2, 1.0, 2.0, 2.3, 3.2, 4.1, 22.0, 0.09807, 0, 0, 0, 0, 0, 0 total, 1449268, 33692, 33692, 33692, 1.5, 1.4, 2.5, 3.2, 4.2, 5.6, 23.0, 0.09800, 0, 0, 0, 0, 0, 0 total, 1471873, 22605, 22605, 22605, 2.2, 2.0, 4.8, 5.9, 7.0, 7.9, 24.0, 0.10015, 0, 0, 0, 0, 0, 0 ... ``` Fixes: https://github.com/scylladb/scylladb/issues/24411 This is a new feature, so no backport needed. Closes scylladb/scylladb#26411 * github.com:scylladb/scylladb: docs: workload-prioritization: add driver service level test: add test to verify use of `sl:driver` transport: use `sl:driver` to handle driver's control connections transport: whitespace only change in update_scheduling_group transport: call update_scheduling_group for non-auth connections generic_server: transport: start using `sl:driver` for new connections test: add test_desc_* for driver service level test: service_levels: add tests for sl:driver creation and removal test: add reload_raft_topology_state() to ScyllaRESTAPIClient service_level_controller: automatically create `sl:driver` service_level_controller: methods to create driver service level service_level_controller: handle special sl:driver in DESC output topology_coordinator: add service_level_controller reference system_keyspace: add service_level_driver_created test: add MAX_USER_SERVICE_LEVELS	2025-10-09 17:28:39 +03:00
Michał Chojnowski	c35b82b860	test/cluster/test_bti_index.py: avoid a race with CQL tracing The test uses CQL tracing to check which files were read by a query. This is flaky if the coordinator and the replica are different shards, because the Python driver only waits for the coordinator, and not for replicas, to finish writing their traces. (So it might happen that the Python driver returns a result with only coordinator events and no replica events). Let's just dodge the issue by using --smp=1. Fixes scylladb/scylladb#26432 Closes scylladb/scylladb#26434	2025-10-09 13:22:06 +03:00
Piotr Dulikowski	fe7ffc5e5d	Merge 'service/qos: set long timeout for auth queries on SL cache update' from Michael Litvak pass an appropriate query state for auth queries called from service level cache reload. we use the function qos_query_state to select a query_state based on caller context - for internal queries, we set a very long timeout. the service level cache reload is called from group0 reload. we want it to have a long timeout instead of the default 5 seconds for auth queries, because we don't have strict latency requirement on the one hand, and on the other hand a timeout exception is undesired in the group0 reload logic and can break group0 on the node. Fixes https://github.com/scylladb/scylladb/issues/25290 backport possible to improve stability Closes scylladb/scylladb#26180 * github.com:scylladb/scylladb: service/qos: set long timeout for auth queries on SL cache update auth: add query_state parameter to query functions auth: refactor query_all_directly_granted	2025-10-08 12:37:01 +02:00
Andrzej Jackowski	f720ce0492	test: add test to verify use of `sl:driver` `sl:driver` is expected to be used for new and control connections, but other connections that run user load should not use it after the user is authenticated. Refs: scylladb/scylladb#24411	2025-10-08 08:25:33 +02:00
Andrzej Jackowski	14081d0727	generic_server: transport: start using `sl:driver` for new connections Before this change, new connections were handled in a default scheduling group (`main`), because before the user is authenticated we do not know which service level should be used. With the new `sl:driver` service level, creation of new connections can be moved to `sl:driver`. We switch the service level as early as possible, in `do_accepts`. There is a possibility, that `sl:driver` will not exist yet, for instance, in specific upgrade cases, or if it was removed. Therefore, we also switch to `sl:driver` after a connection is accepted. Refs: scylladb/scylladb#24411	2025-10-08 08:25:12 +02:00
Andrzej Jackowski	b62135f767	test: add test_desc_* for driver service level Driver service level is a special service level that is created automatically by the system. Therefore, it requires special handling in DESC SCHEMA WITH INTERNALS and those test verifies the special behavior. Refs: scylladb/scylladb#24411	2025-10-08 08:25:07 +02:00
Andrzej Jackowski	0ddf46c7b4	test: service_levels: add tests for sl:driver creation and removal Refs: scylladb/scylladb#24411	2025-10-08 08:25:02 +02:00
Andrzej Jackowski	9e9bca9bdb	test: add reload_raft_topology_state() to ScyllaRESTAPIClient To encapsulate `/storage_service/raft_topology/reload` API call	2025-10-08 08:24:57 +02:00
Andrzej Jackowski	c59a7db1c9	service_level_controller: automatically create `sl:driver` This commit: - Increases the number of allowed scheduling groups to allow the creation of `sl:driver`. - Adds the `DRIVER_SERVICE_LEVEL` feature, which prevents creating `sl:driver` until all nodes have increased the number of scheduling groups. - Starts using `get_create_driver_service_level_mutations` to unconditionally create `sl:driver` on `raft_initialize_discovery_leader`. The purpose of this code path is ensuring existence of `sl:driver` in new system and tests. - Starts using `migrate_to_driver_service_level` to create `sl:driver` if it is not already present. The creation of `sl:driver` is managed by `topology_coordinator`, similar to other system keyspace updates, such as the `view_builder` migration. The purpose of this code path is handling upgrades. - Modifies related tests to pass after `sl:driver` is added. Later in this patch series, `sl:driver` will be used by `transport/server` to handle selected traffic, such as the driver's schema and topology fetches. Refs: scylladb/scylladb#24411	2025-10-08 08:24:43 +02:00
Andrzej Jackowski	7d2db37831	test: add MAX_USER_SERVICE_LEVELS Previously, tests used the hardcoded value 7 for the maximum number of user service levels. This commit introduces a named variable that can be shared across tests to avoid cases where this magic number goes out of sync.	2025-10-08 08:24:17 +02:00
Artsiom Mishuta	99455833bd	test.py: reintroducing sudo in resource_gather.py conditionally reintroducing sudo for resource gathering when running under docker related: https://github.com/scylladb/scylladb/pull/26294#issuecomment-3346968097 fixes: https://github.com/scylladb/scylladb/issues/26312 Closes scylladb/scylladb#26401	2025-10-07 14:42:15 +02:00
Piotr Dulikowski	264cf12b66	Merge 'view building coordinator - add missing tests' from Michał Jadwiszczak This patch adds tests for: - tablet migration during view building - tablet merge during view building. Those tests were missing from the original testing plan. We want to backport it to 2025.4 to ensure the release is bug-free. Closes scylladb/scylladb#26414 * github.com:scylladb/scylladb: test/cluster/test_view_building_coordinator: add test for tablet merge test/cluster/test_view_building_coordinator: add test for tablet migration	2025-10-07 14:25:04 +02:00
Michał Jadwiszczak	279a8cbba3	test/cluster/test_view_building_coordinator: add test for tablet merge The test pauses processing of the view building task and triggers tablet merge.	2025-10-06 15:06:11 +02:00
Michał Jadwiszczak	fc7e5370a1	test/cluster/test_view_building_coordinator: add test for tablet migration The test pauses processing of the view building task and migrates it to another node.	2025-10-06 15:02:42 +02:00
Michał Chojnowski	dbddba0794	sstables/trie: actually apply BYPASS CACHE to index reads BYPASS CACHE is implemented for `bti_index_reader` by giving it its own private `cached_file` wrappers over Partitions.db and Rows.db, instead of passing it the shared `cached_file` owned by the sstable. But due to an oversight, the private `cached_file`s aren't constructed on top of the raw Partitions.db and Rows.db files, but on top of `cached_file_impl` wrappers around those files. Which means that BYPASS CACHE doesn't actually do its job. Tests based on `scylla_index_page_cache_*` metrics and on CQL tracing still see the reads from the private files as "cache misses", but those misses are served from the shared cached files anyway, so the tests don't see the problem. In this commit we extend `test_bti_index.py` with a check that looks at reactor's `io_queue` metrics instead, and catches the problem. Fixes scylladb/scylladb#26372 Closes scylladb/scylladb#26373	2025-10-06 15:32:05 +03:00
Andrzej Jackowski	c3dd383e9e	test: add reproduction of name reuse bug to service level tests This commit adds a reproduction test for scylladb/scylladb#26190 to the service levels test suite. Although the bug was fixed internally in Seastar, the corner-case service level name reuse scenario should be covered by tests to prevent regressions. Refs: https://github.com/scylladb/scylladb/issues/26190 Closes scylladb/scylladb#26379	2025-10-06 14:19:22 +02:00
Piotr Dulikowski	380f243986	Merge ' Support replication factor rack list for tablet-based keyspaces' from Tomasz Grabiec This change extends the CQL replication options syntax so the replication factor can be stated as a list of rack names. For example: { 'mydatacenter': [ 'myrack1', 'myrack2', 'myrack4' ] } Rack-list based RF can coexist with the old numerical RF, even in the same keyspace for different DCs. Specifying the rack list also allows to add replicas on the specified racks (increasing the replication factor), or decommissioning certain racks from their replicas (by omitting them from the current datacenter rack-list). This will allow us to keep the keyspace rf-rack-valid, maintaining guarantees, while allowing adding/removing racks. In particular, this will allow us to add a new DC, which happens by incrementally increasing RF in that DC to cover existing racks. Migration from numerical RF to rack-list is not supported yet. Migration from rack-list to numerical RF is not planned to be supported. New feature, no backport required. Co-authored with @bhalevy Fixes https://github.com/scylladb/scylladb/issues/25269 Fixes https://github.com/scylladb/scylladb/issues/23525 Closes scylladb/scylladb#26358 * github.com:scylladb/scylladb: tablets: load_balancer: Recognize that tablets are confined to racks when computing desired tablet count locator: Make hasher for endpoint_dc_rack globally accessible test: tablets: Add test for replica allocation on rack list changes test: lib: topology_builder: generate unique rack names test: Add tests for rack list RF doc: Document rack-list replication factor topology_coordinator: Restore formatting topology_coordinator: Cancel keyspace alter on broader set of errors topology_coordinator: Make keyspace alter process options through as_ks_metadata_update() cql3: ks_prop_defs: Preserve old options cql3: ks_prop_defs: Introduce flattened() locator: Recognize rack list RF as valid in assert_rf_rack_valid_keyspace() tablet_allocator: Respect binding replicas to racks locator: network_topology_strategy: Respect rack list when reallocating tablets cql3: ks_prop_defs: Fail with more information when options are not in expected format locator, cql3: Support rack lists in replication options cql3: Fail early on vnode/tablet flavor alter cql3: Extract convert_property_map() out of Cql.g schema: Use definition from the header instead of open-coding it locator: Abstract obtaining the number of replicas from replication_strategy_config_option cql3, locator: Use type aliases for option maps locator: Add debug logging locator: Pass topology to replication strategy constructor abstract_replication_strategy, network_topology_strategy: add replication_factor_data class	2025-10-06 14:14:09 +02:00
Piotr Dulikowski	e7907b173a	Merge 'db/view: Require rf_rack_valid_keyspaces when creating materialized view' from Dawid Mędrek Materialized views are currently in the experimental phase and using them in tablet-based keyspaces requires starting Scylla with an experimental feature, `views-with-tablets`. Any attempts to create a materialized view or secondary index when it's not enabled will fail with an appropriate error. After considerable effort, we're drawing close to bringing views out of the experimental phase, and the experimental feature will no longer be needed. However, materialized views in tablet-based keyspaces will still be restricted, and creating them will only be possible after enabling the configuration option `rf_rack_valid_keyspaces`. That's what we do in this PR. In this patch, we adjust existing tests in the tree to work with the new restriction. That shouldn't have been necessary because we've already seemingly adjusted all of them to work with the configuration option, but some tests hid well. We fix that mistake now. After that, we introduce the new restriction. What's more, when starting Scylla, we verify that there is no materialized view that would violate the contract. If there are some that do, we list them, notify the user, and refuse to start. High-level implementation strategy: 1. Name the restrictions in form of a function. 2. Adjust existing tests. 3. Restrict materialized views by both the experimental feature and the configuration option. Add validation test. 4. Drop the requirement for the experimental feature. Adjust the added test and add a new one. 5. Update the user documentation. Fixes scylladb/scylladb#23030 Backport: 2025.4, as we are aiming to support materialized views for tablets from that version. Closes scylladb/scylladb#25802 * github.com:scylladb/scylladb: view: Stop requiring experimental feature db/view: Verify valid configuration for tablet-based views db/view: Require rf_rack_valid_keyspaces when creating view test/cluster/random_failures: Skip creating secondary indexes test/cluster/mv: Mark test_mv_rf_change as skipped test/cluster: Adjust MV tests to RF-rack-validity test/boost/schema_loader_test.cc: Explicitly enable rf_rack_valid_keyspaces db/view: Name requirement for views with tablets	2025-10-06 12:46:46 +02:00
Pavel Emelyanov	6ad8dc4a44	Merge 'root,replica: mv querier to replica/' from Botond Dénes The querier object is a confusing one. Based on its name it should be in the query/ module and it is already in the query namespace. The query namespace is used for symbols which span the coordinator and replica, or that are mostly coordinator side. The querier is mainly in this namespace due to its similar name and because at the time it was introduced, namespace replica didn't exist yet. But this is a mistake which confuses people. The querier is actually a completely replica-side logic, implementing the caching of the readers on the replica. Move it to the replica module and namespace to make this more clear. Code cleanup, no backport. Closes scylladb/scylladb#26280 * github.com:scylladb/scylladb: replica: move querier code to replica namespace root,replica: mv querier to replica/	2025-10-06 08:26:05 +03:00
Michał Chojnowski	6efb807c1a	sstables/sstable_directory: don't forget to delete other components when deleting TemporaryHashes.db TemporaryHashes.db is a temporary sstable component used during ms sstable writes. It's different from other sstable components in that it's not included in the TOC. Because of this, it has a special case in the logic that deletes unfinished sstables on boot. (After Scylla dies in the middle of a sstable write). But there's a bug in that special case, which causes Scylla to forget to delete other components from the same unfinished sstable. The code intends only to delete the TemporaryHashes.db file from the `_state->generations_found` multimap, but it accidentally also deletes the file's sibling components from the multimap. Fix that. Fixes scylladb/scylladb#26393	2025-10-04 00:45:55 +02:00
Michał Chojnowski	16cb223d7f	test/boost/database_test: fix two no-op distributed loader tests There are two tests which effectively check nothing. They intend to check that distributed loader removes "leftover" sstable files. So they create some incomplete sstables, run the test env on the directory, and the files disappeared. But the test env completely clears the test directory before the distributed loader looks at the files, so the tests succeed trivially. Fix that by adding a config knob to the test env which instructs it not to clear the directory before the test.	2025-10-04 00:44:49 +02:00
Tomasz Grabiec	9ebdeb261f	tablets: load_balancer: Recognize that tablets are confined to racks when computing desired tablet count The old logic assumes that replicas are spread across whole DC when determining how many tablets we need to have at least 10 tablets per shard. If replicas are actually confined to a subset of racks, that will come up with a too high count and overshoot actual per-shard count in this rack. Similar problem happens for scaling-down of tablet count, when we try to keep per-shard tablet count below the goal. It should be tracked per-rack rather than per-DC, since racks can differ in how loaded they are by RF if it's a rack-list.	2025-10-02 19:45:00 +02:00
Tomasz Grabiec	85ddb832b4	test: tablets: Add test for replica allocation on rack list changes	2025-10-02 19:45:00 +02:00
Benny Halevy	4955ca3ddd	test: lib: topology_builder: generate unique rack names Encode the dc identifier into each rack name so each dc will have its own unique racks. Just for easier distinction in logs. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-10-02 19:45:00 +02:00
Tomasz Grabiec	5fc617ecf5	test: Add tests for rack list RF	2025-10-02 19:45:00 +02:00
Tomasz Grabiec	6de342ed3e	locator: network_topology_strategy: Respect rack list when reallocating tablets	2025-10-02 19:42:39 +02:00
Botond Dénes	9d08a380db	Merge 'Fix getendpoints command for compound keys containing ':'' from Taras Veretilnyk Before, the `nodetool getendpoints` expected the key as one string separated by : (for example 1:val:ue). This caused errors if any part of the key had a colon because it was unclear whether a colon was a separator or part of the key. This change adds a new API endpoint, `/storage_service/natural_endpoints/v2/{keyspace}`, which accepts composite partition keys as multiple key_component query parameters (e.g., ?key_component=1&key_component=val:ue). The `nodetool getendpoints` command was updated to support a new `--key-components` option, allowing users to pass key components as an array. The client and test infrastructure were extended to support multiple values for a query parameter, and tests were added to verify correct behavior with composite keys. The previous method of passing partition keys as colon-separated strings is preserved for backward compatibility. Backport is not required, since this change relies on recent Seastar updates Fixes #16596 Closes scylladb/scylladb#26169 * github.com:scylladb/scylladb: docs: document --key-components option for getendpoints test/nodetool/test_getendpoints: add coverage for --key-components param in getendpoints nodetool: Introduce new option --key-components to specify compound partition keys as array rest_api/test_storage_service: add v2 natural_endpoints test for composite key with multiple components api/storage_service: add GET 'natural_endpoints' v2 to support composite keys with ':' rest_api_mock: support duplicate query parameters test/rest_api: support multiple query values per key in RestApiSession.send() nodetool: add support of new seastar query_parameters_type to scylla_rest_client	2025-10-02 09:04:40 +03:00
Aleksandra Martyniuk	0e73ce202e	test: wait for cql in test_two_tablets_concurrent_repair_and_migration_repair_writer_level In test_two_tablets_concurrent_repair_and_migration_repair_writer_level safe_rolling_restart returns ready cql. However, get_all_tablet_replicas uses the cql reference from manager that isn't ready. Wait for cql. Fixes: #26328 Closes scylladb/scylladb#26349	2025-10-02 06:41:36 +03:00
Michał Jadwiszczak	d92628e3bd	test/cluster/test_view_building_coordinator: skip reproducer instead of xfail The reproducer for issue scylladb/scylladb#26244 takes some time and since the test is failing, there is no point in wasting resources on it. We can change the xfail mark to skip. Refs scylladb/scylladb#26244 Closes scylladb/scylladb#26350	2025-10-01 18:33:05 +02:00
Tomasz Grabiec	726548b835	locator: Abstract obtaining the number of replicas from replication_strategy_config_option It will become more complex when options will contain rack lists. It's a good change regardless, as it reduces duplication and makes parsing uniform. We already diverged to use stoi / stol / stoul. The change in create_keyspace_statement.cc to add a catch clause is needed because get_replication_factor() now throws configuration_exception on parsing errors instead of std::invalid_argument, so the existing catch clause in the outer scope is not effective. That loop is trying to interpret all options as RF to run some validations. Not all options are RF, and those are supposed to be ignored.	2025-10-01 16:06:52 +02:00
Tomasz Grabiec	91e51a5dd1	cql3, locator: Use type aliases for option maps In preparation for changing their structure. 1) std::map<sstring, sstring> -> replication_strategy_config_options Parsed options. Values will become std::variant<sstring, rack_list> 2) std::map<sstring, sstring> -> property_definitions::map_type Flattened map of options, as stored system tables.	2025-10-01 16:06:51 +02:00
Benny Halevy	da6e2fdb1b	locator: Pass topology to replication strategy constructor	2025-10-01 16:06:28 +02:00
Taras Veretilnyk	6381c63d65	test/nodetool/test_getendpoints: add coverage for --key-components param in getendpoints Adds a parameterized test to verify that multiple --key-components arguments are handled correctly by nodetool's getendpoints command. Ensures the constructed REST request includes all key_component values in the expected format.	2025-10-01 15:53:25 +02:00
Taras Veretilnyk	2456ebd7c2	rest_api/test_storage_service: add v2 natural_endpoints test for composite key with multiple components Adds a test case for the `/storage_service/natural_endpoints/v2/{keyspace}` endpoint, verifying that it correctly resolves natural endpoints for a composite partition key passed as multiple `key_component` query parameters.	2025-10-01 15:53:25 +02:00
Taras Veretilnyk	65ade28a9c	rest_api_mock: support duplicate query parameters Previously, only the last value of a repeated query parameter was captured, which could cause inaccurate request matching in tests. This update ensures that all values are preserved by storing duplicates as lists in the `params` dict.	2025-10-01 15:53:25 +02:00
Taras Veretilnyk	b60afeaa46	test/rest_api: support multiple query values per key in RestApiSession.send() Previously, the send() method in RestApiSession only supported one value per query parameter key. This patch updates it to support passing lists of values, allowing the same key to appear multiple times in the query string (e.g. ?key=value1&key=value2).	2025-10-01 15:53:25 +02:00
Avi Kivity	15fa1c1c7e	Merge 'sstables/trie: translate all key cells in one go, not lazily' from Michał Chojnowski Applying lazy evaluation to the BTI encoding of clustering keys was probably a bad default. The possible benefits are dubious (because it's quite likely that the laziness won't allow us to avoid that much work), but the overhead needed to implement the laziness is large and immediate. In this patch we get rid of the laziness. We rewrite lazy_comparable_bytes_from_clustering_position and lazy_comparable_bytes_from_ring_position so that they performs the key translation eagerly, all components to a single bytes_ostream in one synchronous call. perf_bti_key_translation (microbenchmark added in this series, 1 iteration is 100 translations of a clustering key with 8 cells of int32_type): ``` Before: test iterations median mad min max allocs tasks inst cycles lcb_mismatch_test.lcb_mismatch 9233 109.930us 0.000ns 109.930us 109.930us 4356.000 0.000 2615394.3 614709.6 After: test iterations median mad min max allocs tasks inst cycles lcb_mismatch_test.lcb_mismatch 50952 19.487us 0.000ns 19.487us 19.487us 198.000 0.000 603120.1 109042.9 ``` Enhancement, backport not required. Closes scylladb/scylladb#26302 * github.com:scylladb/scylladb: sstables/trie: BTI-translate the entire partition key at once sstables/trie: avoid an unnecessary allocation of std::generator in last_block_offset() sstables/trie: perform the BTI-encoding of position_in_partition eagerly types/comparable_bytes: add comparable_bytes_from_compound test/perf: add perf_bti_key_translation	2025-10-01 14:59:06 +03:00
Dawid Mędrek	b409e85c20	view: Stop requiring experimental feature We modify the requirements for using materialized views in tablet-based keyspaces. Before, it was necessary to enable the configuration option `rf_rack_valid_keyspaces`, having the cluster feature `VIEWS_WITH_TABLETS` enabled, and using the experimental feature `views-with-tablets`. We drop the last requirement. We adjust code to that change and provide a new validation test. We also update the user documentation to reflect the changes. Fixes scylladb/scylladb#23030	2025-10-01 09:01:53 +02:00
Dawid Mędrek	288be6c82d	db/view: Verify valid configuration for tablet-based views Creating a materialized view or a secondary index in a tablet-based keyspace requires that the user enabled two options: * experimental feature `views-with-tablets`, * configuration option `rf_rack_vaid_keyspaces`. Because the latter has only become a necessity recently (in this series), it's possible that there are already existing materialized views that violate it. We add a new check at start-up that iterates over existing views and makes sure that that is not the case. Otherwise, Scylla notifies the user of the problem.	2025-10-01 09:01:53 +02:00
Dawid Mędrek	00222070cd	db/view: Require rf_rack_valid_keyspaces when creating view We extend the requirements for being able to create materialized views and secondary indexes in tablet-based keyspaces. It's now necessary to enable the configuration option `rf_rack_valid_keyspaces`. This is a stepping stone towards bringing materialized views and secondary indexes with tablets out of the experimental phase. We add a validation test to verify the changes. Refs scylladb/scylladb#23030	2025-10-01 09:01:50 +02:00
Dawid Mędrek	71606ffdda	test/cluster/random_failures: Skip creating secondary indexes Materialized views are going to require the configuration option `rf_rack_valid_keyspaces` when being created in tablet-based keyspaces. Since random-failure tests still haven't been adjusted to work with it, and because it's not trivial, we skip the cases when we end up creating or dropping an index.	2025-10-01 09:01:38 +02:00
Dawid Mędrek	6322b5996d	test/cluster/mv: Mark test_mv_rf_change as skipped The test will not work with `rf_rack_valid_keyspaces`. Since the option is going to become a requirement for using views with tablets, the test will need to be rewritten to take that into consideration. Since that adjustment doesn't seem trivial, we mark the test as skipped for the time being.	2025-10-01 09:01:29 +02:00
Botond Dénes	bdca5600ef	Merge 'Prevent stalls due to large tablet mutations' from Benny Halevy Currently, replica::tablet_map_to_mutation generates a mutation having a row per tablet. With enough tablets (10s of thousands) in the table we observe reactor stalls when freezing / unfreezing such large mutations, as seen in https://github.com/scylladb/scylladb/pull/18095#issuecomment-2029246954, and I assume we would see similar stalls also when converting those mutation into canonical_mutation and back, as they are similar to frozen_mutation, and bit more expensive since they also save the column mappings. This series takes a different approach than allowing freeze to yield. `tablet_map_to_mutation` is changed to `tablet_map_to_mutations`, able to generate multiple split mutations, that when squashed together are equivalent to the previously large mutation. Those mutations are fed into a `process_mutation` callback function, provided by the caller, which may add those mutation to a vector for further processing, and/or process them inline by freezing or making a canonical mutation. In addition, split the large mutations would also prevent hitting the commitlog maximum mutation size. Closes scylladb/scylladb#18162 * github.com:scylladb/scylladb: schema_tables: convert_schema_to_mutations: simplify check for system keyspace tablets: read_tablet_mutations: use unfreeze_and_split_gently storage_service: merge_topology_snapshot: freeze snp.mutations gently mutation: async_utils: add unfreeze_and_split_gently mutation: add for_each_split_mutation tablets: tablet_map_to_mutations: maybe split tablets mutation tablets: tablet_map_to_mutations: accept process_func perf-tablets: change default tables and tablets-per-table perf-tablets: abort on unhandled exception	2025-10-01 07:04:09 +03:00
Dawid Mędrek	994f09530f	test/cluster: Adjust MV tests to RF-rack-validity Some of the new tests covering materialized views explicitly disabled the configuration option `rf_rack_valid_keyspaces`. It's going to become a new requirement for views with tablets, so we adjust those tests and enable the option. There is one exception, the test: `cluster/mv/test_mv_topology_change.py::test_mv_rf_change` We handle it separately in the following commit.	2025-09-30 20:01:25 +02:00

1 2 3 4 5 ...

9801 Commits