scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-02 06:05:53 +00:00

Author	SHA1	Message	Date
Raphael S. Carvalho	0f59deffaa	replica: Fix truncate and drop table after tablet migration happens When running those operations after a tablet replica is migrated away from a shard, an assert can fail resulting in a crash. Status quo (around the assert in truncate procedure): 1) Highest RP seen by table is saved in low_mark, and the current time in low_mark_at. 2) Then compaction is disabled in order to not mix data written before truncate, and data written later. 3) Then memtable is flushed in order for the data written before truncate to be available in sstables and then removed. 4) Now, current time is saved in truncated_at, which is supposedly the time of truncate to decide which sstables to remove. Note: truncated_at is likely above low_mark_at due to steps 2 and 3. The interesting part of the assert is: (truncated_at <= low_mark_at ? rp <= low_mark : low_mark <= rp) Note: RP in the assert above is the highest RP among all sstables generated before truncated_at. RP is retrieved by table::discard_sstables(). If truncated_at > low_mark_at, maybe newer data was written during steps 2 and 3, and memtable's RP becomes greater than low_mark, resulting in a SSTable with RP > low_mark. So assert's 2nd condition is there to defend against the scenario above. truncated_at and low_mark_at uses millisecond granularity, so even if truncated_at == low_mark_at, data could have been written in steps 2 and 3 (during same MS window), failing the assert. This is fragile. Reproducer: To reproduce the problem, truncated_at must be > low_mark_at, which can easily happen with both drop table and truncate due to steps 2 and 3. If a shard has 2 or more tablets, the table's highest RP refer to just one tablet in that shard. If the tablet with the highest RP is migrated away, then the sstables in that shard will have lower RP than the recorded highest RP (it's a table wide state, which makes sense since CL is shared among tablets). So when either drop table or truncate runs, low_mark will be potentially bigger than highest RP retrieved from sstables. Proposed solution: The current assert is hacked to not fail if writes sneak in, during steps 2 and 3, but it's still fragile and seems not to serve its real purpose, since it's allowing for RP > low_mark. We should be able to say that low_mark >= RP, as a way of asserting we're not leaving data targeted by truncate behind (or that we're not removing the wrong data). But the problem is that we're saving low_mark in step 1, before preparation steps (2 and 3). When truncated_at is recorded in step 4, it's a way of saying all data written so far is targeted for removal. But as of today, low_mark refers to all data written up to step 1. So low_mark is now only one set before issuing flush, and also accounts for all potentially flushed data. Fixes #18059. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb/scylladb#23560	2025-04-08 07:32:58 +03:00
Avi Kivity	8d2a41db82	Merge "Fixes for gossiper conversion to host id" from Gleb " The series contains fixes to gossiper conversion to host id. There are two fixes where we could erroneously send outdated entry in a gossiper message and a fix for force_remove_endpoint which was not converted to work on host id and this caused it to not delete the entry in some cases (in replace with the same ip case). " * 'gleb/host-id-fixes' of github.com:scylladb/scylla-dev: gossiper: send newest entry in a digest message gossiper: change make_random_gossip_digest to return value instead of modifying passed parameter gossiper: move force_remove_endpoint to work on host id gossiper: do not send outdated endpoint in gossiper round	2025-04-07 17:04:28 +03:00
Pavel Emelyanov	10376b5b85	db: Re-use database::snapshot_table_on_all_shards() There are two snapshot-on-all-shards methods on the database -- the one that snapshots a keyspace and the one that snapshots a vector of tables. The latter snapshots a single table with a neat helper, while the former has the helper open-coded. Re-using the helper in keyspace snapshot is worth it, but needs to patch the helper to work on uuid, rather than ks:cf pair of strings. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#23532	2025-04-07 11:55:43 +02:00
Nadav Har'El	84fd52315f	alternator: in GetRecords, enforce Limit to be <= 1000 Alternator Streams' "GetRecords" operation has a "Limit" parameter on how many records to return. The DynamoDB documentations says that the upper limit on this Limit parameter is 1000 - but Alternator didn't enforce this. In this patch we begin enforcing this highest Limit, and also add a test for verifying this enforcement. As usual, the new test passes on DynamoDB, and after this patch - also on Alternator. The reason why it's useful to have some upper limit on Limit is that the existing executor::get_records() implementation does not really have preemption points in all the necessary places. In particular, we have a loop on all returned records without preemption points. We also store the returned records in a RapidJson vector, which requires a contiguous allocation. Even before this patch, GetRecords had a hard limit of 1 MB of results. But still, in some cases 1 MB of results may be a lot of results, and we can see stalls in the aforementioned places being O(number of results). Fixes #23534 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#23547	2025-04-07 12:52:03 +03:00
Gleb Natapov	6f53611337	gossiper: move force_remove_endpoint to work on host id Since the gossiper works on host ids now it is incorrect to leave this function to work on ip. It makes it impossible to delete outdated entry since the "gossiper.get_host_id(endpoint) != id" check will always be false for such entries (get_host_id() always returns most up -to-date mapping.	2025-04-06 18:39:24 +03:00
Marcin Maliszkiewicz	b94acfb37b	test: remove alternator code from perf-simple-query This kind of benchmark was superseded by perf-alternator which has more options, workflows and most importantly measures overhead of http server layer (including json parsing). There is no need to maintain additional code in perf-simple-query. Closes scylladb/scylladb#23474	2025-04-06 18:15:16 +03:00
Nadav Har'El	431de48df9	test/alternator: test for item with many attributes A user complained that he couldn't read or write an item with more than 16 attributes (!) in Alternator. This isn't true, but I realized that we don't have a simple test for this case - all test use just a few attributes. So let's add such a test, doing PutItem, UpdateItem and GetItem with 400 attributes. Unsurprisingly, the test passes. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#23568	2025-04-03 22:35:49 +03:00
Nadav Har'El	a9a6f9eecc	test/alternator: increase timeout in Alternator RBAC test On our testing infrastructure, tests often run a hundred times (!) slower than usual, for various reasons that we can't always avoid. This is why all our test frameworks drastically increase the default timeouts. We forgot to increase the timeout in one place - where Alternator tests use CQL. This is needed for the Alternator role-based access control (RBAC) tests, which is configured via CQL and therefore the Alternator test unusually uses CQL. So in this patch we increase the timeout of CQL driver used by Alternator tests to the same high timeouts (60-120 seconds) used by the regular CQL tests. As the famous saying goes, these timeouts should be enough for anyone. Fixes #23569. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#23578	2025-04-03 22:31:08 +03:00
Botond Dénes	1198213000	Merge 'tablets: Make tablet allocation equalize per-shard load ' from Tomasz Grabiec Before, it was equalizing per-node load (tablet count), which is wrong in heterogeneous clusters. Nodes with fewer shards will end up with overloaded shards. Refs #23378 Closes scylladb/scylladb#23478 * github.com:scylladb/scylladb: tablets: Make tablet allocation equalize per-shard load tablets: load_balancer: Fix reporting of total load per node	2025-04-03 16:32:53 +03:00
Botond Dénes	fcdae20fd1	Merge 'Add tablet enforcing option' from Benny Halevy This series add a new config option: `tablets_mode_for_new_keyspaces` that replaces the existing `enable_tablets` option. It can be set to the following values: disabled: New keyspaces use vnodes by default, unless enabled by the tablets={'enabled':true} option enabled: New keyspaces use tablets by default, unless disabled by the tablets={'disabled':true} option enforced: New keyspaces must use tablets. Tablets cannot be disabled using the CREATE KEYSPACE option `tablets_mode_for_new_keyspaces=disabled` or `tablets_mode_for_new_keyspaces=enabled` control whether tablets are disabled or enabled by default for new keyspaces, respectively. In either cases, tablets can be opted-in or out using the `tablets={'enabled':...}` keyspace option, when the keyspace is created. `tablets_mode_for_new_keyspaces=enforced` enables tablets by default for new keyspaces, like `tablets_mode_for_new_keyspaces=enabled`. However, it does not allow to opt-out when creating new keyspaces by setting `tablets = {'enabled': false}` Refs scylladb/scylla-enterprise#4355 * Requires backport to 2025.1 Closes scylladb/scylladb#22273 * github.com:scylladb/scylladb: boost/tablets_test: verify failure to create keyspace with tablets and non network replication strategy tablets: enforce tablets using tablets_mode_for_new_keyspaces=enforced config option db/config: add tablets_mode_for_new_keyspaces option	2025-04-03 16:32:19 +03:00
Tomasz Grabiec	fe8187e594	Merge 'repair: release erm in repair_writer_impl::create_writer when possible' from Aleksandra Martyniuk Currently, repair_writer_impl::create_writer keeps erm to ensure that a sharder is valid. If we repair a tablet, erm blocks the state machine and no operation on any tablet of this table might be performed. Use auto_refreshing_sharder and topology_guard to ensure that the operation is safe and that tablet operations on the whole table aren't blocked. Fixes: #23453. Needs backport to 2025.1 that introduces the tablet repair scheduler. Closes scylladb/scylladb#23455 * github.com:scylladb/scylladb: \test: add test to check concurrent migration and repair of two different tablets repair: release erm in repair_writer_impl::create_writer when possible	2025-04-03 11:15:08 +02:00
Botond Dénes	7bbfa5293f	test/cluster/test_read_repair.py: increase read request timeout This test enables trace-level logging for the mutation_data logger, which seems to be too much in debug mode and the test read times out. Increase timeout to 1minute to avoid this. Fixes: #23513 Closes scylladb/scylladb#23558	2025-04-03 10:42:11 +03:00
Evgeniy Naydanov	84a5037056	test.py: cluster/suite.yaml: update test filters After switching to subfolders the filter `run_in_debug` for random failures test was just copied as is, but need to include the subfolder, actually. Also, `test_old_ip_notification_repro` was deleted, so, we don't need it in the `skip_in_debug` list. Closes scylladb/scylladb#23492	2025-04-02 19:29:27 +03:00
Aleksandra Martyniuk	bae6711809	\test: add test to check concurrent migration and repair of two different tablets	2025-04-02 15:30:17 +02:00
Radosław Cybulski	c36614e16d	alternator: add size check to BatchItemWrite Add a size check for BatchItemWrite command - if the item count is bigger than configuration value `alternator_maximum_batch_write_size`, an error will be raised and no modification will happen. This is done to synchronize with DynamoDB, where maximum size of BatchItemWrite is 25. To avoid complaints from clients, who use our feature of BatchWriteItem being limitless we set default value to 100. Fixes #5057 Closes scylladb/scylladb#23232	2025-04-02 14:48:00 +03:00
Avi Kivity	882f405eed	Merge "Convert gossiper's endpoint state map to be host id based" from Gleb " The series makes endpoint state map in the gossiper addressable by host id instead of ips. The transition has implication outside of the gossiper as well. Gossiper based topology operations are affected by this change since they assume that the mapping is ip based. On wire protocol is not affected by the change as maps that are sent by the gossiper protocol remain ip based. If old node sends two different entries for the same host id the one with newer generation is applied. If new node has two ids that are mapped to the same ip the newer one is added to the outgoing map. Interoperability was verified manually by running mixed cluster. The series concludes the conversion of the system to be host id based. " * 'gleb/gossipper-endpoint-map-to-host-id-v2' of github.com:scylladb/scylla-dev: gossiper: make examine_gossiper private gossiper: rename get_nodes_with_host_id to get_node_ip treewide: drop id parameter from gossiper::for_each_endpoint_state treewide: move gossiper to index nodes by host id gossiper: drop ip from replicate function parameters gossiper: drop ip from apply_new_states parameters gossiper: drop address from handle_major_state_change parameter list gossiper: pass rpc::client_info to gossiper_shutdown verb handler gossiper: add try_get_host_id function gossiper: add ip to endpoint_state serialization: fix std::map de-serializer to not invoke value's default constructor gossiper: drop template from wait_alive_helper function gossiper: move get_supported_features and its users to host id storage_service: make candidates_for_removal host id based gossiper: use peers table to detect address change storage_service: use std::views::keys instead of std::views::transform that returns a key gossiper: move _pending_mark_alive_endpoints to host id gossiper: do not allow to assassinate endpoint in raft topology mode gossiper: fix indentation after previous patch gossiper: do not allow to assassinate non existing endpoint	2025-04-02 12:30:00 +03:00
Botond Dénes	a0d8102a1f	replica/memtable: s/make_flat_reader/make_mutation_reader/ Following the recent refactoring of removing "flat" and "v2" from reader names, replacing all the fully qualified names with simply "mutation_reader". Closes scylladb/scylladb#23346	2025-04-01 17:58:13 +03:00
Artsiom Mishuta	032b28d793	test.py: remove pylib_test from test.py/CI run pylib_test contains one pure Python test. This test does not test Scylla. This test is not deleted because it can be useful to run during pre-commit, for example, but it definitely should not be run in CI in modes with 3 repeats each. It does not make sense. It is a Unit test for test.py framework. Note: test still can be easily run by pytest via the command: ./tools/toolchain/dbuild pytest test/pylib_test Closes scylladb/scylladb#23181	2025-04-01 16:43:45 +03:00
Pavel Emelyanov	2ee9cec1d3	Merge 'Remove object_storage.yaml and move the endpoints to scylla.yaml' from Robert Bindar Move `object_storage.yaml` endpoints to `scylla.yaml` This change also removes the `object_storage.yaml` file altogether and adds tests for fetching the endpoints via the `v2/config/object_storage_endpoints` REST api. Also, `object_storage_config_file` options is moved to a deprecated state as it's no longer needed. This PR depends on #22951, the reviewers should review patch 393e1ac0ec066475ca94094265a5f88dbbdb1a1f Refs https://github.com/scylladb/scylladb/issues/22428 Closes scylladb/scylladb#22952 * github.com:scylladb/scylladb: Remove db::config::object_storage_config Move `object_storage.yaml` endpoints to `scylla.yaml`	2025-04-01 16:01:44 +03:00
Avi Kivity	69684e16d8	Merge 'sstables: add SSTable compression with shared dictionaries ' from Michał Chojnowski This PR extends Scylla's SSTable compression with the ability to use compression dictionaries shared across compression chunks. This involves several changes: - We refactor `compression_parameters` and friends (`compressor`, `sstables::local_compression`, `sstables::compression`) to prepare for making the construction of `compressor`s asynchronous, to enable sharing pieces of compressors (the dictionaries) across shards. - We introduce the notion of "hidden compression options" which are written to `CompressionInfo.db` and used to construct decompressors, like regular options, but don't appear in the schema. (We later stuff the SSTable's dictionary into `CompressionInfo.db` using a sequence of such options). - We add a cluster feature which guards the creation of dictionary-compressed SSTables. - We introduce a central "compressor factory" (one instance shared by all shards), which from this point onward is used to construct all `compressor` objects (one per SSTable) used to process the SSTables. When constructing a compressor for writing, it uses the "current"/"recommended" dictionary (which is passed to the factory from the actively-observed contents of the group0-managed `system.dicts`). When constructing a compressor for reading, it uses the dictionary written in the hidden compression options in CompressionInfo.db. And it keeps dictionaries deduplicated, so that each unique live dictionary blob has only one instance in memory, shared across shards. - We teach the relevant `lz4` and `zstd` compressor wrappers about the dictionaries. - We add a HTTP API call which samples pieces of the given table (i.e. the Data.db files) from across the cluster, trains a dictionary on it, and publishes it via `system.dicts` as the new current dictionary for that table. (And we add some RPC verbs to support that). - We add a HTTP API call which estimates the impact of various available compression configurations on the compression ratio. - We add an autotrainer fiber which periodically retrains dicts for dict-aware tables and publishes them if they seem to be a significant improvement. Known imperfections: - The factory currently keeps one dictionary instance on the entire node, but we probably want one copy per NUMA node. I didn't do that because exposing NUMA knowledge to Scylla seems to require some changes in Seastar first. New feature, no backporting involved. Closes scylladb/scylladb#23025 * github.com:scylladb/scylladb: docs: add user-facing documentation for SSTable compression with shared dicts docs/dev: add sstable-compression-dicts.md test: add test_sstable_compression_dictionaries_autotrain.py test: add test_sstable_compression_dictionaries_basic.py test/pylib/rest_client: add `keyspace_upgrade_sstables` helper main: run a sstable_dict_autotrainer api: add the estimate_compression_ratios API call dict_autotrainer: introduce sstable_dict_autotrainer db/system_keyspace: add query_dict_timestamp compress: add ZstdWithDictsCompressor and LZ4WithDictsCompressor main: clean up sstable compression dicts after table drops sstables/compress: discard hidden compression options after the decompressor is created compress: change compressor_ptr from shared_ptr to unique_ptr api: add the retrain_dict API call storage_service: add some dict-related routines main: in compression_dict_updated_callback, recognize and use SSTable compression dicts storage_service: add do_sample_sstables() messaging_service: add SAMPLE_SSTABLES and ESTIMATE_SSTABLE_VOLUME verbs db/system_keyspace: let `system.dicts` helpers be used for dicts other than the RPC compression dict raft/group0_state_machine: on `system.dicts` mutations, pass the affected partitition keys to the callback database: add sample_data_files() database: add take_sstable_set_snapshot() compress: teach `lz4_processor` about dictionaries compress: teach `zstd_processor` about dictionaries sstables: delegate compressor creation to the compressor factory sstables: plug an `sstable_compressor_factory` into `sstables_manager` sstables: introduce sstable_compressor_factory utils/hashers: add get_sha256() gms/feature_service: add the SSTABLE_COMPRESSION_DICTS cluster feature compress: add hidden dictionary options compress: remove `compression_parameters::get_compressor()` sstables/compress: remove get_sstable_compressor() sstables/compress: move ownership of `compressor` to `sstable::compression` compress: remove compressor::option_names() compress: clean up the constructor of zstd_processor compress: squash zstd.cc into compress.cc sstables/compress: break the dependency of `compression_parameters` on `compressor` compress.hh: switch compressor::name() from an instance member to a virtual call bytes: adapt fmt_hex to std::span<const std::byte>	2025-04-01 12:47:34 +03:00
Pavel Emelyanov	b5a124f60c	sstable_directory: Move highest_generation_seen() to distributed_loader.cc This method is only used by the loader code (and tests). Also, There's the highest_version_seen() peer that sits in the loader code either. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#23324	2025-04-01 09:15:14 +03:00
Botond Dénes	0fdf2a2090	Merge 'test/pylib: servers_add: support list of property_files' from Benny Halevy So that a multi-dc/multi-rack cluster can be populated in a single call. * Enhancement, no backport required Closes scylladb/scylladb#23341 * github.com:scylladb/scylladb: test/pylib: servers_add: add auto_rack_dc parameter test/pylib: servers_add: support list of property_files	2025-04-01 09:14:20 +03:00
Michał Chojnowski	f851efd4fa	test: add test_sstable_compression_dictionaries_autotrain.py Adds a test which checks that sstable compression dict autotraining does its job.	2025-04-01 00:07:31 +02:00
Michał Chojnowski	62da3d8363	test: add test_sstable_compression_dictionaries_basic.py Add a basic integration test for SSTable compression with shared dictionaries.	2025-04-01 00:07:30 +02:00
Michał Chojnowski	7b0eeefd79	test/pylib/rest_client: add `keyspace_upgrade_sstables` helper	2025-04-01 00:07:30 +02:00
Michał Chojnowski	a19d6d95f7	api: add the estimate_compression_ratios API call Add an API call which estimates the effectiveness of possible compression config changes. This can be used to make an informed decision about whether to change the compression method, without actually recompressing any SSTables.	2025-04-01 00:07:30 +02:00
Michał Chojnowski	10fa4abde7	compress: change compressor_ptr from shared_ptr to unique_ptr Cleanup patch. After we moved the ownership of compressors to sstables, compressor objects never have shared lifetime. `unique_ptr` is more appropriate for them than `shared_ptr` now. (And besides expressing the intent better, using `unique_ptr` prevents an accidental cross-shard `shared_ptr` copy).	2025-04-01 00:07:29 +02:00
Michał Chojnowski	58ae278d10	api: add the retrain_dict API call Add an API call which will retrain the SSTable compression dictionary for a given table. Currently, it needs all nodes to be alive to succeed. We can relax this later.	2025-04-01 00:07:29 +02:00
Michał Chojnowski	b77c611c00	raft/group0_state_machine: on `system.dicts` mutations, pass the affected partitition keys to the callback Before this patch, `system.dicts` contains only one dictionary, for RPC compression, with the fixed name "general". In later parts of this series, we will add more dictionaries to system.dicts, one per table, for SSTable compression. To enable that, this patch adjusts the callback mechanism for group0's `write_mutations` command, so that the mutation callbacks for group0-managed tables can see which partition keys were affected. This way, the callbacks can query only the modified partitions instead of doing a full scan. (This is necessary to prevent quadratic behaviours.) For now, only the `system.dicts` callback uses the partition keys.	2025-04-01 00:07:29 +02:00
Michał Chojnowski	30a9d471fa	sstables: plug an `sstable_compressor_factory` into `sstables_manager` Create a `sstable_compressor_factory_impl` in `scylla_main`, and pipe it through constructors into `sstables_manager`. In next commits, the factory available through the `sstables_manager` will be used to create compressors for SSTable readers and writers.	2025-04-01 00:07:28 +02:00
Michał Chojnowski	11be7c0704	compress: remove `compression_parameters::get_compressor()` Following up on the previous commits, we avoid constructing compressors where not necessary, by checking things directly on `compression_parameters` instead.	2025-04-01 00:07:28 +02:00
Michał Chojnowski	8e611536b0	sstables/compress: move ownership of `compressor` to `sstable::compression` SSTable readers and writers use `compressor` objects to compress and decompress chunks of SSTable data files. `compressor` objects are read-only, so only one of them is needed for each SSTable. Before this commit, each reader and writer has its own `compressor` object. This isn't necessary, but it's okay. But later in this series it will stop being okay, because the creation of a `compressor` will become an expensive cross-shard operation (because it might require sharing a compression dictionary from another shard). So we have to adjust the code so that there is only once `compressor` per sstable, not one per reader/writer. We stuff the ownership of this compressor into `sstable::compression`. To make the ownership clear, we remove `compression_ptr` shared pointers from readers and writers, and make them access the compressor via the `sstable::compression` instead.	2025-04-01 00:07:27 +02:00
Michał Chojnowski	cfe69e057f	sstables/compress: break the dependency of `compression_parameters` on `compressor` Note: this commit is meant to be a code refactoring only and is not intended to change the observable behaviour. Today `schema` contains a `compression_parameters`. `compression_parameters` contains an instance of `compressor`, and SSTable writers just share that instance. This is fine because `compressor` is a stateless object, functionally dependent on the schema. But in later parts of the series, we will break this functional dependency by adding dictionaries to compressors. Two writers for the same schema might have different dictionaries, so they won't be able to just share a single instance contained in the schema. And when that happens, having a `compressor` instance in the `schema`/`compression_parameters` will become awkward, since it won't be actually used. It will be only a container for options. In addition, for performance reasons, we will want to share some pieces of compressors across shards, which will require -- in the general case -- a construction of a compressor to be asynchronous, and therefore not possible inside the constructor of `compression_parameters`. This commit modifies `compression_parameters` so that it doesn't hold or construct instances of `compressor`. Before this patch, the `compressor` instance constructed in `compression_parameters` has an additional role of validating and holding compressor-specific options. (Today the only such option is the zstd compression level). This means that the pieces of logic responsible for compressor-specific options have to be rewritten. That ends up being the bulk of this commit.	2025-04-01 00:07:27 +02:00
Robert Bindar	b647196121	Remove db::config::object_storage_config That map became redundant once we added object_storage_endpoints in the config, this patch removes it and switches all the user code to use the new option. Signed-off-by: Robert Bindar <robert.bindar@scylladb.com>	2025-03-31 17:15:12 +03:00
Gleb Natapov	5e06bf76e0	gossiper: pass rpc::client_info to gossiper_shutdown verb handler It will be needed later to obtain host id of the peer.	2025-03-31 16:50:50 +03:00
Tomasz Grabiec	29d1c2adc6	Merge 'Finalize tablet splits earlier' from Lakshmi Narayanan Sreethar Resize finalization is executed in a separate topology transition state, `tablet_resize_finalization`, to ensure it does not overlap with tablet transitions. The topology transitions into the `tablet_resize_finalization` state only when no tablet migrations are scheduled or being executed. If there is a large load-balancing backlog, split finalization might be delayed indefinitely, leaving the tables with large tablets. This PR fixes the issue by updating the load balancer to no schedule any migrations and to not make any repair plans when there a resize finalization is pending in any table. Also added a testcase to verify the fix. Fixes #21762 Improvement : No need to backport. Closes scylladb/scylladb#22148 * github.com:scylladb/scylladb: topology_coordinator: fix indentation in generate_migration_updates topology_coordinator: do not schedule migrations when there are pending resize finalizations load_balancer: make repair plans only when there is no pending resize finalization	2025-03-31 14:42:34 +02:00
Gleb Natapov	6999b474a1	gossiper: add ip to endpoint_state Store endpoint's IP in the endpoint state. Currently it is stored as a key in gossiper's endpoint map, but we are going to change that. The new filed is not serialized when endpoint state is sent over rpc, so it is set by the rpc handler from the value in the map that is in the rpc message. This map will not be changed to be host id based to not break interoperability.	2025-03-31 15:42:08 +03:00
Tomasz Grabiec	6bff596fce	tablets: Make tablet allocation equalize per-shard load Before, it was equalizing per-node load (tablet count), which is wrong in heterogenous clusters. Nodes with fewer shards will end up with overloaded shards. Refs #23378	2025-03-31 14:34:30 +02:00
Botond Dénes	90c20858ed	Merge 'test/database: Remove most of take_snapshot() helper overloads and re-use them more' from Pavel Emelyanov This helper facilitate snapshot creation by various test cases in database_test.cc. This PR generalizes all overloads into one that suits all callers and patches one more test case to use it as well. Closes scylladb/scylladb#23482 * github.com:scylladb/scylladb: test/database: Re-use take_snapshot() helper once more test/database: Remove most of take_snapshot() helper overloads	2025-03-31 15:20:51 +03:00
Benny Halevy	5f2ce0b022	loading_cache_test: test_loading_cache_reload_during_eviction: use manual_clock Rather than lowres_clock, as since `32b7cab917`, loading_cache_for_test uses manual_clock for timing and relying on lowres_clock to time the test might run out of memory on fast test machines. Fixes #23497 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#23498	2025-03-31 14:53:06 +03:00
Robert Bindar	e3a3508960	Move `object_storage.yaml` endpoints to `scylla.yaml` This change also removes the `object_storage.yaml` file altogether and adds tests for fetching the endpoints via the `v2/config/object_storage_endpoints` REST api. Signed-off-by: Robert Bindar <robert.bindar@scylladb.com>	2025-03-31 13:39:39 +03:00
Pavel Emelyanov	ac582efb44	test/database: Re-use take_snapshot() helper once more There's a test case that can call the recently patched take_snapshot() helper as well. This changes nothing, but makes further patching a bit simpler (not in this branch). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-31 13:18:06 +03:00
Pavel Emelyanov	7e6380b6bd	test/database: Remove most of take_snapshot() helper overloads There are 3 of those that help tests (re)shuffle cql_test_env/database, skip_flush == true/false options and keyspace/table/snapshot names. There's little sense in having that many of those, just one overload with default arguments suits most of the callers. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-31 13:18:06 +03:00
Dawid Mędrek	b0b0c5905e	test/cluster/test_multidc: Clean up RF-rack-valid keyspaces tests There are some minor things we should fix that are a remnant of the original changes (scylladb/scylladb@7646e14). Closes scylladb/scylladb#23429	2025-03-31 09:38:42 +03:00
Pavel Emelyanov	693387bda6	Merge 'test.py: topology: allow to run tests with bare pytest command' from Evgeniy Naydanov Add possibility to run topology tests using bare pytest command. To achieve this goal the following changes were made: - Add fixtures `testpy_testsuite` and `testpy_test` to `test/conftest.py`. - To build `TestSuite` object we need to discover a corresponding `suite.xml` file. Do this by walking up thru the fs tree starting from the current test file. - Run ScyllaClusterManager using pytest fixture if `--manager-api` option is not provided. And made some refactoring: - Add path constants to `test` module and use them in different test suites instead of own dups of the same code: - TOP_SRC_DIR : ScyllaDB's source code root directory - TEST_DIR : the directory with test.py tests and libs - BUILD_DIR : directory with ScyllaDB's build artifacts - Add TestSuite.log_dir attribute as a ScyllaDB's build mode subdir of a path provided using `--tmpdir` CLI argument. Don't use `tmpdir` name because it mixed up with pytest's built-in fixture and `--tmpdir` option itself. - Change default value for `--tmdir` from `./testlog` to `TOP_SRC_DIR/testlog` - Refactor `ResourceGather` classes to use path from a `test` object instead of providing it separately. - Move modes constants (`all_modes`/`ALL_MODES` and `debug_modes`/`DEBUG_MODES`) to `test` module and remove duplication. - Move `prepare_dirs()` and `start_3rd_party_services()` from `pylib.util` to`pylib.suite.base` to avoid circular imports. - In some places refactor to use f-strings for formatting. Also minor changes related to running with pytest-xdist: - When run tests in parallel we need to ensure that filenames are unique by adding xdist worker ID to them. - Pass random seed across xdist workers using env variable. Closes scylladb/scylladb#22960 github.com:scylladb/scylladb: test.py: async_cql: remove unused event_loop fixture test.py: random_failures: make it play well with xdist test.py: add xdist worker ID to log filenames test.py: topology: run tests using bare pytest command test.py: add fixtures for current test suite and test test.py: refactor paths constants and options	2025-03-31 09:30:06 +03:00
Benny Halevy	a4aa4d74c1	test/pylib: servers_add: add auto_rack_dc parameter To quickly populate nodes in a single dc, each node in its own rack. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-03-30 19:23:40 +03:00
Benny Halevy	c4dbb11c87	test/pylib: servers_add: support list of property_files So that a multi-dc/multi-rack cluster can be populated in a single call. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-03-30 19:12:39 +03:00
Piotr Smaron	a2bbbc6904	auth: forbid modifying system ks by non-superusers Before this patch, granting a user MODIFY permissions on ALL KEYSPACES allowed the user to write to system tables, where the user could also set himself to "superuser" granting him all other permissions. After this patch, MODIFY permissions on ALL KEYSPACES is limited only to non-system keyspaces. Fixes: scylladb/scylladb#23218 Closes scylladb/scylladb#23219	2025-03-30 16:55:04 +03:00
Ferenc Szili	2c9b312b58	test: port of test and reproducer for resurrection during file based streaming This change ports test/cluster/test_resurrection.py from enterprise to master. Because the underlying issue deals with file based streaming, this test was a part of the enterprise repo. It contains the test and reproducer for the issue described below: When tablets are migrated with file-based streaming, we can have a situation where a tombstone is garbage collected before the data it shadows lands. For instance, if we have a tablet replica with 3 sstables: 1 sstable containing an expired tombstone 2 sstable with additional data 3 sstable containing data which is shadowed by the expired tombstone in sstable 1 If this tablet is migrated, and the sstables are streamed in the order listed above, the first two sstables can be compacted before the third sstable arrives. In that case, the expired tombstone will be garbage collected, and data in the third sstable will be resurrected after it arrives to the pending replica. The fix for the issue was merged in `b66479ea98` This patch only ports the missing test. Closes scylladb/scylladb#23466	2025-03-30 13:39:40 +03:00
Evgeniy Naydanov	1a0c14aa50	test.py: async_cql: remove unused event_loop fixture Newer version of pytest-asyncio (0.24.0) allows to control the scope of async loop per fixture. Don't need this workaround anymore.	2025-03-30 03:19:30 +00:00

1 2 3 4 5 ...

8586 Commits