scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-29 19:21:01 +00:00

Author	SHA1	Message	Date
Nadav Har'El	e97fbc2d65	test/alternator: fix compressed request test on non-us-east1 The test test_compressed_request.py::test_compressed_request coerces boto3 to send a compressed request, and wrongly used region_name=us-east-1 to set up the connection. Theoretically, this doesn't matter because we also set the correct URL (for either Alternator or the desired region in AWS). But in fact it does matter, because region name is part of the request's signature, and DynamoDB refuses the request if it comes to a different region than it is signed for. So this test fails when run on DynamoDB on any other region except us-east-1. The fix is simple - don't use the constant "us-east-1", but pick up the correct region name from the original connection. The functions new_dynamodb_session(), new_dynamodb() and new_dynamodb_stremas() had the same bug and we fix it too, but it didn't break any test because the only tests using these functions were Scylla-only so the AWS region problem didn't apply to them.	2026-01-07 13:33:46 +02:00
Patryk Jędrzejczak	f0d159abb0	Merge 'test/raft: use valid sentinel in liveness check to prevent digest errors' from Emil Maskovsky Replace -1 with 0 for the liveness check operation to avoid triggering digest validation failures. This prevents rare fatal errors when the cluster is recovering and ensures the test does not violate append_seq invariants. The value -1 was causing invalid digest results in the append_seq structure, leading to assertion failures. This could happen when the sentinel value was the first (or only) element being appended, resulting in a digest that did not match the expected value. By using 0 instead, we ensure that the digest calculations remain valid and consistent with the expected behavior of the test. The specific value of the sentinel is not important, as long as it is a valid elem_t that does not violate the invariants of the append_seq structure. In particular, the sentinel value is typically used only when no valid result is received from any server in the current loop iteration, in which case the loop will retry. Fixes: scylladb/scylladb#27307 Backporting to active branches - this is a test-only fix (low risk) for a flaky test that exists in older branches (thus affects the CI of active branches). Closes scylladb/scylladb#28010 * https://github.com/scylladb/scylladb: test/raft: use valid sentinel in liveness check to prevent digest errors test/raft: improve debugging in randomized_nemesis_test	2026-01-07 12:31:21 +01:00
Nadav Har'El	2c02e463ff	test/alternator: fix test's expected error message on DynamoDB The Alternator test test_tag.py::test_tag_lsi_gsi expects to see an error - it's not allowed to set a tag on a GSI or LSI - but the error message that DynamoDB prints recently changed - instead of saying "ResourceArn" the new error message says "resource arn". Change the test to allow both forms, so it will pass on both Alternator (which still uses the word ResourceArn - which is the name of the parameter) and on DynamoDB (which uses "resource arn"). Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2026-01-07 12:51:10 +02:00
Nadav Har'El	4f3150c282	test/alternator: mark Alternator-only test scylla_only The test test_batch.py::test_batch_write_item_large_broken_connection failed on DynamoDB (Refs #26079). It turns out this test has many problems: 1. This test wrongly assumes a batch write needs to complete in one attempt - and this fails on DynamoDB with low WCU capacity where the batch needs to be resumed in multiple requests. Using boto3's batch_writer() fixes this problem. 2. This test has NOTHING to do with batches - so is mis-named and mis-placed. The batch write is just a way to prepare some data in the table, and the real test is about Query'ing the data back and observing the long response and reproducing issue #14454. I did not rename or move the test, but left a comment explaining the situation. 3. This test is written to assume the Query's response uses HTTP chunked encoding. Which isn't actually true for DynamoDB, at least not at the time of this writing. So the test fails on DynamoDB. For the last reason, I made this test scylla_only. This test can't really be run on DynamoDB without rewriting it. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2026-01-07 12:51:10 +02:00
Nadav Har'El	df6b347911	test/alternator: fix test on DynamoDB The test test_batch.py::test_batch_write_item_large often fails when running on DynamoDB, and this patch fixes it. The test checks that a large but not over-the-limits large batch works. However, "works" only means that the batch is not an error - it doesn't guarantee that all the items in the batch are performed. If the WCU limits of the table are exceeded DynamoDB may perform only part of the the batch and return the remaining items as UnprocessedItems. This not only can happen, it usually does happen on DynamoDB - because a new on-demand-billing table always start with a very low WCU capacity. So in this patch we update the test to recognize and perform the UnprocessedItems, instead of assuming it needs to be empty. The test continues to pass on Alternator, and finally passes on DynamoDB. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2026-01-07 12:51:10 +02:00
Nadav Har'El	9d6a463324	test/alternator: increase wait_for_gsi() timeout In Alternator tests, the wait_for_gsi() utility function is used in tests that add a GSI to an existing table, to wait for this new GSI to become ready. Although this takes a fraction of a second on Alternator, we noticed that this takes many minutes (!) on DynamoDB so we used an absurdly high 10 minute timeout to allow tests to also pass on DynamoDB. But it turns out that 10 minutes wasn't absurdly high enough, and tests using it in test_gsi_updatetable.py started to fail on DynamoDB. Empirically, 10 minutes was enough in the past but it seems that today adding a GSI to an empty table routinely takes as much as 20 minutes. So this patch increases the wait_for_gsi() timeout to a whopping 30 minutes. After this patch, the tests in test_gsi_updatetable.py which used to fail - test_gsi_backfill_with_lsi, test_gsi_backfill_with_real_column, test_gsi_creates_and_deletes and test_gsi_backfill_oversized_key now all pass on DynamoDB - but each takes more than 20 minutes to pass. To allow the test to fail much more quickly on Alternator (where creating a GSI takes a fraction of a second), we set a much lower but still very high timeout when running on Alternator - 60 seconds. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2026-01-07 12:50:54 +02:00
Łukasz Paszkowski	62313a6264	load_sketch: Allow populating load_sketch with normalized current load Currently, tablet allocation intentionally ignores current load ( introduced by the commit #1e407ab) which could cause identical shard selection when allocating a small number of tablets in the same topology. When a tablet allocator is asked to allocate N tablets (where N is smaller than the number of shards on a node), it selects the first N lowest shards. If multiple such tables are created, each allocator run picks the same shards, leading to tablet imbalance across shards. This change initializes the load sketch with the current shard load, scaled into the [0,1] range, ensuring allocation still remains even while starting from globally least-loaded shards. Fixes https://github.com/scylladb/scylladb/issues/27620 Closes scylladb/scylladb#27802	2026-01-07 11:49:01 +01:00
Nadav Har'El	5f79d93102	Merge 'Alternator response compression' from Szymon Malewski This pull request introduces HTTP response compression to Alternator, allowing responses (both string and chunked) to be compressed using `gzip` or `deflate` when requested by clients and when the response size exceeds configurable thresholds. * Added new source files `http_compression.cc` and `http_compression.hh` implementing compression logic, including parsing client `Accept-Encoding` headers, selecting compression algorithms, and compressing response bodies using zlib. * Added two new configuration options to `db::config` (`alternator_response_gzip_compression_level` and `alternator_response_gzip_compression_threshold_in_bytes`) to control compression level (and optionally disable compression with level 0 - no compression) and minimum response size for compression. * Added tests showing compliance with DynamoDB behavior. Fixes #27246 New feature - no backporting Closes scylladb/scylladb#27454 * github.com:scylladb/scylladb: alternator/http_compression: Add compression of streamed response alternator/http_compression: Add implementation od gzip/deflate of string response alternator/http_compression: Add handling of Accept-Encoding header test/alternator: add tests for compressed responses	2026-01-06 16:47:11 +02:00
Emil Maskovsky	4ba3e90f33	test/raft: use valid sentinel in liveness check to prevent digest errors Replace -1 with 0 for the liveness check operation to avoid triggering digest validation failures. This prevents rare fatal errors when the cluster is recovering and ensures the test does not violate append_seq invariants. The value -1 was causing invalid digest results in the append_seq structure, leading to assertion failures. This could happen when the sentinel value was the first (or only) element being appended, resulting in a digest that did not match the expected value. By using 0 instead, we ensure that the digest calculations remain valid and consistent with the expected behavior of the test. The specific value of the sentinel is not important, as long as it is a valid elem_t that does not violate the invariants of the append_seq structure. In particular, the sentinel value is typically used only when no valid result is received from any server in the current loop iteration, in which case the loop will retry. Fixes: scylladb/scylladb#27307	2026-01-06 14:34:02 +01:00
Emil Maskovsky	3af5183633	test/raft: improve debugging in randomized_nemesis_test Move the post-condition check before the assertion to ensure it is always executed first. Before, the wrong value could be passed to the digest_remove assertion, making the pre-check trigger there instead of the post-check as expected. Also, add a check in the append_seq constructor to ensure that the digest value is valid when creating an append_seq object.	2026-01-06 14:32:46 +01:00
Ferenc Szili	a51cb3dad9	test: fix flaky test_update_load_stats_after_migration Disable load balancing to avoid the balancer moving the tablet from a node with less to a node with more available disk space. Otherwise, the move_tablet API can fail (if the tablet is already in transisiton) or be a no-op (in case the tablet has already been migrated) Fixes: #27980 Closes scylladb/scylladb#27993	2026-01-06 11:57:35 +02:00
Andrei Chekun	b546315edf	test.py: fix race condition in initizlization of cqlpy tests Fix the race condition when the process finished, while test is trying to checks its descriptors. Now instead of failing the whole loop, it will continue to iterate the rest of the process to find the needed process. Closes scylladb/scylladb#27994	2026-01-06 10:40:25 +02:00
Nadav Har'El	384e394ff0	Merge 'Add similarity functions to calculate similarity of given vectors' from Dawid Pawlik It should be possible to return the similarity of vectors in CQL statements following the [Cassandra compatible syntax](https://cassandra.apache.org/doc/latest/cassandra/getting-started/vector-search-quickstart.html#query-vector-data-with-cql): ``` SELECT comment, similarity_cosine(comment_vector, [0.1, 0.15, 0.3, 0.12, 0.05]) FROM cycling.comments_vs; ``` Although the calculations are slow, and we already have calculated results returned via Vector Store API, we need the functionality as it allows us to calculate similarity of vectors not stored in vector indexes. It will be needed for [quantization and rescoring](https://scylladb.atlassian.net/wiki/spaces/RND/pages/195985800/Quantization+and+Rescoring). The feature is also a nice-to-have in testing as requested many times by testing and CX teams. The optimized version utilizing already calculated distances from Vector Store without a need of rescoring will be coming soon after via https://github.com/scylladb/scylladb/pull/27991. --- The patch adds functions: - `similarity_cosine(<vector>, <vector>)`, - `similarity_euclidean(<vector>, <vector>)`, - `similarity_dot_product(<vector>, <vector>)` Where `<vector>` is either a column of type `VECTOR<FLOAT, N>` or a vector of floats literal. These functions can be called with every `SELECT` query, not only ANN vector queries as opposed to https://github.com/scylladb/scylladb/pull/25993. The similarity calculations are implemented inspired by [USearch's implementation]( `a2f1759910/include/usearch/index_plugins.hpp (L1304-L1385)`) and made compatible with [Cassandra's documentation](https://cassandra.apache.org/doc/5.0/cassandra/developing/cql/functions.html#vector-similarity-functions). That would guarantee the results in ScyllaDB are calculated using the exact same algorithms as used in Vector Store indexes. --- Fixes: SCYLLADB-88 Fixes: SCYLLADB-89 New feature, should land into 2026.1 Closes scylladb/scylladb#27524 * github.com:scylladb/scylladb: docs: add vector similarity functions documentation test/cqlpy: add similarity functions correctness tests test/cqlpy: add similarity functions invalid call tests cql3: introduce similarity functions syntax vector_similarity_fcts: introduce similarity functions vector_similarity_fcts: retrieve similarity function argument types vector_similarity_fcts: add calculating similarity between vectors	2026-01-05 18:28:10 +02:00
Tomasz Grabiec	ffa11d6a2d	test: Verify that repair doesn't block disabling of tablet load balancing Refs #27647	2026-01-05 13:22:15 +01:00
Nadav Har'El	5c2ca56adf	test/alternator: fix test passing a spurious parameter The test test_streams.py::test_streams_putitem_new_item_overrides_old_lsi failed on DynamoDB (Refs #26079) because we passed an unused parameter NonKeyAttributes to the Projection setting an LSI. NonKeyAttributes is only allowed when ProjectionType=INCLUDE, but we used ProjectionType=ALL. DynamoDB refuses to create an LSI with such inconsistent parameters, and we just need to remove this unnecessary parameter from this test. The reason why this test didn't fail on Alternator is that Alternator doesn't yet support or even parse the Projection parameter (Refs #5036). We also add an xfailing test (passes on DynamoDB, fails on Alternator) checking that a spurious NonKeyAttributes parameter is rejected. When we get around to implement the projection feature (#5036), this will be yet another acceptance test for this feature. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2026-01-05 13:51:01 +02:00
Botond Dénes	e4da0afb8d	reader_concurrency_semaphore: add protection against negative count resource leaks The semaphore has detection and protection against regular resource leaks, where some resources go unaccounted for and are not released by the time the semaphore is destroyed. There is no detection or protection against negative leaks: where resources are "made up" of thin air. This kind of leaks looks benign at first sight, a few extra resources won't hurt anyone so long as this is a small amount. But turns out that even a single extra count resource can defeat a very important anti-deadlock protection in can_admit_read(): the special case which admits a new permit regardless of memory resources, when all original count resources all available. This check uses ==, so if resource > original, the protection is defeated indefinitely. Instead of just changing == to >=, we add detection of such negative leaks to signal(), via on_internal_error_noexcept(). At this time I still don't now how this negative leak happens (the code doesn't confess), with this detection, hopefully we'll get a clue from tests or the field. Note that on_internal_error_noexcept() will not generate a coredump, unless ScyllaDB is explicitely configured to do so. In production, it will just generate an error log with a backtrace. The detection also clams the _resources to _initial_resources, to prevent any damage from the negativae leak. I just noticed that there is no unit test for the deadlock protection described above, so one is added in this PR, even if only loosely related to the rest of the patch. Fixes: SCYLLADB-163 Closes scylladb/scylladb#27764	2026-01-05 12:45:15 +02:00
Szymon Malewski	1f658bb2e2	alternator/http_compression: Add compression of streamed response This patch adds compression of chunked responses. It adds intermediate stream to compress chunks of data that are provided to http sink. Fixes #27246	2026-01-05 10:14:42 +01:00
Szymon Malewski	b8afb173a6	alternator/http_compression: Add implementation od gzip/deflate of string response Previous commit added means to decide whether client asks for compression and with which algorithm. This patch adds actual compression of responses based on zlib library. For now only string (not chunked) responses are compressed. Several previously defined tests start to pass.	2026-01-05 10:14:42 +01:00
Szymon Malewski	08386ea959	test/alternator: add tests for compressed responses Adds set of tests that: 1. Show how DynamoDB handles response compression. It supports 'gzip' and 'deflate' compression, which can be selected by providing 'Accept-Encoding` header. It only encodes response above 4096B. - `test_compressed_response`, `test_compressed_response_large` show compression for various response sizes. - `test_accept_encoding_header` focuses on testing various values of Accept-Encoding header. - `test_multiple_accept_encoding_headers` verifies behaviour with repeted Accept-Encoding headers. 2. Will confirm implementation of response compression in Alternator (#27246) Additonally to above test, we check Altenator specific expectations: - `test_chunked_response_compression` makes sure that compression will work also for chunked responses. - `test_set_compression_options` checks config options to set response size threshold for compression and compression level 3. `test_signature_trims_accept_encoding_spaces` reveals Alternator's bug in signature verification (#27775)	2026-01-05 10:13:40 +01:00
Avi Kivity	0df85c8ae8	Revert "Merge 'Unify configuration of object storage endpoints' from Pavel Emelyanov" This reverts commit `1bb897c7ca`, reversing changes made to `954f2cbd2f`. It makes incompatible changes to the object storage configuration format, breaking tests [1]. It's likely that it doesn't break any production configuration, but we can't be sure. Fixes #27966 Closes scylladb/scylladb#27969	2026-01-05 08:53:41 +02:00
Benny Halevy	a8114f9bcc	test_backup: do_abort_restore: reduce data footprint To make the test fast, in particular in debug mode insert fewer keys and do not rely on os.urandom which is notoriously slow Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2026-01-05 08:04:08 +02:00
Benny Halevy	c0dd662144	test_backup: do_abort_restore: use error injection Currently the test depends on timing and enough inserted data to abort the restore tasks at exactly the right time. This is flaky in nature, so instead, use error injection to synchronize the abort with mutation streaming. Note that with that we no longer get the STREAM_MUTATION_FRAGMENTS log message, so waiting for it is dropped from the test. The most imporant thing is that some restore tasks must fail. (We cannot guarantee all would fail unfortunately) Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2026-01-05 08:03:53 +02:00
Benny Halevy	16dd07c7d4	test_backup: do_abort_restore: use asyncio for cql Use the more modern asyncio facility to run cql queries and a prepared statement to insert data into the table. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2026-01-05 08:03:53 +02:00
Benny Halevy	f1a583c39c	test_backup: do_abort_restore: use new_test_keyspace For creating a keyspace with a unique name and auto-deleting it on exit. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2026-01-05 08:03:53 +02:00
Benny Halevy	3e8431a3d9	test_backup: do_abort_restore: use logger rather than print Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2026-01-05 08:03:53 +02:00
Benny Halevy	acb2c9b045	test_backup: do_abort_restore: pass auto_rack_dc to servers_add To generate multi-rack cluster, otherwise we get the following error: ``` E cassandra.protocol.ConfigurationException: <Error from server: code=2300 [Query invalid because of configuration issue] message="Replication factor 3 exceeds the number of racks (1) in dc datacenter1"> ``` Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2026-01-05 08:03:53 +02:00
copilot-swe-agent[bot]	4e41b6f106	tools/scylla-nodetool: Increase precision of compression ratio from 1 to 2 decimal places In the tablestats (cfstats) command. Fixes: https://github.com/scylladb/scylladb/issues/27962 Closes scylladb/scylladb#27965	2026-01-05 07:07:06 +02:00
Nadav Har'El	c4a9d7eb3e	cql: fix DESC KEYSPACES when a "USE" is in effect If a CQL session USEs a keyspace and then calls DESC TABLES, the user expects to see only the tables in the chosen keyspace. However, calling DESC KEYSPACES should still return list all the keyspaces - returning just the USEd one is not useful - and also not what Cassandra does. We had an xfailing test test_describe.py::test_keyspaces_with_use which reproduces this bug (and passes on Cassandra). In this patch we fix this bug. The fix is simple - USE should affect DESC statements, but be ignored for DESC KEYSPACES. We can then remove the xfail marker from the test. The patch also includes a new test for the DESC TABLES case, where the USE does have an affect. And I wanted to make sure the patch doesn't break this case. As usual, the new test passes on both Cassandra and ScyllaDB. Fixes #26334 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#27971	2026-01-04 22:01:12 +02:00
Dawid Pawlik	115bd51873	test/cqlpy: add similarity functions correctness tests Add `calculate_similarity` function for testing purposes. Add tests checking if CQL returned values match the calculated ones with the precision up to 5th decimal place. The tests should also be run on Cassandra to check compatibility with their responses.	2026-01-02 13:02:59 +01:00
Dawid Pawlik	12aa33106f	test/cqlpy: add similarity functions invalid call tests Add tests checking that calling similarity functions with: - non-vector columns - non-vector values - vectors with mismatching dimensions as arguments fails.	2026-01-02 12:49:22 +01:00
Nadav Har'El	6c8ddfc018	test/alternator: fix typo in test_returnvalues.py Different DynamoDB operations have different settings allowed for their "ReturnValues" argument. In particular, some operations allow ReturnValues=UPDATED_OLD but the DeleteItem operation does not. We have a test, test_delete_item_returnvalues, aimed to verify this but it had a typo and didn't actually check "UPDATED_OLD". This patch fixes this typo. The test still passes because the code itself (executor.cc, delete_item_operation's constructor) has the correct check - it was just the test that was wrong. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#27918	2026-01-01 19:33:23 +02:00
Łukasz Paszkowski	76b84b71d1	storage/test_out_of_space_prevention.py: Fix async/await bugs - Add missing await keywords for async operations on s2_log.wait_for() and coord_log.wait_for() - Fix incorrect regex: "compaction .* Split {cf}" → "compaction.*Split {cf}" - The commit https://github.com/scylladb/scylladb/commit/f7324a4 demoted compaction start/end log messages to debug level. Hence add compaction=debug log messages to the following tests: test_split_compaction_not_triggered test_node_restart_while_tablet_split test_repair_failure_on_split_rejection Fixes https://github.com/scylladb/scylladb/issues/27931 Closes scylladb/scylladb#27932	2026-01-01 14:24:30 +02:00
Nadav Har'El	e28df9b3d0	test: fix Python warnings in regular expressions Like C, Python supports some escape sequences in strings such as the familiar "\n" that converts to a newline character. Originally, when backslash was used before a random character, for example, "\.", Python used to just use these literal characters backslash and dot, in the string - and not make a fuss about it. This made it ok to use a string like "hi\.there" as a regular expression. We have a few instances of this in our Python tests. But recent releases of Python started to produce ugly warnings about these cases. The error message looks like: SyntaxWarning: "\." is an invalid escape sequence. Such sequences will not work in the future. Did you mean "\\."? A raw string is also an option. Indeed in most cases the easiest solution is to use a "raw string", a string literal preceded with r. For example, r"hi\.there". In such strings Python doesn't replace escape sequences like \n in the string, and also leaves the \. unchanged for the regular expression to see. So in this patch we use raw strings in all places in test/ where Python warns have this problem. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#27856	2025-12-31 20:44:01 +02:00
Asias He	3abda7d15e	topology_coordinator: Ensure repair_update_compaction_ctrl is executed Consider this: - n1 is a coordinator and schedules tablet repair - n1 detects tablet repair failed, so it schedules tablet transition to end_repair state - n1 loses leadership and n2 becomes the new topology coordinator - n2 runs end_repair on the tablet with session_id=00000000-0000-0000-0000-000000000000 - when a new tablet repair is scheduled, it hangs since the lock is already taken because it was not removed in previous step To fix, we use the global_tablet_id to index the lock instead of the session id. In addition, we retry the repair_update_compaction_ctrl verb in case of error to ensure the verb is eventually executed. The verb handler is also updated to check if it is still in end_repair stage. Fixes #26346 Closes scylladb/scylladb#27740	2025-12-31 13:17:18 +01:00
Yaniv Kaul	0264ec3c1d	test: test_downgrade_after_partial_upgrade: check that feature is disabled on all nodes after partial upgrade We should check that the test feature is disabled on all nodes after a partial upgrade. This hardens the test a bit, although the old code wasn't that bad, since enabled features are a part of the group 0 state shared by all nodes. Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> Closes scylladb/scylladb#27654	2025-12-30 17:34:56 +01:00
Nadav Har'El	ffcce1ffc8	test/boost: fix flaky test node_view_update_backlog The boost test view_schema_test.cc::node_view_update_backlog can be flaky if the test machine has a hiccup of 100ms, and this patch fixes it: The test is a unit test for db::view::node_update_backlog, which is supposed to cache the backlog calculation for a given interval. The test asks to cache the backlog for 100ms, and then without sleeping at all tries to fetch a value again and expect the unchanged cached value to be returned. However, if the test run experiences a context switch of 100ms, it can fail, and it did once as reported in #27876. The fix is to change the interval in this test from 100ms to something much larger, like 10 seconds. We don't sleep this amount - we just need the second fetch to happen before 10 seconds has passed, so there's no harm in using a very large interval. However, the second half of this test wants to check that after the interval is over, we do get a new backlog calculation. So for the second half of this test we can and should use a shorter backlog - e.g., 10ms. We don't care if the test machine is slow or context switched, for this half of the test we want to to sleep more than 10ms, and that's easy. The fixed test is faster than the old one (10ms instead of 100ms) and more reliable on a shared test machine. Fixes #27876. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#27878	2025-12-30 10:10:42 +01:00
Benny Halevy	c9eab7fbd4	test: test_refresh: add test_refresh_deletes_uploaded_sstables The refresh api is expected to automatically delete the sstable files from the uploads/ dir. Verify that. The code that does that is currently called by sstables_loader::load_new_sstables: ```c++ if (load_and_stream) { ... co_await loader.load_and_stream(ks_name, cf_name, table_id, std::move(sstables_on_shards[this_shard_id()]), primary_replica_only(primary), true /* unlink */, scope, {}); ``` Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#27586	2025-12-30 10:51:24 +03:00
Patryk Jędrzejczak	0fed9f94f8	gossiper: add_saved_endpoint: make generations of excluded nodes negative The explanation is in the new comment in `gossiper::add_saved_endpoint`. We add a test for this change. It's "extremely white-box", but it's better than nothing.	2025-12-29 19:13:55 +01:00
Patryk Jędrzejczak	749b0278e5	test: introduce test_full_shutdown_during_replace	2025-12-29 19:13:55 +01:00
Tomasz Grabiec	bbf9ce18ef	Merge 'load_balancer: compute node load based on tablet sizes' from Ferenc Szili Currently, the tablet load balancer performs capacity based balancing by collecting the gross disk capacity of the nodes, and computes balance assuming that all tablet sizes are the same. This change introduces size-based load balancing. The load balancer does not assume identical tablet sizes any more, and computes load based on actual tablet sizes. The size-based load balancer computes the difference between the most and least loaded nodes in the balancing set (nodes in DC, or nodes in a rack in case of `rf-rack-valid-keyspaces`) and stops further balancing if this difference is bellow the config option `size_based_balance_threshold_percentage`. This config option does not apply to the absolute load, but instead to the percentage of how much the most loaded node is more loaded than the least loaded node: `delta = (most_loaded - least_loaded) / most_loaded` If this delta is smaller then the config threshold, the balancer will consider the nodes balanced. This PR is a part of a series of PRs which are based on top of each other. - First part for tablet size collection via load_stats: #26035 - Second part reconcile load_stats: #26152 - The third part for load_sketch changes: #26153 - The fourth part which performs tablet load balancing based on tablet size: #26254 - The fifth part changes the load balancing simulator: #26438 This is a new feature, backport is not needed. Fixes #26254 Closes scylladb/scylladb#26254 * github.com:scylladb/scylladb: test, load balancing: add test for table balance load_balancer: add cluster feature for size based balancing load_balancer: implement size-based load balancing config: add size based load balancing config params load_stats: use trinfo to decide how to reconcile tablet size load_sketch: use tablet sizes in load computation load_stats: add get_tablet_size_in_transition()	2025-12-29 15:01:38 +01:00
Gleb Natapov	4a5292e815	raft topology: Notify that a node was removed only once Raft topology goes over all nodes in a 'left' state and triggers 'remove node' notification in case id/ip mapping is available (meaning the node left recently), but the problem is that, since the mapping is not removed immediately, when multiple nodes are removed in succession a notification for the same node can be sent several times. Fix that by sending notification only if the node still exists in the peers table. It will be removed by the first notification and following notification will not be sent. Closes scylladb/scylladb#27743	2025-12-29 14:22:34 +01:00
Radosław Cybulski	a31c8762ca	Update tests	2025-12-29 08:33:09 +01:00
Radosław Cybulski	dfa600fb8f	Add simple_value_with_expiry util class Add a `simple_value_with_expiry` utility class, which functions like a `std::optional` with added timeout. When emplacing a value, user needs to provide timeout, after which value expires (in which case the `simple_value_with_expiry` object behaves as if was never set at all). Add boost tests for the new class.	2025-12-29 08:32:52 +01:00
Avi Kivity	63e3a22f2e	Merge 'group0_state_machine: don't update in-memory state machine until start' from Piotr Dulikowski Group0 commands consist of one or more mutations and are supposed to be atomic - i.e. the data structures that reflect the group0 tables state are not supposed to be updated while only some mutations of a command are applied, the logic responsible for that is not supposed to observe an inconsistent state of group0 tables. It turns out that this assumption can be broken if a node crashes in the middle of applying a multi-mutation group0 command. Because these mutations are, in general, applied separately, only some mutations might survive a crash and a restart, so the group0 tables might be in an inconsistent state. The current logic of group0_state_machine will attempt to read the group0 tables' state as it was left after restart, so it may observe inconsistent state. This can confuse the node as it may observe a state that it was not supposed to observe, or the state will just outright break some invariants and trigger some sanity checks. One of those was observed in https://github.com/scylladb/scylladb/issues/26945, where a command from the CDC generation publisher fiber was partially applied. The fiber, in addition to publishing generations, it removes old, expired generations as well. Removal is done by removing data that describes the generation from cdc_generations_v3 and by removing the generation's ID from the committed generation list in the topology table. If only the first mutation gets through but not the other one, on reload the node will see a committed CDC generation without data, which will trigger an on_internal_error check. Fix this by delaying the moment when the in memory data structures are first loaded. In `579dcf187a`, a mechanism was introduced which persists the commit index before applying commands that are considered committed. Starting a raft server waits until commands are replayed up to that point. The fix is to start the group0_state_machine in a mode which only applies mutations - the aforementioned mechanism will re-apply the commands which will, thanks to the mutation idempotency, bring the group0 to a consistent state. After the group0 is known to be in consistent state (so, after raft::server_impl::start) the in-memory data structures of group0 are loaded for the first time. There is an exception, however: schema tables. Information about schema is actually loaded into memory earlier than the moment when group0 is started. Applying changes to schema is done through the migration manager module which compares the persisted state before and after the schema mutations are applied and acts on that. Refactoring migration manager is out of scope of this PR. However, this is not a problem because the migration manager takes care to apply all of the mutations given in a command in a single commitlog segment, so the initial schema loading code should not see an inconsistent state due to the state being partially applied. The fix is accompanied by a reproducer of scylladb/scylladb#26945. Fixes: scylladb/scylladb#26945 This is not a regression, so no need to backport. Closes scylladb/scylladb#27528 * github.com:scylladb/scylladb: test: cluster: test for recovery after partial group0 command group0_state_machine: remove obsolete comment about group0 consistency group0_state_machine: don't update in-memory state machine until start group0_state_machine: move reloading out of std::visit service: raft: add state machine ref to raft_server_for_group	2025-12-28 13:59:26 +02:00
Ferenc Szili	6d3c720a08	test, load balancing: add test for table balance This change adds a boost test which validates the resulting table balance of size based load balancing. The threshold was set to a conservative 1.5 overcommit to avoid flakyness.	2025-12-27 11:39:08 +01:00
Ferenc Szili	10eb364821	load_balancer: implement size-based load balancing This changes introduces tablet size based load balancing. It is an extension of capacity based balancing with the addition of actual tablet sizes. It computes the difference between the most and least loaded nodes in the DC and stops further balancing if this difference is bellow the config option size_based_balance_threshold_percentage. This config option does not apply to the absolute load, but instead to the percentage of how much the most loaded node is more loaded than the least loaded node: delta = (most_loaded - least_loaded) / most_loaded If this delta is smaller then the config threshold, the balancer will consider the nodes balanced.	2025-12-27 11:20:20 +01:00
Ferenc Szili	621cb19045	load_sketch: use tablet sizes in load computation This commit changes load_sketch so that it computes node and shard load based on tablet sizes instead of tablet count.	2025-12-27 10:37:23 +01:00
Pavel Emelyanov	bda1709734	Merge 'test: fix infinite loop in python log browsing code triggered from test_orphaned_sstables_on_startup' from Avi Kivity Recently, test/cluster/test_tablet.py::test_orphaned_sstables_on_startup started spinning in the log browsing code, part of a the test library that looks into log files for expected or unexpected patterns. This reproduced somewhat in continuous integration, and very reliably for me locally. The test was introduced in `fa10b0b390`, a year ago. There are two bugs involved: first, that we're looking for crashes in this test, since in fact it is expected to crash. The node expectedly fails with an on_internal_error. Second, the log browsing code contains an infinite loop if the crash backtrace happens to be the last thing in the log. The series fixes both bugs. Fixes #27860. While the bad code exists in release branches, it doesn't trigger there so far, so best to only backport it if it starts manifesting there. Closes scylladb/scylladb#27879 * github.com:scylladb/scylladb: test: pylib: log_browsing: fix infinite loop in find_backtraces() test: pylib/log_browsing, cluster/test_tablets: don't look for expected crashes	2025-12-26 10:45:56 +03:00
Nadav Har'El	9c50d29a00	test/boost: fix flaky test_inject_future_disabled The test boost/error_injection_test.cc::test_inject_future_disabled checks what happens when a sleep injection is disabled: The test has a 10-millisecond-sleep injection and measures how much it takes. The test expects it to take less than 10 milliseconds - in fact it should take almost zero. But this is not guaranteed - on a slow debug build and an overcommitted server this do-nothing injection can take some time, and in one run (#27798) it took 14 milliseconds - and the test failed. The solution is easy - make the sleep-that-doesn't-happen much longer - e.g., 10 whole seconds. Since this sleep still doesn't happen, we expect the injection to return in less - much less - than 10 seconds. This 10 seconds is so ridiculously high we don't expect the do-nothing injection to take 10 seconds, not even a ridiculously busy test machine. Fixes #27798 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#27874	2025-12-25 20:46:31 +02:00
Avi Kivity	92996ce9fa	test: pylib: log_browsing: fix infinite loop in find_backtraces() The find_backtraces() function uses a very convoluted loop to read the log file. The loop fails to terminate if the last thing in the log file is the backtrace, since the loop termination condition (`not line`) continues to be true. It's not clear why this did not reliably hit before, but it now reliably reproduces for me on both x86 and aarch64. Perhaps timing changed, or perhaps previously we had more text on the log.	2025-12-25 20:22:17 +02:00

... 26 27 28 29 30 ...

11801 Commits