scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-23 16:22:15 +00:00

Author	SHA1	Message	Date
Gleb Natapov	e88ce09372	raft_topology: log sub-step progress in local_topology_barrier When a node processes a barrier_and_drain topology command, it performs two potentially long-running operations inside local_topology_barrier(): waiting for stale token metadata versions to be released (stale_versions_in_use) and draining closing sessions (drain_closing_sessions). Either of these can hang indefinitely -- for example, stale_versions_in_use blocks until all references to previous token metadata versions are released, which depends on in-flight requests completing. Previously, the only logging was a single 'done' message at the end, making it impossible to determine which sub-step was blocking when a barrier_and_drain RPC appeared stuck on a node. In a recent CI failure, a node never responded to barrier_and_drain during a removenode operation, and the logs showed the RPC was received but nothing about what it was waiting on internally. Add info-level logging before each blocking sub-step, including the topology version for correlation. This allows diagnosing hangs by showing whether the node is stuck waiting for stale metadata versions, stuck draining sessions, or never reached these steps at all.	2026-05-04 15:58:45 +03:00
Piotr Szymaniak	a5d35d2b4c	test/cluster: Test deferred stream enablement on tablet tables Async cluster test exercising the deferred enablement lifecycle: ENABLING -> ENABLED -> disabled, verifying tablet merge blocking and unblocking at each stage. Uses delay_cdc_stream_finalization error injection and CQL ALTER TABLE with tablet count constraints. Also adds tablet scheduler config to test_config.yaml (fast refresh interval, scale factor 1) for reliable tablet count changes.	2026-04-19 03:54:33 +02:00
Avi Kivity	0ae22a09d4	LICENSE: Update to version 1.1 Updated terms of non-commercial use (must be a never-customer).	2026-04-12 19:46:33 +03:00
Piotr Szymaniak	c8e7e20c5c	test/cluster: retry create_table on transient schema agreement timeout In test_index_requires_rf_rack_valid_keyspace, the create_table call for a plain tablet-based table can fail with 'Unable to reach schema agreement' after the server's 10s timeout is exceeded. This happens when schema gossip propagation across the 4-node cluster takes longer than expected after a sequence of rapid schema changes earlier in the test. Add a retry (up to 2 attempts) on schema agreement errors for this specific create_table call rather than increasing the server-side timeout. Fixes: SCYLLADB-1135 Closes scylladb/scylladb#29132	2026-03-23 10:45:30 +02:00
Nadav Har'El	ad832c263e	test/cluster: mark test_alternator_concurrent_rmw_same_partition_different_server not strictly xfail A few days ago, in commit `7b30a39` we added to pytest.ini the option xfail_strict. This option causes every time a test XPASSes, i.e., an xfail test actually passes - to be considered an error and fail the test. But some tests demonstrate a timing-related bug and do not reproduce the bug every single time. An example we noticed in one CI run is: test/cluster/test_alternator.py::test_alternator_concurrent_rmw_same_partition_different_server This test reproduces a timing-related bug (if you do an LWT write to one partition on to two different coordinators "at the same time", you can get a failure), but only most of the time, not 100% of the time. The solution is to add "strict=False" for the xfail marker on this specific test. This undoes the xfail_strict for this specific test, accepting that this specific test can either pass or fail. Note that this does NOT make this test worthless - we still see this test failing most of the time, and when a developer finally fixes this issue, the test will begin to pass all the time. Fixes https://scylladb.atlassian.net/browse/SCYLLADB-941 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#29016	2026-03-12 23:46:23 +02:00
Nadav Har'El	0c7f499750	test, alternator: add test for TTL expiration with a node down We have many single-node functional tests for Alternator TTL in test/alternator/test_ttl.py. This patch adds a multi-node test in test/cluster/test_alternator.py. The new test verifies that: 1. Even though Alternator TTL splits the work of scanning and expiring items between nodes, all the items get correctly expired. 2. When one node is down, all the items still expire because the "secondary" owner of each token range takes over expiring the items in this range while the "primary" owner is down. This new test is actually a port of a test we already had in dtest (alternator_ttl_tests.py::test_multinode_expiration). This port is faster and smaller then the original (fewer nodes, fewer rows), but it still found a regression (SCYLLADB-777) that dtest missed - the new test failed when running with tablets and in release build mode. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2026-02-23 16:19:43 +02:00
Andrei Chekun	cc5ac75d73	test.py: remove deprecated skip_mode decorator Finishing the deprecation of the skip_mode function in favor of pytest.mark.skip_mode. This PR is only cleaning and migrating leftover tests that are still used and old way of skip_mode. Closes scylladb/scylladb#28299	2026-01-25 18:17:27 +02:00
Michael Litvak	1f7a65904e	alternator: don't require rf_rack flag for indexes, validate instead In `8df61f6d99` we changed the requirements for creating materialized views and MV-based indexes - instead of requiring the rf_rack_valid_keyspaces flag to be set, we now require the keyspace to be RF-rack-valid at the time of creation, and it is enforced to remain RF-rack-valid while the MV exists. This validation is done in the cql create view/index statements. The same should be done also for alternator - when creating a table with GSI or LSI, or when adding a GSI to an existing table, previously we required the flag rf_rack_valid_keyspaces to be set. Now we change it to instead check if the keyspace is RF-rack-valid, and if not the operation fails with an appropriate error.	2026-01-22 16:11:35 +01:00
Michael Litvak	e7ec87382e	Revert "alternator: require rf_rack_valid_keyspaces when creating index" This reverts commit `4b26a86cb0`. The rf_rack_valid_keyspaces option is now not required for creating MVs.	2026-01-20 09:56:48 +01:00
Tomasz Grabiec	6936704677	test: Add missing calls to disable_tablet_balancing() in tests which use move_tablet() API If a test tries to move a tablet, it assumes the tablets are stable. This fixes flakiness exposed by size-based load-balancing and a later change to refresh stats sooner.	2026-01-13 00:38:00 +01:00
Andrei Chekun	c950c2e582	test.py: convert skip_mode function to pytest.mark Function skip_mode works only on function and only in cluster test. This if OK when we need to skip one test, but it's not possible to use it with pytestmark to automatically mark all tests in the file. The goal of this PR is to migrate skip_mode to be dynamic pytest.mark that can be used as ordinary mark. Closes scylladb/scylladb#27853 [avi: apply to test/cluster/test_tablets.py::test_table_creation_wakes_up_balancer]	2026-01-08 21:55:16 +02:00
Michael Litvak	b9ec1180f5	alternator: require rf_rack_valid_keyspaces when creating index When creating an alternator table with tablets, if it has an index, LSI or GSI, require the config option rf_rack_valid_keyspaces to be enabled. The option is required for materialized views in tablets keyspaces to function properly and avoid consistency issues that could happen due to cross-rack migrations and pairing switches when RF-rack validity is not enforced. Currently the option is validated when creating a materialized view via the CQL interface, but it's missing from the alternator interface. Since alternator indexes are based on materialized views, the same check should be added there as well. Fixes scylladb/scylladb#27612 Closes scylladb/scylladb#27622	2025-12-15 10:36:57 +02:00
Petr Gusev	3a865fe991	alternator/executor.cc: move shard check into cas_write This change ensures that if cas_shard points to a different shard, the executor will continue issuing shard jumps until cas_shard.this_shard() returns true. The commit simply moves the this_shard() check from the parallel_for_each lambda into cas_write, with minimal functional changes. We enable test_alternator_invalid_shard_for_lwt since now it should pass. Fixes scylladb/scylladb#27353	2025-12-09 10:21:01 +01:00
Petr Gusev	e60bcd0011	test_alternator: add test_alternator_invalid_shard_for_lwt This test reproduces scylladb/scylladb#27353 using two injection points. First, the test triggers an intra-node tablet migration and suspends it at the streaming stage using the intranode_migration_streaming_wait injection. Next, it enables the alternator_executor_batch_write_wait injection, which suspends a batch write after its cas_shard has already been created. The test then issues several batch writes and waits until one of them hits this injection on the destination shard. At this point, the cas_shard.erm for that write is still in the streaming state, meaning the executor would need to jump back to the source shard. The test then resumes the suspended tablet migration, allowing it to update the ERM on the source shard to write_both_read_new. After that, the test releases the suspended batch write and expects it to perform two shard jumps: first from the destination to the source shard, and then again back to the source shard. This commit adds the alternator_executor_batch_write_wait injection to alternator/executor.cc. Coroutines are intentionally avoided in the parallel_for_each lambda to prevent unnecessary coroutine-frame allocations.	2025-12-08 10:29:28 +01:00
Nadav Har'El	7dc04b033c	test/cluster: fix missing racks in xfailing Alternator test Since Alternator is now using tablets by default, it's no longer possible to create an Alternator table on a 3-node cluster with a single rack - you need to have 3 racks to support RF=3. Most of the multi-node Alternator tests in test/cluster/test_alternator.py were already fixed to use a 3-rack cluster, but one test was missed because it was marked "xfail" so its new failure to create the table was missed. This patch adds the missing 3-rack setup, so the xfailing test returns to failing on the real bug - not on the table creation. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#27382	2025-12-03 10:54:11 +03:00
Nadav Har'El	65ed678109	test,alternator: use 3-rack clusters in tests With tablets enabled, we can't create an Alternator table on a three- node cluster with a single rack, since Scylla refuses RF=3 with just one rack and we get the error: An error occurred (InternalServerError) when calling the CreateTable operation: ... Replication factor 3 exceeds the number of racks (1) in dc datacenter1 So in test/cluster/test_alternator.py we need to use the incantation "auto_rack_dc='dc1'" every time that we create a three-node cluster. Before this patch, several tests in test/cluster/test_alternator.py failed on this error, with this patch all of them pass. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2025-11-09 12:52:29 +02:00
Piotr Szymaniak	63897370cb	alternator: Fix tag name to request vnodes The tag was lately renamed from `experimental:initial_tablets` to `system::initial_tablets`. This commit fixes both the tests as well as the exceptions sent to the user instructing how to create table with vnodes.	2025-11-09 12:52:29 +02:00
Tomasz Grabiec	7f66f67d95	test: alternator: Adjust for rack lists To achieve RF=3 with tablets and rf_rack_valid_keyspaces, we need 3 racks. So change the test to create 3 racks. Alternator was bypassing standard keyspace creation path, so it escaped validation. But this will change, and the test will stop wroking. Also, after auto-expansion of RF to rack list, not all of 4 nodes will host replicas. So need to adjust expectations.	2025-10-29 23:32:58 +01:00
Tomasz Grabiec	40e7543361	test: Create cluster with multiple racks in multi-dc setups To allow auto-expansion of numeric RF to rack list. Otherwise, keyspace creation will be rejected if rf-rack-valid keyspaces are enforced.	2025-10-29 23:32:57 +01:00
Nadav Har'El	265660d22f	test/cluster: greatly speed up test_localnodes_joining_nodes The test cluster/test_alternator::test_localnodes_joining_nodes takes a whopping 2 minutes and 9 seconds to run before this patch. After this patch, it takes just 7 seconds. The slowness of this test was caused by booting a second node that hangs during boot for 2 minutes, deliberately. We never intended for this boot to finish (the whole point of this test is to run before it finishes), but unfortunately had to wait for it to avoid all sort of nasty problems with unwaited futures. As comments already explained in the code, the solution to this problem is to kill the server at the end of the test - after we kill it, we can wait for it - this wait will very quickly notice that the server addition failed, and not need to wait 2 minutes. But until the previous patch, we had no API to find the server which is starting (not yet running), or to kill it. After the previous patch, we do have such an API, and can now use it, and see this test finish in 7 seconds instead of 2 minutes and 9 seconds.	2025-09-25 14:00:16 +03:00
Artsiom Mishuta	4b975668f6	tiering (test.py): introduce tiering labels introduce tiering marks 1 “unstable” - For unstable tests that will be will continue runing every night and generate up-to-date statistics with failures without failing the “Main” verification path(scylla-ci, Next) 2 “nightly” - for tests that are quite old, stable, and test functionality that rather not be changed or affected by other features, are partially covered in other tests, verify non-critical functionality, have not found any issues or regressions, too long to run on every PR, and can be popped out from the CI run. set 7 long tests(according to statistic in elastic) as nightly(theses 8 tests took 20% of CI run, about 4 hours without paralelization) 1 test as unstable(as exaple ot marker usage) Closes scylladb/scylladb#24974	2025-08-04 15:38:16 +03:00
Nadav Har'El	2431f92967	alternator, test: add reproducer for issue about immediate LWT timeout This patch adds a reproducer for issue #16261, where it was reported that when Alternator read-modify-write (using LWT) operations to the same partition are sent to different nodes, sometimes the operation fails immediately, with an InternalServerError claiming to be a "timeout", although this happens almost immediately (after a few milliseconds), not after any real timeout. The test uses 3 nodes, and 3 threads which send RMW operations to different items in the same partition, and usually (though not with 100% certainty) it reaches the InternalServerError in around 100 writes by each thread. This InternalServerError looks like: Internal server error: exceptions::mutation_write_timeout_exception (Operation timed out for alternator_alternator_Test_1719157066704.alternator_Test_1719157066704 - received only 1 responses from 2 CL=LOCAL_SERIAL.) The test also prints how much time it took for the request to fail, for example: In incrementing 1,0 on node 1: error after 0.017074108123779297 This is 0.017 seconds - it's not the cas_contention_timeout_in_ms timeout (1 second) or any other timeout. If we enable trace logging, adding to topology_experimental_raft/suite.yaml extra_scylla_cmdline_options: ["--logger-log-level", "paxos=trace"] we get the following TRACE-level message in the log: paxos - CAS[0] accept_proposal: proposal is partially rejected This again shows the problem is "uncertainty" (partial rejection) and not a timeout. Refs #16261 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#19445	2025-08-01 11:58:52 +03:00
Nadav Har'El	16c1365332	test,alternator: test server-side load balancing with zero-token node In issue #6527 it was suggested that a zero-token node (a.k.a coordinator- only node, or data-less node) could serve as a topology-aware Alternator load balancer - requests could be sent to it and they will be forwarded to the right node. This feature was implemented, but we never tested that it actually works for Alternator requests. So this patch tests this by starting a 5-node cluster with 4 regular nodes and one zero-token node, and testing that requests to the zero-token node work as expected. It is important to know that this feature does indeed work as expected, and also to have a regression test for it so the feature doesn't break in the future. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#23114	2025-06-25 11:13:15 +03:00
Nadav Har'El	3ce7e250cc	alternator: fix schema "concurrent modification" errors In ScyllaDB, schema modification operations use "optimistic locking": A schema operation reads the current schema, decides what it wants to do and prepares changes to the schema, and then attempts to commit those changes - but only if the schema hasn't changed since the first read. If the schema has already been changed by some other node - we need to try again. In a loop. In Alternator, there are six operations that perform schema modification: CreateTable, DeleteTable, UpdateTable, TagResource, UntagResource and UpdateTimeToLive. All of them were missing this loop. We knew about this - and even had FIXME in all places. So all these operations, when facing contention of concurrent schema modifications on different nodes may fail one of these operations with an error like: Internal server error: service::group0_concurrent_modification (Failed to apply group 0 change due to concurrent modification). This problem had very minor effect, if any, on real users because the DynamoDB SDK automatically retries operations that fail with retryable errors - like this "Internal server error" - and most likely the schema operation will succeed upon retry. However, as shown in issue #13152 these failures were annoying in our CI, where tests - which disable request retries - failed on these errors. This patch fixes all six operations (the last three operations all use one common function, db::modify_tags(), so are fixed by one change) to add the missing loop. The patch also includes reproducing tests for all these operations - the new tests all fail before this patch, and pass with it. These new tests are much more reliable reproducers than the dtests we had that only sometimes - very rarely - reproduced the problem. Moreover, the new tests reproduces the bug seperately for each of the six operations, so if we forget to fix one of the six operations, one of the tests would have continued to fail. Of course I checked this during development. The new tests are in the test/cluster framework, not test/alternator, because this problem can only be reproduced in a multi-node cluster: On a single node, it serializes its schema modifications on its own; The collisions only happen when more than one node attempts schema modifications at the same time. Fixes #13152 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#23827	2025-05-05 09:59:08 +03:00
Artsiom Mishuta	d1198f8318	test.py: rename topology_custom folder to cluster rename topology_custom folder to cluster as it contains not only topology test cases	2025-03-04 10:32:44 +01:00

25 Commits