scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-31 20:16:43 +00:00

Author	SHA1	Message	Date
Petr Gusev	37c575c104	test_alternator: add test_alternator_invalid_shard_for_lwt This test reproduces scylladb/scylladb#27353 using two injection points. First, the test triggers an intra-node tablet migration and suspends it at the streaming stage using the intranode_migration_streaming_wait injection. Next, it enables the alternator_executor_batch_write_wait injection, which suspends a batch write after its cas_shard has already been created. The test then issues several batch writes and waits until one of them hits this injection on the destination shard. At this point, the cas_shard.erm for that write is still in the streaming state, meaning the executor would need to jump back to the source shard. The test then resumes the suspended tablet migration, allowing it to update the ERM on the source shard to write_both_read_new. After that, the test releases the suspended batch write and expects it to perform two shard jumps: first from the destination to the source shard, and then again back to the source shard. This commit adds the alternator_executor_batch_write_wait injection to alternator/executor.cc. Coroutines are intentionally avoided in the parallel_for_each lambda to prevent unnecessary coroutine-frame allocations. (cherry picked from commit `e60bcd0011`)	2025-12-10 11:42:54 +01:00
Petr Gusev	b88ed6156b	alternator/executor.cc: avoid cross-shard free This commit is an optimization: avoiding destruction of foreign objects on the wrong shard. Releasing objects allocated on a different shard causes their ::free calls to be executed remotely, which adds unnecessary load to the SMP subsystem. Before this patch, a std::vector could be moved to another shard. When the vector was eventually destroyed, its ::free had to be marshalled back to the shard where the memory had originally been allocated. This change avoids that overhead by passing the vector by const reference instead. The referenced objects lifetime correctness reasoning: * the put_or_delete_item refs usages in put_or_delete_item_cas_request are bound to its lifetime * cas_request lifetime is bound to storage_proxy::cas future * we don't release put_or_delete_item-s untill all storage_proxy::cas calls are done. (cherry picked from commit `f00f7976c1`)	2025-12-10 11:41:55 +01:00
Jenkins Promoter	582b9f83db	Update ScyllaDB version to: 2025.4.0-rc7	2025-12-09 17:21:23 +02:00
Anna Stuchlik	d2c24bf42d	doc: update the upgrade policy to cover non-consecutive minor upgrades Fixes https://github.com/scylladb/scylladb/issues/27308 Closes scylladb/scylladb#27319 (cherry picked from commit `a5c971d21c`) Closes scylladb/scylladb#27457	2025-12-09 11:38:00 +03:00
Anna Stuchlik	72ee6396d2	doc: add the upgrade guide from 2025.x to 2025.4 Fixes https://github.com/scylladb/scylladb/issues/26451 Fixes https://github.com/scylladb/scylladb/issues/26452 Closes scylladb/scylladb#27310 (cherry picked from commit `48cf84064c`) Closes scylladb/scylladb#27407	2025-12-09 11:37:27 +03:00
Jenkins Promoter	f64b5d375e	Update ScyllaDB version to: 2025.4.0-rc6 scylla-2025.4.0-rc6 scylla-2025.4.0-rc6-candidate-20251208053234	2025-12-08 13:37:52 +02:00
Piotr Dulikowski	3040e7aedf	index: allow vector indexes without rf_rack_valid_keyspces The rf_rack_valid_keyspaces option needs to be turned on in order to allow creating materialized views in tablet keyspaces with numeric RF per DC. This is also necessary for secondary indexes because they use materialized views underneath. However, this option is _not_ necessary for vector store indexes because those use the external vector store service for querying the list of keys to fetch from the main table, they do not create a materialized view. The rf_rack_valid_keyspaces was, by accident, required for vector indexes, too. Remove the restriction for vector store indexes as it is completely unnecessary. Fixes: SCYLLADB-81 Closes scylladb/scylladb#27447 (cherry picked from commit `bb6e41f97a`) Closes scylladb/scylladb#27455	2025-12-05 20:13:02 +01:00
Karol Nowacki	71c47b8d18	vector_search: Fix requests hanging on unreachable nodes When a vector store node becomes unreachable, a client request sent before the keep-alive timer fires would hang until the CQL query timeout was reached. This occurred because the HTTP request writes to the TCP buffer and then waits for a response. While data is in the buffer, TCP retransmissions prevent the keep-alive timer from detecting the dead connection. This patch resolves the issue by setting the `TCP_USER_TIMEOUT` socket option, which applies an effective timeout to TCP retransmissions, allowing the connection to fail faster. Closes scylladb/scylladb#27388 (cherry picked from commit `a54bf50290`) Closes scylladb/scylladb#27423	2025-12-04 19:54:08 +01:00
Avi Kivity	4db6d3e924	database: fix overflow when computing data distribution over shards We store the per-shard chunk count in a uint64_t vector global_offset, and then convert the counts to offsets with a prefix sum: ```c++ // [1, 2, 3, 0] --> [0, 1, 3, 6] std::exclusive_scan(global_offset.begin(), global_offset.end(), global_offset.begin(), 0, std::plus()); ``` However, std::exclusive_scan takes the accumulator type from the initial value, 0, which is an int, instead of from the range being iterated, which is of uint64_t. As a result, the prefix sum is computed as a 32-bit integer value. If it exceeds 0x8000'0000, it becomes negative. It is then extended to 64 bits and stored. The result is a huge 64-bit number. Later on we try to find an sstable with this chunk and fail, crashing on an assertion. An example of the failure can be seen here: https://godbolt.org/z/6M8aEbo57 The fix is simple: the initial value is passed as uint64_t instead of int. Fixes https://github.com/scylladb/scylladb/issues/27417 Closes scylladb/scylladb#27418 (cherry picked from commit `9696ee64d0`)	2025-12-04 20:17:19 +02:00
Piotr Dulikowski	e4b1c1f38b	db/view/view_building_coordinator: skip work if no view is built Even though that `view_building_coordinator::work_on_view_building` has an `if` at the very beginning which checks whether the currently processed base table is set, it only prints a message and continues executing the rest of the function regardless of the result of the check. However, some of the logic in the function assumes that the currently processed base table field is set and tries to access the value of the field. This can lead to the view building coordinator accessing a disengaged optional, which is undefined behavior. Fix the function by adding the clearly missing `co_await` to the check. A regression test is added which checks that the view building state observer - a different fiber which used to print a weird message due to erroneus view building coordinator behavior - does not print a warning. Fixes: scylladb/scylladb#27363 Closes scylladb/scylladb#27373 (cherry picked from commit `654ac9099b`) Closes scylladb/scylladb#27406 scylla-2025.4.0-rc5-candidate-20251204032609	2025-12-03 17:12:17 +01:00
Piotr Dulikowski	2787ac6cba	Merge '[Backport 2025.4] vector_search: Fix high availability during timeouts' from Scylladb[bot] This PR introduces two key improvements to the robustness and resource management of vector search: Proper Abort on CQL Timeout: Previously, when a CQL query involving a vector search timed out , the underlying ANN query to the vector store was not aborted and would continue to run. This has been fixed by ensuring the abort source is correctly signaled, terminating the ANN request when its parent CQL query expires and preventing unnecessary resource consumption. Faster Failure Detection: The connection and keep-alive timeouts for vector store nodes were excessively long (2 and 11 minutes, respectively), causing significant delays in detecting and recovering from unreachable nodes. These timeouts are now aligned with the request_timeout_in_ms setting, allowing for much faster failure detection and improving high availability by failing over from unresponsive nodes more quickly. Fixes: SCYLLADB-76 This issue affects the 2025.4 branch, where similar HA recovery delays have been observed. - (cherry picked from commit `b6afacfc1e`) - (cherry picked from commit `086c6992f5`) Parent PR: #27377 Closes scylladb/scylladb#27391 * github.com:scylladb/scylladb: vector_search: Fix ANN query abort on CQL timeout vector_search: Reduce connection and keep-alive timeouts	2025-12-03 07:20:11 +01:00
Karol Nowacki	26599e79f2	vector_search: Fix ANN query abort on CQL timeout When a CQL vector search request timed out, the underlying ANN query was not aborted and continued to run. This happened because the abort source was not being signaled upon request expiration. This commit ensures the ANN query is aborted when the CQL request times out preventing unnecessary resource consumption.	2025-12-02 16:58:55 +01:00
Karol Nowacki	d4c199a1ec	vector_search: Reduce connection and keep-alive timeouts The connection timeout was 2 minutes and the keep-alive timeout was 11 minutes. If a vector store node became unreachable, these long timeouts caused significant delays before the system could recover, negatively impacting high availability. This change aligns both timeouts with the `request_timeout` configuration, which defaults to 10 seconds. This allows for much faster failure detection and recovery, ensuring that unresponsive nodes are failed over from more quickly.	2025-12-02 16:52:53 +01:00
Asias He	4e7202ee32	repair: Fix deadlock when topology coordinator steps down in the middle Consider this: 1) n1 is the topology coordinator 2) n1 schedules and executes a tablet repair with session id s1 for a tablet on n3 an n4. 3) n3 and n4 take and store the in _rs._repair_compaction_locks[s1] 4) n1 steps down before it executes locator::tablet_transition_stage::end_repair 5) n2 becomes the new topology coordinator 6) n2 runs locator::tablet_transition_stage::repair again 7) n3 and n4 try to take the lock again and hangs since the lock is already taken. To avoid the deadlock, we can throw in step 7 so that n2 will proceed to end_repair stage and release the lock. After that, the scheduler could schedule the tablet repair request again. Fixes #26346 Closes scylladb/scylladb#27163 (cherry picked from commit `da5cc13e97`) Closes scylladb/scylladb#27337	2025-12-01 13:06:02 +01:00
Jenkins Promoter	6b5d334be3	Update pgo profiles - aarch64	2025-12-01 04:47:40 +02:00
Jenkins Promoter	58f1597831	Update pgo profiles - x86_64	2025-12-01 03:56:10 +02:00
Anna Stuchlik	d9bfb8c607	doc: fix the info about object storage This commit fixes the information about object storage: - Object storage configuration is no longer marked as experimental. - Redundant information has been removed from the description. - Information related to object storage for SStabels has been removed as the feature is not working. Fixes https://github.com/scylladb/scylladb/issues/26985 Closes scylladb/scylladb#26987 (cherry picked from commit `724dc1e582`) Closes scylladb/scylladb#27211	2025-11-28 12:38:08 +01:00
Patryk Jędrzejczak	1dab04666c	Merge '[Backport 2025.4] doc: update Cloud Instance Recommendations for GCP' from Scylladb[bot] This PR: - Removes n1-highmem instances from Recommended Instances. - Adds missing support for n2-highmem-96. - Updates the reference to n2 instances in the Google Cloud docs (fixes a broken link to GCP). - Adds the missing information about processors for n2-highmem-instance - Ice Lake and Cascade Lake (requested by CX). Fixes https://github.com/scylladb/scylladb/issues/25946 Fixes https://github.com/scylladb/scylladb/issues/24223 Fixes https://github.com/scylladb/scylladb/issues/23976 No backport needed if this PR is merged before 2025.4 branching. - (cherry picked from commit `b18b052d26`) - (cherry picked from commit `dab74471cc`) Parent PR: #26182 Closes scylladb/scylladb#27168 * https://github.com/scylladb/scylladb: doc: update information for n2-highmem instances doc: remove n1-highmem instances from Recommended Instances	2025-11-28 12:31:33 +01:00
Asias He	fc54aedd8f	topology_coordinator: Send incremental repair rpc only when the feature is enabled Otherwise, in a mixed cluster, the handle_tablet_resize_finalization would fail because of the unknown rpc verb. Fixes #26309 Closes scylladb/scylladb#27218 (cherry picked from commit `ab4896dc70`) Closes scylladb/scylladb#27284	2025-11-27 18:42:14 +01:00
Patryk Jędrzejczak	6b3b05c10b	Merge '[Backport 2025.4] fix notification about expiring erm held for to long' from Scylladb[bot] Commit `6e4803a750` broke notification about expired erms held for too long since it resets the tracker without calling its destructor (where notification is triggered). Fix the assign operator to call the destructor like it should. Fixes https://github.com/scylladb/scylladb/issues/27141 - (cherry picked from commit `9f97c376f1`) - (cherry picked from commit `5dcdaa6f66`) Parent PR: #27140 Closes scylladb/scylladb#27276 * https://github.com/scylladb/scylladb: test: test that expired erm that held for too long triggers notification token_metadata: fix notification about expiring erm held for to long	2025-11-27 16:58:16 +01:00
Patryk Jędrzejczak	30e02b6658	Merge '[Backport 2025.4] locator/node: include _excluded in missing places' from Scylladb[bot] We currently ignore the `_excluded` field in `node::clone()` and the verbose formatter of `locator::node`. The first one is a bug that can have unpredictable consequences on the system. The second one can be a minor inconvenience during debugging. We fix both places in this PR. Fixes https://scylladb.atlassian.net/browse/SCYLLADB-72 This PR is a bugfix that should be backported to all supported branches. - (cherry picked from commit `4160ae94c1`) - (cherry picked from commit `287c9eea65`) Parent PR: #27265 Closes scylladb/scylladb#27291 * https://github.com/scylladb/scylladb: locator/node: include _excluded in verbose formatter locator/node: preserve _excluded in clone()	2025-11-27 12:27:06 +01:00
Nadav Har'El	a0916179a3	Merge '[Backport 2025.4] Alternator: enable tablets by default - depending on tablets_mode_for_new_keyspaces' from Scylladb[bot] Before this series, Alternator's CreateTable operation defaults to creating a table replicated with vnodes, not tablets. The reasons for this default included missing support for LWT, Materialized Views, Alternator TTL and Alternator Streams if tablets are used. But today, all of these (except the still-experimental Alternator Streams) are now fully available with tablets, so we are finally ready to switch Alternator to use tablets by default in new tables. We will use the same configuration parameter that CQL uses, tablets_mode_for_new_keyspaces, to determine whether new keyspaces use tablets by default. If set to `enabled`, tablets are used by default on new tables. If set to `disabled`, tablets will not be used by default (i.e., vnodes will be used, as before). A third value, `enforced` is similar to `enabled` but forbids overriding the default to vnodes when creating a table. As before, the user can set a tag during the CreateTable operation to override the default choice of tablets or vnodes (unless in `enforced` mode). This tag is now named `system:initial_tablets` - whereas before this patch it was called `experimental:initial_tablets`. The rules stay the same as with the earlier, experimental:initial_tablets tag: when supplied with a numeric value, the table will use tablets. When supplied with something else (like a string "none"), the table will use vnodes. Fixes https://github.com/scylladb/scylladb/issues/22463 Backport to 2025.4, it's important not to delay phasing out vnodes. - (cherry picked from commit `403068cb3d`) - (cherry picked from commit `af00b59930`) - (cherry picked from commit `376a2f2109`) - (cherry picked from commit `35216d2f01`) - (cherry picked from commit `7466325028`) - (cherry picked from commit `c7de7e76f4`) - (cherry picked from commit `63897370cb`) - (cherry picked from commit `274d0b6d62`) - (cherry picked from commit `345747775b`) - (cherry picked from commit `a659698c6d`) - (cherry picked from commit `eeb3a40afb`) - (cherry picked from commit `b34f28dae2`) - (cherry picked from commit `25439127c8`) - (cherry picked from commit `c03081eb12`) - (cherry picked from commit `65ed678109`) Parent PR: #26836 Closes scylladb/scylladb#26949 * github.com:scylladb/scylladb: test/cluster: modify test to not fail on 2025.4 branch Fix backport conflicts test,alternator: use 3-rack clusters in tests alternator: improve error in tablets_mode_for_new_keyspaces=enforced config: make tablets_mode_for_new_keyspaces live-updatable alternator: improve comment about non-hidden system tags alternator: Fix test_ttl_expiration_streams() alternator: Fix test_scan_paging_missing_limit() alternator: Don't require vnodes for TTL tests alternator: Remove obsolete test from test_table.py alternator: Fix tag name to request vnodes alternator: Fix test name clash in test_tablets.py alternator: test_tablets.py handles new policy reg. tablets alternator: Update doc regarding tablets support alternator: Support `tablets_mode_for_new_keyspaces` config flag Fix incorrect hint for tablets_mode_for_new_keyspaces Fix comment for tablets_mode_for_new_keyspaces	2025-11-27 09:05:18 +02:00
Piotr Dulikowski	863aae84fd	Merge '[Backport 2025.4] db/view/view_building_coordinator: get rid of task's state in group0' from Scylladb[bot] Previously, the view building coordinator relied on setting each task's state to STARTED and then explicitly removing these state entries once tasks finished, before scheduling new ones. This approach induced a significant number of group0 commits, particularly in large clusters with many nodes and tablets, negatively impacting performance and scalability. With the update, the coordinator and worker logic has been restructured to operate without maintaining per-task states. Instead, tasks are simply tracked with an aborted boolean flag, which is still essential for certain tablet operations. This change removes much of the coordination complexity, simplifies the view building code, and reduces operational overhead. In addition, the coordinator now batches reports of finished tasks before making commits. Rather than committing task completions individually, it aggregates them and reports in groups, significantly minimizing the frequency of group0 commits. This new approach is expected to improve efficiency and scalability during materialized view construction, especially in large deployments. Fixes https://github.com/scylladb/scylladb/issues/26311 This patch needs to be backported to 2025.4. - (cherry picked from commit `6d853c8f11`) - (cherry picked from commit `08974e1d50`) - (cherry picked from commit `eb04af5020`) - (cherry picked from commit `24d69b4005`) - (cherry picked from commit `fb8cbf1615`) - (cherry picked from commit `fe9581f54c`) Parent PR: #26897 Closes scylladb/scylladb#27266 * github.com:scylladb/scylladb: docs/dev/view-building-coordinator: update the docs after recent changes db/view/view_building: send coordinator's term in the RPC db/view/view_building_state: replace task's state with `aborted` flag db/view/view_building_coordinator: batch finished tasks reporting db/view/view_building_worker: change internal implementation db/view/view_building_coordinator: change `work_on_tasks` RPC return type	2025-11-27 01:47:53 +01:00
Patryk Jędrzejczak	3c635037df	locator/node: include _excluded in verbose formatter It can be helpful during debugging. (cherry picked from commit `287c9eea65`)	2025-11-26 23:05:25 +00:00
Patryk Jędrzejczak	30790b9af4	locator/node: preserve _excluded in clone() We currently ignore the `_excluded` field in `clone()`. Losing information about exclusion can have unpredictable consequences. One observed effect (that led to finding this issue) is that the `/storage_service/nodes/excluded` API endpoint sometimes misses excluded nodes. (cherry picked from commit `4160ae94c1`)	2025-11-26 23:05:25 +00:00
Nadav Har'El	b2c3b28617	test/cluster: modify test to not fail on 2025.4 branch The purpose of the test cluster/test_alternator::test_alternator_ttl_scheduling_group Is to verify that during TTL expiration scans and deletions, all of the CPU is used in the "streaming" scheduling group, not in the statement scheduling group ("sl:default") as we had in the past due to bugs. It appears that in branch 2025.4 we have a new bug - which doesn't exist in master - that causes some tablets-related work which I couldn't identify to be done in sl:default, and cause this test to fail. The simple fix is to sleep for 5 seconds after writing the data, and it seems that by that time, the sl:default work is done. This change doesn't make the Alternator TTL test any weaker, so we need to make this change to allow Alternator to go forward. Sadly, it does mean that the only test we have for this apparent bug (which has nothing to do with Alternator) will be gone. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2025-11-26 20:42:09 +02:00
Nadav Har'El	433bc4c17f	Fix backport conflicts	2025-11-26 20:42:08 +02:00
Nadav Har'El	5e48fb9601	test,alternator: use 3-rack clusters in tests With tablets enabled, we can't create an Alternator table on a three- node cluster with a single rack, since Scylla refuses RF=3 with just one rack and we get the error: An error occurred (InternalServerError) when calling the CreateTable operation: ... Replication factor 3 exceeds the number of racks (1) in dc datacenter1 So in test/cluster/test_alternator.py we need to use the incantation "auto_rack_dc='dc1'" every time that we create a three-node cluster. Before this patch, several tests in test/cluster/test_alternator.py failed on this error, with this patch all of them pass. Signed-off-by: Nadav Har'El <nyh@scylladb.com> (cherry picked from commit `65ed678109`)	2025-11-26 20:42:08 +02:00
Nadav Har'El	a5ba028d40	alternator: improve error in tablets_mode_for_new_keyspaces=enforced When in tablets_mode_for_new_keyspaces=enforced mode, Alternator is supposed to fail when CreateTable asks explicitly for vnodes. Before this patch, this error was an ugly "Internal Server Error" (an exception thrown from deep inside the implementation), this patch checks for this case in the right place, to generate a proper ValidationException with a proper error message. We also enable the test test_tablets_tag_vs_config which should have caught this error, but didn't because it was marked xfail because tablets_mode_for_new_keyspaces had not been live-updatable. Now that it is, we can enable the test. I also improved the test to be slightly faster (no need to change the configuration so many times) and also check the ordinary case - where the schema doesn't choose neither vnodes nor tablets explicitly and we should just use the default. Signed-off-by: Nadav Har'El <nyh@scylladb.com> (cherry picked from commit `c03081eb12`)	2025-11-26 20:42:08 +02:00
Nadav Har'El	5972790a71	config: make tablets_mode_for_new_keyspaces live-updatable We have a configuration option "tablets_mode_for_new_keyspaces" which determines whether new keyspaces should use tablets or vnodes. For some reason, this configuration parameter was never marked live- updatable, so in this patch we add flag. No other changes are needed - the existing code that uses this flag always uses it through the up-to-date configuration. In the previous patches we start to honor tablets_mode_for_new_keyspaces also in Alternator CreateTable, and we wanted to test this but couldn't do this in test/alternator because the option was not live-updatable. Now that it will be, we'll be able to test this feature in test/alternator. Signed-off-by: Nadav Har'El <nyh@scylladb.com> (cherry picked from commit `25439127c8`)	2025-11-26 20:42:08 +02:00
Nadav Har'El	58b89b4d28	alternator: improve comment about non-hidden system tags The previous patches added a somewhat misleading comment in front of system:initial_tablets, which this patch improves. That tag is NOT where Alternator "stores" table properties like the existing comment claimed. In fact, the whole point is that it's the opposite - Alternator never writes to this tag - it's a user-writable tag which Alternator reads, to configure the new table. And this is why it obviously can't be hidden from the user. Signed-off-by: Nadav Har'El <nyh@scylladb.com> (cherry picked from commit `b34f28dae2`)	2025-11-26 20:42:08 +02:00
Piotr Szymaniak	915fa6694b	alternator: Fix test_ttl_expiration_streams() The test is now aware of the new name of the `system:initial_tablets` tag. (cherry picked from commit `eeb3a40afb`)	2025-11-26 20:42:07 +02:00
Piotr Szymaniak	42dc583467	alternator: Fix test_scan_paging_missing_limit() With tablets, the test begun failing. The failure was correlated with the number of initial tablets, which when kept at default, equals 4 tablets per shard in release build and 2 tablets per shard in dev build. In this patch we split the test into two - one with a more data in the table to check the original purpose of this test - that Scan doesn't return the entire table in one page if "Limit" is missing. The other test reproduces issue #10327 - that when the table is small, Scan's page size isn't strictly limited to 1MB as it is in DynamoDB. Experimentally, 8000 KB of data (compared to 6000 KB before this patch) is enough when we have up to 4 initial tablets per shard (so 8 initial tablets on a two-shard node as we typically run in tests). Original patch by Piotr Szymaniak <piotr.szymaniak@scylladb.com> modified by Nadav Har'El <nyh@scylladb.com> (cherry picked from commit `a659698c6d`)	2025-11-26 20:42:07 +02:00
Piotr Szymaniak	80fc6a7951	alternator: Don't require vnodes for TTL tests Since #23662 Alternator supports TTL with tablets too. Let's clear some leftovers causing Alternator to test TTL with vnodes instead of with what is default for Alternator (tablets or vnodes). (cherry picked from commit `345747775b`)	2025-11-26 20:42:07 +02:00
Piotr Szymaniak	44f46099fd	alternator: Remove obsolete test from test_table.py Since Alternator is capable of runnng with tablets according to the flag in config, remove the obsolete test that is making sure that Alternator runs with vnodes. (cherry picked from commit `274d0b6d62`)	2025-11-26 20:42:07 +02:00
Piotr Szymaniak	7489db0097	alternator: Fix tag name to request vnodes The tag was lately renamed from `experimental:initial_tablets` to `system::initial_tablets`. This commit fixes both the tests as well as the exceptions sent to the user instructing how to create table with vnodes. (cherry picked from commit `63897370cb`)	2025-11-26 20:42:07 +02:00
Piotr Szymaniak	2b3f921971	alternator: Fix test name clash in test_tablets.py (cherry picked from commit `c7de7e76f4`)	2025-11-26 20:42:07 +02:00
Piotr Szymaniak	e286a72fd8	alternator: test_tablets.py handles new policy reg. tablets Adjust the tests so they are in-line with the config flag 'tablets_mode_for_new_keyspaces` that the Alternator learned to honour. (cherry picked from commit `7466325028`)	2025-11-26 20:42:06 +02:00
Piotr Szymaniak	04363b86ea	alternator: Update doc regarding tablets support Reflect honouring by Alternator the value of the config flag `tablets_mode_for_new_keyspaces`, as well as renaming of the tag `experimental:initial_tablets` into `system:initial_tablets`. (cherry picked from commit `35216d2f01`)	2025-11-26 20:42:06 +02:00
Piotr Szymaniak	65ceb83f42	alternator: Support `tablets_mode_for_new_keyspaces` config flag Until now, tablets in Alternator were experimental feature enabled only when a TAG "experimental:initial_tablets" was present when creating a table and associated with a numeric value. After this patch, Alternator honours the value of `tablets_mode_for_new_keyspaces` config flag. Each table can be overriden to use tablets or not by supplying a new TAG "system:initial_tablets". The rules stay the same as with the earlier, experimental tag: when supplied with a numeric value, the table will use tablets (as long as they are supported). When supplied with something else (like a string "none"), the table will use vnodes, provided that tablets are not `enforced` by the config flag. Fixes #22463 (cherry picked from commit `376a2f2109`)	2025-11-26 20:42:06 +02:00
Piotr Szymaniak	8b9de27ff8	Fix incorrect hint for tablets_mode_for_new_keyspaces (cherry picked from commit `af00b59930`)	2025-11-26 20:42:06 +02:00
Piotr Szymaniak	b000904d0f	Fix comment for tablets_mode_for_new_keyspaces The comment was not listing all the 3 possible values correctly, despite an explanation just below covers all 3 values. (cherry picked from commit `403068cb3d`)	2025-11-26 20:42:06 +02:00
Michał Jadwiszczak	88d55e9236	docs/dev/view-building-coordinator: update the docs after recent changes Remove information about view building task state and explain how current lifetime of the task. (cherry picked from commit `fe9581f54c`)	2025-11-26 17:47:16 +01:00
Michał Jadwiszczak	64e0405ba2	db/view/view_building: send coordinator's term in the RPC To avoid case when an old coordinator (which hasn't been stopped yet) dictates what should be done, add raft term to the `work_on_view_building_tasks` RPC. The worker needs to check if the term matches the current term from raft server, and deny the request when the term is bad. (cherry picked from commit `fb8cbf1615`)	2025-11-26 17:47:16 +01:00
Michał Jadwiszczak	0ffc8c5987	db/view/view_building_state: replace task's state with `aborted` flag After previous commits, we can drop entire task's state and replace it with single boolean flag, which determines if a task was aborted. Once a task was aborted, it cannot get resurrected to a normal state. (cherry picked from commit `24d69b4005`)	2025-11-26 17:47:16 +01:00
Michał Jadwiszczak	098082a8d9	db/view/view_building_coordinator: batch finished tasks reporting In previous implementation to execute view building tasks, the coordinator needed to firstly set their states to `STARTED` and then it needed to remove them before it could start the next ones. This logic required a lot of group0 commits, especially in large clusters with higher number of nodes and big tablet count. After previous commit to the view building worker, the coordinator can start view building tasks without setting the `STARTED` state and deleting finished tasks. This patch adjusts the coordinator to save finished tasks locally, so it can continue to execute next ones and the finished tasks are periodically removed from the group0 by `finished_task_gc_fiber()`. (cherry picked from commit `eb04af5020`)	2025-11-26 17:47:12 +01:00
Gleb Natapov	26606c8801	test: test that expired erm that held for too long triggers notification (cherry picked from commit `5dcdaa6f66`)	2025-11-26 15:09:15 +00:00
Gleb Natapov	f29911cb73	token_metadata: fix notification about expiring erm held for to long Commit `6e4803a750` broke notification about expired erms held for too long since it resets the tracker without calling its destructor (where notification is triggered). Fix assign operator to call destructor. (cherry picked from commit `9f97c376f1`)	2025-11-26 15:09:15 +00:00
Wojciech Mitros	33eef5122c	alternator: use storage_proxy from the correct shard in executor::delete_table When we delete a table in alternator, the schema change is performed on shard 0. However, we actually use the storage_proxy from the shard that is handling the delete_table command. This can lead to problems because some information is stored only on shard 0 and using storage_proxy from another shard may make us miss it. In this patch we fix this by using the storage_proxy from shard 0 instead. Fixes https://github.com/scylladb/scylladb/issues/27223 Closes scylladb/scylladb#27224 (cherry picked from commit `3c376d1b64`) Closes scylladb/scylladb#27260	2025-11-26 14:33:21 +02:00
Michał Jadwiszczak	ab9878c2df	db/view/view_building_worker: change internal implementation This commit doesn't change the logic behind the view building worker but it changes how the worker is executing view building tasks. Previously, the worker had a state only on shard0 and it was reacting to changes in group0 state. When it noticed some tasks were moved to `STARTED` state, the worker was creating a batch for it on the shard0 state. The RPC call was used only to start the batch and to get its result. Now, the main logic of batch management was moved to the RPC call handler. The worker has a local state on each shard and the state contains: - unique ptr to the batch - set of completed tasks - information for which views the base table was flushed So currently, each batch lives on a shard where it has its work to do exclusively. This eliminates a need to do a synchronization between shard0 and work shard, which was a painful point in previous implementation. The worker still reacts to changes in group0 view building state, but currently it's only used to observe whether any view building tasks was aborted by setting `ABORTED` state. To prepare for further changes to drop the view building task state, the worker ignores `IDLE` and `STARTED` states completely. (cherry picked from commit `08974e1d50`)	2025-11-26 12:24:42 +00:00

1 2 3 4 5 ...

50132 Commits