Commit Graph

50155 Commits

Author SHA1 Message Date
Michael Litvak
fbca8a7644 docs: document restrictions of colocated tables
Currently some things are not supported for colocated tables: it's not
possible to repair a colocated table, and due to this it's also not
possible to use the tombstone_gc=repair mode on a colocated table.

Extend the documentation to explain what colocated tables are and
document these restrictions.

Fixes scylladb/scylladb#27261

Closes scylladb/scylladb#27516

(cherry picked from commit 33f7bc28da)

Closes scylladb/scylladb#27772
2025-12-19 12:26:44 +01:00
Emil Maskovsky
6d81dc8ba8 topology_coordinator: handle seastar::abort_requested_exception alongside raft::request_aborted
In several exception handlers, only raft::request_aborted was being
caught and rethrown, while seastar::abort_requested_exception was
falling through to the generic catch(...) block. This caused the
exception to be incorrectly treated as a failure that triggers
rollback, instead of being recognized as an abort signal.

For example, during tablet draining, the error log showed:
"tablets draining failed with seastar::abort_requested_exception
(abort requested). Aborting the topology operation"

This change adds seastar::abort_requested_exception handling
alongside raft::request_aborted in all places where it was missing.
When rethrown, these exceptions propagate up to the main run() loop
where handle_topology_coordinator_error() recognizes them as normal
abort signals and allows the coordinator to exit gracefully without
triggering unnecessary rollback operations.

Fixes: scylladb/scylladb#27255

(cherry picked from commit 37e3dacf33)

Closes scylladb/scylladb#27663
2025-12-19 11:50:02 +01:00
Patryk Jędrzejczak
1cadf057ce Merge '[Backport 2025.4] Make direct failure detector verb handler more efficient' from Scylladb[bot]
We saw that in large clusters direct failure detector may cause large task queues to be accumulated. The series address this issue and also moves the code into the correct scheduling group.

Fixes https://github.com/scylladb/scylladb/issues/27142

Backport to all version where 60f1053087 was backported to since it should improve performance in large clusters.

- (cherry picked from commit 82f80478b8)

- (cherry picked from commit 6a6bbbf1a6)

- (cherry picked from commit 86dde50c0d)

Parent PR: #27387

Closes scylladb/scylladb#27483

* https://github.com/scylladb/scylladb:
  direct_failure_detector: run direct failure detector in the gossiper scheduling group
  raft: drop invoke_on from the pinger verb handler
  direct_failure_detector: pass timeout to direct_fd_ping verb
2025-12-19 11:13:11 +01:00
Amnon Heiman
cbf6250021 scylla-node-exporter: Add ethtool to node exporter
AWS suggests following multiple network performance metrics:
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/monitoring-network-performance-ena.html#network-performance-metrics

This patch enables the ethtool collector with the specific list of
metrics

Ater this patch the relevant metris looks like:

$ curl http://localhost:9100/metrics |& grep ethtool
node_ethtool_bw_in_allowance_exceeded{device="ens5"} 0
node_ethtool_bw_out_allowance_exceeded{device="ens5"} 0
node_ethtool_conntrack_allowance_available{device="ens5"} 51303
node_ethtool_conntrack_allowance_exceeded{device="ens5"} 0
node_ethtool_info{bus_info="0000:00:05.0",device="ens5",driver="ena",expansion_rom_version="",firmware_version="",version="6.14.0-1015-aws"} 1
node_ethtool_linklocal_allowance_exceeded{device="ens5"} 0
node_scrape_collector_duration_seconds{collector="ethtool"} 0.001091436
node_scrape_collector_success{collector="ethtool"} 1

Signed-off-by: Amnon Heiman <amnon@scylladb.com>

Closes scylladb/scylladb#27358

(cherry picked from commit a213e41250)

Closes scylladb/scylladb#27508
2025-12-19 09:15:51 +02:00
Pawel Pery
0d7a32cece unittest: fix vector_store_client_test_dns_refresh_aborted hangs
The root cause for the hanging test is a concurrency deadlock.
`vector_store_client` runs dns refresh time and it is waiting for the condition
variable.After aborting dns request the test signals the condition variable.
Stopping the vector_store_client takes time enough to trigger the next dns
refresh - and this time the condition variable won't be signalled - so
vector_store_client will wait forever for finish dns refresh fiber.

The commit fixes the problem by waiting for the condition variable only once.

Fixes: #27237
Fixes: VECTOR-370

Closes scylladb/scylladb#27239

(cherry picked from commit b5c85d08bb)

Closes scylladb/scylladb#27393
2025-12-19 09:14:49 +02:00
Ernest Zaslavsky
4b81530e8a s3_client: handle additional transient network errors
Add handling for a broader set of transient network-related `std::errc` values in `aws_error::from_system_error`. Treat these conditions as retryable when the client re-creates the socket for each request.

Fixes: https://github.com/scylladb/scylladb/issues/27349

Closes scylladb/scylladb#27350

(cherry picked from commit 605f71d074)

Closes scylladb/scylladb#27392
2025-12-19 09:14:00 +02:00
Michael Litvak
34ede10db9 tablet: scheduler: Do not emit conflicting migration in merge colocation
The tablet scheduler should not emit conflicting migrations for the same
tablet. This was addressed initially in scylladb/scylladb#26038 but the
check is missing in the merge colocation plan, so add it there as well.

Without this check, the merge colocation plan could generate a
conflicting migration for a tablet that is already scheduled for
migration, as the test demonstrates.

This can cause correctness problems, because if the load balancer
generates two migrations for a single tablet, both will be written as
mutations, and the resulting mutation could contain mixed cells from
both migrations.

Fixes scylladb/scylladb#27304

Closes scylladb/scylladb#27312

(cherry picked from commit 97b7c03709)

Closes scylladb/scylladb#27331
2025-12-19 09:13:29 +02:00
Amnon Heiman
2e0c41b32b vector_index: require tablets for vector indexes
This patch enforces that vector indexes can only be created on keyspaces
that use tablets. During index validation, `check_uses_tablets()` verifies
the base keyspace configuration and rejects creation otherwise.

To support this, the `custom_index::validate()` API now receives a
`const data_dictionary::database&` parameter, allowing index
implementations to access keyspace-level settings during DDL validation.

Fixes https://scylladb.atlassian.net/browse/VECTOR-322

Closes scylladb/scylladb#26786

(cherry picked from commit 68c7236acb)

Closes scylladb/scylladb#27272
2025-12-19 09:12:18 +02:00
Amnon Heiman
9ad7bd8070 index/vector_index.cc: Don't allow zero as an index option
This patch forces vector_index option value to be real-positive numbers
as zero would make no senese.

Fixes https://scylladb.atlassian.net/browse/VECTOR-249

Signed-off-by: Amnon Heiman <amnon@scylladb.com>

Closes scylladb/scylladb#27191

(cherry picked from commit b2c2a99741)

Closes scylladb/scylladb#27234
2025-12-19 09:11:34 +02:00
Jenkins Promoter
7a365e0973 Update ScyllaDB version to: 2025.4.1 2025-12-18 18:36:06 +02:00
Jenkins Promoter
fe8b2f1092 Update ScyllaDB version to: 2025.4.0 scylla-2025.4.0-candidate-20251217011910 scylla-2025.4.0 2025-12-17 09:46:48 +02:00
Nadav Har'El
55b78e56a9 test/cqlpy: fix flaky test test_view_in_system_tables
The cqlpy test test_materialized_view.py::test_view_in_system_tables
checks that the system table "system.built_views" can inform us that
a view has been built. This test was flaky, starting to fail quite
often recently, and this patch fixes the problem in the test.

For historic reasons  this test began by calling a utility function
wait_for_view_built() - which uses a different system table,
system_distributed.view_build_status, to wait until the view was built.
The test then immediately tries to verify that also system.built_views
lists this view.

But there is no real reason why we could assume - or want to assume -
that these two tables are updated in this order, or how much time
passed between the two tables being changed. The authors of this
test already acknowledged there is a problem - they included a hack
purporting to be a "read barrier" that claimed to solve this exact
problem - but it seems it doesn't, or at least no longer does after
recent changes to the view builder's implementation.

The solution is simple - just remove the call to wait_for_view_built()
and the "hack" after it. We should just wait in a loop (until a timeout)
for the system table that we really wanted to check - system.built_views.
It's as simple as that. No need for any other assumptions or hacks.

Fixes #27296

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#27626

(cherry picked from commit ccacea621f)

Closes scylladb/scylladb#27670
2025-12-16 15:45:59 +02:00
Michael Litvak
4b26a86cb0 alternator: require rf_rack_valid_keyspaces when creating index
When creating an alternator table with tablets, if it has an index, LSI
or GSI, require the config option rf_rack_valid_keyspaces to be enabled.

The option is required for materialized views in tablets keyspaces to
function properly and avoid consistency issues that could happen due to
cross-rack migrations and pairing switches when RF-rack validity is not
enforced.

Currently the option is validated when creating a materialized view via
the CQL interface, but it's missing from the alternator interface. Since
alternator indexes are based on materialized views, the same check
should be added there as well.

Fixes scylladb/scylladb#27612

Closes scylladb/scylladb#27622

(cherry picked from commit b9ec1180f5)

Closes scylladb/scylladb#27671
2025-12-16 10:13:31 +02:00
Jenkins Promoter
344f648703 Update pgo profiles - aarch64 scylla-2025.4.0-rc7-candidate-20251216080722 scylla-2025.4.0-rc7 2025-12-15 10:31:11 +02:00
Yaron Kaikov
b70794e6ed auto-backport.py: modify instruction for making PR ready for review
Update the comment sent when PR has conflicts with clear instrauctions how to make the PR Ready for review

Fixes: https://scylladb.atlassian.net/browse/RELENG-152

Closes scylladb/scylladb#27547

(cherry picked from commit d3e199984e)

Closes scylladb/scylladb#27565
2025-12-15 09:59:44 +02:00
Yaron Kaikov
3c7ff856c3 workflows: trigger CI automatically when conflicts label is removed
Add pull_request_target event with unlabeled type to trigger-scylla-ci
workflow. This allows automatic CI triggering when the 'conflicts' label
is removed from a PR, in addition to the existing manual trigger via
comment.

The workflow now runs when:
- A user posts a comment with '@scylladbbot trigger-ci' (existing)
- The 'conflicts' label is removed from a PR (new)

Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-84

Closes scylladb/scylladb#27521

(cherry picked from commit f7ffa395a8)

Closes scylladb/scylladb#27602
2025-12-15 09:58:52 +02:00
Jenkins Promoter
99e9f4b07c Update pgo profiles - x86_64 2025-12-15 09:42:11 +02:00
Botond Dénes
ffc6953850 Merge '[Backport 2025.4] api: storage_service/tablets/repair: disable incremental repair by default' from Scylladb[bot]
Change the default incremental_mode to `disabled` due to https://github.com/scylladb/scylladb/issues/26041 and https://github.com/scylladb/scylladb/issues/27414

** Backport to 2025.4 where 611918056a was introduced **

- (cherry picked from commit 5fae4cdf80)

- (cherry picked from commit c8cff94a5a)

Parent PR: #27530

Closes scylladb/scylladb#27595

* github.com:scylladb/scylladb:
  api: storage_service/tablets/repair: disable incremental repair by default
  docs: nodetool-commands: cluster: repair: fix incremental-mode example
2025-12-15 08:47:49 +02:00
Yaron Kaikov
434956af0f Add JIRA issue validation to backport PR fixes check
Extend the Fixes validation pattern to also accept JIRA issue references
(format: [A-Z]+-\d+) in addition to GitHub issue references. This allows
backport PRs to reference JIRA issues in the format 'Fixes: PROJECT-123'.

Fixes: https://github.com/scylladb/scylladb/issues/27571

Closes scylladb/scylladb#27572

(cherry picked from commit 3dfa5ebd7f)

Closes scylladb/scylladb#27601
2025-12-12 09:34:45 +02:00
Benny Halevy
2f4f3ff980 api: storage_service/tablets/repair: disable incremental repair by default
Change the default incremental_mode to `disabled` due to
https://github.com/scylladb/scylladb/issues/26041 and
https://github.com/scylladb/scylladb/issues/27414

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
(cherry picked from commit c8cff94a5a)
2025-12-11 23:49:07 +00:00
Benny Halevy
20134b9ade docs: nodetool-commands: cluster: repair: fix incremental-mode example
There is no 'regular' incremental mode anymore.
The example seems have meant 'disabled'.

Fixes #27587

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
(cherry picked from commit 5fae4cdf80)
2025-12-11 23:49:07 +00:00
Anna Stuchlik
cc885a3f35 replace the Driver pages with a link to the new Drivers pages
This commit removes the now redundant driver pages from
the Scylla DB documentation. Instead, the link to the pages
where we moved the diver information is added.
Also, the links are updated across the ScyllaDB manual.

Redirections are added for all the removed pages.

Fixes https://github.com/scylladb/scylladb/issues/26871

Closes scylladb/scylladb#27277

(cherry picked from commit c5580399a8)

Closes scylladb/scylladb#27442
2025-12-10 09:18:24 +01:00
Jenkins Promoter
582b9f83db Update ScyllaDB version to: 2025.4.0-rc7 2025-12-09 17:21:23 +02:00
Gleb Natapov
726c1f5734 direct_failure_detector: run direct failure detector in the gossiper scheduling group
When direct failure detector was introduces the idea was that it will
run on the same connection raft group0 verbs are running, but in
60f1053087 raft verbs were moved to run on the gossiper connection
while DIRECT_FD_PING was left where it was. This patch move it to
gossiper connection as well and fix the pinger code to run in gossiper
scheduling group.

(cherry picked from commit 86dde50c0d)
2025-12-09 17:19:31 +02:00
Anna Stuchlik
d2c24bf42d doc: update the upgrade policy to cover non-consecutive minor upgrades
Fixes https://github.com/scylladb/scylladb/issues/27308

Closes scylladb/scylladb#27319

(cherry picked from commit a5c971d21c)

Closes scylladb/scylladb#27457
2025-12-09 11:38:00 +03:00
Anna Stuchlik
72ee6396d2 doc: add the upgrade guide from 2025.x to 2025.4
Fixes https://github.com/scylladb/scylladb/issues/26451

Fixes https://github.com/scylladb/scylladb/issues/26452

Closes scylladb/scylladb#27310

(cherry picked from commit 48cf84064c)

Closes scylladb/scylladb#27407
2025-12-09 11:37:27 +03:00
Jenkins Promoter
f64b5d375e Update ScyllaDB version to: 2025.4.0-rc6 scylla-2025.4.0-rc6 scylla-2025.4.0-rc6-candidate-20251208053234 2025-12-08 13:37:52 +02:00
Gleb Natapov
6427044681 raft: drop invoke_on from the pinger verb handler
Currently raft direct pinger verb jumps to shard 0 to check if group0 is
alive before replying. The verb runs relatively often, so it is not very
efficient. The patch distributes group0 liveness information (as it
changes) to all shard instead, so that the handler itself does not need
to jump to shard 0.

(cherry picked from commit 6a6bbbf1a6)
2025-12-07 14:57:50 +00:00
Gleb Natapov
000ac0227e direct_failure_detector: pass timeout to direct_fd_ping verb
Currently direct_fd_ping runs without timeout, but the verb is not
waited forever, the wait is canceled after a timeout, this timeout
simply is not passed to the rpc. It may create a situation where the
rpc callback can runs on a destination but it is no longer waited on.
Change the code to pass timeout to rpc as well and return earlier from
the rpc handler if the timeout is reached by the time the callback is
called. This is backwards compatible since timeout is passed as
optional.

(cherry picked from commit 82f80478b8)
2025-12-07 14:57:50 +00:00
Piotr Dulikowski
3040e7aedf index: allow vector indexes without rf_rack_valid_keyspces
The rf_rack_valid_keyspaces option needs to be turned on in order to
allow creating materialized views in tablet keyspaces with numeric RF
per DC. This is also necessary for secondary indexes because they use
materialized views underneath. However, this option is _not_ necessary
for vector store indexes because those use the external vector store
service for querying the list of keys to fetch from the main table, they
do not create a materialized view. The rf_rack_valid_keyspaces was, by
accident, required for vector indexes, too.

Remove the restriction for vector store indexes as it is completely
unnecessary.

Fixes: SCYLLADB-81

Closes scylladb/scylladb#27447

(cherry picked from commit bb6e41f97a)

Closes scylladb/scylladb#27455
2025-12-05 20:13:02 +01:00
Karol Nowacki
71c47b8d18 vector_search: Fix requests hanging on unreachable nodes
When a vector store node becomes unreachable, a client request sent
before the keep-alive timer fires would hang until the CQL query
timeout was reached.

This occurred because the HTTP request writes to the TCP buffer and then
waits for a response. While data is in the buffer, TCP retransmissions
prevent the keep-alive timer from detecting the dead connection.

This patch resolves the issue by setting the `TCP_USER_TIMEOUT` socket
option, which applies an effective timeout to TCP retransmissions,
allowing the connection to fail faster.

Closes scylladb/scylladb#27388

(cherry picked from commit a54bf50290)

Closes scylladb/scylladb#27423
2025-12-04 19:54:08 +01:00
Avi Kivity
4db6d3e924 database: fix overflow when computing data distribution over shards
We store the per-shard chunk count in a uint64_t vector
global_offset, and then convert the counts to offsets with
a prefix sum:

```c++
        // [1, 2, 3, 0] --> [0, 1, 3, 6]
        std::exclusive_scan(global_offset.begin(), global_offset.end(), global_offset.begin(), 0, std::plus());
```

However, std::exclusive_scan takes the accumulator type from the
initial value, 0, which is an int, instead of from the range being
iterated, which is of uint64_t.

As a result, the prefix sum is computed as a 32-bit integer value. If
it exceeds 0x8000'0000, it becomes negative. It is then extended to
64 bits and stored. The result is a huge 64-bit number. Later on
we try to find an sstable with this chunk and fail, crashing on
an assertion.

An example of the failure can be seen here: https://godbolt.org/z/6M8aEbo57

The fix is simple: the initial value is passed as uint64_t instead of int.

Fixes https://github.com/scylladb/scylladb/issues/27417

Closes scylladb/scylladb#27418

(cherry picked from commit 9696ee64d0)
2025-12-04 20:17:19 +02:00
Piotr Dulikowski
e4b1c1f38b db/view/view_building_coordinator: skip work if no view is built
Even though that `view_building_coordinator::work_on_view_building` has
an `if` at the very beginning which checks whether the currently
processed base table is set, it only prints a message and continues
executing the rest of the function regardless of the result of the
check. However, some of the logic in the function assumes that the
currently processed base table field is set and tries to access the
value of the field. This can lead to the view building coordinator
accessing a disengaged optional, which is undefined behavior.

Fix the function by adding the clearly missing `co_await` to the check.
A regression test is added which checks that the view building state
observer - a different fiber which used to print a weird message due to
erroneus view building coordinator behavior - does not print a warning.

Fixes: scylladb/scylladb#27363

Closes scylladb/scylladb#27373

(cherry picked from commit 654ac9099b)

Closes scylladb/scylladb#27406
scylla-2025.4.0-rc5-candidate-20251204032609
2025-12-03 17:12:17 +01:00
Piotr Dulikowski
2787ac6cba Merge '[Backport 2025.4] vector_search: Fix high availability during timeouts' from Scylladb[bot]
This PR introduces two key improvements to the robustness and resource management of vector search:

Proper Abort on CQL Timeout: Previously, when a CQL query involving a vector search timed out
, the underlying ANN query to the vector store was not aborted and would continue to run. This has been fixed by ensuring the abort source is correctly signaled, terminating the ANN request when its parent CQL query expires and preventing unnecessary resource consumption.

Faster Failure Detection: The connection and keep-alive timeouts for vector store nodes were excessively long (2 and 11 minutes, respectively), causing significant delays in detecting and recovering from unreachable nodes. These timeouts are now aligned with the request_timeout_in_ms setting, allowing for much faster failure detection and improving high availability by failing over from unresponsive nodes more quickly.

Fixes: SCYLLADB-76

This issue affects the 2025.4 branch, where similar HA recovery delays have been observed.

- (cherry picked from commit b6afacfc1e)

- (cherry picked from commit 086c6992f5)

Parent PR: #27377

Closes scylladb/scylladb#27391

* github.com:scylladb/scylladb:
  vector_search: Fix ANN query abort on CQL timeout
  vector_search: Reduce connection and keep-alive timeouts
2025-12-03 07:20:11 +01:00
Karol Nowacki
26599e79f2 vector_search: Fix ANN query abort on CQL timeout
When a CQL vector search request timed out, the underlying ANN query was
not aborted and continued to run. This happened because the abort source
was not being signaled upon request expiration.
This commit ensures the ANN query is aborted when the CQL request times out
preventing unnecessary resource consumption.
2025-12-02 16:58:55 +01:00
Karol Nowacki
d4c199a1ec vector_search: Reduce connection and keep-alive timeouts
The connection timeout was 2 minutes and the keep-alive
timeout was 11 minutes. If a vector store node became unreachable, these
long timeouts caused significant delays before the system could recover,
negatively impacting high availability.

This change aligns both timeouts with the `request_timeout`
configuration, which defaults to 10 seconds. This allows for much
faster failure detection and recovery, ensuring that unresponsive nodes
are failed over from more quickly.
2025-12-02 16:52:53 +01:00
Asias He
4e7202ee32 repair: Fix deadlock when topology coordinator steps down in the middle
Consider this:

1) n1 is the topology coordinator
2) n1 schedules and executes a tablet repair with session id s1 for a
tablet on n3 an n4.
3) n3 and n4 take and store the in _rs._repair_compaction_locks[s1]
4) n1 steps down before it executes
locator::tablet_transition_stage::end_repair
5) n2 becomes the new topology coordinator
6) n2 runs locator::tablet_transition_stage::repair again
7) n3 and n4 try to take the lock again and hangs since the lock is
already taken.

To avoid the deadlock, we can throw in step 7 so that n2 will
proceed to end_repair stage and release the lock. After that, the
scheduler could schedule the tablet repair request again.

Fixes #26346

Closes scylladb/scylladb#27163

(cherry picked from commit da5cc13e97)

Closes scylladb/scylladb#27337
2025-12-01 13:06:02 +01:00
Jenkins Promoter
6b5d334be3 Update pgo profiles - aarch64 2025-12-01 04:47:40 +02:00
Jenkins Promoter
58f1597831 Update pgo profiles - x86_64 2025-12-01 03:56:10 +02:00
Anna Stuchlik
d9bfb8c607 doc: fix the info about object storage
This commit fixes the information about object storage:

- Object storage configuration is no longer marked as experimental.
- Redundant information has been removed from the description.
- Information related to object storage for SStabels has been removed
  as the feature is not working.

Fixes https://github.com/scylladb/scylladb/issues/26985

Closes scylladb/scylladb#26987

(cherry picked from commit 724dc1e582)

Closes scylladb/scylladb#27211
2025-11-28 12:38:08 +01:00
Patryk Jędrzejczak
1dab04666c Merge '[Backport 2025.4] doc: update Cloud Instance Recommendations for GCP' from Scylladb[bot]
This PR:
- Removes n1-highmem instances from Recommended Instances.
- Adds missing support for n2-highmem-96.
- Updates the reference to n2 instances in the Google Cloud docs (fixes a broken link to GCP).
- Adds the missing information about processors for n2-highmem-instance - Ice Lake and Cascade Lake (requested by CX).

Fixes https://github.com/scylladb/scylladb/issues/25946
Fixes https://github.com/scylladb/scylladb/issues/24223
Fixes https://github.com/scylladb/scylladb/issues/23976

No backport needed if this PR is merged before 2025.4 branching.

- (cherry picked from commit b18b052d26)

- (cherry picked from commit dab74471cc)

Parent PR: #26182

Closes scylladb/scylladb#27168

* https://github.com/scylladb/scylladb:
  doc: update information for n2-highmem instances
  doc: remove n1-highmem instances from Recommended Instances
2025-11-28 12:31:33 +01:00
Asias He
fc54aedd8f topology_coordinator: Send incremental repair rpc only when the feature is enabled
Otherwise, in a mixed cluster, the handle_tablet_resize_finalization
would fail because of the unknown rpc verb.

Fixes #26309

Closes scylladb/scylladb#27218

(cherry picked from commit ab4896dc70)

Closes scylladb/scylladb#27284
2025-11-27 18:42:14 +01:00
Patryk Jędrzejczak
6b3b05c10b Merge '[Backport 2025.4] fix notification about expiring erm held for to long' from Scylladb[bot]
Commit 6e4803a750 broke notification about expired erms held for too long since it resets the tracker without calling its destructor (where notification is triggered). Fix the assign operator to call the destructor like it should.

Fixes https://github.com/scylladb/scylladb/issues/27141

- (cherry picked from commit 9f97c376f1)

- (cherry picked from commit 5dcdaa6f66)

Parent PR: #27140

Closes scylladb/scylladb#27276

* https://github.com/scylladb/scylladb:
  test: test that expired erm that held for too long triggers notification
  token_metadata: fix notification about expiring erm held for to long
2025-11-27 16:58:16 +01:00
Patryk Jędrzejczak
30e02b6658 Merge '[Backport 2025.4] locator/node: include _excluded in missing places' from Scylladb[bot]
We currently ignore the `_excluded` field in `node::clone()` and the verbose
formatter of `locator::node`. The first one is a bug that can have
unpredictable consequences on the system. The second one can be a minor
inconvenience during debugging.

We fix both places in this PR.

Fixes https://scylladb.atlassian.net/browse/SCYLLADB-72

This PR is a bugfix that should be backported to all supported branches.

- (cherry picked from commit 4160ae94c1)

- (cherry picked from commit 287c9eea65)

Parent PR: #27265

Closes scylladb/scylladb#27291

* https://github.com/scylladb/scylladb:
  locator/node: include _excluded in verbose formatter
  locator/node: preserve _excluded in clone()
2025-11-27 12:27:06 +01:00
Nadav Har'El
a0916179a3 Merge '[Backport 2025.4] Alternator: enable tablets by default - depending on tablets_mode_for_new_keyspaces' from Scylladb[bot]
Before this series, Alternator's CreateTable operation defaults to creating a table replicated with vnodes, not tablets. The reasons for this default included missing support for LWT, Materialized Views, Alternator TTL and Alternator Streams if tablets are used. But today, all of these (except the still-experimental Alternator Streams) are now fully available with tablets, so we are finally ready to switch Alternator to use tablets by default in new tables.

We will use the same configuration parameter that CQL uses, tablets_mode_for_new_keyspaces, to determine whether new keyspaces use tablets by default. If set to `enabled`, tablets are used by default on new tables. If set to `disabled`, tablets will not be used by default (i.e., vnodes will be used, as before). A third value, `enforced` is similar to `enabled` but forbids overriding the default to vnodes when creating a table.

As before, the user can set a tag during the CreateTable operation to override the default choice of tablets or vnodes (unless in `enforced` mode). This tag is now named `system:initial_tablets` - whereas before this patch it was called `experimental:initial_tablets`. The rules stay the same as with the earlier, experimental:initial_tablets tag: when supplied with a numeric value, the table will use tablets. When supplied with something else (like a string "none"), the table will use vnodes.

Fixes https://github.com/scylladb/scylladb/issues/22463

Backport to 2025.4, it's important not to delay phasing out vnodes.

- (cherry picked from commit 403068cb3d)

- (cherry picked from commit af00b59930)

- (cherry picked from commit 376a2f2109)

- (cherry picked from commit 35216d2f01)

- (cherry picked from commit 7466325028)

- (cherry picked from commit c7de7e76f4)

- (cherry picked from commit 63897370cb)

- (cherry picked from commit 274d0b6d62)

- (cherry picked from commit 345747775b)

- (cherry picked from commit a659698c6d)

- (cherry picked from commit eeb3a40afb)

- (cherry picked from commit b34f28dae2)

- (cherry picked from commit 25439127c8)

- (cherry picked from commit c03081eb12)

- (cherry picked from commit 65ed678109)

Parent PR: #26836

Closes scylladb/scylladb#26949

* github.com:scylladb/scylladb:
  test/cluster: modify test to not fail on 2025.4 branch
  Fix backport conflicts
  test,alternator: use 3-rack clusters in tests
  alternator: improve error in tablets_mode_for_new_keyspaces=enforced
  config: make tablets_mode_for_new_keyspaces live-updatable
  alternator: improve comment about non-hidden system tags
  alternator: Fix test_ttl_expiration_streams()
  alternator: Fix test_scan_paging_missing_limit()
  alternator: Don't require vnodes for TTL tests
  alternator: Remove obsolete test from test_table.py
  alternator: Fix tag name to request vnodes
  alternator: Fix test name clash in test_tablets.py
  alternator: test_tablets.py handles new policy reg. tablets
  alternator: Update doc regarding tablets support
  alternator: Support `tablets_mode_for_new_keyspaces` config flag
  Fix incorrect hint for tablets_mode_for_new_keyspaces
  Fix comment for tablets_mode_for_new_keyspaces
2025-11-27 09:05:18 +02:00
Piotr Dulikowski
863aae84fd Merge '[Backport 2025.4] db/view/view_building_coordinator: get rid of task's state in group0' from Scylladb[bot]
Previously, the view building coordinator relied on setting each task's state to STARTED and then explicitly removing these state entries once tasks finished, before scheduling new ones. This approach induced a significant number of group0 commits, particularly in large clusters with many nodes and tablets, negatively impacting performance and scalability.

With the update, the coordinator and worker logic has been restructured to operate without maintaining per-task states. Instead, tasks are simply tracked with an aborted boolean flag, which is still essential for certain tablet operations. This change removes much of the coordination complexity, simplifies the view building code, and reduces operational overhead.

In addition, the coordinator now batches reports of finished tasks before making commits. Rather than committing task completions individually, it aggregates them and reports in groups, significantly minimizing the frequency of group0 commits. This new approach is expected to improve efficiency and scalability during materialized view construction, especially in large deployments.

Fixes https://github.com/scylladb/scylladb/issues/26311

This patch needs to be backported to 2025.4.

- (cherry picked from commit 6d853c8f11)

- (cherry picked from commit 08974e1d50)

- (cherry picked from commit eb04af5020)

- (cherry picked from commit 24d69b4005)

- (cherry picked from commit fb8cbf1615)

- (cherry picked from commit fe9581f54c)

Parent PR: #26897

Closes scylladb/scylladb#27266

* github.com:scylladb/scylladb:
  docs/dev/view-building-coordinator: update the docs after recent changes
  db/view/view_building: send coordinator's term in the RPC
  db/view/view_building_state: replace task's state with `aborted` flag
  db/view/view_building_coordinator: batch finished tasks reporting
  db/view/view_building_worker: change internal implementation
  db/view/view_building_coordinator: change `work_on_tasks` RPC return type
2025-11-27 01:47:53 +01:00
Patryk Jędrzejczak
3c635037df locator/node: include _excluded in verbose formatter
It can be helpful during debugging.

(cherry picked from commit 287c9eea65)
2025-11-26 23:05:25 +00:00
Patryk Jędrzejczak
30790b9af4 locator/node: preserve _excluded in clone()
We currently ignore the `_excluded` field in `clone()`. Losing
information about exclusion can have unpredictable consequences. One
observed effect (that led to finding this issue) is that the
`/storage_service/nodes/excluded` API endpoint sometimes misses excluded
nodes.

(cherry picked from commit 4160ae94c1)
2025-11-26 23:05:25 +00:00
Nadav Har'El
b2c3b28617 test/cluster: modify test to not fail on 2025.4 branch
The purpose of the test

cluster/test_alternator::test_alternator_ttl_scheduling_group

Is to verify that during TTL expiration scans and deletions, all of the
CPU is used in the "streaming" scheduling group, not in the statement
scheduling group ("sl:default") as we had in the past due to bugs.

It appears that in branch 2025.4 we have a new bug - which doesn't exist
in master - that causes some tablets-related work which I couldn't
identify to be done in sl:default, and cause this test to fail.

The simple fix is to sleep for 5 seconds after writing the data, and
it seems that by that time, the sl:default work is done.

This change doesn't make the Alternator TTL test any weaker, so we
need to make this change to allow Alternator to go forward.

Sadly, it does mean that the only test we have for this apparent
bug (which has nothing to do with Alternator) will be gone.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2025-11-26 20:42:09 +02:00
Nadav Har'El
433bc4c17f Fix backport conflicts 2025-11-26 20:42:08 +02:00