Commit Graph

50342 Commits

Author SHA1 Message Date
Jenkins Promoter
cb30eb2e21 Update pgo profiles - aarch64 2025-11-01 05:23:52 +02:00
Jenkins Promoter
e3a0935482 Update pgo profiles - x86_64 2025-11-01 04:54:49 +02:00
Petr Gusev
88765f627a paxos_state: get_replica_lock: remove shard check
This check is incorrect: the current shard may be looking at
the old version of tablets map:
* an accept RPC comes to replica shard 0, which is already at write_both_read_new
* the new shard is shard 1, so paxos_state::accept is called on shard 1
* shard 1 is still at "streaming" -> shards_ready_for_reads() returns old
shard 0

Fixes scylladb/scylladb#26801

Closes scylladb/scylladb#26809
2025-10-31 21:37:39 +01:00
Avi Kivity
7a72155374 Merge 'Introduce nodetool excludenode' from Tomasz Grabiec
If a node is dead and cannot be brought back, tablet migrations are
stuck, until the node is explicitly marked as "permanently dead" /
"ignored node" / "excluded" (name differs in different contexts).

Currently, this is done during removenode and replace operations but
it should be possible to only mark the node as dead, for the purpose
of unblocking migrations or other topology operations, without doing
the actual removenode, because full removal might be currently
impossible, or not desirable due to lack of capacity or priorities.

This patch introduces this kind of API:

```
  nodetool excludenode <host-id> [ ... <host-id> ]
```

Having this kind of API is an improvement in user experience in
several cases. For example, when we lose a rack, the only viable
option for recovery is to run removenode with an extra
--ignore-dead-nodes option. This removenode will fail in the tablet
draining phase, as there is no live node in the rack to rebuild
replicas in. This is confusing to the operator. But necessary before
ALTER KEYSPACE can proceed in order to change replication options to
drop the rack from RF.

Having this API allows operators to have more unified procedures,
where "nodetool excludenode" is always the first step of recovery,
which unblocks further topology operations, both those which restore
capacity, but also auto-scaling, tablet split/merge, load balancing,
etc.

Fixes #21281

The PR also changes "nodetool status" to show excluded nodes,
they have 'X' in their status instead of 'D'.

Closes scylladb/scylladb#26659

* github.com:scylladb/scylladb:
  nodetool: status: Show excluded nodes as having status 'X'
  test: py: Test scenario involving excludenode API
  nodetool: Introduce excludenode command
2025-10-31 22:14:57 +02:00
Avi Kivity
d458dd41c6 Merge 'Avoid input_/output_stream-s default initialization and move-assignment' from Pavel Emelyanov
Recent seastar update deprecated in/out streams usage pattern when a stream is default constructed early and them move-assigned with the proper one (see scylladb/seastar#3051). This PR fixes few places in Scylla that still use one.

Adopting newer seastar API, no need to backport

Closes scylladb/scylladb#26747

* github.com:scylladb/scylladb:
  commitlog: Remove unused work::r stream variable
  ec2_snitch: Fix indentation after previous patch
  ec2_snitch: Coroutinize the aws_api_call_once()
  sstable: Construct output_stream for data instantly
  test: Don't reuse on-stack input stream
2025-10-31 21:22:41 +02:00
Avi Kivity
adf9c426c2 Merge 'db/config: Change default SSTable compressor to LZ4WithDictsCompressor' from Nikos Dragazis
`sstable_compression_user_table_options` allows configuring a node-global SSTable compression algorithm for user tables via scylla.yaml. The current default is LZ4Compressor (inherited from Cassandra).

Make LZ4WithDictsCompressor the new default. Metrics from real datasets in the field have shown significant improvements in compression ratios.

If the dictionary compression feature is not enabled in the cluster (e.g., during an upgrade), fall back to the `LZ4Compressor`. Once the feature is enabled, flip the default back to the dictionary compressor using with a listener callback.

Fixes #26610.

Closes scylladb/scylladb#26697

* github.com:scylladb/scylladb:
  test/cluster: Add test for default SSTable compressor
  db/config: Change default SSTable compressor to LZ4WithDictsCompressor
  db/config: Deprecate sstable_compression_dictionaries_allow_in_ddl
  boost/cql_query_test: Get expected compressor from config
2025-10-31 21:15:18 +02:00
Lakshmi Narayanan Sreethar
3eb7193458 backlog_controller: compute backlog even when static shares are set
The compaction manager backlog is exposed via metrics, but if static
shares are set, the backlog is never calculated. As a result, there is
no way to determine the backlog and if the static shares need
adjustment. Fix that by calculating backlog even when static shares are
set.

Fixes #26287

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>

Closes scylladb/scylladb#26778
2025-10-31 18:18:36 +02:00
Michał Hudobski
fd521cee6f test: fix typo in vector_index test
Unfortunately in https://github.com/scylladb/scylladb/pull/26508
a typo than changes a behavior of a test was introduced.
This patch fixes the typo.

Closes scylladb/scylladb#26803
2025-10-31 18:35:02 +03:00
Tomasz Grabiec
284c73d466 scripts: pull_github_pr.sh: Fix auth problem detection
Before the patch, the script printed:

parse error: Invalid numeric literal at line 2, column 0

Closes scylladb/scylladb#26818
2025-10-31 18:32:58 +03:00
Michael Litvak
e7dbccd59e cdc: use chunked_vector instead of vector for stream ids
use utils::chunked_vector instead of std::vector to store cdc stream
sets for tablets.

a cdc stream set usually represents all streams for a specific table and
timestamp, and has a stream id per each tablet of the table. each stream
id is represented by 16 bytes. thus the vector could require quite large
contiguous allocations for a table that has many tablets. change it to
chunked_vector to avoid large contiguous allocations.

Fixes scylladb/scylladb#26791

Closes scylladb/scylladb#26792
2025-10-31 13:02:34 +01:00
Tomasz Grabiec
1c0d847281 Merge 'load_balancer: load_stats reconcile after tablet migration and table resize' from Ferenc Szili
This change adds the ability to move tablets sizes in load_stats after a tablet migration or table resize (split/merge). This is needed because the size based load balancer needs to have tablet size data which is as accurate as possible, in order to work on fresh tablet size distribution and issue correct tablet migrations.

This is the second part of the size based load balancing changes:

- First part for tablet size collection via load_stats: #26035
- Second part reconcile load_stats: #26152
- The third part for load_sketch changes: #26153
- The fourth part which performs tablet load balancing based on tablet size: #26254

This is a new feature and backport is not needed.

Closes scylladb/scylladb#26152

* github.com:scylladb/scylladb:
  load_balancer: load_stats reconcile after tablet migration and table resize
  load_stats: change data structure which contains tablet sizes
2025-10-31 09:58:25 +01:00
Tomasz Grabiec
2bd173da97 nodetool: status: Show excluded nodes as having status 'X'
Example:

$ build/dev/scylla nodetool status
Datacenter: dc1
===============
Status=Up/Down/eXcluded
|/ State=Normal/Leaving/Joining/Moving
-- Address   Load      Tokens Owns Host ID                              Rack
UN 127.0.0.1 783.42 KB 1      ?    753cb7b0-1b90-4614-ae17-2cfe470f5104 rack1
XN 127.0.0.2 785.10 KB 1      ?    92ccdd23-5526-4863-844a-5c8e8906fa55 rack2
UN 127.0.0.3 708.91 KB 1      ?    781646ad-c85b-4d77-b7e3-8d50c34f1f17 rack3
2025-10-31 09:03:20 +01:00
Tomasz Grabiec
87492d3073 test: py: Test scenario involving excludenode API 2025-10-31 09:03:20 +01:00
Tomasz Grabiec
55ecd92feb nodetool: Introduce excludenode command
If a node is dead and cannot be brought back, tablet migrations are
stuck, until the node is explicitly marked as "permanently dead" /
"ignored node" / "excluded" (name differs in different contexts).

Currently, this is done during removenode and replace operations but
it should be possible to only mark the node as dead, for the purpose
of unblocking migrations or other topology operations, without doing
the actual removenode, because full removal might be currently
impossible, or not desirable due to lack of capacity or priorities.

This patch introduces this kind of API:

  nodetool excludenode <host-id> [ ... <host-id> ]

Having this kind of API is an improvement in user experience in
several cases. For example, when we lose a rack, the only viable
option for recovery is to run removenode with an extra
--ignore-dead-nodes option. This removenode will fail in the tablet
draining phase, as there is no live node in the rack to rebuild
replicas in. This is confusing to the operator. But necessary before
ALTER KEYSPACE can proceed in order to change replication options to
drop the rack from RF.

Having this API allows operators to have more unified procedures,
where "nodetool excludenode" is always the first step of recovery,
which unblocks further topology operations, both those which restore
capacity, but also auto-scaling, tablet split/merge, load balancing,
etc.

Fixes #21281
2025-10-31 09:03:20 +01:00
Avi Kivity
04a289cae6 Merge 'Auto expand to rack list' from Tomasz Grabiec
We want to move towards rack-list based replication factor for tablets being the default mode, and in the future the only supported mode. This PR is a step towards that. We auto-expand numeric RF to rack list on keyspace creation and ALTER when rf_rack_valid_keyspaces option is enabled.

The PR is mostly about adjusting tests. The main logic change is in the last patch, which modifies option post-processing in ks_prop_defs.

Fixes #26397

Closes scylladb/scylladb#26692

* github.com:scylladb/scylladb:
  cql3: ks_prop_defs: Expand numeric RF to rack list
  locator: Move rack_list to topology.hh
  alternator: Do not set RF for zero-token DCs
  alternator: Switch keyspace creation to use ks_prop_defs
  test: alternator: Adjust for rack lists
  cql3: Move validation of invalid ALTER KEYSPACE earlier, to ks_prop_defs
  test: cqlpy: Mark tests using rack lists as scylla-only
  test: Switch to rack-list based RF
  test: Generalize tests to work with both numeric RF and rack lists
  test: cluster: test_zero_token_nodes_multidc: Adjust to rack list RF
  test: Prepare for handling errors specific to rack list path
  test: cluster: dtest: alternator: Force RF=1 in test_putitem_contention
  test: Create cluster with multiple racks in multi-dc setups
  test: boost: network_topology_strategy_test: Adjust to rack-list RF
  test: tablets: Adjust to rack list
  test: cluster: test_group0_schema_versioning: Use smaller RF to respect rf-rack-validness
  test: tablets_test: Convert test_per_shard_goal_mixed_dc_rf to be rack-valid
  test: object_store: test_backup: Adjust for rack lists
  test: cluster: tablets: Do not move tablet across racks in test_tablet_transition_sanity
  test: cluster: mv: Do not move tablets across racks
  test: cluster: util: Fix docstring for parse_replication_options()
  tablets, topology_coordinator: Skip tablet draining on replace
2025-10-30 21:54:08 +02:00
Avi Kivity
c0222e4d3c Merge 'replica/table: do not stop major compaction when disabling auto compaction' from Lakshmi Narayanan Sreethar
When auto compaction is disabled, all ongoing compactions, including
major compactions, are stopped. However, major compactions should not be
stopped, since the disable request applies only to regular auto
compactions.

This PR fixes the issue by tagging major compaction tasks with a newly
introduced `compaction_type::Major` enum. Since
`table::disable_auto_compaction()` already requests the compaction
manager to stop only tasks of type `compaction_type::Compaction`, major
compactions will no longer be stopped.

Fixes #24501

PR improves how the compactions are stopped when a disable auto compaction request is executed.
No need to backport

Closes scylladb/scylladb#26288

* github.com:scylladb/scylladb:
  replica/table: do not stop major compaction when disabling auto compaction
  compaction/compaction_descriptor: introduce compaction_type::Major
2025-10-30 21:45:57 +02:00
Nikos Dragazis
a0bf932caa test/cluster: Add test for default SSTable compressor
The previous patch made the default compressor dependent on the
SSTABLE_COMPRESSION_DICTS feature:
* LZ4Compressor if the feature is disabled
* LZ4WithDictsCompressor if the feature is enabled

Add a test to verify that the cluster uses the right default in every
case.

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
2025-10-30 15:53:54 +02:00
Nikos Dragazis
2fc812a1b9 db/config: Change default SSTable compressor to LZ4WithDictsCompressor
`sstable_compression_user_table_options` allows configuring a
node-global SSTable compression algorithm for user tables via
scylla.yaml. The current default is `LZ4Compressor` (inherited from
Cassandra).

Make `LZ4WithDictsCompressor` the new default. Metrics from real datasets
in the field have shown significant improvements in compression ratios.

If the dictionary compression feature is not enabled in the cluster
(e.g., during an upgrade), fall back to the `LZ4Compressor`. Once the
feature is enabled, flip the default back to the dictionary compressor
using with a listener callback.

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
2025-10-30 15:53:49 +02:00
Pavel Emelyanov
395e275e03 Merge 'test/cluster/random_failures: Adjust to RF-rack-validity' from Dawid Mędrek
We adjust the test to RF-rack-validity and then re-enable
index random events, which requires the configuration option
`rf_rack_valid_keyspaces` to be enabled.

Fixes scylladb/scylladb#26422

Backport: I'd rather not backport these changes. They're almost a hack and poses too much risk for little gain.

Closes scylladb/scylladb#26591

* github.com:scylladb/scylladb:
  test/cluster/random_failures: Re-enable index events
  test/cluster/random_failures: Enable rf_rack_valid_keyspaces
  test/cluster/random_failures: Adjust to RF-rack-validity
2025-10-30 15:39:38 +03:00
Tomasz Grabiec
6cb14c7793 Revert "tests(lwt): new test for LWT testing during tablet resize"
This reverts commit 99dc31e71a.

The test is not stable due to #26801
2025-10-30 08:50:40 +01:00
Tomasz Grabiec
28f6bdc99b cql3: ks_prop_defs: Expand numeric RF to rack list
Auto-exands numeric RF in CREATE/ALTER KEYSPACE statements for
new DCs specified in the statement.

Doesn't auto-expand existing options, as the rack choice may not be in
line with current replica placement. This requires co-locating tablet
replicas, and tracking of co-location state, which is not implemented yet.

Signed-off-by: Tomasz Grabiec <tgrabiec@scylladb.com>
2025-10-29 23:32:59 +01:00
Tomasz Grabiec
35166809cb locator: Move rack_list to topology.hh
So that we can use it in locator/tablets.hh and avoid circular dependency
between that header and abstract_replication_strategy.hh
2025-10-29 23:32:58 +01:00
Tomasz Grabiec
f6dfea2fb1 alternator: Do not set RF for zero-token DCs
That will fail with tablets because it won't be able to allocate
replicas.
2025-10-29 23:32:58 +01:00
Tomasz Grabiec
21db21af7e alternator: Switch keyspace creation to use ks_prop_defs
So that we get the same validation and option post-processing as
during regular keyspace creation.
RF auto-expansion logic happens in ks_prop_defs, and we want that
for tablets.
2025-10-29 23:32:58 +01:00
Tomasz Grabiec
7f66f67d95 test: alternator: Adjust for rack lists
To achieve RF=3 with tablets and rf_rack_valid_keyspaces, we need 3
racks. So change the test to create 3 racks. Alternator was bypassing
standard keyspace creation path, so it escaped validation. But this
will change, and the test will stop wroking.

Also, after auto-expansion of RF to rack list, not all of 4 nodes
will host replicas. So need to adjust expectations.
2025-10-29 23:32:58 +01:00
Tomasz Grabiec
a88f70ce2c cql3: Move validation of invalid ALTER KEYSPACE earlier, to ks_prop_defs
Tests expect this failure in some scenarios, but later changes make us
fail ealier due to topology constraints.

As a rule, more general validation should come before more specific
validation. So syntax validation before topology validation.
2025-10-29 23:32:58 +01:00
Tomasz Grabiec
8e69c65124 test: cqlpy: Mark tests using rack lists as scylla-only
Those tests are intended to be also run against Cassandra, which
doesn't support rack lists.
2025-10-29 23:32:58 +01:00
Tomasz Grabiec
ba53f41f59 test: Switch to rack-list based RF
Have to do that before we enable auto-expansion of numeric RF to
rack-lists, because those tests alter the replication factor, and
altering from rack-list to numeric will not be allowed.
2025-10-29 23:32:58 +01:00
Tomasz Grabiec
d2e7d6fad2 test: Generalize tests to work with both numeric RF and rack lists 2025-10-29 23:32:58 +01:00
Tomasz Grabiec
aa05f0fad0 test: cluster: test_zero_token_nodes_multidc: Adjust to rack list RF
Two changes here:

1) Allocate nodes in dc2 in separeate racks to make the test stronger
- it invites bugs where RF==nr_racks succeeds despite there being
zero-token nodes, and not simply fail due to rack count.

2) Due to auto-expansion to rack list, scylla throws in keyspace
creation rather than table creation.
2025-10-29 23:32:58 +01:00
Benny Halevy
e8b9f13061 test: Prepare for handling errors specific to rack list path 2025-10-29 23:32:58 +01:00
Tomasz Grabiec
255f429a80 test: cluster: dtest: alternator: Force RF=1 in test_putitem_contention
With rf_rack_valid_keyspaces enabled, RF of alternator tables will be
equal to the number of racks (in this test: nodes). Prior to that, if
number of nodes is smaller than 3, alternator creates the keyspace
with RF=1. Turns out, with RF=2 the test fails with write timeouts due
to contention. Enforce RF=1 by creating the table with one node before
adding the second node.
2025-10-29 23:32:58 +01:00
Tomasz Grabiec
40e7543361 test: Create cluster with multiple racks in multi-dc setups
To allow auto-expansion of numeric RF to rack list. Otherwise,
keyspace creation will be rejected if rf-rack-valid keyspaces are
enforced.
2025-10-29 23:32:57 +01:00
Tomasz Grabiec
723622cf70 test: boost: network_topology_strategy_test: Adjust to rack-list RF 2025-10-29 23:32:57 +01:00
Tomasz Grabiec
19d0beff38 test: tablets: Adjust to rack list
test_decommission_rack_load_failure expects some tablets to land in
the rack which only has the decommissioning node. Since the table uses
RF=1, auto-expansion may choose the other rack and put all tablets
there, and the expected failure will not happen. Force placement by
using rack-list RF.
2025-10-29 23:32:57 +01:00
Tomasz Grabiec
7ccc2a3560 test: cluster: test_group0_schema_versioning: Use smaller RF to respect rf-rack-validness 2025-10-29 23:32:57 +01:00
Tomasz Grabiec
0f38f7185c test: tablets_test: Convert test_per_shard_goal_mixed_dc_rf to be rack-valid 2025-10-29 23:32:57 +01:00
Tomasz Grabiec
5962498983 test: object_store: test_backup: Adjust for rack lists
With rack lists, not all nodes in a rack will receive streams if RF=1.
Adjust expectations.
2025-10-29 23:32:57 +01:00
Tomasz Grabiec
3b8a3823db test: cluster: tablets: Do not move tablet across racks in test_tablet_transition_sanity
Choose old_replica and new_replica so that they're both in rack r1.

After later changes (rack list auto expansion), it's no longer
guaranteed that the first replica will be on r1.
2025-10-29 23:32:57 +01:00
Tomasz Grabiec
5bf7112fe6 test: cluster: mv: Do not move tablets across racks
It's illegal with rf-rack-valid keyspaces.
2025-10-29 23:32:57 +01:00
Tomasz Grabiec
e34548ccdb test: cluster: util: Fix docstring for parse_replication_options()
rack lists are now in replication_v2, which is also parsed with this
function.
2025-10-29 23:32:57 +01:00
Tomasz Grabiec
288e75fe22 tablets, topology_coordinator: Skip tablet draining on replace
Replace doesn't drain (rebuild) tablets during topology change. They
are rebuilt afterwards when the replaced node is in "left" state and
replacing node is in normal state. So there is no point in attempting
to drain, as nothing will be drained.

Not only that, doing so has a risk, because the load balancer is
invoked on a transitional topology state in which we can end up with
no normal nodes in a rack. That's the case if the replaced node was
the last one in the rack. This tripped one of the algorithms which
computes rack's shard count for the purpose of determining ideal
tablet count, it was not prepared to find an empty rack to which a
table is still repliacated. That was fixed separately, but to avoid
this, we better skip tablet draining here.
2025-10-29 23:32:57 +01:00
Nikos Dragazis
96e727d7b9 db/config: Deprecate sstable_compression_dictionaries_allow_in_ddl
The option is a knob that allows to reject dictionary-aware compressors
in the validation stage of CREATE/ALTER statements, and in the
validation of `sstable_compression_user_table_options`. It was
introduced in 7d26d3c7cb to allow the admins of Scylla Cloud to
selectively enable it in certain clusters. For more details, check:
https://github.com/scylladb/scylla-enterprise/issues/5435

As of this series, we want to start offering dictionary compression as
the default option in all clusters, i.e., treat it as a generally
available feature. This makes the knob redundant.

Additionally, making dictionary compression the default choice in
`sstable_compression_user_table_options` creates an awkward dependency
with the knob (disabling the knob should cause
`sstable_compression_user_table_options` to fall back to a non-dict
compressor as default). That may not be very clear to the end user.

For these reasons, mark the option as "Deprecated", remove all relevant
tests, and adjust the business logic as if dictionary compression is
always available.

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
2025-10-29 20:13:08 +02:00
Nadav Har'El
aa34f0b875 alternator: fix CDC events for TTL expiration
In commit a3ec6c7d1d we supposedly
implemented the feature of telling TTL experation events from regular
user-sent deletions. However, that implementation did not actually work
at all... It had two bugs:

 1. It created an null rjson::value() instead of an empty dictionary
    rjson::empty_object(), so GetRecords failed every time such a
    TTL expiration event was generated.
 2. In the output, it used lowercase field names "type" and "principalId"
    instead of the uppercase "Type" and "PrincipalId". This is not the
    correct capitalization, and when boto3 recieves such incorrect
    fields it silently deletes them and never passes them to the user's
    get_records() call.

This patch fixes those two bugs, and importantly - enables a test for
this feature. We did already have such a test but it was marked as
"veryslow" so doesn't run in CI and apparently not even run once to
check the new feature. This test is not actually very long on Alternator
when the TTL period is set very low (as we do in our tests), so I replaced
the "veryslow" marker by "waits_for_expiration". The latter marker means
that the test is still very slow - as much as half an hour - on DynamoDB -
but runs quickly on Scylla in our test setup, and enabled in CI by
default.

The enabled test failed badly before this patch (a server error during
GetRecords), and passes with this patch.

Also, the aforementioned commit forgot to remove the paragraph in
Alternator's compatibility.md that claims we don't have that feature yet.
So we do it now.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#26633
2025-10-29 17:08:20 +01:00
Piotr Wieczorek
2812e67f47 cdc: Emit a preimage for non-clustered tables
Until this patch, CDC haven't fetched a preimage for mutations
containing only a partition tombstone. Therefore, single-row deletions
in a table witout a clustering key didn't include a preimage, which was
inconsistent with single-row clustered deletions. This commit addresses
this inconsistency.

Second reason is compatibility with DynamoDB Streams, which doesn't
support entire-partition deletes. Alternator uses partition tombstones
for single-row deletions, though, and in these cases the 'OldImage' was
missing from REMOVE records.

Fixes https://github.com/scylladb/scylladb/issues/26382

Closes scylladb/scylladb#26578
2025-10-29 17:54:58 +02:00
Nadav Har'El
29ed1f3de7 Merge 'cql3: Refactor vector search select impl into a dedicated class' from Karol Nowacki
cql3: Refactor vector search select impl into a dedicated class

The motivation for this change is crash fixed in https://github.com/scylladb/scylladb/pull/25500.

This commit refactors how ANN ordered select statements are handled to prevent a potential null pointer dereference and improve code organization.

Previously, vector search selects were managed by `indexed_table_select_statement`, which unconditionally dereferenced a `view_ptr`. This assumption is invalid for vector search indexes where no view exists, creating a risk of crashes.

To address this, the refactoring introduces the following changes:

- A new `vector_indexed_table_select_statement` class is created to specifically handle ANN-ordered selects. This class operates without a view_ptr, resolving the null pointer risk.
-  The `indexed_table_select_statement` is renamed to `view_indexed_table_select_statement` to more accurately reflect its function with view-based indexes.
- An assertion has been added to `indexed_table_select_statement` constructor to ensure view_ptr is not null, preventing similar issues in the future.

Fixes: VECTOR-162

No backport is needed, as this is refactoring.

Closes scylladb/scylladb#25798

* github.com:scylladb/scylladb:
  cql3: Rename indexed_table_select_statement
  cql3: Move vector search select to dedicated class
2025-10-29 17:49:24 +02:00
Lakshmi Narayanan Sreethar
7eac18229c replica/table: do not stop major compaction when disabling auto compaction
When auto compaction is disabled, all ongoing compactions, including
major compactions, are stopped. However, major compactions should not be
stopped, since the disable request applies only to regular auto
compactions.

This patch fixes the issue by tagging major compaction tasks with the
newly introduced `compaction_type::MajorCompaction`. Since
`table::disable_auto_compaction()` already requests the compaction
manager to stop only tasks of type `compaction_type::Compaction`, major
compactions will no longer be stopped.

Fixes #24501

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2025-10-29 19:22:07 +05:30
Lakshmi Narayanan Sreethar
4d442f48db compaction/compaction_descriptor: introduce compaction_type::Major
Introduce a new compaction_type enum : `Major`.
This type will be used by the next patches to differentiate between
major compaction and regular compaction (compaction_type::Compaction).

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2025-10-29 19:21:53 +05:30
Nikos Dragazis
d95ebe7058 boost/cql_query_test: Get expected compressor from config
Since 5b6570be52, the default SSTable compression algorithm for user
tables is no longer hardcoded; it can be configured via the
`sstable_compression_user_table_options.sstable_compression` option in
scylla.yaml.

Modify the `test_table_compression` test to get the expected value from
the configuration.

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
2025-10-29 14:52:43 +02:00
Piotr Dulikowski
aba922ea65 Merge 'cdc: improve cdc metadata loading' from Michael Litvak
when loading CDC streams metadata for tablets from the tables, read only
new entries from the history table instead of reading all entries. This
improves the CDC metadata reloading, making it more efficient and
predictable.

the CDC metadata is loaded as part of group0 reload whenever the
internal CDC tables are modified. on tablet split / merge, we create a
new CDC timestamp and streams by writing them to the cdc_streams_history
table by group0 operation, and when it's applied we reload the in-memory
CDC streams map by reading from the tables and constructing the updated map.

Previously, on every update, we would read the entire
cdc_streams_history entries for the changed table, constructing all its
streams and creating a new map from scratch.

We improve this now by reading only new entries from cdc_streams_history
and append them to the existing map. we can do this because we only
append new entries to cdc_streams_history with higher timestamp than all
previous entries.

This makes this reloading more efficient and predictable, because
previously we would read a number of entries that depends on the number
of tablets splits and merges, which increases over time and is
unbounded, whereas now we read only a single stream set on each update.

Fixes https://github.com/scylladb/scylladb/issues/26732

backport to 2025.4 where cdc with tablets is introduced

Closes scylladb/scylladb#26160

* github.com:scylladb/scylladb:
  test: cdc: extend cdc with tablets tests
  cdc: improve cdc metadata loading
2025-10-29 11:07:48 +01:00