Commit Graph

1157 Commits

Author SHA1 Message Date
Botond Dénes
9d08a380db Merge 'Fix getendpoints command for compound keys containing ':'' from Taras Veretilnyk
Before, the `nodetool getendpoints` expected the key as one string separated by : (for example 1:val:ue). This caused errors if any part of the key had a colon because it was unclear whether a colon was a separator or part of the key.

This change adds a new API endpoint, `/storage_service/natural_endpoints/v2/{keyspace}`, which accepts composite partition keys as multiple key_component query parameters (e.g., ?key_component=1&key_component=val:ue). The `nodetool getendpoints` command was updated to support a new `--key-components` option, allowing users to pass key components as an array. The client and test infrastructure were extended to support multiple values for a query parameter, and tests were added to verify correct behavior with composite keys.

The previous method of passing partition keys as colon-separated strings is preserved for backward compatibility.

Backport is not required, since this change relies on recent Seastar updates

Fixes #16596

Closes scylladb/scylladb#26169

* github.com:scylladb/scylladb:
  docs: document --key-components option for getendpoints
  test/nodetool/test_getendpoints: add coverage for --key-components param in getendpoints
  nodetool: Introduce new option --key-components to specify compound partition keys as array
  rest_api/test_storage_service: add v2 natural_endpoints test for composite key with multiple components
  api/storage_service: add GET 'natural_endpoints' v2 to support composite keys with ':'
  rest_api_mock: support duplicate query parameters
  test/rest_api: support multiple query values per key in RestApiSession.send()
  nodetool: add support of new seastar query_parameters_type to scylla_rest_client
2025-10-02 09:04:40 +03:00
Taras Veretilnyk
78888dd76c nodetool: Introduce new option --key-components to specify compound partition keys as array
Allows getendpoints to accept components of partition key using the --key-components option.
Key components are passed as an array and sent to the new /natural_endpoints/v2/{keyspace} endpoint.
2025-10-01 15:53:25 +02:00
Taras Veretilnyk
53883958d6 nodetool: add support of new seastar query_parameters_type to scylla_rest_client 2025-10-01 15:52:18 +02:00
Emil Maskovsky
b0de054439 docs: fix typos and spelling errors
Corrected spelling mistakes, typos, and minor wording issues to improve
the developer documentation.

No backport: There is no functional change, and the doc is mostly
relevant to master, so it doesn't need to be backported.

Closes scylladb/scylladb#26332
2025-09-30 13:16:49 +02:00
Michał Chojnowski
aed1cb6f65 tools/scylla-sstable: add --sstable-version=? to scylla sstable write
A useful option in general, and I'll need it to test multiple versions
in `test_sstable_validation.py`.
2025-09-29 22:15:25 +02:00
Michał Chojnowski
ef11dc57c1 db/config: expose "ms" format to the users via database config
Extend the `sstable_format` config enum with a "ms" value,
and, if it's enabled (in the config and in cluster features),
use it for new sstables on the node.

(Before this commit, writing `ms` sstables should only be possible
in unit tests, via internal APIs. After this commit, the format
can be enabled in the config and the database will write it during
normal operation).

As of this commit, the new format is not the default yet.
(But it will become the default in a later commit in the same series).
2025-09-29 22:15:25 +02:00
Michał Chojnowski
db4283b542 sstables: introduce ms sstable format version
Introduce `ms` -- a new sstable format version which
is a hybrid of Cassandra's `me` and `da`.

It is based on `me`, but with the index components
(Summary.db and Index.db) replaced with the index
components of `da` (Partitions.db and Rows.db).

As of this patch, the version is never chosen
anywhere for writing sstables yet. It is only introduced.
We will add it to unit tests in a later commit,
and expose it to users in yet later commit.
2025-09-29 22:15:24 +02:00
Michał Chojnowski
17085dc1e4 tools/scylla-sstable: default to "preferred" sstable version, not "highest"
Later in this patch series we will introduce `ms` as the new highest
format, but we won't be able to make it the default within the same
series due to some dtest incompatibilities.

Until `ms` is the default, we don't `scylla sstable` to default to
it, even though it's the highest. Let's choose the default
version in `scylla sstable` using the same method which is
used by Scylla in general: by letting the `sstable_manager` choose.
2025-09-29 22:13:59 +02:00
Botond Dénes
34cc7aafae tools/scylla-sstable: introduce the upgrade command
An offline, scylla-sstable variant of nodetool upgradesstables command.
Applies latest (or selected) sstable version and latest schema.

Closes scylladb/scylladb#26109
2025-09-27 16:53:14 +03:00
Avi Kivity
0f4363cc8d Merge 'sstable: add more complete schema to scylla component' from Botond Dénes
Sstables store a basic schema in the statistics component. The scylla-sstable tool uses this to be able to read and dump sstables in a self-contained manner, without requiring an external schema source.
The problem is that the schema stored int he statistics component is incomplete: it doesn't store column names for key columns, so these have placeholder names in dump outputs where column names are visible.
This is not a disaster but it is confusing and it can cause errors in scripts which want to check the content of sstables, while also knowing the schema and expecting the proper names for key columns.

To make sstables truly self-contained w.r.t. the schema, add a complete schema to the scylla component. This schema contains the names and types of all columns, as well as some basic information about the schema: keyspace name, table name, id and version.
When available, scylla-sstable's schema loader will use this new more complete schema and fall-back to the old method of loading the (incomplete) schema from the statistics component otherwise.

New feature, no backport required.

Closes scylladb/scylladb#24187

* github.com:scylladb/scylladb:
  test/boost/schema_loader_test: add specific test with interesting types
  test/lib/random_schema: add random_schema(schema_ptr) constructor
  test/boost/schema_loader_test: test_load_schema_from_sstable: add fall-back test
  tools/schema_loader: add support for loading from scylla-metadata
  tools/schema_loader: extract code which load schema from statistics
  sstables: scylla_metadata: add schema member
2025-09-26 00:21:17 +03:00
Botond Dénes
1999d8e3d3 compaction: remove using namespace {compaction,sstables}
Some files in compaction/ have using namespace {compaction,sstables}
clauses, some even in headers. This is considered bad practice and
muddies the namespace use. Remove them.
2025-09-25 15:03:57 +03:00
Botond Dénes
86ed627fc4 compaction: move code to namespace compaction
The namespace usage in this directory is very inconsistent, with files
and classes scattered in:
* global namespace
* namespace compaction
* namespace sstables

With cases, where all three used in the same file. This code used to
live in sstables/ and some of it still retains namespace sstables as a
heritage of that time. The mismatch between the dir (future module) and
the namespace used is confusing, so finish the migration and move all
code in compaction/ to namespace compaction too.

This patch, although large, is mechanic and only the following kind of
changes are made:
* replace namespace sstable {} with namespace compaction {}
* add namespace compaction {}
* drop/add sstables::
* drop/add compaction::
* move around forward-declarations so they are in the correct namespace
  context

This refactoring revealed some awkward leftover coupling between
sstables and compaction, in sstables/sstable_set.cc, where the
make_sstable_set() methods of compaction strategies are implemented.
2025-09-25 15:03:56 +03:00
Botond Dénes
b85d858f6d tools/schema_loader: add support for loading from scylla-metadata
When available, load the schema from the Scylla component, where the
column names of keys are also available. Fall-back to loading the schema
from the Statistics component otherwise (previous behaviour).
2025-09-25 11:28:34 +03:00
Botond Dénes
ace2ba06c3 tools/schema_loader: extract code which load schema from statistics
Soon there will be an alternative method too: load from scylla-metadata.
2025-09-25 11:28:34 +03:00
Botond Dénes
234f905fa4 sstables: scylla_metadata: add schema member
To store the most important schema fields, like id, version, keyspace
name, table name and the list of all columns, along with their kind,
name and type. This will serve as alternative schema source to the one
stored in statistics component. This latter one doesn't store any of the
metatada and neither does primary key names (just the types), so it is
leads to confusion when it is used as schema source for scylla-sstable.

This new schema stored in the scylla-metadata component is not intended
to be a full-schema, equivalent to the one stored in the schema tables,
it is intended to be good enough for scylla-sstable being able to parse
sstables in a self-sufficient manner.
2025-09-25 11:28:34 +03:00
Pavel Emelyanov
8f815de1e0 Merge 'treewide: move away from accessing httpd::request::query_parameters' from Botond Dénes
Acecssing this member directly is deprecated, migrate code to use {get,set}_query_param() and friends instead.

Fixes: https://github.com/scylladb/scylladb/issues/26023

Preparation for seastar update, no backport required.

Closes scylladb/scylladb#26024

* github.com:scylladb/scylladb:
  treewide: move away from accessing httpd::request::query_parameters
  test/pylib/s3_server_mock.py: better handle empty query params
2025-09-25 11:05:50 +03:00
Botond Dénes
1ac7b4c35e treewide: move away from accessing httpd::request::query_parameters
Acecssing this member directly is deprecated, migrate code to use
{get,set}_query_param() and friends instead.

Fixes: https://github.com/scylladb/scylladb/issues/26023
2025-09-24 11:52:15 +03:00
Michał Chojnowski
b76716c8aa tools/schema_loader: disable tablet-related restrictions in the placeholder keyspace
Passing `0` as the `initial_tablets` argument causes `schema_loaders`'s
placeholder keyspace to be a tablet keyspace.

This causes `scylla sstable` to reject some table schemas which
are legitimate in this context. For example, `scylla sstable`
refuses to work with sstables which contains `counter` columns,
because tablets don't support counters.

This is undesirable. Let's make `schema_loader`'s keyspace
a non-tablet keyspace.

Closes scylladb/scylladb#26192
2025-09-24 06:55:28 +03:00
Łukasz Paszkowski
5089ffe06f tools: toolchain: add e2fsprogs, fuse3 to the dependencies
The packages contain filesystem utilities to create volumes such
that sudo/unshare are not required.

Closes #26135

[avi: regenerate frozen toolchain with optimized clang from
  https://devpkg.scylladb.com/clang/clang-20.1.8-Fedora-42-aarch64.tar.gz
  https://devpkg.scylladb.com/clang/clang-20.1.8-Fedora-42-x86_64.tar.gz
]

Closes scylladb/scylladb#26165
2025-09-23 18:49:37 +03:00
Pavel Emelyanov
ce8dd798a2 Merge 'tools/scylla-sstable-scripts: introduce purgeable.lua and writetime-histogram.lua' from Botond Dénes
`purgeable.lua` was written for a specific investigation a few years ago.
`writetime-histogram.lua` is an sstable script transcription of the former scylla-sstable writetime-histogram command. This was also written for an investigation (before script command existed) and is too specific to be a native command, so was removed by edaf67edcb.

Add both scripts to the sample script library, they can be useful, either for a future investigation, or as samples to copy+edit to write new scripts (and train AI).

New sstable scripts, no backport

Closes scylladb/scylladb#26137

* github.com:scylladb/scylladb:
  tools/scylla-sstable-scripts: introduce writetime-histogram.lua
  tools/scylla-sstable-scripts: introduce purgable.lua
2025-09-22 15:27:49 +03:00
Botond Dénes
92f614dc5a tools/scylla-sstable-scripts: introduce writetime-histogram.lua
Produces a histogram with the writetime (timestamp) of the data in the
sstable(s). The histogram is printed to the output, along with general
stats about the processed data.
2025-09-19 11:54:01 +03:00
Botond Dénes
d298da5410 tools/scylla-sstable-scripts: introduce purgable.lua
Collects and prints statistics about how much data is purgeable in an
sstable. Works only with tombstone_gc = {'mode': 'timeout'};
Can help diagnosing the efficiency (or lack of) tombstone-gc.
2025-09-19 11:53:30 +03:00
Botond Dénes
37e46f674d Merge 'nodetool: ignore repair request error of colocated tables' from Michael Litvak
when cluster repair is run for an entire keyspace, nodetool makes a
repair api request for each table.

if the keyspace contains colocated tables, then the api request for the
colocated tables will fail, because currently scylla doesn't allow making
repair requests for specific colocated tables, but only for base tables.

if the request is to repair an entire keyspace then we can ignore this,
because we will make a repair request for all base tables, and this in
turn will repair also all the colocated tables in the keyspace.

however if specific tables are requested and some of them are colocated
then we should propagate the error to let the user know the request is
invalid.

Refs https://github.com/scylladb/scylladb/issues/24816

no backport - no colocated tablets in previous releases

Closes scylladb/scylladb#26051

* github.com:scylladb/scylladb:
  nodetool: ignore repair request error of colocated tables
  storage_service: improve error message on repair of colocated tables
2025-09-19 06:44:23 +03:00
Karol Nowacki
b5f3f2f4c5 tools: Fix missing source file in CMake target
The `json_mutation_stream_parser.cc` file was not included in the
`scylla-tools` CMake target. This could lead to "undefined reference"
linker errors when building with CMake.

This commit adds the missing source file to the target's source list.

Closes scylladb/scylladb#26108
2025-09-18 19:44:53 +03:00
Botond Dénes
edaf67edcb tools/scylla-sstable: remove writetime-histogram command
This command was written for an investigation and was used exactly once.
This would have been a perfect candidate for the (also rarely used)
scylla-sstable script command, but it didn't exist yet.
Drop this command from the tool, such super-specific commands should be
written as sstable-scripts nowadays, which is what we will do if we ever
need this again.

Closes scylladb/scylladb#26062
2025-09-18 12:05:54 +03:00
Michael Litvak
aae91330b0 nodetool: ignore repair request error of colocated tables
when cluster repair is run for an entire keyspace, nodetool makes a
repair api request for each table.

if the keyspace contains colocated tables, then the api request for the
colocated tables will fail, because currently scylla doesn't allow making
repair requests for specific colocated tables, but only for base tables.

if the request is to repair an entire keyspace then we can ignore this,
because we will make a repair request for all base tables, and this in
turn will repair also all the colocated tables in the keyspace.

however if specific tables are requested and some of them are colocated
then we should propagate the error to let the user know the request is
invalid.

Refs scylladb/scylladb#24816
2025-09-18 09:35:53 +02:00
Avi Kivity
3acfc577d8 Merge 'tools/scylla-sstable: extract json mutation stream parser into own hh,cc' from Botond Dénes
tools/scylla-sstable.cc has 3.5k SLOC, out of which this class alone is 1K. Extract into own hh and cc. Since this class was already using pimpl, the header remains nice and small.

Code cleanup, no backport needed.

Closes scylladb/scylladb#26064

* github.com:scylladb/scylladb:
  tools: extract json_mtuation_stream_parser to its own hh,cc files
  tools/scylla-sstable: fix indentation
  tools/scylla-sstable: prepare for extracting json_mutation_stream_parser
2025-09-17 18:30:30 +03:00
Botond Dénes
2fa0f82910 tools: extract json_mtuation_stream_parser to its own hh,cc files
tools/scylla-sstable.cc has 3.5k SLOC, out of which this class alone is
1K. Extract into own hh and cc, just a copy-paste after the preparation
commit.
2025-09-17 12:18:07 +03:00
Botond Dénes
ffe8918522 tools/scylla-sstable: fix indentation
Left broken by previous patch.
2025-09-17 12:16:22 +03:00
Botond Dénes
8c36a983cc tools/scylla-sstable: prepare for extracting json_mutation_stream_parser
Make methods out-of-line, so class declaration stands on its own,
without definition of impl.
Move auxiliary structures, used only by impl, out of the class scope.
Move parser to tools namespace, and auxiliaries to anonymous namespace
within the tools one.
Pass down logger ref to parser impl and below, to prepare for sst_log
not being available in scope.
Add comment to parser class explaining what it does.
2025-09-17 12:16:21 +03:00
Asias He
54162a026f scylla-nodetool: Add --incremental-mode option to cluster repair
The `--incremental-mode` option specifies the incremental repair mode.
Can be 'disabled', 'regular', or 'full'.

'regular': The incremental repair logic is enabled. Unrepaired sstables
will be included for repair.  Repaired sstables will be skipped. The
incremental repair states will be updated after repair.

'full': The incremental repair logic is enabled. Both repaired and
unrepaired sstables will be included for repair. The incremental repair
states will be updated after repair.

'disabled': The incremental repair logic is disabled completely. The
incremental repair states, e.g., repaired_at in sstables and
sstables_repaired_at in the system.tablets table, will not be updated
after repair.

When the option is not provided, it defaults to regular.

Fixes #25931

Closes scylladb/scylladb#25969
2025-09-16 10:23:22 +03:00
Yaron Kaikov
902d139c80 tools: toolchain: dbuild: add setuptools_scm as dependency
this package was added as a dependnancy to `cqlsh` in 216d8b0658

Fixes: https://github.com/scylladb/scylladb/issues/25613

[Yaron: regenerate frozen toolchain with optimized clang from
	https://devpkg.scylladb.com/clang/clang-20.1.8-Fedora-42-aarch64.tar.gz
	https://devpkg.scylladb.com/clang/clang-20.1.8-Fedora-42-x86_64.tar.gz
]

Closes scylladb/scylladb#25932
2025-09-11 08:51:28 +03:00
Botond Dénes
514f59d157 tools/scylla-sstable: write: move to UUID generation
We are moving away from integer generations, so stop using them.
Also drop the --generation command-line parameter, UUID generations
don't have be provided by the caller, because random UUIDs will not
collide with each other. To help the caller still know what generation
the output sstable has (previously they provided it via --generation),
print the generation to stdout.

Closes scylladb/scylladb#25166
2025-09-10 13:47:26 +03:00
Andrei Chekun
ea4cd431c9 test.py: add pytest-sugar plugin to the dependencies
This plugin allows having better terminal output with progress bar for
the tests.

Closes scylladb/scylladb#25845

[avi: regenerate frozen toolchain]

Closes scylladb/scylladb#25860
2025-09-08 20:50:02 +03:00
Dawid Mędrek
bb0255b2fb tools/scylla-sstable: Enable rf_rack_valid_keyspaces
Enabling the configuration option should have no negative impact on how the tool
behaves. There is no topology and we do not create any keyspaces (except for
trivial ones using `SimpleStrategy` and RF=1), only their metadata. Thanks to
that, we don't go through validation logic that could fail in presence of an
RF-rack-invalid keyspace.

On the other hand, enabling `rf_rack_valid_keyspaces` lets the tool access code
hidden behind that option. While that might not be of any consequence right now,
in the future it might be crucial (for instance, see: scylladb/scylladb#23030).

Note that other tools don't need an adjustment:

* scylla-types: it uses schema_builder, but it doesn't reuse any other
  relevant part of Scylla.
* nodetool: it manages Scylla instances but is not an instance itself, and it
  does not reuse any codepaths.
* local-file-key-generator: it has nothing to do with Scylla's logic.

Other files in the `tools` directory are auxiliary and are instructed with an
already created instance of `db::config`. Hence, no need to modify them either.

Fixes scylladb/scylladb#25792

Closes scylladb/scylladb#25794
2025-09-08 11:52:43 +03:00
Pavel Emelyanov
b26816f80d s3: Export memory usage gauge (metrics)
The memory usage is tracked with the help of a semaphore, so just export
its "consumed" units.

One tricky place here is the need to skip metrics registration for
scylla-sstable tool. The thing is that the tools starts the storage
manager and sstables manager on start and then some of tool's operations
may want to start both managers again (via cql environment) causing
double metrics registration exception.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#25769
2025-09-05 18:25:34 +03:00
Radosław Cybulski
c242234552 Revert "build: add precompiled headers to CMakeLists.txt"
This reverts commit 01bb7b629a.

Closes scylladb/scylladb#25735
2025-09-03 09:46:00 +03:00
Radosław Cybulski
01bb7b629a build: add precompiled headers to CMakeLists.txt
Add precompiled header support to CMakeLists.txt and configure.py -
it improves compilation time by approximately 10%.

New header `stdafx.hh` is added, don't include it manually -
the compiler will include it for you. The header contains includes from
external libraries used by Scylla - seastar, standard library,
linux headers and zlib.

The feature is enabled by default, use CMake option `Scylla_USE_PRECOMPILED_HEADER`
or configure.py --disable-precompiled-header to disable.

The feature should be disabled, when trying to check headers - otherwise
you might get false negatives on missing includes from seastar / abseil and so on.

Note: following configuration needs to be added to ccache.conf:

    sloppiness = pch_defines,time_macros

Closes #25182
2025-08-27 21:37:54 +03:00
Avi Kivity
352cda4467 treewide: avoid including gms/feature_service.hh from headers
To avoid dependency proliferation, switch to forward declarations.

In one case, we introduce indirection via std::unique_ptr and
deinline the constructor and destructor.

Ref #1

Closes scylladb/scylladb#25584
2025-08-20 10:30:27 +03:00
Avi Kivity
611918056a Merge 'repair: Add tablet incremental repair support' from Asias He
The central idea of incremental repair is to allow repair participants
to select and repair only a portion of the dataset to speed up the
repair process. All repair participants must utilize an identical
selection method to repair and synchronize the same selected dataset.
There are two primary selection methods: time-based and file-based. The
time-based method selects data within a specified time frame. It is
versatile but it is less efficient because it requires reading all of
the dataset and omitting data beyond the time frame. The file-based
method selects data from unrepaired SSTables and is more efficient
because it allows the entire SSTable to be omitted. This document patch
implements the file-based selection method.

Incremental repair will only be supported for tablet tables; it will not
be supported for vnode tables. On one hand, the legacy vnode is less
important to support. On the other hand, the incremental repair for
vnode is much harder to implement. With vnodes, a SSTalbe could contain
data for multiple vnode ranges. When a given vnode range is repaired,
only a portion of the SSTable is repaired. This complicates the
manipulation of SSTables significantly during both repair and
compaction. With tablets, an entire tablet is repaired so that a
sstable is either fully repaired or not repaired which is a huge
simplification.

This patch uses the repaired_at from sstables::statistics component to
mark a sstable as repaired. It uses a virtual clock as the repair
timestamp, i.e., using a monotonically increasing number for the
repaired_at field of a SSTable and sstables_repaired_at column in
system.tablets table. Notice that when a sstable is not repaired, the
repaired_at field will be set to the default value 0 by default. The
being_repaired in memory field of a SSTable is used to explicitly mark
that a SSTable is being selected. The following variables are used for
incremental repair:

The repaired_at on disk field of a SSTable is used.
   - A 64-bit number increases sequentially

The sstables_repaired_at is added to the system.tablets table.
   - repaired_at <= sstables_repaired_at means the sstable is repaired

The being_repaired in memory field of a SSTable is added.
   - A repair UUID tells which sstable has participated in the repair

Initial test results:

    1) Medium dataset results
    Node amount: 3
    Instance type: i4i.2xlarge
    Disk usage per node: ~500GB
    Cluster pre-populated with ~500GB of data before starting repairs job.
    Results for Repair Timings:
    The regular repair run took 210 mins.
    Incremental repair 1st run took 183 mins, 2nd and 3rd runs took around 48s
    The speedup is: 183 mins  / 48s = 228X

    2) Small dataset results
    Node amount: 3
    Instance type: i4i.2xlarge
    Disk usage per node: ~167GB
    Cluster pre-populated with ~167GB of data before starting the repairs job.
    Regular repair 1st run took 110s,  2nd and 3rd runs took 110s.
    Incremental repair 1st run took 110 seconds, 2nd and 3rd run took 1.5 seconds.
    The speedup is: 110s / 1.5s = 73X

    3) Large dataset results
    Node amount: 6
    Instance type: i4i.2xlarge, 3 racks
    50% of base load, 50% read/write
    Dataset == Sum of data on each node

    Dataset     Non-incremental repair (minutes)
    1.3 TiB     31:07
    3.5 TiB     25:10
    5.0 TiB     19:03
    6.3 TiB     31:42

    Dataset     Incremental repair (minutes)
    1.3 TiB     24:32
    3.0 TiB     13:06
    4.0 TiB     5:23
    4.8 TiB     7:14
    5.6 TiB     3:58
    6.3 TiB     7:33
    7.0 TiB     6:55

Fixes #22472

Closes scylladb/scylladb#24291

* github.com:scylladb/scylladb:
  replica: Introduce get_compaction_reenablers_and_lock_holders_for_repair
  compaction: Move compaction_reenabler to compaction_reenabler.hh
  topology_coordinator: Make rpc::remote_verb_error to warning level
  repair: Add metrics for sstable bytes read and skipped from sstables
  test.py: Disable incremental for test_tombstone_gc_for_streaming_and_repair
  test.py: Add tests for tablet incremental repair
  repair: Add tablet incremental repair support
  compaction: Add tablet incremental repair support
  feature_service: Add TABLET_INCREMENTAL_REPAIR feature
  tablet_allocator: Add tablet_force_tablet_count_increase and decrease
  repair: Add incremental helpers
  sstable: Add being_repaired to sstable
  sstables: Add set_repaired_at to metadata_collector
  mutation_compactor: Introduce add operator to compaction_stats
  tablet: Add sstables_repaired_at to system.tablets table
  test: Fix drain api in task_manager_client.py
2025-08-19 13:13:22 +03:00
Asias He
f9021777d8 compaction: Add tablet incremental repair support
This patch addes incremental_repair support in compaction.

- The sstables are split into repaired and unrepaired set.

- Repaired and unrepaired set compact sperately.

- The repaired_at from sstable and sstables_repaired_at from
  system.tablets table are used to decide if a sstable is repaired or
  not.

- Different compactions tasks, e.g., minor, major, scrub, split, are
  serialized with tablet repair.
2025-08-18 11:01:21 +08:00
Avi Kivity
66173c06a3 Merge 'Eradicate the ability to create new sstables with numerical sstable generation' from Benny Halevy
Remove support for generating numerical sstable generation for new sstables.
Loading such sstables is still supported but new sstables are always created with a uuid generation.
This is possible since:
* All live versions (since 5.4 / f014ccf369) now support uuid sstable generations.
* The `uuid_sstable_identifiers_enabled` config option (that is unused from version 2025.2 / 6da758d74c) controls only the use of uuid generations when creating new sstables. SSTables with uuid generations should still be properly loaded by older versions, even if `uuid_sstable_identifiers_enabled` is set to `false`.

Fixes #24248

* Enhancement, no backport needed

Closes scylladb/scylladb#24512

* github.com:scylladb/scylladb:
  streaming: stream_blob: use the table sstable_generation_generator
  replica: distributed_loader: process_upload_dir: use the table sstable_generation_generator
  sstables: sstable_generation_generator: stop tracking highest generation
  replica: table: get rid of update_sstables_known_generation
  sstables: sstable_directory: stop tracking highest_generation
  replica: distributed_loader: stop tracking highest_generation
  sstables: sstable_generation: get rid of uuid_identifiers bool class
  sstables_manager: drop uuid_sstable_identifiers
  feature_service: move UUID_SSTABLE_IDENTIFIERS to supported_feature_set
  test: cql_query_test: add test_sstable_load_mixed_generation_type
  test: sstable_datafile_test: move copy_directory helper to test/lib/test_utils
  test: database_test: move table_dir helper to test/lib/test_utils
2025-08-14 11:54:33 +03:00
Israel Fruchter
2da26d1fc1 Update tools/cqlsh submodule (v6.0.26)
* tools/cqlsh 02ec7c57...aa1a52c1 (6):
  > build-push.yaml: upgrade cibuildwheel to latest
  > build-push.yml: skip python 3.8 and PyPy builds
  > cqlshlib: make NetworkTopologyStrategy default for autocomplete
  > default to setuptools_scm based version when not packaged
  > chore(deps): update pypa/cibuildwheel action to v2.23.0

Closes scylladb/scylladb#25420
2025-08-11 13:07:47 +03:00
Avi Kivity
f49b63f696 tools: toolchain: dbuild: forward container registry credentials
Docker hub rate-limits unauthenticated image pulls, so forward
the host's credentials to the container. This prevents rate limit
errors when running nested containers.

Try the locations for the credentials in order and bind-mount the
first that exists to a location that gets picked up.

Verified with `podman login --get-login docker.io` in the container.

Closes scylladb/scylladb#25354
2025-08-11 09:05:57 +03:00
Benny Halevy
6cc964ef16 sstables: sstable_generation: get rid of uuid_identifiers bool class
Now that all call sites enable uuid_identifiers.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-08-08 11:46:21 +03:00
Raphael S. Carvalho
9d3755f276 replica: Futurize retrieval of sstable sets in compaction_group_view
This will allow upcoming work to gently produce a sstable set for
each compaction group view. Example: repaired and unrepaired.

Locking strategy for compaction's sstable selection:
Since sstable retrieval path became futurized, tasks in compaction
manager will now hold the write lock (compaction_state::lock)
when retrieving the sstable list, feeding them into compaction
strategy, and finally registering selected sstables as compacting.
The last step prevents another concurrent task from picking the
same sstable. Previously, all those steps were atomic, but
we have seen stall in that area in large installations, so
futurization of that area would come sooner or later.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2025-08-08 06:58:00 +03:00
Raphael S. Carvalho
2c4a9ba70c treewide: Rename table_state to compaction_group_view
Since table_state is a view to a compaction group, it makes sense
to rename it as so.

With upcoming incremental repair, each replica::compaction_group
will be actually two compaction groups, so there will be two
views for each replica::compaction_group.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2025-08-08 06:51:28 +03:00
Benny Halevy
5e5e63af10 scylla-sstable: print_query_results_json: continue loop if row is disengaged
Otherwise it is accessed right when exiting the if block.
Add a unit test reproducing the issue and validating the fix.

Fixes #25325

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#25326
2025-08-06 16:44:51 +03:00
Pavel Emelyanov
0616407be5 Merge 'rest_api: add endpoint which drops all quarantined sstables' from Taras Veretilnyk
Added a new POST endpoint `/storage_service/drop_quarantined_sstables` to the REST API.
This endpoint allows dropping all quarantined SSTables either globally or
for a specific keyspace and tables.
Optional query parameters `keyspace` and `tables` (comma-separated table names) can be
provided to limit the scope of the operation.

Fixes scylladb/scylladb#19061

Backport is not required, it is new functionality

Closes scylladb/scylladb#25063

* github.com:scylladb/scylladb:
  docs: Add documentation for the nodetool dropquarantinedsstables command
  nodetool: add command for dropping quarantine sstables
  rest_api: add endpoint which drops all quarantined sstables
2025-08-06 11:55:15 +03:00
Andrei Chekun
4c33ff791b build: add pytest-timeout to the toolchain
Adding this plugin allows using timeout for a test or timeout for the whole
session. This can be useful for Unit Test Custom task in the pipeline to avoid
running tests is batches, that will mess with the test names later in Jenkins.

Closes #25210

[avi: regenerate frozen toolchain with optimized clang from

  https://devpkg.scylladb.com/clang/clang-20.1.8-Fedora-42-aarch64.tar.gz
  https://devpkg.scylladb.com/clang/clang-20.1.8-Fedora-42-x86_64.tar.gz
]

Closes scylladb/scylladb#25243
2025-07-30 12:53:10 +03:00