This PR introduces support for a new scrub option: `--drop-unfixable-sstables`, which enables the dropping of corrupted SSTables during scrub only in segregate mode. The patch includes implementation, validation, and set of tests to ensure correct behavior and error handling.
Fixes#19060
Backport is not required, it is a new feature
Closesscylladb/scylladb#26579
* github.com:scylladb/scylladb:
sstable_compaction_test: add segregate mode tests for drop-unfixable-sstables option
test/nodetool: add scrub drop-unfixable-sstables option testcase
scrub: add support for dropping unfixable sstables in segregate mode
This patch adds a new flag `drop-unfixable-sstables` to the scrub operation
in segregate mode, allowing to automatically drop SSTables that
cannot be fixed during scrub. It also includes API support of the 'drop_unfixable_sstables'
paramater and validation to ensure this flag is not enabled in other modes rather than segragate.
This patch removes the dependence of vector search module
on the cql3 module by moving the contents of cql3/type_json.hh
to types/json_utils.hh and removing the usage of cql3 primary_key
object in vector_store_client. We also make the needed adjustments
to files that were previously using the afformentioned type_json.hh
file.
This fixes the circular dependency cql3 <-> vector_search.
Closesscylladb/scylladb#26482
Integrates GCP object storage as a working storage backend for scylla sstables as well as backup storage.
Adds an abstraction layer (atm very heavily designed around the s3 client interface and usage) to allow the "storage" etc layers of sstable management to pick transparently between "s3" and "gs" providers.
This modifies the scylla config such that endpoints can optionally (through a "type" param) ref a GS backend.
Similarly with storage_options.
Also adds some IO wrapping primitives to make it more feasible to place some logic at a mid level of the implementation stack (such as making networked storage files, ranged reading etc).
Test s3 fixture is replaced (where appropriate) with an `object_storage` fixture that multiplexes the test across both backends.
Unit tests are duplicated and for the GS versions use a boost test fixture for GCS, default local fake.
Fixes#25359Fixes#26453Closesscylladb/scylladb#26186
* github.com:scylladb/scylladb:
docs::dev::object_storage: Add some initial info on GS storage
docs/dev: Add mention of (nested) docker usage in testing.md
sstables::object_storage_client: Forward memory limit semaphore to GS instance
utils::gcp::object_storage: Add optional memory limits to up/download
sstables::object_storage_client: Add multi-upload support for GS
utils::gcp::storage: Add merge objects operation
test_backup/test_basic: Make tests multiplex both s3 and gs backends
test::cluster::conftest: Add support for multiple object storage backends
boost::gcs_storage_test: reindent
boost::gcs_storage_test: Convert to use fixture
tests::boost: Add GS object storage cases to mirror S3 ones
tests::lib::gcs_fixture: Add a reusable test fixture for real/fake GS/GCS
tests::lib::test_utils: Add overloads/helpers for reading and (temp) writing env
sstables::object_storage_client: Add google storage implementation
test_services: Allow testing with GS object storage parameters
utils::gcp::gcp_credentials: Add option to create uninitialized credentials
utils::gcp::object_storage: Make create_download_source return seekable_data_source
utils::gcp::object_storage: Add defensive copies of string_view params
utils::gcp::object_storage: Add missing retry backoff increate
utils::gcp::object_storage: Add timestamp to object listing
utils::gcp::object_storage: Add paging support to list_objects
object_storage_client: Add object_name wrapper type
utils::gcp::object_storage: Add optional abort_source
utils::rest::client: Add abort_source support
sstables: Use object_storage_client for remote storage
sstables::object_storage_client: Add abstraction layer for OS cliens (s3 initial)
s3::upload_progress: Promote to general util type
storage_options: Abstract s3 to "object_storage" and add gs as option
sstables::file_io_extension: Change "creator" callback to just data_source
utils::io-wrappers: Add ranged data_source
utils::io-wrappers: Add file wrapper type for seekable_source
utils::seekable_source: Add a seekable IO source type
object_storage_endpoint_param: Add gs storage as option
config: break out object_storage_endpoint_param preparing for multi storage
* tools/cqlsh ff3f572...f852b1f5 (2):
> Add LZ4 as a required package - so ScyllaDB Python driver could use LZ4 compression
> github actions: replace macos-13 with macos-15-intel
Closesscylladb/scylladb#26608
We want to add strongly consistent tables as an option. We will have
two kind of strongly consistent tables: globally consistent and locally
consistent. The former means that requests from all DCs will be globally
linearisable while the later - only requests to the same DCs will be
linearisable. To allow configuring all the possibilities the patch
adds new parameter to a keyspace definition "consistency" that can be
configured to be `eventual`, `global` or `local`. Non eventual setting
is supported for tablets enabled keyspaces only. Since we want to start
with implementing local consistency configuring global consistency will
result in an error for now.
Since both are bucket+prefix oriented, we can basically use same
options for both, only distinguished by actual protocol.
Abstract the types and the helper parse etc routines to handle either.
Use "gs" as term for gcs (google compute storage), since this is the
URL scheme used.
Moves the config wrapper to own file (to reduce recompilation for modifying)
and refactors to handle extending this parameter to non-s3 endpoint configs.
Using the name regular as the incremental mode could be confusing, since
regular might be interpreted as the non-incremental repair. It is better
to use incremental directly.
Before:
- regular (standard incremental repair)
- full (full incremental repair)
- disabled (incremental repair disabled)
After:
- incremental (standard incremental repair)
- full (full incremental repair)
- disabled (incremental repair disabled)
Fixes#26503Closesscylladb/scylladb#26504
This change extends the CQL replication options syntax so the replication factor can be stated as a list of rack names.
For example: { 'mydatacenter': [ 'myrack1', 'myrack2', 'myrack4' ] }
Rack-list based RF can coexist with the old numerical RF, even in the same keyspace for different DCs.
Specifying the rack list also allows to add replicas on the specified racks (increasing the replication factor), or decommissioning certain racks from their replicas (by omitting them from the current datacenter rack-list). This will allow us to keep the keyspace rf-rack-valid, maintaining guarantees, while allowing adding/removing racks. In particular, this will allow us to add a new DC, which happens by incrementally increasing RF in that DC to cover existing racks.
Migration from numerical RF to rack-list is not supported yet. Migration from rack-list to numerical RF is not planned to be supported.
New feature, no backport required.
Co-authored with @bhalevy
Fixes https://github.com/scylladb/scylladb/issues/25269
Fixes https://github.com/scylladb/scylladb/issues/23525Closesscylladb/scylladb#26358
* github.com:scylladb/scylladb:
tablets: load_balancer: Recognize that tablets are confined to racks when computing desired tablet count
locator: Make hasher for endpoint_dc_rack globally accessible
test: tablets: Add test for replica allocation on rack list changes
test: lib: topology_builder: generate unique rack names
test: Add tests for rack list RF
doc: Document rack-list replication factor
topology_coordinator: Restore formatting
topology_coordinator: Cancel keyspace alter on broader set of errors
topology_coordinator: Make keyspace alter process options through as_ks_metadata_update()
cql3: ks_prop_defs: Preserve old options
cql3: ks_prop_defs: Introduce flattened()
locator: Recognize rack list RF as valid in assert_rf_rack_valid_keyspace()
tablet_allocator: Respect binding replicas to racks
locator: network_topology_strategy: Respect rack list when reallocating tablets
cql3: ks_prop_defs: Fail with more information when options are not in expected format
locator, cql3: Support rack lists in replication options
cql3: Fail early on vnode/tablet flavor alter
cql3: Extract convert_property_map() out of Cql.g
schema: Use definition from the header instead of open-coding it
locator: Abstract obtaining the number of replicas from replication_strategy_config_option
cql3, locator: Use type aliases for option maps
locator: Add debug logging
locator: Pass topology to replication strategy constructor
abstract_replication_strategy, network_topology_strategy: add replication_factor_data class
Some tools commands have links to online documentation in their help output. These links were left behind in the source-available change, they still point to the old opensource docs. Furthermore, the links in the scylla-sstable help output always point to the latest stable release's documentation, instead of the appropriate one for the branch the tool was built from. Fix both of these.
Fixes: scylladb/scylladb#26320
Broken documentation link fix for the tool help output, needs backport to all live source-available versions.
Closesscylladb/scylladb#26322
* github.com:scylladb/scylladb:
tools/scylla-sstable: fix doc links
release: adjust doc_link() for the post source-available world
tools/scylla-nodetool: remove trailing " from doc urls
Before, the `nodetool getendpoints` expected the key as one string separated by : (for example 1:val:ue). This caused errors if any part of the key had a colon because it was unclear whether a colon was a separator or part of the key.
This change adds a new API endpoint, `/storage_service/natural_endpoints/v2/{keyspace}`, which accepts composite partition keys as multiple key_component query parameters (e.g., ?key_component=1&key_component=val:ue). The `nodetool getendpoints` command was updated to support a new `--key-components` option, allowing users to pass key components as an array. The client and test infrastructure were extended to support multiple values for a query parameter, and tests were added to verify correct behavior with composite keys.
The previous method of passing partition keys as colon-separated strings is preserved for backward compatibility.
Backport is not required, since this change relies on recent Seastar updates
Fixes#16596Closesscylladb/scylladb#26169
* github.com:scylladb/scylladb:
docs: document --key-components option for getendpoints
test/nodetool/test_getendpoints: add coverage for --key-components param in getendpoints
nodetool: Introduce new option --key-components to specify compound partition keys as array
rest_api/test_storage_service: add v2 natural_endpoints test for composite key with multiple components
api/storage_service: add GET 'natural_endpoints' v2 to support composite keys with ':'
rest_api_mock: support duplicate query parameters
test/rest_api: support multiple query values per key in RestApiSession.send()
nodetool: add support of new seastar query_parameters_type to scylla_rest_client
In preparation for changing their structure.
1) std::map<sstring, sstring> -> replication_strategy_config_options
Parsed options. Values will become std::variant<sstring, rack_list>
2) std::map<sstring, sstring> -> property_definitions::map_type
Flattened map of options, as stored system tables.
Allows getendpoints to accept components of partition key using the --key-components option.
Key components are passed as an array and sent to the new /natural_endpoints/v2/{keyspace} endpoint.
Corrected spelling mistakes, typos, and minor wording issues to improve
the developer documentation.
No backport: There is no functional change, and the doc is mostly
relevant to master, so it doesn't need to be backported.
Closesscylladb/scylladb#26332
Extend the `sstable_format` config enum with a "ms" value,
and, if it's enabled (in the config and in cluster features),
use it for new sstables on the node.
(Before this commit, writing `ms` sstables should only be possible
in unit tests, via internal APIs. After this commit, the format
can be enabled in the config and the database will write it during
normal operation).
As of this commit, the new format is not the default yet.
(But it will become the default in a later commit in the same series).
Introduce `ms` -- a new sstable format version which
is a hybrid of Cassandra's `me` and `da`.
It is based on `me`, but with the index components
(Summary.db and Index.db) replaced with the index
components of `da` (Partitions.db and Rows.db).
As of this patch, the version is never chosen
anywhere for writing sstables yet. It is only introduced.
We will add it to unit tests in a later commit,
and expose it to users in yet later commit.
Later in this patch series we will introduce `ms` as the new highest
format, but we won't be able to make it the default within the same
series due to some dtest incompatibilities.
Until `ms` is the default, we don't `scylla sstable` to default to
it, even though it's the highest. Let's choose the default
version in `scylla sstable` using the same method which is
used by Scylla in general: by letting the `sstable_manager` choose.
The doc links in scylla-sstable help output are static, so they always
point to the documentation of the latest stable release, not to the
documentation of the release the tool binary is from. On top of that,
the links point to old open-source documentation, which is now EOL.
Fix both problems: point link at the new source-available documentation
pages and make them version aware.
An offline, scylla-sstable variant of nodetool upgradesstables command.
Applies latest (or selected) sstable version and latest schema.
Closesscylladb/scylladb#26109
Sstables store a basic schema in the statistics component. The scylla-sstable tool uses this to be able to read and dump sstables in a self-contained manner, without requiring an external schema source.
The problem is that the schema stored int he statistics component is incomplete: it doesn't store column names for key columns, so these have placeholder names in dump outputs where column names are visible.
This is not a disaster but it is confusing and it can cause errors in scripts which want to check the content of sstables, while also knowing the schema and expecting the proper names for key columns.
To make sstables truly self-contained w.r.t. the schema, add a complete schema to the scylla component. This schema contains the names and types of all columns, as well as some basic information about the schema: keyspace name, table name, id and version.
When available, scylla-sstable's schema loader will use this new more complete schema and fall-back to the old method of loading the (incomplete) schema from the statistics component otherwise.
New feature, no backport required.
Closesscylladb/scylladb#24187
* github.com:scylladb/scylladb:
test/boost/schema_loader_test: add specific test with interesting types
test/lib/random_schema: add random_schema(schema_ptr) constructor
test/boost/schema_loader_test: test_load_schema_from_sstable: add fall-back test
tools/schema_loader: add support for loading from scylla-metadata
tools/schema_loader: extract code which load schema from statistics
sstables: scylla_metadata: add schema member
Some files in compaction/ have using namespace {compaction,sstables}
clauses, some even in headers. This is considered bad practice and
muddies the namespace use. Remove them.
The namespace usage in this directory is very inconsistent, with files
and classes scattered in:
* global namespace
* namespace compaction
* namespace sstables
With cases, where all three used in the same file. This code used to
live in sstables/ and some of it still retains namespace sstables as a
heritage of that time. The mismatch between the dir (future module) and
the namespace used is confusing, so finish the migration and move all
code in compaction/ to namespace compaction too.
This patch, although large, is mechanic and only the following kind of
changes are made:
* replace namespace sstable {} with namespace compaction {}
* add namespace compaction {}
* drop/add sstables::
* drop/add compaction::
* move around forward-declarations so they are in the correct namespace
context
This refactoring revealed some awkward leftover coupling between
sstables and compaction, in sstables/sstable_set.cc, where the
make_sstable_set() methods of compaction strategies are implemented.
When available, load the schema from the Scylla component, where the
column names of keys are also available. Fall-back to loading the schema
from the Statistics component otherwise (previous behaviour).
To store the most important schema fields, like id, version, keyspace
name, table name and the list of all columns, along with their kind,
name and type. This will serve as alternative schema source to the one
stored in statistics component. This latter one doesn't store any of the
metatada and neither does primary key names (just the types), so it is
leads to confusion when it is used as schema source for scylla-sstable.
This new schema stored in the scylla-metadata component is not intended
to be a full-schema, equivalent to the one stored in the schema tables,
it is intended to be good enough for scylla-sstable being able to parse
sstables in a self-sufficient manner.
Acecssing this member directly is deprecated, migrate code to use {get,set}_query_param() and friends instead.
Fixes: https://github.com/scylladb/scylladb/issues/26023
Preparation for seastar update, no backport required.
Closesscylladb/scylladb#26024
* github.com:scylladb/scylladb:
treewide: move away from accessing httpd::request::query_parameters
test/pylib/s3_server_mock.py: better handle empty query params
Passing `0` as the `initial_tablets` argument causes `schema_loaders`'s
placeholder keyspace to be a tablet keyspace.
This causes `scylla sstable` to reject some table schemas which
are legitimate in this context. For example, `scylla sstable`
refuses to work with sstables which contains `counter` columns,
because tablets don't support counters.
This is undesirable. Let's make `schema_loader`'s keyspace
a non-tablet keyspace.
Closesscylladb/scylladb#26192
`purgeable.lua` was written for a specific investigation a few years ago.
`writetime-histogram.lua` is an sstable script transcription of the former scylla-sstable writetime-histogram command. This was also written for an investigation (before script command existed) and is too specific to be a native command, so was removed by edaf67edcb.
Add both scripts to the sample script library, they can be useful, either for a future investigation, or as samples to copy+edit to write new scripts (and train AI).
New sstable scripts, no backport
Closesscylladb/scylladb#26137
* github.com:scylladb/scylladb:
tools/scylla-sstable-scripts: introduce writetime-histogram.lua
tools/scylla-sstable-scripts: introduce purgable.lua
Produces a histogram with the writetime (timestamp) of the data in the
sstable(s). The histogram is printed to the output, along with general
stats about the processed data.
Collects and prints statistics about how much data is purgeable in an
sstable. Works only with tombstone_gc = {'mode': 'timeout'};
Can help diagnosing the efficiency (or lack of) tombstone-gc.
when cluster repair is run for an entire keyspace, nodetool makes a
repair api request for each table.
if the keyspace contains colocated tables, then the api request for the
colocated tables will fail, because currently scylla doesn't allow making
repair requests for specific colocated tables, but only for base tables.
if the request is to repair an entire keyspace then we can ignore this,
because we will make a repair request for all base tables, and this in
turn will repair also all the colocated tables in the keyspace.
however if specific tables are requested and some of them are colocated
then we should propagate the error to let the user know the request is
invalid.
Refs https://github.com/scylladb/scylladb/issues/24816
no backport - no colocated tablets in previous releases
Closesscylladb/scylladb#26051
* github.com:scylladb/scylladb:
nodetool: ignore repair request error of colocated tables
storage_service: improve error message on repair of colocated tables
The `json_mutation_stream_parser.cc` file was not included in the
`scylla-tools` CMake target. This could lead to "undefined reference"
linker errors when building with CMake.
This commit adds the missing source file to the target's source list.
Closesscylladb/scylladb#26108
This command was written for an investigation and was used exactly once.
This would have been a perfect candidate for the (also rarely used)
scylla-sstable script command, but it didn't exist yet.
Drop this command from the tool, such super-specific commands should be
written as sstable-scripts nowadays, which is what we will do if we ever
need this again.
Closesscylladb/scylladb#26062
when cluster repair is run for an entire keyspace, nodetool makes a
repair api request for each table.
if the keyspace contains colocated tables, then the api request for the
colocated tables will fail, because currently scylla doesn't allow making
repair requests for specific colocated tables, but only for base tables.
if the request is to repair an entire keyspace then we can ignore this,
because we will make a repair request for all base tables, and this in
turn will repair also all the colocated tables in the keyspace.
however if specific tables are requested and some of them are colocated
then we should propagate the error to let the user know the request is
invalid.
Refs scylladb/scylladb#24816
tools/scylla-sstable.cc has 3.5k SLOC, out of which this class alone is 1K. Extract into own hh and cc. Since this class was already using pimpl, the header remains nice and small.
Code cleanup, no backport needed.
Closesscylladb/scylladb#26064
* github.com:scylladb/scylladb:
tools: extract json_mtuation_stream_parser to its own hh,cc files
tools/scylla-sstable: fix indentation
tools/scylla-sstable: prepare for extracting json_mutation_stream_parser
tools/scylla-sstable.cc has 3.5k SLOC, out of which this class alone is
1K. Extract into own hh and cc, just a copy-paste after the preparation
commit.
Make methods out-of-line, so class declaration stands on its own,
without definition of impl.
Move auxiliary structures, used only by impl, out of the class scope.
Move parser to tools namespace, and auxiliaries to anonymous namespace
within the tools one.
Pass down logger ref to parser impl and below, to prepare for sst_log
not being available in scope.
Add comment to parser class explaining what it does.
The `--incremental-mode` option specifies the incremental repair mode.
Can be 'disabled', 'regular', or 'full'.
'regular': The incremental repair logic is enabled. Unrepaired sstables
will be included for repair. Repaired sstables will be skipped. The
incremental repair states will be updated after repair.
'full': The incremental repair logic is enabled. Both repaired and
unrepaired sstables will be included for repair. The incremental repair
states will be updated after repair.
'disabled': The incremental repair logic is disabled completely. The
incremental repair states, e.g., repaired_at in sstables and
sstables_repaired_at in the system.tablets table, will not be updated
after repair.
When the option is not provided, it defaults to regular.
Fixes#25931Closesscylladb/scylladb#25969
We are moving away from integer generations, so stop using them.
Also drop the --generation command-line parameter, UUID generations
don't have be provided by the caller, because random UUIDs will not
collide with each other. To help the caller still know what generation
the output sstable has (previously they provided it via --generation),
print the generation to stdout.
Closesscylladb/scylladb#25166