Although valid for compact tables, non-full (or empty) clustering key prefixes are not handled for row keys when writing sstables. Only the present components are written, consequently if the key is empty, it is omitted entirely.
When parsing sstables, the parsing code unconditionally parses a full prefix.
This mis-match results in parsing failures, as the parser parses part of the row content as a key resulting in a garbage key and subsequent mis-parsing of the row content and maybe even subsequent partitions.
Introduce a new system table: `system.corrupt_data` and infrastructure similar to `large_data_handler`: `corrupt_data_handler` which abstracts how corrupt data is handled. The sstable writer now passes rows such corrupt keys to the corrupt data handler. This way, we avoid corrupting the sstables beyond parsing and the rows are also kept around in system.corrupt_data for later inspection and possible recovery.
Add a full-stack test which checks that rows with bad keys are correctly handled.
Fixes: https://github.com/scylladb/scylladb/issues/24489
The bug is present in all versions, has to be backported to all supported versions.
Closesscylladb/scylladb#24492
* github.com:scylladb/scylladb:
test/boost/sstable_datafile_test: add test for corrupt data
sstables/mx/writer: handler rows with empty keys
test/lib/cql_assertions: introduce columns_assertions
sstables: add corrupt_data_handler to sstables::sstables
tools/scylla-sstable: make large_data_handler a local
db: introduce corrupt_data_handler
mutation: introduce frozen_mutation_fragment_v2
mutation/mutation_partition_view: read_{clustering,static}_row(): return row type
mutation/mutation_partition_view: extract de-ser of {clustering,static} row
idl-compiler.py: generate skip() definition for enums serializers
idl: extract full_position.idl from position_in_partition.idl
db/system_keyspace: add apply_mutation()
db/system_keyspace: introduce the corrupt_data table
This commit fixes incorrect headings in the Admin Guide and the files
that are included in that guide.
The purpose is to properly organize the content and improve the search,
as well as prevent potential build problems caused by a poor heading organization.
Fixes https://github.com/scylladb/scylladb/issues/24441Closesscylladb/scylladb#24700
We replace the documentation of the old recovery procedure with the
documentation of the new recovery procedure.
The new recovery procedure requires the Raft-based topology to be
enabled, so to remove the old procedure from the documentation,
we must assume users have the Raft-based topology enabled.
We can do it in 2025.2 because the upgrade guides to 2025.1 state that
enabling the Raft-based topology is a mandatory step of the upgrade.
Another reminder is the upgrade guides to 2025.2.
Since we rely on the Raft-based topology being enabled, we remove the
obsolete parts of the documentation.
We will make the Raft-based topology mandatory in the code in the
future, hopefully in 2025.3. For this reason, we also don't touch the
dev docs in this PR.
Fixesscylladb/scylladb#24530
Requires backport to 2025.2 because 2025.2 contains the new recovery
procedure.
Closesscylladb/scylladb#24583
* github.com:scylladb/scylladb:
docs: rely on the Raft-based topology being enabled
docs: handling-node-failures: document the new recovery procedure
In 2025.2, we don't force enabling the Raft-based topology in the code,
but we stated in the upgrade guides that it's a mandatory step of the
upgrade to 2025.1. We also remind users to enable the Raft-based
topology in the upgrade guides to 2025.2. Hence, we can rely in the
the documentation on the Raft-based topology being enabled. If it is
still disabled, we can just send the user to the upgrade guides. Hence:
- we remove all documentation related to enabling the Raft-based
topology, enabling the Raft-based schema (enabled Raft-based topology
implies enabled Raft-based schema), and the gossip-based topology,
- we can replace the documentation of the old manual recovery procedure
with the documentation of the new manual recovery procedure (done in
the previous commit).
We replace the documentation of the old recovery procedure with the
documentation of the new recovery procedure.
We can get rid of the old procedure from the documentation because
we requested users to enable the Raft-based topology during upgrades to
2025.1 and 2025.2.
We leave the note that enabling the Raft-based topology is required to
use the new recovery procedure just in case, since we didn't force
enabling the Raft-based topology in the code.
This change adds an md file which gives a high
level overview of the scylladb repository, the
components each path contains and a basic description
for each one of them. This is mainly intended for
onboarding engineers to help get a mental picture when
starting ramping up on Scylla concepts.
Refs #22908
Signed-off-by: Robert Bindar <robert.bindar@scylladb.com>
Closesscylladb/scylladb#23010
This commit removes the references to ScyllaDB Open Source from the README file for documentation.
In addition, it updates the link where the documentation is currently published.
We've removed Open Source from all the documentation, but the README was missed.
This commit fixes that.
Closesscylladb/scylladb#24477
To serve as a place to store corrupt mutation fragments. These fragments
cannot be written to sstables, as they would be spread around by
compaction and/or repair. They even might make parsing the sstable
impossible. So they are stored in this special table instead, kept
around to be inspected later and possibly restored if possible.
Currently only one global topology request (such as truncate, cdc repair, cleanup and alter table) can be pending. If one is already pending others will be rejected with an error. This is not very user friendly, so this series introduces a queue of global requests which allows queuing many global topology requests simultaneously.
Fixes: #16822
No need to backport since this is a new feature.
Closesscylladb/scylladb#24293
* https://github.com/scylladb/scylladb:
topology coordinator: simplify truncate handling in case request queue feature is disable
topology coordinator: fix indentation after the previous patch
topology coordinator: allow running multiple global commands in parallel
topology coordinator: Implement global topology request queue
topology coordinator: Do not cancel global requests in cancel_all_requests
topology coordinator: store request type for each global command
topology request: make it possible to hold global request types in request_type field
topology coordinator: move alter table global request parameters into topology_request table
topology coordinator: move cleanup global command to report completion through topology_request table
topology coordinator: no need to create updates vector explicitly
topology coordinator: use topology_request_tracking_mutation_builder::done() instead of open code it
topology coordinator: handle error during new_cdc_generation command processing
topology coordinator: remove unneeded semicolon
topology coordinator: fix indentation after the last commit
topology coordinator: move new_cdc_generation topology request to use topology_request table for completion
gms/feature_service: add TOPOLOGY_GLOBAL_REQUEST_QUEUE feature flag
cql, schema: Extend name length limit from 48 to 192 bytes
This commit increases the maximum length of names for keyspaces, tables, materialized views, and indexes from 48 to 192 bytes.
The previous 48-bytes limit was inherited from Cassandra 3 for compatibility. However, this validation was removed in Cassandra 4 and 5 (see CASSANDRA-20389)
and some usage scenarios (such as some feature store workflows generating long table names) now depend on this relaxed constraint.
This change brings ScyllaDB's behavior in line with modern Cassandra versions and better supports these use cases.
The new limit of 192 bytes is derived from underlying filesystem limitations to prevent runtime errors when creating directories for table data.
When a new table is created, ScyllaDB generates a directory for its SSTables. The directory name is constructed from the table name, a dash, and a 32-character UUID.
For a CDC-enabled table, an associated log table is also created, which has the suffix `_scylla_cdc_log` appended to its name.
The directory name for this log table becomes the longest possible representation.
Additionally we reserve 15 bytes for future use, allowing for potential future extensions without breaking existing schemas.
To guarantee that directory creation never fails due to exceeding filesystem name limits, the maximum name length is calculated as follows:
255 bytes (common filesystem limit for a path component)
- 32 bytes (for the 32-character UUID string)
- 1 byte (for the '-' separator)
- 15 bytes (for the '_scylla_cdc_log' suffix)
- 15 bytes (reserved for future use)
----------
= 192 bytes (Maximum allowed name length)
This calculation is similar in principle to the one proposed for Cassandra to fix related directory creation failures (see apache/cassandra/pull/4038).
This patch also updates/adds all associated tests to validate the new 192-byte limit.
The documentation has been updated accordingly.
Fixes#4480
Backport 2025.2: The significantly shorter maximum table name length in Scylla compared to Cassandra is becoming a more common issue for users in the latest release.
Closesscylladb/scylladb#24500
* github.com:scylladb/scylladb:
cql, schema: Extend name length limit from 48 to 192 bytes
replica: Remove unused keyspace::init_storage()
This commit increases the maximum length of names for keyspaces, tables, materialized views, and indexes from 48 to 192 bytes.
The previous 48-bytes limit was inherited from Cassandra 3 for compatibility. However, this validation was removed in Cassandra 4 and 5 (see CASSANDRA-20389)
and some usage scenarios (such as some feature store workflows generating long table names) now depend on this relaxed constraint.
This change brings ScyllaDB's behavior in line with modern Cassandra versions and better supports these use cases.
The new limit of 192 bytes is derived from underlying filesystem limitations to prevent runtime errors when creating directories for table data.
When a new table is created, ScyllaDB generates a directory for its SSTables. The directory name is constructed from the table name, a dash, and a 32-character UUID.
For a CDC-enabled table, an associated log table is also created, which has the suffix `_scylla_cdc_log` appended to its name.
The directory name for this log table becomes the longest possible representation.
Additionally we reserve 15 bytes for future use, allowing for potential future extensions without breaking existing schemas.
To guarantee that directory creation never fails due to exceeding filesystem name limits, the maximum name length is calculated as follows:
255 bytes (common filesystem limit for a path component)
- 32 bytes (for the 32-character UUID string)
- 1 byte (for the '-' separator)
- 15 bytes (for the '_scylla_cdc_log' suffix)
- 15 bytes (reserved for future use)
----------
= 192 bytes (Maximum allowed name length)
This calculation is similar in principle to the one proposed for Cassandra to fix related directory creation failures (see apache/cassandra/pull/4038).
This patch also updates/adds all associated tests to validate the new 192-byte limit.
The documentation has been updated accordingly.
This patch intends to give an overview of where, when and how we store
data in S3 and provide a quick set of commands
which help gain local access to the data in case there is a need for
manual intervention.
The patch also collects in the same place links/descriptions for all
formats we use in S3.
Fixes#22438
Signed-off-by: Robert Bindar <robert.bindar@scylladb.com>
Closesscylladb/scylladb#24323
Our documentation docs/alternator/new-apis.md claims that Alternator TTL
does not work with tablets, due to issue #16567. However, we fixed that
issue in commit de96c28625. So let's drop
the outdated statement that it doesn't work.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#24427
Requests, together with their parameters, are added to the
topology_request tables and the queue of active global requests is
kept in topology state. Thy are processed one by one by the topology
state machine.
Fixes: #16822
Add system:table_creation_time tag with value - timestamp in milliseconds of creation table.
If the tag is present, it will used to fill creation timestamp value (when CreateTable or DescribeTable is called).
If the tag is missing, value 0 for timestamp will be substituted (in other words table was created on 1th january of 1970).
Update test to change how we make sure timestamp is actually used - we create two tables one after another and make sure their creation timestamp is in correct order.
Update tests, that work with tags to filter system tags out.
Fixes#5013Closesscylladb/scylladb#24007
This patch adds the new option in nodetool, patches the
load_new_ss_tables REST request with a new parameter and
skips the reshape step in refresh if this flag is passed.
Signed-off-by: Robert Bindar <robert.bindar@scylladb.com>
Closesscylladb/scylladb#24409Fixes: #24365
This commit adds the upgrade guide from version 2025.1 to 2025.2.
Also, it removes the upgrade guides existing for the previous version
that are irrelevant in 2025.2 (upgrade from OSS 6.2 and Enterprise 2024.x).
Note that the new guide does not include the "Enable Consistent Topology Updates" page,
as users upgrading to 2025.2 have consistent topology updates already enabled.
Fixes https://github.com/scylladb/scylladb/issues/24133
Fixes https://github.com/scylladb/scylladb/issues/24265Closesscylladb/scylladb#24266
Create a custom pytest test collector for .cql files and
move CQL test execution logic from `CQLApprovalTest` class
and `pylib/cql_repl/cql_repl.py` file to `CqlTest.runtest()`
method.
In result, the only difference between CQLApproval and Python
suite types is suffixes of test files.
This change adds the --scope option to nodetool refresh.
Like in the case of nodetool restore, you can pass either of:
* node - On the local node.
* rack - On the local rack.
* dc - In the datacenter (DC) where the local node lives.
* all (default) - Everywhere across the cluster.
as scope.
The feature is based on the existing load_and_stream paths, so it
requires passing --load-and-stream to the refresh command.
Also, it is not compatible with the --primary-replica-only option.
Signed-off-by: Robert Bindar <robert.bindar@scylladb.com>
Closesscylladb/scylladb#23861
Metadata id was introduced in CQLv5 to make metadata of prepared
statement metadata consistent between driver and database.
This commit introduces a protocol extension that allows to use the same
mechanism in CQLv4. As CQLv5 is currently unsupported in ScyllaDb (as well
as in some of the drivers), the motivation is to allow fixing https://github.com/scylladb/scylladb/issues/20860.
This change:
- Implement metadata::calculate_metadata_id()
- Implement SCYLLA_USE_METADATA_ID protocol extension for CQLv4
- Added description of SCYLLA_USE_METADATA_ID in documentation
- Add boost tests to confirm correctness of the function
- Add python tests for table metadata change corner-cases
Fixesscylladb/scylladb#20860
Also see related https://scylladb.atlassian.net/wiki/spaces/RND/pages/42238631/MetadataId+extension+in+CQLv4+Requirement+Document
No backport needed (unless specifically requested by a customer), because there are existing workarounds for the issue
Closesscylladb/scylladb#23292
* github.com:scylladb/scylladb:
test: add tests for prepared statement metadata consistency corner cases
transport: implement SCYLLA_USE_METADATA_ID support
cql3: implement metadata::calculate_metadata_id()
Currently, the `system.compaction_history` table miss information like the type of compaction (cleanup, major, resharding, etc), the sstable generations involved (in and out), shard's id the compaction was triggered on and statistics on purged tombstones to be collected during compaction.
The series extends the table with the following columns:
- "compaction_type" (text)
- "shard_id" (int)
- "sstables_in" (list<sstableinfo_type>)
- "sstables_out" (list<sstableinfo_type>)
- "total_tombstone_purge_attempt" (long)
- "total_tombstone_purge_failure_due_to_overlapping_with_memtable" (long)
- "total_tombstone_purge_failure_due_to_overlapping_with_uncompacting_sstable" (long)
with a user defined type `sstableinfo_type` that holds the information about sstable file
- generation (uuid)
- origin (text)
- size (long)
Additional statistics stored in the compaction_history have been incorporated in the API `/compaction_manager/compaction_history` and the `nodetool compactionhistory` command.
No backport is required. It extends the existing compaction history output.
Fixes https://github.com/scylladb/scylladb/issues/3791Closesscylladb/scylladb#21288
* github.com:scylladb/scylladb:
nodetool: Refactor of compactionhistory_operation
nodetool: Add more stats into compactionhistory output
api/compaction_manager: Extend compaction_history api
compaction: Collect tombstone purge stats during compaction
compacting_reader: Extend to accept tombstone purge statistics
mutation_compactor: Collect tombstone purge attempts
compaction_garbage_collector: Extend return type of max_purgeable_fn
compaction: Extend compaction_result to collect more information
system_keyspace: Upgrade compaction_history table
system_keyspace: Create UDT: sstableinfo_type
system_keyspace: Extract compaction_history struct
system_keyspace: Squeeze update_compaction_history parameters
compaction/compaction_manager: update_history accepts compaction_result as rvalue
Fix for https://github.com/scylladb/scylladb/pull/24097
The stable branch does not contain the split API reference yet. This change fixes the 404 error raised when accessing the API reference on the stable branch due to the redirect.
Closesscylladb/scylladb#24259
* github.com:scylladb/scylladb:
docs: fix typo
docs: remove API reference redirect
There is a difference how ScyllaDB and Cassandra handle conditional
batches with different IF statements (such as "IF EXISTS" and "IF NOT
EXISTS"). Cassandra tries to detect condition conflicts, and prints
an error instead of silently failing the batch, but in ScyllaDB
we considered this check to be inconsistent and unhelpful, and
decided not to implement it.
In this series, we extend the documentation of the ScyllaDB behaviour
by extending the documents and improving relevant LWT tests.
Fixes: https://github.com/scylladb/scylladb/issues/13011
Backport not needed, only docs and minor tests changes.
Closesscylladb/scylladb#24086
* github.com:scylladb/scylladb:
test: mark difference in handling IFs in LWT as scylla_only
docs: cql: add explicit explanation how mixing IFs works in LWT
docs: lwt: add two missing spaces
The stable branch does not contain the split API reference yet.
This change fixes the 404 error raised when accessing the API reference on the stable branch.
There is a difference how ScyllaDB and Cassandra handle conditional
batches with different IF statements (such as "IF EXISTS" and "IF NOT
EXISTS").
This commit explicitly documents the differences in the behavior.
Refs: #13011
Incorporate additional statistics stored in the compaction_history
system table. Depending on the requested format type, the output has
different form.
Remove unnecessary duplicated history_entry struct and instead use
extracted db::compaction_history_entry structure.
Running the cql command: select * from system.compaction_history;
prints sstable's generation type as UUID (e.g. 5a5cf800-b617-11ef-a97d-8438c36f0e31),
see generation_type::data_value() which is different than its fmt
format (e.g. 3glx_0srx_1pasg2ksepk902v8dt). Therefore, to unify
the outputs, generation_type is converted to data_value before
it is printed.
Starting with 2025.1, ScyllaDB versions are no longer called "Enterprise",
but the OS support page still uses that label.
This commit fixes that by replacing "Enterprise" with "ScyllaDB".
This update is required since we've removed "Enterprise" from everywhere else,
including the commands, so having it here is confusing.
Fixes https://github.com/scylladb/scylladb/issues/24179Closesscylladb/scylladb#24181
The non-streaming loading of sstables performs cleanup since recently [1]. For vnodes, unfortunately, cleanup is almost unavoidable, because of the nature of vnodes sharding, even if sstable is already clean. This leads to waste of IO and CPU for nothing. Skipping the cleanup in a smart way is possible, but requires too many changes in the code and in the on-disk data. However, the effort will not help existing SSTables and it's going to be obsoleted by tablets some time soon.
Said that, the easiest way to skip cleanup is the explicit --skip-cleanup option for nodetool and respective skip_cleanup parameter for API handler.
New feature, no backport
fixes#24136
refs #12422 [1]
Closesscylladb/scylladb#24139
* github.com:scylladb/scylladb:
nodetool: Add refresh --skip-cleanup option
api: Introduce skip_cleanup query parameter
distributed_loader: Don't create owned ranges if skip-cleanup is true
code: Push bool skip_cleanup flag around
Metadata id was introduced in CQLv5 to make metadata of prepared
statement consistent between driver and database. This commit introduces
a protocol extension that allows to use the same mechanism in CQLv4.
This change:
- Introduce SCYLLA_USE_METADATA_ID protocol extension for CQLv4
- Introduce METADATA_CHANGED flag in RESULT. The flag cames directly
from CQLv5 binary protocol. In CQLv4, the bit was never used, so we
assume it is safe to reuse it.
- Implement handling of metadata_id and METADATA_CHANGED in RESULT rows
- Implement returning metadata_id in RESULT prepared
- Implement reading metadata_id from EXECUTE
- Added description of SCYLLA_USE_METADATA_ID in documentation
Metadata_id is wrapped in cql_metadata_id_wrapper because we need to
distinguish the following situations:
- Metadata_id is not supported by the protocol (e.g. CQLv4 without the
extension is used)
- Metadata_id is supported by the protocol but not set - e.g. PREPARE
query is being handled: it doesn't contain metadata_id in the
request but the reply (RESULT prepared) must contain metadata_id
- Metadata_id is supported by the protocol and set, any number of
bytes >= 0 is allowed, according to the CQLv5 protocol specification
Fixesscylladb/scylladb#20860