Files
scylladb/docs
Avi Kivity 5e4941a74b Merge '[Backport 2025.2] sstables/mx/writer: handle non-full prefix row keys' from Scylladb[bot]
Although valid for compact tables, non-full (or empty) clustering key prefixes are not handled for row keys when writing sstables. Only the present components are written, consequently if the key is empty, it is omitted entirely.
When parsing sstables, the parsing code unconditionally parses a full prefix.
This mis-match results in parsing failures, as the parser parses part of the row content as a key resulting in a garbage key and subsequent mis-parsing of the row content and maybe even subsequent partitions.

Introduce a new system table: `system.corrupt_data` and infrastructure similar to `large_data_handler`: `corrupt_data_handler` which abstracts how corrupt data is handled. The sstable writer now passes rows such corrupt keys to the corrupt data handler. This way, we avoid corrupting the sstables beyond parsing and the rows are also kept around in system.corrupt_data for later inspection and possible recovery.

Add a full-stack test which checks that rows with bad keys are correctly handled.

Fixes: https://github.com/scylladb/scylladb/issues/24489

The bug is present in all versions, has to be backported to all supported versions.

- (cherry picked from commit 92b5fe8983)

- (cherry picked from commit 0753643606)

- (cherry picked from commit b0d5462440)

- (cherry picked from commit 093d4f8d69)

- (cherry picked from commit 678deece88)

- (cherry picked from commit 64f8500367)

- (cherry picked from commit b931145a26)

- (cherry picked from commit 3e1c50e9a7)

- (cherry picked from commit 46ff7f9c12)

- (cherry picked from commit ebd9420687)

- (cherry picked from commit aae212a87c)

- (cherry picked from commit 592ca789e2)

- (cherry picked from commit edc2906892)

Parent PR: #24492

Closes scylladb/scylladb#24744

* github.com:scylladb/scylladb:
  test/boost/sstable_datafile_test: add test for corrupt data
  sstables/mx/writer: handler rows with empty keys
  test/lib/cql_assertions: introduce columns_assertions
  sstables: add corrupt_data_handler to sstables::sstables
  tools/scylla-sstable: make large_data_handler a local
  db: introduce corrupt_data_handler
  mutation: introduce frozen_mutation_fragment_v2
  mutation/mutation_partition_view: read_{clustering,static}_row(): return row type
  mutation/mutation_partition_view: extract de-ser of {clustering,static} row
  idl-compiler.py: generate skip() definition for enums serializers
  idl: extract full_position.idl from position_in_partition.idl
  db/system_keyspace: add apply_mutation()
  db/system_keyspace: introduce the corrupt_data table
2025-07-01 12:27:01 +03:00
..
2025-05-26 10:30:03 +03:00
2024-11-06 14:09:28 +02:00
2025-02-20 11:24:34 +02:00
2025-05-26 10:30:03 +03:00

ScyllaDB Documentation

This repository contains the source files for ScyllaDB Open Source documentation.

  • The dev folder contains developer-oriented documentation related to the ScyllaDB code base. It is not published and is only available via GitHub.
  • All other folders and files contain user-oriented documentation related to ScyllaDB Open Source and are sources for opensource.docs.scylladb.com.

To report a documentation bug or suggest an improvement, open an issue in GitHub issues for this project.

To contribute to the documentation, open a GitHub pull request.

Key Guidelines for Contributors

To prevent the build from failing:

  • If you add a new file, ensure it's added to an appropriate toctree, for example:

     .. toctree::
        :maxdepth: 2
        :hidden:
    
        Page X </folder1/article1>
        Page Y </folder1/article2>
        Your New Page </folder1/your-new-article>
    
  • Make sure the link syntax is correct. See the guidelines on creating links

  • Make sure the section headings are correct. See the guidelines on creating headings Note that the markup must be at least as long as the text in the heading. For example:

    ----------------------
    Prerequisites
    ----------------------
    

Building User Documentation

Prerequisites

  • Python
  • poetry
  • make

See the ScyllaDB Sphinx Theme prerequisites to check which versions of the above are currently required.

Mac OS X

You must have a working Homebrew in order to install the needed tools.

You also need the standard utility make.

Check if you have these two items with the following commands:

brew help
make -h

Linux Distributions

Building the user docs should work out of the box on most Linux distributions.

Windows

Use "Bash on Ubuntu on Windows" for the same tools and capabilities as on Linux distributions.

Building the Docs

  1. Run make preview to build the documentation.
  2. Preview the built documentation locally at http://127.0.0.1:5500/.

Cleanup

You can clean up all the build products and auto-installed Python stuff with:

make pristine

Information for Contributors

If you are interested in contributing to Scylla docs, please read the Scylla open source page at http://www.scylladb.com/opensource/ and complete a Scylla contributor agreement if needed. We can only accept documentation pull requests if we have a contributor agreement on file for you.

Third-party Documentation

  • Do any copying as a separate commit. Always commit an unmodified version first and then do any editing in a separate commit.

  • We already have a copy of the Apache license in our tree, so you do not need to commit a copy of the license.

  • Include the copyright header from the source file in the edited version. If you are copying an Apache Cassandra document with no copyright header, use:

This document includes material from Apache Cassandra.
Apache Cassandra is Copyright 2009-2014 The Apache Software Foundation.