Files
scylladb/docs/dev
Botond Dénes 88a8324e68 erge 'db: store large data records in SSTable metadata and serve via virtual tables' from Benny Halevy
`system.large_partitions`, `system.large_rows`, and `system.large_cells` store records keyed by SSTable name. When SSTables are migrated between shards or nodes (resharding, streaming, decommission), the records are lost because the destination never writes entries for the migrated SSTables.

This patch series moves the source of truth for large data records into the SSTable's scylla metadata component (new `LargeDataRecords` tag 13) and reimplements the three `system.large_*` tables as virtual tables that query live SSTables on demand. A cluster feature flag (`LARGE_DATA_VIRTUAL_TABLES`) gates the transition for safe rolling upgrades.

When the cluster feature is enabled, each node drops the old system large_* tables and starts serving the corresponding tables using virtual tables that represent the large data records now stored on the sstables.
Note that the virtual tables will be empty after upgrade until the sstables that contained large data are rewritten, therefore it is recommended to run upgrade sstables compaction or major compaction to repopulate the sstables scylla-metadata with large data records.

1. **keys: move key_to_str() to keys/keys.hh** — make the helper reusable across large_data_handler, virtual tables, and scylla-sstable
2. **sstables: add LargeDataRecords metadata type (tag 13)** — new struct with binary-serialized key fields, scylla-sstable JSON support, format documentation
3. **large_data_handler: rename partition_above_threshold to above_threshold_result** — generalize the struct for reuse
4. **large_data_handler: return above_threshold_result from maybe_record_large_cells** — separate booleans for cell size vs collection elements thresholds
5. **sstables: populate LargeDataRecords from writer** — bounded min-heaps (one per large_data_type), configurable top-N via `compaction_large_data_records_per_sstable`
6. **test: add LargeDataRecords round-trip unit tests** — verify write/read, top-N bounding, below-threshold behavior
7. **db: call initialize_virtual_tables from shard 0 only** — preparatory refactoring to enable cross-shard coordination
8. **db: implement large_data virtual tables with feature flag gating** — three virtual table classes, feature flag activation, legacy SSTable fallback, dual-threshold dedup, cross-shard collection

Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-1276

* Although this fixes a bug where large data entries are effectively lost when sstables are renamed or migrated, the changes are intrusive and do not warrant a backport

Closes scylladb/scylladb#29257

* github.com:scylladb/scylladb:
  db: implement large_data virtual tables with feature flag gating
  db: call initialize_virtual_tables from shard 0 only
  test: add LargeDataRecords round-trip unit tests
  sstables: populate LargeDataRecords from writer
  large_data_handler: return above_threshold_result from maybe_record_large_cells
  large_data_handler: rename partition_above_threshold to above_threshold_result
  sstables: add LargeDataRecords metadata type (tag 13)
  sstables: add fmt::formatter for large_data_type
  keys: move key_to_str() to keys/keys.hh
2026-04-16 14:03:31 +03:00
..
2025-02-11 00:17:43 +02:00
2025-02-11 00:17:43 +02:00
2026-04-09 13:08:02 +02:00
2025-02-11 00:17:43 +02:00
2025-01-09 10:40:47 +00:00
2024-02-13 17:16:15 +02:00
2025-02-13 01:54:08 +02:00
2025-02-11 00:17:43 +02:00
2025-02-11 00:17:43 +02:00
2025-09-12 15:58:19 +03:00

Scylla developer documentation

This folder contains developer-oriented documentation concerning the ScyllaDB codebase. We also have a wiki, which contains additional developer-oriented documentation. There is currently no clear definition of what goes where, so when looking for something be sure to check both.

Seastar documentation can be found here.

User documentation can be found on docs.scylladb.com

For information on how to build Scylla and how to contribute visit HACKING.md and CONTRIBUTING.md.

Index

Module list and dependencies

Repository layout and short summary of components