Replace the physical system.large_partitions, system.large_rows, and
system.large_cells CQL tables with virtual tables that read from
LargeDataRecords stored in SSTable scylla metadata (tag 13).
The transition is gated by a new LARGE_DATA_VIRTUAL_TABLES cluster
feature flag:
- Before the feature is enabled: the old physical tables remain in
all_tables(), CQL writes are active, no virtual tables are registered.
This ensures safe rollback during rolling upgrades.
- After the feature is enabled: old physical tables are dropped from
disk via legacy_drop_table_on_all_shards(), virtual tables are
registered on all shards, and CQL writes are skipped via
skip_cql_writes() in cql_table_large_data_handler.
Key implementation details:
- Three virtual table classes (large_partitions_virtual_table,
large_rows_virtual_table, large_cells_virtual_table) extend
streaming_virtual_table with cross-shard record collection.
- generate_legacy_id() gains a version parameter; virtual tables
use version 1 to get different UUIDs than the old physical tables.
- compaction_time is derived from SSTable generation UUID at display
time via UUID_gen::unix_timestamp().
- Legacy SSTables without LargeDataRecords emit synthetic summary
rows based on above_threshold > 0 in LargeDataStats.
- The activation logic uses two paths: when the feature is already
enabled (test env, restart), it runs as a coroutine; when not yet
enabled, it registers a when_enabled callback that runs inside
seastar::async from feature_service::enable().
- sstable_3_x_test updated to use a simplified large_data_test_handler
and validate LargeDataRecords in SSTable metadata directly.
The latter is recommended in seastar, and the former was left as
compatibility alias. Latest seastar explicitly marks it as deprecated so
once the submodule is updated, compilation logs will explode.
Most of the patch is generated with
for f in $(git grep -l '\<distributed<[A-Za-z0-9:_]*>') ; do sed -e 's/\<distributed<\([A-Za-z0-9:_]*\)>/sharded<\1>/g' -i $f; done
for f in $(git grep -l distributed.hh); do sed -e 's/distributed.hh/sharded.hh/' -i $f ; done
and a small manual change in test/perf/perf.hh
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#26136
Can be used to query per-node stats about load as seen by the load
balancer.
In particular, node's capacity will be used by tablet-mon.py to
scale tablet columns so that equal height is equal node utilization.
Virtual tables are kept in a thread_local registry for deduplication
purposes. The problem is that thread_local variables are destroyed late,
possibly after the schema registry and the reactor are destroyed.
Currently this isn't a problem, but after a seastar change to
destroy the reactor after termination [1], things break.
Fix by moving the registry to system_keyspace. system_keyspace was chosen
since it was the birthplace of virtual tables.
Pimpl is used to avoid increasing dependencies.
[1] 101b245ed7
This is a readability refactoring commit without observable changes
in behaviour.
initialize_virtual_tables logically belongs to virtual_tables module,
and it allows to make other functions in virtual_tables.cc
(register_virtual_tables, install_virtual_readers)
local to the module, which simplifies the matters a bit.
all_virtual_tables() is not needed anymore, all the references to
registered virtual tables are now local to virtual_tables module
and can just use virtual_tables variable directly.
The definitions of virtual tables make up approximately a quarter of the
huge system_keyspace.cc file (almost 4K lines), pulling in a lot of
headers only used by them.
Move them to a separate source file to make system_keyspace.cc easier
for humans and compilers to digest.
This patch also moves the `register_virtual_tables()`,
`install_virtual_readers()` as well as the `virtual_tables` global.
Closes#14308