Fixes issue #12818 with the following docs changes: docs/dev/system_keyspace.md: Added missing system tables, added table of contents (TOC), added categories Closes scylladb/scylladb#27789
64 KiB
System Keyspace Layout
This section describes the layouts and usage of system.* tables.
Table of Contents
- System Table Categories
- system.IndexInfo
- system.batchlog
- system.batchlog_v2
- system.built_views
- system.cdc_generations_v3
- system.cdc_local
- system.cdc_streams
- system.cdc_streams_history
- system.cdc_streams_state
- system.client_routes
- system.clients
- system.cluster_status
- system.commitlog_cleanups
- system.compaction_history
- system.config
- system.corrupt_data
- system.dicts
- system.discovery
- system.group0_history
- The system.large_* tables
- system.large_cells
- system.large_partitions
- system.large_rows
- system.load_per_node
- system.local
- system.paxos
- system.peers
- system.protocol_servers
- system.raft
- system.raft_snapshot_config
- system.raft_snapshots
- system.raft_state
- system.repair_history
- system.repair_tasks
- system.role_attributes
- system.role_members
- system.role_permissions
- system.roles
- system.runtime_info
- system.scylla_local
- system.scylla_views_builds_in_progress
- system.service_levels_v2
- system.size_estimates
- system.snapshots
- system.sstables
- system.sstables_registry
- system.tablet_sizes
- system.tablets
- system.token_ring
- system.topology
- system.topology_requests
- system.truncated
- system.versions
- system.view_build_status_v2
- system.view_building_tasks
- system.views_builds_in_progress
System Table Categories
system.IndexInfo
Legacy table used for secondary index metadata. This table exists for Cassandra compatibility but is largely unused in modern ScyllaDB versions.
Schema:
CREATE TABLE system."IndexInfo" (
table_name text,
index_name text,
PRIMARY KEY (table_name, index_name)
) WITH CLUSTERING ORDER BY (index_name ASC);
Note: This table name requires quoting due to the capital letters.
system.batchlog
Stores batches that have been written locally but not yet replicated to other nodes. Used to ensure batch atomicity across replicas.
Schema:
CREATE TABLE system.batchlog (
id uuid PRIMARY KEY,
data blob,
version int,
written_at timestamp
);
Columns:
id: Unique identifier for the batchdata: Serialized batch mutationsversion: Protocol version used to serialize the batchwritten_at: Timestamp when the batch was written locally
system.batchlog_v2
Stores batches that have been written locally but not yet replicated to other nodes. This is the newer version of batchlog with improved clustering and sharding. Used to ensure batch atomicity across replicas.
Schema:
CREATE TABLE system.batchlog_v2 (
version int,
stage tinyint,
shard int,
written_at timestamp,
id uuid,
data blob,
PRIMARY KEY ((version, stage, shard), written_at, id)
) WITH CLUSTERING ORDER BY (written_at ASC, id ASC);
Columns:
version: Protocol version (part of composite partition key)stage: Batch processing stage (part of composite partition key)shard: Shard number (part of composite partition key)written_at: Timestamp when the batch was written locally (part of composite clustering key)id: Unique identifier for the batch (part of composite clustering key)data: Serialized batch mutations
Related tables: batchlog
system.built_views
Tracks which materialized views have completed their initial build process.
Schema:
CREATE TABLE system.built_views (
keyspace_name text,
view_name text,
PRIMARY KEY (keyspace_name, view_name)
) WITH CLUSTERING ORDER BY (view_name ASC);
Columns:
keyspace_name: Name of the keyspace containing the viewview_name: Name of the materialized view
Related tables: views_builds_in_progress, scylla_views_builds_in_progress, view_build_status_v2
system.cdc_generations_v3
Stores CDC generation data. Each CDC generation defines which streams are active during a particular time period. This is the current version of CDC generation storage, replacing earlier versions.
Schema:
CREATE TABLE system.cdc_generations_v3 (
key text,
id timeuuid,
range_end bigint,
streams frozen<set<blob>>,
ignore_msb tinyint,
PRIMARY KEY (key, id, range_end)
) WITH CLUSTERING ORDER BY (id ASC, range_end ASC);
Columns:
key: Partition key (typically"cdc_generations")id: Unique identifier for this CDC generationrange_end: End of the token range described by this rowstreams: Set of stream IDs mapped to this token rangeignore_msb: Number of most significant bits to ignore in token-to-shard mapping (sharding parameter of the node that owned this range when the generation was created)
How it works:
This is a single-partition table. Each generation describes a mapping from all tokens in the token ring to stream IDs. The mapping is built from smaller range-local mappings, where each row describes how tokens in a subrange map to stream IDs. The range_end is the end of the range for this row; the start is the range_end of the previous row (or wraps around for the first row with a given id).
Related tables: cdc_local, cdc_streams, system_distributed.cdc_timestamps
system.cdc_local
Stores local CDC state information for this node.
Schema:
CREATE TABLE system.cdc_local (
key text PRIMARY KEY,
streams_timestamp timestamp,
uuid uuid
);
Columns:
key: Always set to"cdc_local"streams_timestamp: Timestamp of the currently active streams generation on this nodeuuid: UUID of the CDC generation (may be null for older generations that only have timestamp)
Related tables: cdc_generations_v3, cdc_streams
system.cdc_streams
Virtual table containing CDC stream mappings for tablet-based tables. Each stream corresponds to a tablet and is used for distributing CDC log entries.
Schema:
CREATE TABLE system.cdc_streams (
keyspace_name text,
table_name text,
timestamp timestamp,
stream_state tinyint,
stream_id blob,
PRIMARY KEY ((keyspace_name, table_name), timestamp, stream_state, stream_id)
) WITH CLUSTERING ORDER BY (timestamp ASC, stream_state ASC, stream_id ASC);
Columns:
keyspace_name: Name of the keyspacetable_name: Name of the tabletimestamp: Timestamp identifying the CDC generationstream_state: State of the stream (0=current, 1=closed, 2=opened)stream_id: CDC stream identifier
Related tables: cdc_streams_history, cdc_streams_state, cdc_generations_v3
system.cdc_streams_history
Historical record of CDC stream configurations for tablet-based tables, described as differences from the previous state.
Schema:
CREATE TABLE system.cdc_streams_history (
table_id uuid,
timestamp timestamp,
stream_state tinyint,
last_token bigint,
stream_id blob,
PRIMARY KEY (table_id, timestamp, stream_state, last_token)
) WITH CLUSTERING ORDER BY (timestamp ASC, stream_state ASC, last_token ASC);
Columns:
table_id: UUID of the tabletimestamp: Timestamp of this stream configuration changestream_state: State of the stream change (e.g., added, removed)last_token: Last token of the tabletstream_id: CDC stream identifier for this tablet
Related tables: cdc_streams
system.cdc_streams_state
Tracks the state of CDC streams for tablet-based tables. This is the oldest CDC stream set for tablets.
Schema:
CREATE TABLE system.cdc_streams_state (
table_id uuid,
last_token bigint,
stream_id blob,
timestamp timestamp STATIC,
PRIMARY KEY (table_id, last_token)
) WITH CLUSTERING ORDER BY (last_token ASC);
Columns:
table_id: UUID of the tablelast_token: Last token of the tabletstream_id: CDC stream identifier for this tablettimestamp: (static) Timestamp of the CDC stream set
Related tables: cdc_streams, cdc_generations_v3
Note: CDC timestamp tracking is handled by the system_distributed.cdc_timestamps table, not in the system keyspace.
system.client_routes
Stores client connection routing information for each connection ID.
Schema:
CREATE TABLE system.client_routes (
connection_id text,
host_id uuid,
address text,
port int,
tls_port int,
alternator_port int,
alternator_https_port int,
PRIMARY KEY (connection_id, host_id)
) WITH CLUSTERING ORDER BY (host_id ASC);
Columns:
connection_id: Unique identifier for the connectionhost_id: UUID of the host node (clustering key)address: IP address of the hostport: CQL port numbertls_port: TLS-enabled CQL port numberalternator_port: Alternator (DynamoDB-compatible) port numberalternator_https_port: HTTPS port for Alternator
system.clients
Virtual table that provides information about currently connected CQL clients.
Schema:
CREATE TABLE system.clients (
address inet,
port int,
client_type text,
connection_stage text,
driver_name text,
driver_version text,
hostname text,
protocol_version int,
shard_id int,
ssl_cipher_suite text,
ssl_enabled boolean,
ssl_protocol text,
username text,
scheduling_group text,
client_options map<text, text>,
PRIMARY KEY (address, port, client_type)
) WITH CLUSTERING ORDER BY (port ASC, client_type ASC);
Columns:
address: Client IP addressport: Client connection portclient_type: Type of client connection (currently only CQL clients are tracked)connection_stage: Current stage of the connectiondriver_name: Name of the client driver being useddriver_version: Version of the client driverhostname: Client hostname if availableprotocol_version: CQL native protocol version usedshard_id: ScyllaDB shard handling this connectionssl_cipher_suite: SSL cipher suite if SSL is enabledssl_enabled: Whether SSL/TLS is enabled for this connectionssl_protocol: SSL/TLS protocol versionusername: Authenticated usernamescheduling_group: Workload prioritization groupclient_options: Map of additional client options and configuration parameters. Stores all data sent by the client in the STARTUP request. This column is useful for debugging and monitoring purposes
Note: This is a virtual table. This table was previously persisted to disk (in the data directory) before and including version 4.5.
system.cluster_status
Virtual table containing information about the status of each endpoint in the cluster. Equivalent to the nodetool status command.
Schema:
CREATE TABLE system.cluster_status (
peer inet PRIMARY KEY,
dc text,
up boolean,
status text,
load text,
tokens int,
owns float,
host_id uuid
);
Columns:
peer: IP address of the nodedc: Data center nameup: Whether the node is currently upstatus: Node status (e.g.,UNfor Up/Normal)load: Amount of data stored on the nodetokens: Number of tokens ownedowns: Percentage of the ring owned by this nodehost_id: Unique identifier for the node
Implemented by cluster_status_table in db/system_keyspace.cc.
system.commitlog_cleanups
Tracks commitlog cleanup operations, used to ensure durability guarantees are met.
Schema:
CREATE TABLE system.commitlog_cleanups (
shard int,
position frozen<tuple<bigint, int>>,
table_uuid uuid,
start_token_exclusive bigint,
end_token_inclusive bigint,
PRIMARY KEY (shard, position, table_uuid, start_token_exclusive, end_token_inclusive)
) WITH CLUSTERING ORDER BY (position ASC, table_uuid ASC, start_token_exclusive ASC, end_token_inclusive ASC);
Columns:
shard: The shard (CPU) that performed the cleanupposition: Position in the commitlog (tuple of segment ID and position within segment)table_uuid: UUID of the table being cleaned upstart_token_exclusive: Start of the token range (exclusive)end_token_inclusive: End of the token range (inclusive)
system.compaction_history
Records historical information about compaction operations.
Schema:
CREATE TABLE system.compaction_history (
id uuid PRIMARY KEY,
bytes_in bigint,
bytes_out bigint,
columnfamily_name text,
started_at timestamp,
compacted_at timestamp,
compaction_type text,
keyspace_name text,
rows_merged map<int, bigint>,
shard_id int,
sstables_in list<frozen<sstableinfo>>,
sstables_out list<frozen<sstableinfo>>,
total_tombstone_purge_attempt bigint,
total_tombstone_purge_failure_due_to_overlapping_with_memtable bigint,
total_tombstone_purge_failure_due_to_overlapping_with_uncompacting_sstable bigint
) WITH default_time_to_live = 604800;
User-Defined Type:
CREATE TYPE system.sstableinfo (
generation int,
level int,
size bigint,
version text
);
Columns:
id: Unique identifier for the compactionbytes_in: Total bytes read during compactionbytes_out: Total bytes written during compactioncolumnfamily_name: Table that was compactedstarted_at: Timestamp when compaction startedcompacted_at: Timestamp when compaction completedcompaction_type: Type of compaction (e.g.,COMPACTION,MAJOR,RESHAPE)keyspace_name: Keyspace containing the compacted tablerows_merged: Map showing how many rows were merged at each levelshard_id: Shard that performed the compactionsstables_in: List of input SSTablessstables_out: List of output SSTablestotal_tombstone_purge_attempt: Number of tombstones attempted to be purgedtotal_tombstone_purge_failure_due_to_overlapping_with_memtable: Tombstones not purged due to memtable overlaptotal_tombstone_purge_failure_due_to_overlapping_with_uncompacting_sstable: Tombstones not purged due to SSTable overlap
Note: Entries in this table have a TTL of 7 days (604800 seconds).
system.config
Virtual table holding all configuration variables currently in use. Values can be modified at runtime using the UPDATE statement for configuration options that support live updates. Updates are not durable. Restarting the node will reset options to their previous values unless the physical configuration file has also been updated.
Schema:
CREATE TABLE system.config (
name text PRIMARY KEY,
source text,
type text,
value text
);
Columns:
name: Name of the configuration parametersource: Origin of the configuration value, one of:default: The default value (not explicitly configured)config: Set via configuration filecli: Set via command-line argumentcql: Set via updating this tableinternal: Deliberately configured by ScyllaDB internals
type: Data type of the configuration parameter (e.g.,string,bool,integer)value: Current value in JSON format (as it would appear in the config file)
Usage:
-- View a specific setting
SELECT * FROM system.config WHERE name = 'compaction_throughput_mb_per_sec';
-- Update a live-updateable setting
UPDATE system.config SET value = '64' WHERE name = 'compaction_throughput_mb_per_sec';
Related tables: versions
system.corrupt_data
Stores data found to be corrupt during internal operations. This data cannot be written to SSTables because it would be spread around by repair and compaction and might cause failures in SSTable parsing. The table preserves corrupt data so it can be inspected and possibly restored by the database operator.
Schema:
CREATE TABLE system.corrupt_data (
keyspace_name text,
table_name text,
id timeuuid,
partition_key blob,
clustering_key text,
mutation_fragment_kind text,
frozen_mutation_fragment blob,
origin text,
sstable_name text,
PRIMARY KEY ((keyspace_name, table_name), id)
) WITH CLUSTERING ORDER BY (id ASC)
AND gc_grace_seconds = 0;
Columns:
keyspace_name: Keyspace name of the source tabletable_name: Table name of the source tableid: Unique identifier assigned when the corrupt data entry is createdpartition_key: Partition key of the partition in the source table (can be incomplete or null due to corruption)clustering_key: Clustering key of the mutation fragment (can be null for some mutation fragment kinds, or incomplete due to corruption)mutation_fragment_kind: Kind of the mutation fragment, one of:partition startpartition endstatic rowclustering rowrange tombstone change(onlyclustering rowandrange tombstone changecan haveclustering_keyset)
frozen_mutation_fragment: The serialized mutation fragment itselforigin: The name of the process that found the corruption (e.g.,sstable-writer)sstable_name: The name of the SSTable that contains the corrupt data, if known (SSTable may have been compacted or deleted)
system.dicts
Stores shared compression dictionaries used for inter-node traffic compression when internode_compression_enable_advanced is enabled.
Schema:
CREATE TABLE system.dicts (
name text PRIMARY KEY,
timestamp timestamp,
origin uuid,
data blob
);
Columns:
name: Unique name/identifier for the dictionarytimestamp: When the dictionary was createdorigin: UUID of the node that created the dictionarydata: Compressed dictionary data
Note: Dictionary training can leak unencrypted data to disk. See internode_compression_train_dictionaries configuration option.
system.discovery
Used during cluster discovery to find peer nodes. This table is part of the Raft-based topology management.
Schema:
CREATE TABLE system.discovery (
key text,
ip_addr inet,
raft_server_id uuid,
PRIMARY KEY (key, ip_addr)
) WITH CLUSTERING ORDER BY (ip_addr ASC);
Columns:
key: Partition key (typically"peers")ip_addr: IP address of a discovered peer (clustering column)raft_server_id: The ID of the group 0 Raft server on that peer (may be UUID 0 if unknown during discovery)
Related tables: topology
system.group0_history
Records the history of Raft Group 0 state changes. Group 0 is the central Raft group used for topology and schema coordination.
Schema:
CREATE TABLE system.group0_history (
key text,
state_id timeuuid,
description text,
PRIMARY KEY (key, state_id)
) WITH CLUSTERING ORDER BY (state_id DESC);
Columns:
key: Partition key (typically"history")state_id: Identifier for this state change (in descending order)description: Human-readable description of the state change
Related tables: raft, raft_state
The system.large_* tables
ScyllaDB performs better when partitions, rows, and cells are not too large. To help diagnose cases where these grow too large, ScyllaDB keeps three tables that record large partitions (including those with too many rows), rows, and cells, respectively.
The meaning of an entry in each of these tables is similar. It means that there is a particular SSTable with a large partition, row, cell, or a partition with too many rows. In particular, this implies that:
- There is no entry until compaction aggregates enough data in a single SSTable.
- The entry stays around until the SSTable is deleted.
In addition, the entries also have a TTL of 30 days.
system.large_cells
Tracks large cells detected in SSTables. Used to help diagnose performance issues caused by oversized cells.
Schema:
CREATE TABLE system.large_cells (
keyspace_name text,
table_name text,
sstable_name text,
cell_size bigint,
partition_key text,
clustering_key text,
column_name text,
compaction_time timestamp,
collection_elements bigint,
PRIMARY KEY ((keyspace_name, table_name), sstable_name, cell_size, partition_key, clustering_key, column_name)
) WITH CLUSTERING ORDER BY (sstable_name ASC, cell_size DESC, partition_key ASC, clustering_key ASC, column_name ASC);
Columns:
keyspace_name: Keyspace containing the large celltable_name: Table containing the large cellsstable_name: SSTable where the cell was detectedcell_size: Size of the cell in bytespartition_key: Partition key of the rowclustering_key: Clustering key of the rowcolumn_name: Name of the column containing the large cellcompaction_time: When the large cell was detected during compactioncollection_elements: Number of elements in the collection (if the cell contains a collection type)
Note: A collection is just one cell. There is no information about the size of each collection element. This table is currently only used with the MC SSTable format (issue #4868). Entries have a TTL of 30 days.
Related tables: large_partitions, large_rows
Example usage
-- View all large cells
SELECT * FROM system.large_cells;
-- View large cells for a specific table
SELECT * FROM system.large_cells WHERE keyspace_name = 'ks1' AND table_name = 'standard1';
system.large_partitions
Tracks large partitions detected in SSTables, including partitions with too many rows. ScyllaDB performs better if partitions are not too large.
Schema:
CREATE TABLE system.large_partitions (
keyspace_name text,
table_name text,
sstable_name text,
partition_size bigint,
partition_key text,
range_tombstones bigint,
dead_rows bigint,
rows bigint,
compaction_time timestamp,
PRIMARY KEY ((keyspace_name, table_name), sstable_name, partition_size, partition_key)
) WITH CLUSTERING ORDER BY (sstable_name ASC, partition_size DESC, partition_key ASC);
Columns:
keyspace_name: Keyspace containing the large partitiontable_name: Table containing the large partitionsstable_name: SSTable where the partition was detectedpartition_size: Size of the partition in bytespartition_key: The partition key valuerange_tombstones: Number of range tombstones in the partitiondead_rows: Number of tombstoned (deleted) rowsrows: Total number of rows in the partitioncompaction_time: When the large partition was detected during compaction
Important notes:
- There is no entry until compaction aggregates enough data in a single SSTable
- The entry stays around until the SSTable is deleted
- Entries have a TTL of 30 days
Related tables: large_rows, large_cells
Example usage
-- View all large partitions
SELECT * FROM system.large_partitions;
-- View large partitions for a specific table
SELECT * FROM system.large_partitions WHERE keyspace_name = 'ks1' AND table_name = 'standard1';
system.large_rows
Tracks large clustering rows and static rows detected in SSTables.
Schema:
CREATE TABLE system.large_rows (
keyspace_name text,
table_name text,
sstable_name text,
row_size bigint,
partition_key text,
clustering_key text,
compaction_time timestamp,
PRIMARY KEY ((keyspace_name, table_name), sstable_name, row_size, partition_key, clustering_key)
) WITH CLUSTERING ORDER BY (sstable_name ASC, row_size DESC, partition_key ASC, clustering_key ASC);
Columns:
keyspace_name: Keyspace containing the large rowtable_name: Table containing the large rowsstable_name: SSTable where the row was detectedrow_size: Size of the row in bytespartition_key: Partition key of the rowclustering_key: Clustering key of the row (empty for static rows)compaction_time: When the large row was detected during compaction
Note: This table is currently only used with the MC SSTable format (issue #4868). Entries have a TTL of 30 days.
Related tables: large_partitions, large_cells
Example usage
-- View all large rows
SELECT * FROM system.large_rows;
-- View large rows for a specific table
SELECT * FROM system.large_rows WHERE keyspace_name = 'ks1' AND table_name = 'standard1';
system.load_per_node
Virtual table containing information about the current tablet load with node granularity. Can be queried on any node, but the data comes from the group0 leader. Reads wait for group0 leader to be elected and load balancer stats to become available.
Schema:
CREATE TABLE system.load_per_node (
node uuid PRIMARY KEY,
dc text,
effective_capacity bigint,
rack text,
ip inet,
up boolean,
tablets_allocated bigint,
tablets_allocated_per_shard double,
storage_capacity bigint,
storage_allocated_load bigint,
storage_allocated_utilization double,
storage_load bigint,
storage_utilization double
);
Columns:
node: Host ID of the nodedc: Data center nameeffective_capacity: Sum of available disk space and tablet sizes on a node. Used to compute load on a node for size based balancingrack: Rack nameip: IP address of the nodeup: Whether the node is alive according to gossipertablets_allocated: Number of tablet replicas on the node (migrating tablets are counted as if migration already finished)tablets_allocated_per_shard:tablets_allocateddivided by shard count on the nodestorage_capacity: Total disk capacity in bytes (by default equals file system capacity)storage_allocated_load: Estimated disk space for tablets based on the configuredtarget_tablet_size(not actual current size)storage_allocated_utilization: Fraction of node's disk capacity used forstorage_allocated_load(1.0 means full utilization)storage_load: Disk space allocated for tablets, computed with actual tablet sizes. Can be null if some of the tablet sizes are not knownstorage_utilization: Fraction of node's disk capacity taken forstorage_load(with actual tablet sizes), where 1.0 means full utilization. Can be null if some of the tablet sizes are not known
Related tables: tablets, tablet_sizes
system.local
Stores information about the local node. This is one of the most important system tables, containing essential cluster and node metadata.
Schema:
CREATE TABLE system.local (
key text PRIMARY KEY,
bootstrapped text,
broadcast_address inet,
cluster_name text,
cql_version text,
data_center text,
gossip_generation int,
host_id uuid,
listen_address inet,
native_protocol_version text,
partitioner text,
rack text,
release_version text,
rpc_address inet,
schema_version uuid,
supported_features text,
tokens set<text>,
truncated_at map<uuid, blob>
);
Key Columns:
key: Always set to"local"bootstrapped: Bootstrap status (COMPLETED,IN_PROGRESS,NEEDS_BOOTSTRAP)broadcast_address: Address this node broadcasts to other nodescluster_name: Name of the clustercql_version: CQL version supporteddata_center: Data center this node belongs tohost_id: Unique identifier for this nodenative_protocol_version: Native CQL protocol versionpartitioner: Partitioner class being usedrack: Rack this node belongs torelease_version: ScyllaDB release versionschema_version: Current schema version UUIDtokens: Set of tokens owned by this nodetruncated_at: Legacy truncation records (see truncated for current storage)
Related tables: peers, scylla_local
system.paxos
Stores state for Paxos-based lightweight transactions (LWT). Each row represents the Paxos state for a particular partition in a user table.
Schema:
CREATE TABLE system.paxos (
row_key blob,
cf_id uuid,
promise timeuuid,
most_recent_commit blob,
most_recent_commit_at timeuuid,
proposal blob,
proposal_ballot timeuuid,
PRIMARY KEY (row_key, cf_id)
) WITH CLUSTERING ORDER BY (cf_id ASC);
Columns:
row_key: Partition key of the user table rowcf_id: UUID of the user tablepromise: Ballot of the most recent promise (previously namedin_progress_ballotin older versions)most_recent_commit: The most recently committed value (serialized frozen_mutation)most_recent_commit_at: Ballot of the most recent commitproposal: The current proposal value (serialized frozen_mutation)proposal_ballot: Ballot of the current proposal
system.peers
Stores information about other nodes in the cluster as seen by this node.
Schema:
CREATE TABLE system.peers (
peer inet PRIMARY KEY,
data_center text,
host_id uuid,
preferred_ip inet,
rack text,
release_version text,
rpc_address inet,
schema_version uuid,
supported_features text,
tokens set<text>
);
Columns:
peer: IP address of the peer node (primary key)data_center: Data center of the peerhost_id: Unique identifier for the peer nodepreferred_ip: Preferred IP address for inter-node communicationrack: Rack of the peerrelease_version: ScyllaDB version running on the peerrpc_address: Address for client RPC connectionsschema_version: Schema version on the peersupported_features: Features supported by this peertokens: Set of tokens owned by this peer
Related tables: local
system.protocol_servers
Virtual table listing all client-facing data-plane protocol servers and their listen addresses. Equivalent of nodetool statusbinary plus the "Native Transport active" fields from nodetool info.
Schema:
CREATE TABLE system.protocol_servers (
name text PRIMARY KEY,
listen_addresses frozen<list<text>>,
protocol text,
protocol_version text
);
Columns:
name: Name/alias of the server (e.g.,"native"for the CQL server)listen_addresses: List of addresses this server listens onprotocol: Name of the protocol this server serves (e.g.,"cql")protocol_version: Version of the protocol this server understands
Implemented by protocol_servers_table in db/system_keyspace.cc.
system.raft
Core Raft consensus state table. Holds information about Raft groups including the log entries.
Schema:
CREATE TABLE system.raft (
group_id timeuuid,
index bigint,
term bigint,
data blob,
vote_term bigint STATIC,
vote uuid STATIC,
snapshot_id uuid STATIC,
commit_idx bigint STATIC,
PRIMARY KEY (group_id, index)
) WITH CLUSTERING ORDER BY (index ASC);
Columns:
group_id: Identifier for the Raft groupindex: Log entry indexterm: Raft term for this log entrydata: Serialized log entry datavote_term: (static) Term of the last votevote: (static) Server ID that received the votesnapshot_id: (static) ID of the current snapshotcommit_idx: (static) Index of the last committed entry
Related tables: raft_state, raft_snapshots, raft_snapshot_config, group0_history
system.raft_snapshot_config
Stores configuration for Raft snapshots.
Schema:
CREATE TABLE system.raft_snapshot_config (
group_id timeuuid,
disposition text,
server_id uuid,
can_vote boolean,
PRIMARY KEY (group_id, disposition, server_id)
) WITH CLUSTERING ORDER BY (disposition ASC, server_id ASC);
Columns:
group_id: Identifier for the Raft groupdisposition: Configuration disposition (can beCURRENTorPREVIOUS)server_id: ID of the server in the snapshot configurationcan_vote: Whether this server can vote in the Raft group
Related tables: raft, raft_snapshots
system.raft_snapshots
Stores Raft snapshot data.
Schema:
CREATE TABLE system.raft_snapshots (
group_id timeuuid PRIMARY KEY,
snapshot_id uuid,
idx bigint,
term bigint
);
Columns:
group_id: Identifier for the Raft groupsnapshot_id: Snapshot identifieridx: Log index of the snapshotterm: Raft term of the snapshot
Related tables: raft, raft_snapshot_config
system.raft_state
Tracks the state of Raft groups.
Schema:
CREATE TABLE system.raft_state (
group_id timeuuid,
disposition text,
server_id uuid,
can_vote boolean,
PRIMARY KEY (group_id, disposition, server_id)
) WITH CLUSTERING ORDER BY (disposition ASC, server_id ASC);
Columns:
group_id: Identifier for the Raft groupdisposition: Current disposition/state of the configuration ("CURRENT" or "PREVIOUS")server_id: ID of the server in this Raft groupcan_vote: Whether this server can vote in this configuration
Related tables: raft, group0_history
system.repair_history
Records the history of repair operations.
Schema:
CREATE TABLE system.repair_history (
table_uuid uuid,
repair_time timestamp,
repair_uuid uuid,
range_start bigint,
range_end bigint,
keyspace_name text STATIC,
table_name text STATIC,
PRIMARY KEY (table_uuid, repair_time, repair_uuid, range_start, range_end)
) WITH CLUSTERING ORDER BY (repair_time DESC, repair_uuid ASC, range_start ASC, range_end ASC);
Columns:
table_uuid: UUID of the repaired tablerepair_time: When the repair was completedrepair_uuid: Unique identifier for the repair operationrange_start: Start of the repaired token range (as bigint)range_end: End of the repaired token range (as bigint)keyspace_name: (static) Keyspace containing the repaired tabletable_name: (static) Name of the repaired table
system.repair_tasks
Records tablet repair tasks and their progress.
Schema:
CREATE TABLE system.repair_tasks (
task_uuid uuid,
operation text,
first_token bigint,
last_token bigint,
timestamp timestamp,
table_uuid uuid STATIC,
PRIMARY KEY (task_uuid, operation, first_token, last_token)
) WITH CLUSTERING ORDER BY (operation ASC, first_token ASC, last_token ASC);
Columns:
task_uuid: Unique identifier for the repair taskoperation: Type of repair operation (clustering key)first_token: First token of the tablet (clustering key)last_token: Last token of the tablet (clustering key)timestamp: When the repair task was recordedtable_uuid: (static) UUID of the table being repaired
system.role_attributes
Stores additional attributes for roles (e.g., service level associations).
Schema:
CREATE TABLE system.role_attributes (
role text,
name text,
value text,
PRIMARY KEY (role, name)
) WITH CLUSTERING ORDER BY (name ASC);
Columns:
role: Name of the rolename: Attribute namevalue: Attribute value
Related tables: roles, role_members, role_permissions
system.role_members
Stores role membership information (which roles are members of other roles).
Schema:
CREATE TABLE system.role_members (
role text,
member text,
PRIMARY KEY (role, member)
) WITH CLUSTERING ORDER BY (member ASC);
Columns:
role: Name of the parent rolemember: Name of the member role
Related tables: roles, role_attributes, role_permissions
system.role_permissions
Stores permissions granted to roles.
Schema:
CREATE TABLE system.role_permissions (
role text,
resource text,
permissions set<text>,
PRIMARY KEY (role, resource)
) WITH CLUSTERING ORDER BY (resource ASC);
Columns:
role: Name of the roleresource: Resource the permissions apply to (e.g.,data/keyspace/table)permissions: Set of permissions granted (e.g.,SELECT,MODIFY,CREATE)
Related tables: roles, role_members, role_attributes
system.roles
Stores role definitions for authentication and authorization.
Schema:
CREATE TABLE system.roles (
role text PRIMARY KEY,
can_login boolean,
is_superuser boolean,
member_of set<text>,
salted_hash text
);
Columns:
role: Name of the role (primary key)can_login: Whether this role can be used for authenticationis_superuser: Whether this role has superuser privilegesmember_of: Set of roles this role is a member ofsalted_hash: Hashed password for authentication
Related tables: role_members, role_permissions, role_attributes
system.runtime_info
Virtual table containing runtime-specific information such as memory stats, memtable stats, cache stats, and more. Data is grouped so that related items stay together and are easily queried. Roughly equivalent to nodetool info, nodetool gettraceprobability, and nodetool statusgossip commands.
Schema:
CREATE TABLE system.runtime_info (
group text,
item text,
value text,
PRIMARY KEY (group, item)
) WITH CLUSTERING ORDER BY (item ASC);
Columns:
group: Category/group of the runtime informationitem: Name of the metric or settingvalue: Current value
Implemented by runtime_info_table in db/system_keyspace.cc.
system.scylla_local
Stores ScyllaDB-specific local node information that extends beyond the Cassandra-compatible system.local table.
Schema:
CREATE TABLE system.scylla_local (
key text PRIMARY KEY,
value text
);
Columns:
key: Setting or property namevalue: The value (typically JSON or plain text)
Common keys:
cluster_supports_upgrading_from_version: Minimum version the cluster supports upgrade fromenabled_features: Features enabled on this noderaft_group0_id: ID of the Raft Group 0topology_state_machine_version: Version of the topology state machine
Related tables: local
system.scylla_views_builds_in_progress
ScyllaDB-specific table tracking materialized view build progress, extending the Cassandra-compatible version.
Schema:
CREATE TABLE system.scylla_views_builds_in_progress (
keyspace_name text,
view_name text,
cpu_id int,
next_token text,
generation_number int,
first_token text,
PRIMARY KEY (keyspace_name, view_name, cpu_id)
) WITH CLUSTERING ORDER BY (view_name ASC, cpu_id ASC);
Columns:
keyspace_name: Keyspace containing the viewview_name: Name of the materialized viewcpu_id: Shard (CPU) that owns this portion of the buildnext_token: Next token to processgeneration_number: Build generation numberfirst_token: First token in the range being built
Related tables: views_builds_in_progress, built_views
system.service_levels_v2
Stores service level definitions for Quality of Service (QoS) workload prioritization.
Schema:
CREATE TABLE system.service_levels_v2 (
service_level text PRIMARY KEY,
timeout duration,
workload_type text,
shares int
);
Columns:
service_level: Name of the service leveltimeout: Default timeout for operations in this service levelworkload_type: Type of workload (e.g.,batch,interactive)shares: CPU shares allocated to this service level for scheduling priority
system.size_estimates
Virtual table containing size estimates for individual token ranges of each keyspace/table. Useful for capacity planning and understanding data distribution.
Schema:
CREATE TABLE system.size_estimates (
keyspace_name text,
table_name text,
range_start text,
range_end text,
mean_partition_size bigint,
partitions_count bigint,
PRIMARY KEY (keyspace_name, table_name, range_start, range_end)
) WITH CLUSTERING ORDER BY (table_name ASC, range_start ASC, range_end ASC);
Columns:
keyspace_name: Name of the keyspacetable_name: Name of the tablerange_start: Start of the token rangerange_end: End of the token rangemean_partition_size: Average partition size in bytes for this range (bigint)partitions_count: Estimated number of partitions in this range (bigint)
Implemented by size_estimates_mutation_reader in db/size_estimates_virtual_reader.{hh,cc}.
system.snapshots
Virtual table listing snapshots on the node. Equivalent to the nodetool listsnapshots command.
Schema:
CREATE TABLE system.snapshots (
keyspace_name text,
table_name text,
snapshot_name text,
live bigint,
total bigint,
PRIMARY KEY (keyspace_name, table_name, snapshot_name)
) WITH CLUSTERING ORDER BY (table_name ASC, snapshot_name ASC);
Columns:
keyspace_name: Keyspace of the snapshottable_name: Table of the snapshotsnapshot_name: Name of the snapshotlive: Live data size in bytes (data that isn't also in the current data files)total: Total size of snapshot files in bytes
Implemented by snapshots_table in db/system_keyspace.cc.
system.sstables
The "ownership" table for non-local SSTables when using object storage (S3).
Schema:
CREATE TABLE system.sstables (
owner uuid,
generation timeuuid,
status text,
state text,
version text,
format text,
PRIMARY KEY (owner, generation)
) WITH CLUSTERING ORDER BY (generation ASC);
Columns:
owner: UUID of the owning tablegeneration: SSTable generation identifierstatus: Current status of the SSTablestate: State of the SSTableversion: SSTable format versionformat: SSTable format
Note: When a user keyspace is created with S3 storage options, SSTables are put on remote object storage and the information about them is kept in this table.
system.sstables_registry
Tracks SSTable ownership information.
Schema:
CREATE TABLE system.sstables_registry (
owner uuid,
generation timeuuid,
status text,
state text,
version text,
format text,
PRIMARY KEY (owner, generation)
) WITH CLUSTERING ORDER BY (generation ASC);
Columns:
owner: UUID of the SSTable ownergeneration: SSTable generation identifier (clustering key)status: Current status of the SSTablestate: State of the SSTableversion: SSTable format versionformat: SSTable format
system.tablet_sizes
Virtual table containing information about current tablet disk sizes. The table can contain incomplete data, in which case missing_replicas will contain the host IDs of replicas for which the tablet size is not known. Can be queried on any node, but the data comes from the group0 leader. Reads wait for group0 leader to be elected and load balancer stats to become available.
Schema:
CREATE TABLE system.tablet_sizes (
table_id uuid,
last_token bigint,
missing_replicas frozen<set<uuid>>,
replicas frozen<map<uuid, bigint>>,
PRIMARY KEY (table_id, last_token)
) WITH CLUSTERING ORDER BY (last_token ASC);
Columns:
table_id: UUID of the tablelast_token: Last token owned by the tabletmissing_replicas: Set of host IDs for replicas where tablet size is unknownreplicas: Map of replica host IDs to disk size of the tablet replica in bytes
Related tables: tablets, load_per_node
system.tablets
Holds information about all tablets in the cluster. Only tables that use tablet-based replication strategy have entries here.
Each partition (table_id) represents a tablet map of a given table.
Schema:
CREATE TABLE system.tablets (
table_id uuid,
last_token bigint,
base_table uuid STATIC,
keyspace_name text STATIC,
repair_scheduler_config frozen<repair_scheduler_config> STATIC,
resize_seq_number bigint STATIC,
resize_task_info frozen<tablet_task_info> STATIC,
resize_type text STATIC,
table_name text STATIC,
tablet_count int STATIC,
migration_task_info frozen<tablet_task_info>,
new_replicas frozen<list<frozen<tuple<uuid, int>>>>,
repair_task_info frozen<tablet_task_info>,
repair_time timestamp,
replicas frozen<list<frozen<tuple<uuid, int>>>>,
session uuid,
stage text,
transition text,
sstables_repaired_at bigint,
repair_incremental_mode text,
PRIMARY KEY (table_id, last_token)
) WITH CLUSTERING ORDER BY (last_token ASC);
User-Defined Types:
CREATE TYPE system.repair_scheduler_config (
auto_repair_enabled boolean,
auto_repair_threshold bigint
);
CREATE TYPE system.tablet_task_info (
request_type text,
tablet_task_id uuid,
request_time timestamp,
sched_nr bigint,
sched_time timestamp,
repair_hosts_filter text,
repair_dcs_filter text
);
Static Columns (per table):
tablet_count: Number of tablets in the maptable_name: Name of the table (provided for convenience)keyspace_name: Name of the keyspacebase_table: Optionally set with thetable_idof another table that this table is co-located with, meaning they always have the same tablet count and tablet replicas, and are migrated and resized together as a group. Whenbase_tableis set, the rest of the tablet map is empty, and the tablet map ofbase_tableshould be read instead. Whenbase_tableis NULL, this table has its own independent tablet map stored in the remaining columns.resize_type: Resize decision type that spans all tablets of a given table (merge,split, ornone)resize_seq_number: Sequence number (>= 0) of the resize decision that globally identifies it. It's monotonically increasing, incremented by one for every new decision, so a higher value means it came later in time.repair_scheduler_config: Configuration for the repair scheduler containing:auto_repair_enabled: When set to true, auto repair is enabled. Disabled by default.auto_repair_threshold: If the time since last repair is longer thanauto_repair_thresholdseconds, the tablet is eligible for auto repair.
Per-Tablet Columns:
last_token is the last token owned by the tablet. The i-th tablet, where i = 0, 1, ..., tablet_count-1, owns the token range:
(-inf, last_token(0)] for i = 0
(last_token(i-1), last_token(i)] for i > 0
replicas: List of tuples(host_id, shard_id)representing the shard-replicas of the tabletnew_replicas: What will be put inreplicasafter transition is donestage: Current stage of tablet transitiontransition: Type of transition (see below)repair_time: Last time the tablet was repairedsstables_repaired_at: Tablet-level referencerepaired_atvalue. When an SSTable's on-diskrepaired_atvalue is less than or equal tosstables_repaired_at, the SSTable is considered repaired.repair_incremental_mode: Mode for incremental repair (see below)repair_task_info: Metadata for the task manager containing:request_type: Type of the request (user_repairorauto_repair)tablet_task_id: UUID of the taskrequest_time: Time the request was createdsched_nr: Number of times the request has been scheduled by the repair schedulersched_time: Time the request was scheduled by the repair schedulerrepair_hosts_filter: Repair replicas listed in the comma-separated host_id listrepair_dcs_filter: Repair replicas listed in the comma-separated DC list
Transition types:
migration: One tablet replica is moving from one shard to anotherrebuild: New tablet replica is created from the remaining replicasrepair: Tablet replicas are being repaired
Incremental repair modes (repair_incremental_mode):
regular: The incremental repair logic is enabled. Unrepaired SSTables will be included for repair. Repaired SSTables will be skipped. The incremental repair states will be updated after repair.full: The incremental repair logic is enabled. Both repaired and unrepaired SSTables will be included for repair. The incremental repair states will be updated after repair.disabled: The incremental repair logic is disabled completely. The incremental repair states (e.g.,repaired_atin SSTables andsstables_repaired_atin this table) will not be updated after repair.
Related tables: tablet_sizes, load_per_node, topology
system.token_ring
Virtual table containing the ring description for each keyspace. Equivalent of the nodetool describe_ring $KEYSPACE command (when filtered for a specific keyspace). Overlaps with the output of nodetool ring.
Schema:
CREATE TABLE system.token_ring (
keyspace_name text,
table_name text,
start_token text,
endpoint inet,
end_token text,
dc text,
rack text,
PRIMARY KEY (keyspace_name, table_name, start_token, endpoint)
) WITH CLUSTERING ORDER BY (table_name ASC, start_token ASC, endpoint ASC);
Columns:
keyspace_name: Name of the keyspacetable_name: Name of the tablestart_token: Start of the token rangeendpoint: IP address of the node owning this rangeend_token: End of the token rangedc: Data center of the endpointrack: Rack of the endpoint
Implemented by token_ring_table in db/system_keyspace.cc.
system.topology
Stores cluster topology state managed by Raft. This is the central table for Raft-based topology management.
Schema:
CREATE TABLE system.topology (
key text,
host_id uuid,
cleanup_status text,
datacenter text,
ignore_msb int,
node_state text,
num_tokens int,
rack text,
rebuild_option text,
release_version text,
replaced_id uuid,
request_id timeuuid,
shard_count int,
supported_features frozen<set<text>>,
tokens frozen<set<text>>,
tokens_string text,
topology_request text,
committed_cdc_generations frozen<set<tuple<bigint, timeuuid>>> STATIC,
enabled_features frozen<set<text>> STATIC,
fence_version bigint STATIC,
global_requests frozen<set<timeuuid>> STATIC,
global_topology_request text STATIC,
global_topology_request_id timeuuid STATIC,
ignore_nodes frozen<set<uuid>> STATIC,
new_cdc_generation_data_uuid timeuuid STATIC,
new_keyspace_rf_change_data frozen<map<text, text>> STATIC,
new_keyspace_rf_change_ks_name text STATIC,
paused_rf_change_requests frozen<set<timeuuid>> STATIC,
session uuid STATIC,
tablet_balancing_enabled boolean STATIC,
transition_state text STATIC,
unpublished_cdc_generations frozen<set<tuple<bigint, timeuuid>>> STATIC,
upgrade_state text STATIC,
version bigint STATIC,
PRIMARY KEY (key, host_id)
) WITH CLUSTERING ORDER BY (host_id ASC);
Partition Key:
key: Partition key (typically"topology")
Clustering Column:
host_id: Host ID of the node (each row represents one node)
Per-Node Regular Columns:
datacenter: Data center of the noderack: Rack of the nodetokens: Set of token strings owned by the nodenode_state: Current state of the node in topologyrelease_version: ScyllaDB version running on the nodetopology_request: Current topology request for this nodereplaced_id: If this node is replacing another, the replaced node's host IDrebuild_option: Rebuild configuration for the nodenum_tokens: Number of tokens for the nodetokens_string: Alternative representation of tokensshard_count: Number of shards on the nodeignore_msb: MSB bits to ignore for token calculationcleanup_status: Status of cleanup operationssupported_features: Features supported by this noderequest_id: ID of the current topology request for this node
Static Columns (cluster-wide state):
ignore_nodes: Set of node IDs to ignorenew_cdc_generation_data_uuid: UUID for new CDC generation datanew_keyspace_rf_change_ks_name: (deprecated) Keyspace name for RF changenew_keyspace_rf_change_data: (deprecated) RF change dataversion: Topology version numberfence_version: Version used for fencing operationstransition_state: Current state of topology transitioncommitted_cdc_generations: Set of committed CDC generations (generation timestamp, ID)unpublished_cdc_generations: Set of unpublished CDC generationsglobal_topology_request: Current global topology requestglobal_topology_request_id: ID of the global topology requestenabled_features: Features enabled across the clustersession: Current topology session IDtablet_balancing_enabled: Whether tablet load balancing is enabledupgrade_state: State of cluster upgrade processglobal_requests: Set of global request IDspaused_rf_change_requests: Set of paused replication factor change requests
Related tables: topology_requests, tablets, discovery
system.topology_requests
Queue of pending topology change requests. The coordinator processes this queue to handle topology operations.
Schema:
CREATE TABLE system.topology_requests (
id timeuuid PRIMARY KEY,
initiating_host uuid,
request_type text,
start_time timestamp,
done boolean,
error text,
end_time timestamp,
truncate_table_id uuid,
new_keyspace_rf_change_ks_name text,
new_keyspace_rf_change_data map<text, text>
);
Columns:
id: Unique identifier for the requestinitiating_host: Host ID of the node that initiated the requestrequest_type: Type of topology request (e.g.,join,leave,remove,replace,remove_node)start_time: When the request was starteddone: Whether the request has been completederror: Error message if the request failedend_time: When the request completedtruncate_table_id: Table ID for truncate operationsnew_keyspace_rf_change_ks_name: Keyspace name for replication factor changesnew_keyspace_rf_change_data: Data for replication factor changes
Related tables: topology
system.truncated
Holds truncation replay positions per table and shard. When a table is truncated, SSTables are removed and the current replay position for each shard is recorded here. During commitlog replay (in case of a crash), this data is used to filter mutations and ensure truncated data is not resurrected.
Schema:
CREATE TABLE system.truncated (
table_uuid uuid,
shard int,
position int,
segment_id bigint,
truncated_at timestamp STATIC,
PRIMARY KEY (table_uuid, shard)
) WITH CLUSTERING ORDER BY (shard ASC);
Columns:
table_uuid: ID of the truncated tableshard: Shard numberposition: Replay position within the segmentsegment_id: Commitlog segment IDtruncated_at: (static) Timestamp when the truncation occurred
Note: Prior to this table being added, truncation records were kept in the truncated_at map column in system.local. When booting up, ScyllaDB merges data from the legacy store with data from this table. Until the whole cluster agrees on the feature TRUNCATION_TABLE, truncation will write both new and legacy records. When the feature is agreed upon, the legacy map is removed.
system.versions
Virtual table containing all version-related information. Equivalent of the nodetool version command, but contains more detailed version information.
Schema:
CREATE TABLE system.versions (
key text PRIMARY KEY,
version text,
build_mode text,
build_id text
);
Columns:
key: Version category identifier (typically"local")version: Full version stringbuild_mode: Build mode (e.g.,release,debug,dev)build_id: Build identifier/hash
Implemented by versions_table in db/system_keyspace.cc.
Related tables: config
system.view_build_status_v2
Tracks the build status of materialized views across the cluster.
Schema:
CREATE TABLE system.view_build_status_v2 (
keyspace_name text,
view_name text,
host_id uuid,
status text,
PRIMARY KEY ((keyspace_name, view_name), host_id)
) WITH CLUSTERING ORDER BY (host_id ASC);
Columns:
keyspace_name: Keyspace containing the viewview_name: Name of the materialized viewhost_id: Host ID of the node reporting statusstatus: Build status (e.g.,STARTED,SUCCESS)
Related tables: built_views, views_builds_in_progress
system.view_building_tasks
Tracks tasks related to building materialized views.
Schema:
CREATE TABLE system.view_building_tasks (
key text,
id timeuuid,
type text,
aborted boolean,
base_id uuid,
view_id uuid,
last_token bigint,
host_id uuid,
shard int,
PRIMARY KEY (key, id)
) WITH CLUSTERING ORDER BY (id ASC);
Columns:
key: Partition key for grouping tasksid: Unique identifier for the build task (timeuuid)type: Type of the build taskaborted: Whether the task has been abortedbase_id: UUID of the base tableview_id: UUID of the materialized viewlast_token: Last token processedhost_id: Host ID of the node handling this taskshard: Shard ID handling this task
Related tables: views_builds_in_progress
system.views_builds_in_progress
Tracks materialized view builds that are currently in progress. This is the Cassandra-compatible version.
Schema:
CREATE TABLE system.views_builds_in_progress (
keyspace_name text,
view_name text,
generation_number int,
last_token text,
PRIMARY KEY (keyspace_name, view_name)
) WITH CLUSTERING ORDER BY (view_name ASC);
Columns:
keyspace_name: Keyspace containing the viewview_name: Name of the materialized viewgeneration_number: Build generation numberlast_token: Last token that was processed
Related tables: scylla_views_builds_in_progress, built_views, view_build_status_v2
Virtual Tables
Virtual tables behave just like regular tables from the user's point of view. The difference between them and regular tables comes down to how they are implemented. While regular tables have memtables/commitlog/SSTables and all you would expect from CQL tables, virtual tables translate some in-memory structure to CQL result format.
The following system tables are virtual tables:
- cluster_status
- clients
- config
- load_per_node
- protocol_servers
- runtime_info
- size_estimates
- snapshots
- tablet_sizes
- token_ring
- versions
For more details see virtual-tables.md.
Notes
-
Read-Only Tables: Most system tables should be considered read-only. Direct modification can cause cluster instability.
-
Internal Use: Many tables (especially Raft, CDC, and topology tables) are for internal ScyllaDB use. Modifying them directly is not recommended.
-
Monitoring: Tables like
large_partitions,large_rows, andlarge_cellsare useful for performance monitoring and troubleshooting. -
Version Suffixes: Tables with
_v2or_v3suffixes represent schema evolution across ScyllaDB versions. -
TTL on Diagnostic Tables: The
large_*tables have entries with a 30-day TTL.
Verification
To verify table schemas in your cluster:
-- Describe a specific table
DESCRIBE TABLE system.<table_name>;
-- List all tables in the system keyspace
SELECT * FROM system_schema.tables WHERE keyspace_name = 'system';
-- Get column details for a specific table
SELECT * FROM system_schema.columns
WHERE keyspace_name = 'system' AND table_name = '<table_name>';