mirror of
https://github.com/scylladb/scylladb.git
synced 2026-06-04 14:03:06 +00:00
doc: document the trie-based SSTable index (ms format)
The ms SSTable format, which introduces a trie-based partition index replacing the previous Cassandra 3.0 index format used by me and md, became the default value in the default scylla.yaml starting with ScyllaDB 2026.2. Changes: - sstable_what_is.rst: add format links above the version table; add a short ms format section with a link to the dedicated page. - sstable/index.rst: remove format links (now in sstable_what_is.rst). - sstable3/index.rst: add sstable-ms-index to the toctree and link list. - sstable-format.rst: add SSTable Format Variants section documenting the ms, me, and md options, ordered newest to oldest. - sstable-ms-index.rst (new): dedicated page covering the trie-based index benefits, and a Configuring ms Format section with subsections for new clusters (2026.2 and later) and upgraded clusters, including existing SSTable conversion behavior and nodetool upgradesstables -a. Fixes SCYLLADB-1994 Related: https://github.com/scylladb/scylladb/issues/29442 Closes scylladb/scylladb#30148
This commit is contained in:
committed by
Avi Kivity
parent
02539bc4f4
commit
a0e160db8a
@@ -4,6 +4,11 @@ The location of ScyllaDB SSTables is specified in scylla.yaml ``data_file_direct
|
||||
|
||||
SSTable 3.x is more efficient and requires less disk space than the SSTable 2.x.
|
||||
|
||||
For more information on each of the SSTable formats, see below:
|
||||
|
||||
* :doc:`SSTable 2.x </architecture/sstable/sstable2/index>`
|
||||
* :doc:`SSTable 3.x </architecture/sstable/sstable3/index>`
|
||||
|
||||
SSTable Version Support
|
||||
------------------------
|
||||
|
||||
@@ -13,11 +18,11 @@ SSTable Version Support
|
||||
|
||||
* - SSTable Version
|
||||
- ScyllaDB Version
|
||||
* - 3.x ('ms')
|
||||
* - 3.x (``ms``)
|
||||
- 2025.4 and above
|
||||
* - 3.x ('me')
|
||||
* - 3.x (``me``)
|
||||
- 2022.2 and above
|
||||
* - 3.x ('md')
|
||||
* - 3.x (``md``)
|
||||
- 2021.1
|
||||
|
||||
* The supported formats are ``me`` and ``ms``.
|
||||
@@ -27,3 +32,10 @@ SSTable Version Support
|
||||
**writes**. The legacy SSTable formats (``ka``, ``la``, ``mc``) remain
|
||||
supported for reads, which is essential for restoring clusters from existing
|
||||
backups.
|
||||
|
||||
The ms Format: Trie-Based SSTable Index
|
||||
-----------------------------------------
|
||||
|
||||
The ``ms`` format introduces a *trie-based* SSTable index.
|
||||
|
||||
For a detailed description of the trie index format, see :doc:`SSTable ms Index </architecture/sstable/sstable3/sstable-ms-index>`.
|
||||
@@ -7,9 +7,4 @@ ScyllaDB SSTable Format
|
||||
sstable2/index
|
||||
sstable3/index
|
||||
|
||||
.. include:: _common/sstable_what_is.rst
|
||||
|
||||
For more information on each of the SSTable formats, see below:
|
||||
|
||||
* :doc:`SSTable 2.x <sstable2/index>`
|
||||
* :doc:`SSTable 3.x <sstable3/index>`
|
||||
.. include:: _common/sstable_what_is.rst
|
||||
@@ -8,6 +8,7 @@ ScyllaDB SSTable - 3.x
|
||||
sstables-3-statistics
|
||||
sstables-3-summary
|
||||
sstables-3-index
|
||||
sstable-ms-index
|
||||
sstable-format
|
||||
|
||||
.. include:: ../_common/sstable_what_is.rst
|
||||
@@ -20,5 +21,6 @@ For more information on ScyllaDB 3.x SSTable formats, see below:
|
||||
* :doc:`SSTable 3.0 Data File Format <sstables-3-data-file-format>`
|
||||
* :doc:`SSTable 3.0 Statistics <sstables-3-statistics>`
|
||||
* :doc:`SSTable 3.0 Summary <sstables-3-summary>`
|
||||
* :doc:`SSTable 3.0 Index <sstables-3-index>`
|
||||
* :doc:`SSTable 3.0 Format in ScyllaDB <sstable-format>`
|
||||
* :doc:`SSTable 3.0 Index (me/md format) <sstables-3-index>`
|
||||
* :doc:`SSTable ms Index (Trie-Based) <sstable-ms-index>`
|
||||
* :doc:`SSTable 3.0 Format in ScyllaDB <sstable-format>`
|
||||
@@ -9,8 +9,28 @@ Looking more carefully, you will see that ScyllaDB maintains more,
|
||||
smaller, SSTables than Cassandra does. On ScyllaDB, each core manages its
|
||||
own subset of SSTables. This internal sharding allows each core (shard)
|
||||
to work more efficiently, avoiding the complexity and delays of multiple
|
||||
cores competing for the same data
|
||||
cores competing for the same data.
|
||||
|
||||
SSTable Format Variants
|
||||
------------------------
|
||||
|
||||
ScyllaDB 3.x SSTables come in three format variants, selected via the ``sstable_format``
|
||||
parameter in ``scylla.yaml``:
|
||||
|
||||
``ms``
|
||||
Introduces a trie-based SSTable index.
|
||||
For details, see :doc:`SSTable ms Index (Trie-Based) <sstable-ms-index>`.
|
||||
|
||||
``me``
|
||||
The baseline 3.x format, default from ScyllaDB 2022.2 through 2026.1.
|
||||
|
||||
``md``
|
||||
An earlier 3.x variant. Only used when upgrading from an existing ``md`` cluster.
|
||||
The ``sstable_format`` parameter is ignored if set to ``md``.
|
||||
|
||||
Existing SSTables are not rewritten automatically when upgrading to 2026.2.
|
||||
They are upgraded to the ``ms`` format on the next compaction.
|
||||
|
||||
.. include:: /rst_include/architecture-index.rst
|
||||
|
||||
.. include:: /rst_include/apache-copyrights.rst
|
||||
.. include:: /rst_include/apache-copyrights.rst
|
||||
47
docs/architecture/sstable/sstable3/sstable-ms-index.rst
Normal file
47
docs/architecture/sstable/sstable3/sstable-ms-index.rst
Normal file
@@ -0,0 +1,47 @@
|
||||
.. _sstable-ms-index:
|
||||
|
||||
SSTable ms Index: Trie-Based Format
|
||||
=====================================
|
||||
|
||||
The ``ms`` format introduces a trie-based partition index, replacing the previous
|
||||
Cassandra 3.0 index format used by ``me`` and ``md``.
|
||||
|
||||
Benefits
|
||||
---------
|
||||
|
||||
Compared to the previous format, the trie-based index provides:
|
||||
|
||||
* **Reduced memory footprint** - less in-memory index data per SSTable.
|
||||
* **Faster partition lookups** - more efficient index traversal.
|
||||
* **Smaller on-disk index files** - shared key prefixes reduce redundant storage.
|
||||
|
||||
Configuring ms Format
|
||||
----------------------
|
||||
|
||||
New Clusters (2026.2 and Later)
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Starting with ScyllaDB 2026.2, ``ms`` is the default value for ``sstable_format`` in the
|
||||
default ``scylla.yaml``. Clusters created with ScyllaDB 2026.2 use the ``ms`` format unless
|
||||
the configuration is changed.
|
||||
|
||||
Upgraded Clusters
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Clusters upgraded from an earlier version continue to use the format specified in their
|
||||
``scylla.yaml``. To switch to the ``ms`` format, add the following to ``scylla.yaml``:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
sstable_format: ms
|
||||
|
||||
After the node configuration is updated, newly created SSTables will use the ``ms`` format.
|
||||
|
||||
Existing SSTables are not automatically converted. They will be rewritten in ``ms`` format
|
||||
during the next compaction. Conversion can be triggered using ``nodetool upgradesstables -a``
|
||||
after updating the node configuration.
|
||||
|
||||
Related Topics
|
||||
--------------
|
||||
|
||||
* :doc:`SSTable 3.0 Format in ScyllaDB <sstable-format>` - overview of 3.x format variants
|
||||
Reference in New Issue
Block a user