mirror of
https://github.com/scylladb/scylladb.git
synced 2026-05-13 11:22:01 +00:00
SELECT MUTATION FRAGMENTS is a new select statement sub-type, which allows dumping the underling mutations making up the data of a given table. The output of this statement is mutation-fragments presented as CQL rows. Each row corresponds to a mutation-fragment. Subsequently, the output of this statement has a schema that is different than that of the underlying table. The output schema is derived from the table's schema, as following:
* The table's partition key is copied over as-is
* The clustering key is formed from the following columns:
- mutation_source (text): the kind of the mutation source, one of: memtable, row-cache or sstable; and the identifier of the individual mutation source.
- partition_region (int): represents the enum with the same name.
- the copy of the table's clustering columns
- position_weight (int): -1, 0 or 1, has the same meaning as that in position_in_partition, used to disambiguate range tombstone changes with the same clustering key, from rows and from each other.
* The following regular columns:
- metadata (text): the JSON representation of the mutation-fragment's metadata.
- value (text): the JSON representation of the mutation-fragment's value.
Data is always read from the local replica, on which the query is executed. Migrating queries between coordinators is frobidden.
More details in the documentation commit (last commit).
Example:
```cql
cqlsh> CREATE TABLE ks.tbl (pk int, ck int, v int, PRIMARY KEY (pk, ck));
cqlsh> DELETE FROM ks.tbl WHERE pk = 0;
cqlsh> DELETE FROM ks.tbl WHERE pk = 0 AND ck > 0 AND ck < 2;
cqlsh> INSERT INTO ks.tbl (pk, ck, v) VALUES (0, 0, 0);
cqlsh> INSERT INTO ks.tbl (pk, ck, v) VALUES (0, 1, 0);
cqlsh> INSERT INTO ks.tbl (pk, ck, v) VALUES (0, 2, 0);
cqlsh> INSERT INTO ks.tbl (pk, ck, v) VALUES (1, 0, 0);
cqlsh> SELECT * FROM ks.tbl;
pk | ck | v
----+----+---
1 | 0 | 0
0 | 0 | 0
0 | 1 | 0
0 | 2 | 0
(4 rows)
cqlsh> SELECT * FROM MUTATION_FRAGMENTS(ks.tbl);
pk | mutation_source | partition_region | ck | position_weight | metadata | mutation_fragment_kind | value
----+-----------------+------------------+----+-----------------+--------------------------------------------------------------------------------------------------------------------------+------------------------+-----------
1 | memtable:0 | 0 | | | {"tombstone":{}} | partition start | null
1 | memtable:0 | 2 | 0 | 0 | {"marker":{"timestamp":1688122873341627},"columns":{"v":{"is_live":true,"type":"regular","timestamp":1688122873341627}}} | clustering row | {"v":"0"}
1 | memtable:0 | 3 | | | null | partition end | null
0 | memtable:0 | 0 | | | {"tombstone":{"timestamp":1688122848686316,"deletion_time":"2023-06-30 11:00:48z"}} | partition start | null
0 | memtable:0 | 2 | 0 | 0 | {"marker":{"timestamp":1688122860037077},"columns":{"v":{"is_live":true,"type":"regular","timestamp":1688122860037077}}} | clustering row | {"v":"0"}
0 | memtable:0 | 2 | 0 | 1 | {"tombstone":{"timestamp":1688122853571709,"deletion_time":"2023-06-30 11:00:53z"}} | range tombstone change | null
0 | memtable:0 | 2 | 1 | 0 | {"marker":{"timestamp":1688122864641920},"columns":{"v":{"is_live":true,"type":"regular","timestamp":1688122864641920}}} | clustering row | {"v":"0"}
0 | memtable:0 | 2 | 2 | -1 | {"tombstone":{}} | range tombstone change | null
0 | memtable:0 | 2 | 2 | 0 | {"marker":{"timestamp":1688122868706989},"columns":{"v":{"is_live":true,"type":"regular","timestamp":1688122868706989}}} | clustering row | {"v":"0"}
0 | memtable:0 | 3 | | | null | partition end | null
(10 rows)
```
Perf simple query:
```
/build/release/scylla perf-simple-query -c1 -m2G --duration=60
```
Before:
```
median 141596.39 tps ( 62.1 allocs/op, 13.1 tasks/op, 43688 insns/op, 0 errors)
median absolute deviation: 137.15
maximum: 142173.32
minimum: 140492.37
```
After:
```
median 141889.95 tps ( 62.1 allocs/op, 13.1 tasks/op, 43692 insns/op, 0 errors)
median absolute deviation: 167.04
maximum: 142380.26
minimum: 141025.51
```
Fixes: https://github.com/scylladb/scylladb/issues/11130
Closes #14347
* github.com:scylladb/scylladb:
docs/operating-scylla/admin-tools: add documentation for the SELECT * FROM MUTATION_FRAGMENTS() statement
test/topology_custom: add test_select_from_mutation_fragments.py
test/boost/database_test: add test for mutation_dump/generate_output_schema_from_underlying_schema
test/cql-pytest: add test_select_mutation_fragments.py
test/cql-pytest: move scylla_data_dir fixture to conftest.py
cql3/statements: wire-in mutation_fragments_select_statement
cql3/restrictions/statement_restrictions: fix indentation
cql3/restrictions/statement_restrictions: add check_indexes flag
cql3/statments/select_statement: add mutation_fragments_select_statement
cql3: add SELECT MUTATION FRAGMENTS select statement sub-type
service/pager: allow passing a query functor override
service/storage_proxy: un-embed coordinator_query_options
replica: add mutation_dump
replica: extract query_state into own header
replica/table: add make_nonpopulating_cache_reader()
replica/table: add select_memtables_as_mutation_sources()
tools,mutation: extract the low-level json utilities into mutation/json.hh
tools/json_writer: fold SstableKey() overloads into callers
tools/json_writer: allow writing metadata and value separately
tools/json_writer: split mutation_fragment_json_writer in two classes
tools/json_writer: allow passing custom std::ostream to json_writer