mirror of https://github.com/scylladb/scylladb.git synced 2026-06-07 15:33:15 +00:00

Go to file

Avi Kivity 318278ff92 Merge 'tablets: reload only changed metadata' from Botond Dénes

Currently, each change to tablet metadata triggers a full metadata reload from disk. This is very wasteful, especially if the metadata change affects only a single row in the `system.tablets` table. This is the case when the tablet load balancer triggers a migration, this will affect a single row in the table, but today will trigger a full reload.
We expect tablet count to potentially grow to thousands and beyond and the overhead of this full reload can become significant.
This PR makes tablet metadata reload partial, instead of reloading all metadata on topology or schema changes, reload only the partitions that are affected by the change. Copy the rest from the in-memory state.
This is done with two passes: first the change mutations are scanned and a hint is produced. This hint is then passed down to the reload code, which will use it to only reload parts (rows/partitions) of the metadata that has actually changed.

The performance difference between full reload and partial reload is quite drastic:
```
INFO  2024-07-25 05:06:27,347 [shard 0:stat] testlog - Tablet metadata reload:
full      616.39ms
partial     0.18ms
```
This was measured with the modified (by this PR) `perf_tablets`, which creates 100 tables, each with 2K tablets. The test was modified to change a single tablet, then do a full and partial reload respectively, measuring the time it takes for reach.

Fixes: #15294

New feature, no backport needed.

Closes scylladb/scylladb#15541

* github.com:scylladb/scylladb:
  test/perf/perf_tablets: add tablet metadata reload perf measurement
  test/boost/tablets_test: add test for partial tablet metadata updates
  db/schema_tables: pass tablet hint to update_tablet_metadata()
  service/storage_service: load_tablet_metadata(): add hint parameter
  service/migration_listener: update_tablet_metadata(): add hint parameter
  service/raft/group0_state_machine: provide tablet change hint on topology change
  service/storage_service: topology_state_load(): allow providing change hint
  replica/tablets: add update_tablet_metadata()
  replica/tablets: fix indentation
  replica/tablets: extract tablet_metadata builder logic
  replica/tablets: add get_tablet_metadata_change_hint() and update_tablet_metadata_change_hint()
  locator/tablets: add tablet_map::clear_tablet_transition_info()
  locator/tablets: make tablet_metadata cheap to copy
  mutation/canonical_mutation: add key()

2024-08-11 21:27:18 +03:00

.github

Merge 'github: disable scheduled workflow on forks' from Kefu Chai

2024-07-24 07:50:39 +03:00

abseil @ d7aaad83b4

build: bring abseil submodule back

2024-05-05 23:31:09 +03:00

alternator

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

api

api/cql_server_test: add CQL server testing API

2024-08-08 10:42:09 +02:00

auth

service/migration_listener: update_tablet_metadata(): add hint parameter

2024-08-11 09:53:19 -04:00

bin

install.sh: use the native nodetool directly

2024-04-25 22:52:00 +03:00

cdc

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

cmake

build: cmake: define SCYLLA_ENABLE_PREEMPTION_SOURCE for dev build

2024-08-04 11:46:28 +03:00

compaction

Merge 'compaction: drop compaction executors' possibility to bypass task manager' from Aleksandra Martyniuk

2024-08-11 10:26:43 +03:00

conf

conf: scylla.yaml: enable_tablets: expand documentation

2024-06-27 14:41:43 +03:00

cql3

service/migration_listener: update_tablet_metadata(): add hint parameter

2024-08-11 09:53:19 -04:00

data_dictionary

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

Merge 'tablets: reload only changed metadata' from Botond Dénes

2024-08-11 21:27:18 +03:00

debug

…

dht

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

direct_failure_detector

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

dist

Revert "dist: support nonroot and offline mode for scylla-housekeeping"

2024-08-04 10:55:26 +03:00

docs

doc: update Raft info in 6.1

2024-08-08 11:25:50 +02:00

exceptions

exceptions/exceptions.hh: Wrap #include <concepts> within an #ifdef

2024-07-17 22:09:41 +03:00

gms

gms/feature_service: allow to suppress features

2024-08-09 19:15:19 +02:00

idl

forward_service: rename to mapreduce_service

2024-07-03 19:29:47 +03:00

index

code-cleanup: add missing header guards

2024-07-09 18:31:35 +03:00

lang

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

licenses

…

locator

replica/tablets: add get_tablet_metadata_change_hint() and update_tablet_metadata_change_hint()

2024-08-11 09:52:37 -04:00

message

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

mutation

mutation/canonical_mutation: add key()

2024-08-11 09:52:37 -04:00

mutation_writer

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

node_ops

db: node_ops: filter topology request entries

2024-07-23 13:35:02 +02:00

raft

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

readers

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

redis

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

reloc

reloc: create $BUILDDIR for getting its path

2024-05-01 09:52:17 +03:00

repair

repair: row_level: coroutinize working_row_hashes()

2024-08-05 08:55:34 +03:00

replica

Merge 'tablets: reload only changed metadata' from Botond Dénes

2024-08-11 21:27:18 +03:00

rust

rust: disable incremental build for release build

2024-06-20 12:01:14 +03:00

schema

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

scripts

scripts/open-coredump.sh: allow complete bypass of S3 server

2024-07-18 21:43:53 +03:00

seastar @ a7d81328fb

Update seastar submodule

2024-07-28 21:04:45 +03:00

service

service/storage_service: load_tablet_metadata(): add hint parameter

2024-08-11 09:53:19 -04:00

sstables

sstable_directory: use return_exception_ptr() when appropriate

2024-08-05 12:54:27 +03:00

streaming

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

swagger-ui @ 12f1da1082

…

tasks

tasks: fix task handler

2024-08-06 13:15:13 +02:00

test

Merge 'tablets: reload only changed metadata' from Botond Dénes

2024-08-11 21:27:18 +03:00

tools

tool/scylla-nodetool: refresh: improve error-message on missing ks/tbl args

2024-08-05 22:36:05 +03:00

tracing

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

transport

service/migration_listener: update_tablet_metadata(): add hint parameter

2024-08-11 09:53:19 -04:00

types

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

unified

cqlsh: update cqlsh submodule

2024-06-26 12:07:21 +03:00

utils

extensions: Add exception types for IO extensions and handle in memtable write path

2024-08-11 13:52:35 +03:00

.dockerignore

…

.gitattributes

gitattributes: Mark swagger .js files as binary

2024-06-19 15:07:56 +03:00

.gitignore

git: add build.ninja.new to .gitignore

2024-06-24 16:48:50 +03:00

.gitmodules

build: bring abseil submodule back

2024-05-05 23:31:09 +03:00

.gitorderfile

…

.mailmap

…

absl-flat_hash_map.cc

…

absl-flat_hash_map.hh

…

amplify.yml

…

backlog_controller.hh

…

build_mode.hh

…

bytes_ostream.hh

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

bytes.cc

bytes: drop unused operator<<

2024-06-25 12:11:28 +03:00

bytes.hh

bytes: drop unused operator<<

2024-06-25 12:11:28 +03:00

cache_mutation_reader.hh

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

cache_temperature.hh

…

cartesian_product.hh

…

cell_locking.hh

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

checked-file-impl.hh

…

client_data.cc

…

client_data.hh

transport: do not return client_type from cql_server::connection::make_client_key()

2024-06-07 09:23:06 +08:00

clocks-impl.cc

…

clocks-impl.hh

…

clustering_bounds_comparator.hh

clustering_bounds_comparator: drop operator<< for bound_kind

2024-06-11 18:01:06 +02:00

clustering_interval_set.hh

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

clustering_key_filter.hh

…

clustering_ranges_walker.hh

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

CMakeLists.txt

node_ops: add task manager module and node_ops_virtual_task

2024-07-23 13:35:01 +02:00

collection_mutation.cc

collection_mutation: compact_and_expire(): use compact_and_expire_result

2024-08-06 08:56:11 -04:00

collection_mutation.hh

collection_mutation: compact_and_expire(): use compact_and_expire_result

2024-08-06 08:56:11 -04:00

column_computation.hh

…

combine.hh

…

compound_compat.hh

treewide: replace formatter<std::string_view> with formatter<string_view>

2024-04-19 07:44:07 +03:00

compound.hh

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

compress.cc

…

compress.hh

compress, auth: include used headers

2024-05-30 09:16:23 +03:00

concrete_types.hh

…

configure.py

api/cql_server_test: add CQL server testing API

2024-08-08 10:42:09 +02:00

CONTRIBUTING.md

…

converting_mutation_partition_applier.cc

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

converting_mutation_partition_applier.hh

…

counters.cc

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

counters.hh

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

coverage_excludes.txt

…

coverage_sources.list

…

cql_serialization_format.hh

…

db_clock.hh

treewide: replace formatter<std::string_view> with formatter<string_view>

2024-04-19 07:44:07 +03:00

debug.cc

…

debug.hh

…

default.nix

treewide: drop thrift support

2024-06-07 06:44:59 +08:00

Doxyfile

…

duration.cc

…

duration.hh

…

encoding_stats.hh

…

enum_set.hh

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

fix_system_distributed_tables.py

…

flake.lock

…

flake.nix

…

frozen_schema.cc

…

frozen_schema.hh

…

full_position.hh

…

gc_clock.hh

…

gdbinit

…

gen_segmented_compress_params.py

…

generic_server.cc

generic_server: use async function in for_each_gently()

2024-08-08 10:42:09 +02:00

generic_server.hh

generic_server: use async function in for_each_gently()

2024-08-08 10:42:09 +02:00

HACKING.md

HACKING.md: fix typo in "--overprovisioned" option name

2024-06-25 12:11:28 +03:00

hashing_partition_visitor.hh

…

idl-compiler.py

idl-compiler: generate async serialization functions for stub members

2024-05-02 19:27:56 +03:00

inet_address_vectors.hh

…

init.cc

…

init.hh

…

install-dependencies.sh

toolchain: change optimized clang install method to standard one

2024-07-09 14:22:42 +03:00

install.sh

Revert "dist: support nonroot and offline mode for scylla-housekeeping"

2024-08-04 10:55:26 +03:00

interval.hh

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

keys.cc

clustering_bounds_comparator: drop operator<< for bound_kind

2024-06-11 18:01:06 +02:00

keys.hh

treewide: replace formatter<std::string_view> with formatter<string_view>

2024-04-19 07:44:07 +03:00

LICENSE.AGPL

…

log.hh

…

main.cc

service/storage_service: load_tablet_metadata(): add hint parameter

2024-08-11 09:53:19 -04:00

map_difference.hh

…

marshal_exception.hh

./: not include unused headers

2024-03-20 09:16:46 +02:00

multishard_mutation_query.cc

treewide: rename flat_mutation_reader_v2 to mutation_reader

2024-06-21 07:12:06 +03:00

multishard_mutation_query.hh

…

mutation_query.cc

mutation_query: reconcilable_result: add merge_disjoint()

2024-02-21 02:08:48 -05:00

mutation_query.hh

treewide: Use partition_slice::is_reversed()

2024-03-13 08:52:46 +02:00

noexcept_traits.hh

…

NOTICE.txt

…

ORIGIN

…

partition_builder.hh

…

partition_range_compat.hh

interval: rename nonwrapping_interval to interval

2024-02-21 19:43:17 +02:00

partition_slice_builder.cc

…

partition_slice_builder.hh

…

partition_snapshot_reader.hh

treewide: rename flat_mutation_reader_v2 to mutation_reader

2024-06-21 07:12:06 +03:00

partition_snapshot_row_cursor.hh

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

protocol_server.hh

protocol_server: Keep scheduling group on board

2024-05-24 17:54:29 +03:00

querier.cc

querier: consume_page(): add rate-limiting to tombstone warnings

2024-08-06 08:56:11 -04:00

querier.hh

querier: consume_page(): add rate-limiting to tombstone warnings

2024-08-06 08:56:11 -04:00

query_id.hh

…

query_ranges_to_vnodes.cc

./: not include unused headers

2024-03-20 09:16:46 +02:00

query_ranges_to_vnodes.hh

./: not include unused headers

2024-03-20 09:16:46 +02:00

query_result_merger.hh

…

query-request.hh

forward_service: rename to mapreduce_service

2024-07-03 19:29:47 +03:00

query-result-reader.hh

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

query-result-set.cc

…

query-result-set.hh

query-result-set: add formatter for query-result-set.hh types

2024-02-21 17:54:48 +08:00

query-result-writer.hh

./: not include unused headers

2024-03-20 09:16:46 +02:00

query-result.hh

query-result.hh: add formatter for query::result::printer

2024-02-21 17:57:18 +08:00

query.cc

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

read_context.hh

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

reader_concurrency_semaphore.cc

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

reader_concurrency_semaphore.hh

reader_concurrency_semaphore: test constructor: don't ignore metrics param

2024-08-04 21:14:42 +03:00

reader_permit.hh

treewide: replace formatter<std::string_view> with formatter<string_view>

2024-04-19 07:44:07 +03:00

README.md

README.md: add badges for cron jobs

2024-06-23 19:24:40 +03:00

real_dirty_memory_accounter.hh

…

release.cc

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

release.hh

release: introduce doc_link()

2024-05-08 09:41:17 -04:00

reversibly_mergeable.hh

…

row_cache.cc

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

row_cache.hh

treewide: rename flat_mutation_reader_v2 to mutation_reader

2024-06-21 07:12:06 +03:00

schema_mutations.cc

schema_mutations: add fmt::formatter for schema_mutations

2024-03-15 09:49:56 +02:00

schema_mutations.hh

treewide: replace formatter<std::string_view> with formatter<string_view>

2024-04-19 07:44:07 +03:00

schema_upgrader.hh

…

scylla_post_install.sh

…

scylla-gdb.py

Merge 'replica: remove rwlock for protecting iteration over storage group map' from Raphael "Raph" Carvalho

2024-07-12 15:45:36 +03:00

SCYLLA-VERSION-GEN

Update ScyllaDB version to: 6.2.0-dev

2024-07-18 16:07:07 +03:00

seastarx.hh

…

serialization_visitors.hh

…

serializer_impl.hh

serializer_impl, sstables: fix build failure due to missing includes

2024-04-23 12:03:51 +03:00

serializer.cc

…

serializer.hh

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

service_permit.hh

…

setup.py

…

shell.nix

…

sstables_loader.cc

sstables-loader: Run loading in its scheduling group

2024-05-28 11:07:58 +03:00

sstables_loader.hh

sstables-loader: Add scheduling group to constructor

2024-05-28 11:07:22 +03:00

supervisor.hh

./: not include unused headers

2024-03-20 09:16:46 +02:00

table_helper.cc

treewide: change assert() to SCYLLA_ASSERT()

2024-08-05 08:23:35 +03:00

table_helper.hh

…

test.py

[test.py] Increase pool size for CI

2024-08-06 11:20:36 +03:00

timeout_config.cc

…

timeout_config.hh

treewide: drop thrift support

2024-06-07 06:44:59 +08:00

timestamp.hh

…

tombstone_gc_extension.hh

./: not include unused headers

2024-03-20 09:16:46 +02:00

tombstone_gc_options.cc

treewide: replace formatter<std::string_view> with formatter<string_view>

2024-04-19 07:44:07 +03:00

tombstone_gc_options.hh

treewide: replace formatter<std::string_view> with formatter<string_view>

2024-04-19 07:44:07 +03:00

tombstone_gc.cc

token: move ordering operator inline

2024-07-20 21:21:42 +03:00

tombstone_gc.hh

cql3: statements: change default tombstone_gc mode for tablets

2024-04-24 10:42:10 +02:00

tox.ini

…

ubsan-suppressions.supp

…

unimplemented.cc

treewide: drop thrift support

2024-06-07 06:44:59 +08:00

unimplemented.hh

treewide: drop thrift support

2024-06-07 06:44:59 +08:00

validation.cc

…

validation.hh

…

version.hh

…

view_info.hh

mv: delete a partition in a single operation when applicable

2024-07-25 11:12:58 +03:00

vint-serialization.cc

…

vint-serialization.hh

…

zstd.cc

zstd: include external header with brackets

2024-07-04 10:42:29 +03:00

README.md

Scylla

What is Scylla?

Scylla is the real-time big data database that is API-compatible with Apache Cassandra and Amazon DynamoDB. Scylla embraces a shared-nothing approach that increases throughput and storage capacity to realize order-of-magnitude performance improvements and reduce hardware costs.

For more information, please see the ScyllaDB web site.

Build Prerequisites

Scylla is fairly fussy about its build environment, requiring very recent versions of the C++20 compiler and of many libraries to build. The document HACKING.md includes detailed information on building and developing Scylla, but to get Scylla building quickly on (almost) any build machine, Scylla offers a frozen toolchain, This is a pre-configured Docker image which includes recent versions of all the required compilers, libraries and build tools. Using the frozen toolchain allows you to avoid changing anything in your build machine to meet Scylla's requirements - you just need to meet the frozen toolchain's prerequisites (mostly, Docker or Podman being available).

Building Scylla

Building Scylla with the frozen toolchain dbuild is as easy as:

$ git submodule update --init --force --recursive
$ ./tools/toolchain/dbuild ./configure.py
$ ./tools/toolchain/dbuild ninja build/release/scylla

For further information, please see:

Developer documentation for more information on building Scylla.
Build documentation on how to build Scylla binaries, tests, and packages.
Docker image build documentation for information on how to build Docker images.

Running Scylla

To start Scylla server, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --workdir tmp --smp 1 --developer-mode 1

This will start a Scylla node with one CPU core allocated to it and data files stored in the tmp directory. The --developer-mode is needed to disable the various checks Scylla performs at startup to ensure the machine is configured for maximum performance (not relevant on development workstations). Please note that you need to run Scylla with dbuild if you built it with the frozen toolchain.

For more run options, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --help

Testing

See test.py manual.

Scylla APIs and compatibility

By default, Scylla is compatible with Apache Cassandra and its API - CQL. There is also support for the API of Amazon DynamoDB™, which needs to be enabled and configured in order to be used. For more information on how to enable the DynamoDB™ API in Scylla, and the current compatibility of this feature as well as Scylla-specific extensions, see Alternator and Getting started with Alternator.

Documentation

Documentation can be found here. Seastar documentation can be found here. User documentation can be found here.

Training

Training material and online courses can be found at Scylla University. The courses are free, self-paced and include hands-on examples. They cover a variety of topics including Scylla data modeling, administration, architecture, basic NoSQL concepts, using drivers for application development, Scylla setup, failover, compactions, multi-datacenters and how Scylla integrates with third-party applications.

Contributing to Scylla

If you want to report a bug or submit a pull request or a patch, please read the contribution guidelines.

If you are a developer working on Scylla, please read the developer guidelines.

Contact

The community forum and Slack channel are for users to discuss configuration, management, and operations of the ScyllaDB open source.
The developers mailing list is for developers and people interested in following the development of ScyllaDB to discuss technical topics.

Languages

C++ 72.1%

Python 26.7%

CMake 0.3%

GAP 0.3%

Shell 0.3%