mirror of https://github.com/scylladb/scylladb.git synced 2026-05-12 19:02:12 +00:00

Go to file

Michał Chojnowski 7d551f99be replica/memtable: move region_listener handlers from dirty_memory_manager to memtable

The memtable wants to listen for changes in its `total_memory` in order
to decrease its `_flushed_memory` in case some of the freed memory has already
been accounted as flushed. (This can happen because the flush reader sees
and accounts even outdated MVCC versions, which can be deleted and freed
during the flush).

Today, the memtable doesn't listen to those changes directly. Instead,
some calls which can affect `total_memory` (in particular, the mutation cleaner)
manually check the value of `total_memory` before and after they run, and they
pass the difference to the memtable.

But that's not good enough, because `total_memory` can also change outside
of those manually-checked calls -- for example, during LSA compaction, which
can occur anytime. This makes memtable's accounting inaccurate and can lead
to unexpected states.

But we already have an interface for listening to `total_memory` changes
actively, and `dirty_memory_manager`, which also needs to know it,
does just that. So what happens e.g. when `mutation_cleaner` runs
is that `mutation_cleaner` checks the value of `total_memory` before it runs,
then it runs, causing several changes to `total_memory` which are picked up
by `dirty_memory_manager`, then `mutation_cleaner` checks the end value of
`total_memory` and passes the difference to `memtable`, which corrects
whatever was observed by `dirty_memory_manager`.

To allow memtable to modify its `_flushed_memory` correctly, we need
to make `memtable` itself a `region_listener`. Also, instead of
the situation where `dirty_memory_manager` receives `total_memory`
change notifications from `logalloc` directly, and `memtable` fixes
the manager's state later, we want to only the memtable listen
for the notifications, and pass them already modified accordingl
to the manager, so there is no intermediate wrong states.

This patch moves the `region_listener` callbacks from the
`dirty_memory_manager` to the `memtable`. It's not intended to be
a functional change, just a source code refactoring.
The next patch will be a functional change enabled by this.

2025-06-20 11:42:30 +02:00

.github

.github/workflows/conflict_reminder: reduce the amount of conflict reminder for every push event

2025-05-28 11:01:44 +03:00

abseil @ d7aaad83b4

…

alternator

alternator/stats.cc, metrics-config.yml: docs fix per-table metrics

2025-06-15 18:06:36 +03:00

api

api: Shorten get_simple_states() handler

2025-06-16 15:21:27 +03:00

audit

audit: add semaphore to audit_syslog_storage_helper

2025-04-08 16:24:42 +02:00

auth

Revert "Merge 'Atomic in-memory schema changes application' from Marcin Maliszkiewicz"

2025-06-16 22:38:12 +03:00

bin

…

cdc

cdc: add sanity check for generating an empty generation

2025-04-25 11:25:07 +02:00

cmake

build: when compiling without -g, don't leave debugging information

2025-05-12 15:42:17 +03:00

compaction

Merge 'Extend compaction_history table with additional compaction statistics' from Łukasz Paszkowski

2025-05-27 14:12:13 +03:00

conf

Merge 'Add tablet enforcing option' from Benny Halevy

2025-04-03 16:32:19 +03:00

cql3

Revert "Merge 'Atomic in-memory schema changes application' from Marcin Maliszkiewicz"

2025-06-16 22:38:12 +03:00

data_dictionary

Merge 'scylla-sstable: add native S3 support' from Ernest Zaslavsky

2025-03-14 15:05:52 +02:00

Revert "Merge 'Atomic in-memory schema changes application' from Marcin Maliszkiewicz"

2025-06-16 22:38:12 +03:00

debug

…

dht

interval: reduce sizeof

2025-06-14 21:29:43 +03:00

dist

dist: rpm: override %_sbindir for Fedora 42

2025-05-16 12:05:29 +02:00

docs

doc: add support for z3 GCP

2025-06-17 13:50:46 +03:00

ent

encryption/utils: Move encryption httpclient to "general" REST client

2025-05-30 12:21:51 +03:00

exceptions

transport: storage_proxy: release ERM when waiting for query timeout

2025-04-23 09:29:47 +02:00

gms

load_and_stream: Add abortion flow to mutation streaming

2025-05-27 14:21:58 +03:00

idl

interval: change start()/end() not to return references to data members

2025-06-14 21:26:17 +03:00

index

interval: rename start() to start_ref() (and end() etc).

2025-06-14 21:26:16 +03:00

lang

treewide: Reduce db/config.hh header fanout

2025-02-25 15:16:40 +01:00

licenses

…

locator

interval: rename start_ref() back to start() (and end_ref() etc).

2025-06-14 21:26:16 +03:00

message

dht: fragment token_range_vector

2025-05-27 14:47:24 +03:00

mutation

interval: rename start_ref() back to start() (and end_ref() etc).

2025-06-14 21:26:16 +03:00

mutation_writer

readers/mutation_reader: s/reader_consumer_v2/mutation_reader_consumer/

2025-05-09 07:53:29 -04:00

node_ops

tasks: check whether a node is alive before rpc

2025-04-17 12:51:22 +02:00

pgo

Update pgo profiles - aarch64

2025-06-15 04:57:59 +03:00

raft

raft: server_impl: use named gate

2025-04-12 11:28:48 +03:00

readers

interval: rename start_ref() back to start() (and end_ref() etc).

2025-06-14 21:26:16 +03:00

redis

transport: generic_server: remove no longer used connection advertising code

2025-05-27 19:31:09 +02:00

reloc

…

repair

interval: rename start_ref() back to start() (and end_ref() etc).

2025-06-14 21:26:16 +03:00

replica

replica/memtable: move region_listener handlers from dirty_memory_manager to memtable

2025-06-20 11:42:30 +02:00

rust

rust: update dependencies

2025-03-04 09:45:23 +02:00

schema

Revert "Merge 'Atomic in-memory schema changes application' from Marcin Maliszkiewicz"

2025-06-16 22:38:12 +03:00

scripts

alternator/stats.cc, metrics-config.yml: docs fix per-table metrics

2025-06-15 18:06:36 +03:00

seastar @ 26badcb14c

Update seastar submodule

2025-06-03 13:47:05 +03:00

service

Revert "Merge 'Atomic in-memory schema changes application' from Marcin Maliszkiewicz"

2025-06-16 22:38:12 +03:00

sstables

interval: rename start_ref() back to start() (and end_ref() etc).

2025-06-14 21:26:16 +03:00

streaming

Merge 'token_range_vector: fragment' from Avi Kivity

2025-05-29 18:45:13 +02:00

swagger-ui @ 12f1da1082

…

tasks

test: add test for getting tasks children

2025-04-17 13:48:44 +02:00

test

replica/memtable: move region_listener handlers from dirty_memory_manager to memtable

2025-06-20 11:42:30 +02:00

tools

Merge 'test: introduce upgrade tests to test.py, add a SSTable dict compression upgrade test' from Michał Chojnowski

2025-06-18 12:21:21 +03:00

tracing

tracing: trace_keyspace_helper: use named gate

2025-04-12 11:29:48 +03:00

transport

Revert "Merge 'Atomic in-memory schema changes application' from Marcin Maliszkiewicz"

2025-06-16 22:38:12 +03:00

types

…

unified

…

utils

s3: Mark claimed_buffer constructor noexcept

2025-06-18 20:36:45 +03:00

.clang-format

…

.dockerignore

…

.gitattributes

…

.gitignore

…

.gitmodules

build: replace tools/java submodule with packaged cassandra-stress

2025-04-15 10:11:28 +03:00

.gitorderfile

…

.mailmap

…

absl-flat_hash_map.cc

…

absl-flat_hash_map.hh

…

amplify.yml

…

backlog_controller.hh

…

build_mode.hh

…

bytes_fwd.hh

…

bytes_ostream.hh

…

bytes.cc

…

bytes.hh

bytes: adapt fmt_hex to std::span<const std::byte>

2025-04-01 00:07:27 +02:00

cache_temperature.hh

…

cartesian_product.hh

…

cell_locking.hh

…

client_data.cc

…

client_data.hh

…

clocks-impl.cc

…

clocks-impl.hh

…

clustering_bounds_comparator.hh

interval: rename start_ref() back to start() (and end_ref() etc).

2025-06-14 21:26:16 +03:00

clustering_interval_set.hh

…

clustering_key_filter.hh

…

clustering_ranges_walker.hh

interval: rename start_ref() back to start() (and end_ref() etc).

2025-06-14 21:26:16 +03:00

CMakeLists.txt

Move direct_failure_detector from root to service/

2025-04-08 13:03:24 +03:00

collection_mutation.cc

…

collection_mutation.hh

…

column_computation.hh

…

combine.hh

…

compound_compat.hh

…

compound.hh

…

compress.cc

transport/server: silence the oversized allocation warning in snappy_compress

2025-06-10 19:13:26 +03:00

compress.hh

db/config: add an option that disables dict-aware sstable compressors in DDL statements

2025-06-09 13:30:40 +03:00

concrete_types.hh

…

configure.py

Revert "Merge 'Atomic in-memory schema changes application' from Marcin Maliszkiewicz"

2025-06-16 22:38:12 +03:00

CONTRIBUTING.md

…

converting_mutation_partition_applier.cc

…

converting_mutation_partition_applier.hh

…

counters.cc

…

counters.hh

…

coverage_excludes.txt

…

coverage_sources.list

…

cql_serialization_format.hh

…

db_clock.hh

…

debug.cc

…

debug.hh

…

default.nix

…

Doxyfile

…

duration.cc

…

duration.hh

…

encoding_stats.hh

…

enum_set.hh

…

fix_system_distributed_tables.py

…

flake.lock

…

flake.nix

…

frozen_schema.cc

Revert "Merge 'Atomic in-memory schema changes application' from Marcin Maliszkiewicz"

2025-06-16 22:38:12 +03:00

frozen_schema.hh

Revert "Merge 'Atomic in-memory schema changes application' from Marcin Maliszkiewicz"

2025-06-16 22:38:12 +03:00

full_position.hh

…

gc_clock.hh

…

gdbinit

…

gen_segmented_compress_params.py

…

generic_server.cc

generic_server: make shutdown() return void

2025-05-27 19:31:09 +02:00

generic_server.hh

generic_server: make shutdown() return void

2025-05-27 19:31:09 +02:00

HACKING.md

build: replace tools/java submodule with packaged cassandra-stress

2025-04-15 10:11:28 +03:00

hashing_partition_visitor.hh

…

idl-compiler.py

idl, message: make with_timeout and cancellable verb attributes composable

2025-04-30 11:45:51 +03:00

inet_address_vectors.hh

…

init.cc

compress: distribute compression dictionaries over shards

2025-05-07 14:43:18 +02:00

init.hh

Move object_storage.yaml endpoints to scylla.yaml

2025-03-31 13:39:39 +03:00

install-dependencies.sh

toolchain: set scylla-driver release based on tools/cqlsh

2025-05-15 06:08:14 +03:00

install.sh

install.sh: simplify check_usermode_support()

2025-02-24 11:29:30 +03:00

interval.hh

interval: reduce sizeof

2025-06-14 21:29:43 +03:00

keys.cc

…

keys.hh

…

LICENSE-ScyllaDB-Source-Available.md

…

main.cc

main: don't start maintenance auth service if not enabled

2025-06-18 11:27:08 +02:00

map_difference.hh

…

marshal_exception.hh

…

multishard_mutation_query.cc

readers/mutation_source: s/make_reader_v2/make_mutation_reader/

2025-05-09 07:53:29 -04:00

multishard_mutation_query.hh

…

mutation_query.cc

…

mutation_query.hh

…

NOTICE.txt

…

ORIGIN

…

partition_builder.hh

…

partition_range_compat.hh

interval: rename start_ref() back to start() (and end_ref() etc).

2025-06-14 21:26:16 +03:00

partition_slice_builder.cc

tree: Remove unused boost headers

2025-02-25 10:32:32 +03:00

partition_slice_builder.hh

…

partition_snapshot_reader.hh

…

protocol_server.hh

…

querier.cc

interval: rename start_ref() back to start() (and end_ref() etc).

2025-06-14 21:26:16 +03:00

querier.hh

readers/mutation_source: s/make_reader_v2/make_mutation_reader/

2025-05-09 07:53:29 -04:00

query_id.hh

…

query_ranges_to_vnodes.cc

interval: rename start_ref() back to start() (and end_ref() etc).

2025-06-14 21:26:16 +03:00

query_ranges_to_vnodes.hh

…

query_result_merger.hh

…

query-request.hh

interval: rename start_ref() back to start() (and end_ref() etc).

2025-06-14 21:26:16 +03:00

query-result-reader.hh

…

query-result-set.cc

…

query-result-set.hh

…

query-result-writer.hh

…

query-result.hh

…

query.cc

interval: rename start_ref() back to start() (and end_ref() etc).

2025-06-14 21:26:16 +03:00

reader_concurrency_semaphore_group.cc

…

reader_concurrency_semaphore_group.hh

…

reader_concurrency_semaphore.cc

reader_concurrency_semaphore: use named gate

2025-04-12 11:28:48 +03:00

reader_concurrency_semaphore.hh

reader_concurrency_semaphore: use named gate

2025-04-12 11:28:48 +03:00

reader_permit.hh

…

README.md

…

real_dirty_memory_accounter.hh

…

release.cc

…

release.hh

…

reversibly_mergeable.hh

…

schema_mutations.cc

…

schema_mutations.hh

…

schema_upgrader.hh

…

scylla_post_install.sh

…

scylla-gdb.py

gdb: adjust unordered container accessors for libstdc++15

2025-06-18 09:15:03 +03:00

SCYLLA-VERSION-GEN

Update ScyllaDB version to: 2025.3.0-dev

2025-05-07 11:43:11 +03:00

seastarx.hh

…

serialization_visitors.hh

…

serializer_impl.hh

serializer_impl.hh: add as_input_stream(managed_bytes_view) overload

2025-05-13 10:32:32 +02:00

serializer.cc

…

serializer.hh

…

service_permit.hh

…

shell.nix

…

sstable_dict_autotrainer.cc

compress: distribute compression dictionaries over shards

2025-05-07 14:43:18 +02:00

sstable_dict_autotrainer.hh

dict_autotrainer: introduce sstable_dict_autotrainer

2025-04-01 00:07:30 +02:00

sstables_loader.cc

Add support for nodetool refresh --skip-reshape

2025-06-10 12:52:13 +03:00

sstables_loader.hh

Add support for nodetool refresh --skip-reshape

2025-06-10 12:52:13 +03:00

supervisor.hh

…

table_helper.cc

…

table_helper.hh

…

test.py

Merge 'test: introduce upgrade tests to test.py, add a SSTable dict compression upgrade test' from Michał Chojnowski

2025-06-18 12:21:21 +03:00

timeout_config.cc

…

timeout_config.hh

…

timestamp.hh

…

tombstone_gc_extension.hh

schema: deprecate schema_extension

2025-03-19 20:36:16 +02:00

tombstone_gc_options.cc

…

tombstone_gc_options.hh

…

tombstone_gc-internals.hh

…

tombstone_gc.cc

Merge 'scylla sstable: Add standard extensions and propagate to schema load ' from Calle Wilund

2025-02-26 13:52:47 +02:00

tombstone_gc.hh

…

ubsan-suppressions.supp

…

unimplemented.cc

…

unimplemented.hh

…

validation.cc

tree: Make values mutable to enable move semantics

2025-03-03 13:53:02 +03:00

validation.hh

…

version.hh

…

view_info.hh

base_info: remove the lw_shared_ptr variant

2025-04-24 01:08:40 +02:00

vint-serialization.cc

…

vint-serialization.hh

…

README.md

Scylla

What is Scylla?

Scylla is the real-time big data database that is API-compatible with Apache Cassandra and Amazon DynamoDB. Scylla embraces a shared-nothing approach that increases throughput and storage capacity to realize order-of-magnitude performance improvements and reduce hardware costs.

For more information, please see the ScyllaDB web site.

Build Prerequisites

Scylla is fairly fussy about its build environment, requiring very recent versions of the C++23 compiler and of many libraries to build. The document HACKING.md includes detailed information on building and developing Scylla, but to get Scylla building quickly on (almost) any build machine, Scylla offers a frozen toolchain, This is a pre-configured Docker image which includes recent versions of all the required compilers, libraries and build tools. Using the frozen toolchain allows you to avoid changing anything in your build machine to meet Scylla's requirements - you just need to meet the frozen toolchain's prerequisites (mostly, Docker or Podman being available).

Building Scylla

Building Scylla with the frozen toolchain dbuild is as easy as:

$ git submodule update --init --force --recursive
$ ./tools/toolchain/dbuild ./configure.py
$ ./tools/toolchain/dbuild ninja build/release/scylla

For further information, please see:

Developer documentation for more information on building Scylla.
Build documentation on how to build Scylla binaries, tests, and packages.
Docker image build documentation for information on how to build Docker images.

Running Scylla

To start Scylla server, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --workdir tmp --smp 1 --developer-mode 1

This will start a Scylla node with one CPU core allocated to it and data files stored in the tmp directory. The --developer-mode is needed to disable the various checks Scylla performs at startup to ensure the machine is configured for maximum performance (not relevant on development workstations). Please note that you need to run Scylla with dbuild if you built it with the frozen toolchain.

For more run options, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --help

Testing

See test.py manual.

Scylla APIs and compatibility

By default, Scylla is compatible with Apache Cassandra and its API - CQL. There is also support for the API of Amazon DynamoDB™, which needs to be enabled and configured in order to be used. For more information on how to enable the DynamoDB™ API in Scylla, and the current compatibility of this feature as well as Scylla-specific extensions, see Alternator and Getting started with Alternator.

Documentation

Documentation can be found here. Seastar documentation can be found here. User documentation can be found here.

Training

Training material and online courses can be found at Scylla University. The courses are free, self-paced and include hands-on examples. They cover a variety of topics including Scylla data modeling, administration, architecture, basic NoSQL concepts, using drivers for application development, Scylla setup, failover, compactions, multi-datacenters and how Scylla integrates with third-party applications.

Contributing to Scylla

If you want to report a bug or submit a pull request or a patch, please read the contribution guidelines.

If you are a developer working on Scylla, please read the developer guidelines.

Contact

The community forum and Slack channel are for users to discuss configuration, management, and operations of ScyllaDB.
The developers mailing list is for developers and people interested in following the development of ScyllaDB to discuss technical topics.

Languages

C++ 72.5%

Python 26.2%

CMake 0.4%

GAP 0.3%

Shell 0.3%