Go to file

Michał Chojnowski 975e7e405a memtable: ensure _flushed_memory doesn't grow above total memory usage

dirty_memory_manager tracks two quantities about memtable memory usage:
"real" and "unspooled" memory usage.

"real" is the total memory usage (sum of `occupancy().total_space()`)
by all memtable LSA regions, plus a upper-bound estimate of the size of
memtable data which has already moved to the cache region but isn't
evictable (merged into the cache) yet.

"unspooled" is the difference between total memory usage by all memtable
LSA regions, and the total flushed memory (sum of `_flushed_memory`)
of memtables.

dirty_memory_manager controls the shares of compaction and/or blocks
writes when these quantities cross various thresholds.

"Total flushed memory" isn't a well defined notion,
since the actual consumption of memory by the same data can vary over
time due to LSA compactions, and even the data present in memtable can
change over the course of the flush due to removals of outdated MVCC versions.
So `_flushed_memory` is merely an approximation computed by `flush_reader`
based on the data passing through it.

This approximation is supposed to be a conservative lower bound.
In particular, `_flushed_memory` should be not greater than
`occupancy().total_space()`. Otherwise, for example, "unspooled" memory
could become negative (and/or wrap around) and weird things could happen.
There is an assertion in ~flush_memory_accounter which checks that
`_flushed_memory < occupancy().total_space()` at the end of flush.

But it can fail. Without additional treatment, the memtable reader sometimes emits
data which is already deleted. (In particular, it emites rows covered by
a partition tombstone in a newer MVCC version.)
This data is seen `flush_reader` and accounted in `_flushed_memory`.
But this data can be garbage-collected by the mutation_cleaner later during the
flush and decrease `total_memory` below `_flushed_memory`.

There is a piece of code in mutation_cleaner intended to prevent that.
If `total_memory` decreases during a `mutation_cleaner` run,
`_flushed_memory` is lowered by the same amount, just to preserve the
asserted property. (This could also make `_flushed_memory` quite inaccurate,
but that's considered acceptable).

But that only works if `total_memory` is decreased during that run. It doesn't
work if the `total_memory` decrease (enabled by the new allocator holes made
by `mutation_cleaner`'s garbage collection work) happens asynchronously
(due to memory reclaim for whatever reason) after the run.

This patch fixes that by tracking the decreases of `total_memory` closer to the
source. Instead of relying on `mutation_cleaner` to notify the memtable if it
lowers `total_memory`, the memtable itself listens for notifications about
LSA segment deallocations. It keeps `_flushed_memory` equal to the reader's
estimate of flushed memory decreased by the change in `total_memory` since the
beginning of flush (if it was positive), and it keeps the amount of "spooled"
memory reported to the `dirty_memory_manager` at `max(0, _flushed_memory)`.

2025-06-20 11:42:30 +02:00

.github

.github/workflows/conflict_reminder: reduce the amount of conflict reminder for every push event

2025-05-28 11:01:44 +03:00

abseil @ d7aaad83b4

…

alternator

alternator/stats.cc, metrics-config.yml: docs fix per-table metrics

2025-06-15 18:06:36 +03:00

api

api: Shorten get_simple_states() handler

2025-06-16 15:21:27 +03:00

audit

audit: add semaphore to audit_syslog_storage_helper

2025-04-08 16:24:42 +02:00

auth

Revert "Merge 'Atomic in-memory schema changes application' from Marcin Maliszkiewicz"

2025-06-16 22:38:12 +03:00

bin

treewide: improve bash error reporting

2025-02-10 18:28:52 +03:00

cdc

cdc: add sanity check for generating an empty generation

2025-04-25 11:25:07 +02:00

cmake

build: when compiling without -g, don't leave debugging information

2025-05-12 15:42:17 +03:00

compaction

Merge 'Extend compaction_history table with additional compaction statistics' from Łukasz Paszkowski

2025-05-27 14:12:13 +03:00

conf

Merge 'Add tablet enforcing option' from Benny Halevy

2025-04-03 16:32:19 +03:00

cql3

Revert "Merge 'Atomic in-memory schema changes application' from Marcin Maliszkiewicz"

2025-06-16 22:38:12 +03:00

data_dictionary

Merge 'scylla-sstable: add native S3 support' from Ernest Zaslavsky

2025-03-14 15:05:52 +02:00

Revert "Merge 'Atomic in-memory schema changes application' from Marcin Maliszkiewicz"

2025-06-16 22:38:12 +03:00

debug

…

dht

interval: reduce sizeof

2025-06-14 21:29:43 +03:00

dist

dist: rpm: override %_sbindir for Fedora 42

2025-05-16 12:05:29 +02:00

docs

doc: add support for z3 GCP

2025-06-17 13:50:46 +03:00

ent

encryption/utils: Move encryption httpclient to "general" REST client

2025-05-30 12:21:51 +03:00

exceptions

transport: storage_proxy: release ERM when waiting for query timeout

2025-04-23 09:29:47 +02:00

gms

load_and_stream: Add abortion flow to mutation streaming

2025-05-27 14:21:58 +03:00

idl

interval: change start()/end() not to return references to data members

2025-06-14 21:26:17 +03:00

index

interval: rename start() to start_ref() (and end() etc).

2025-06-14 21:26:16 +03:00

lang

treewide: Reduce db/config.hh header fanout

2025-02-25 15:16:40 +01:00

licenses

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

locator

interval: rename start_ref() back to start() (and end_ref() etc).

2025-06-14 21:26:16 +03:00

message

dht: fragment token_range_vector

2025-05-27 14:47:24 +03:00

mutation

interval: rename start_ref() back to start() (and end_ref() etc).

2025-06-14 21:26:16 +03:00

mutation_writer

readers/mutation_reader: s/reader_consumer_v2/mutation_reader_consumer/

2025-05-09 07:53:29 -04:00

node_ops

tasks: check whether a node is alive before rpc

2025-04-17 12:51:22 +02:00

pgo

Update pgo profiles - aarch64

2025-06-15 04:57:59 +03:00

raft

raft: server_impl: use named gate

2025-04-12 11:28:48 +03:00

readers

interval: rename start_ref() back to start() (and end_ref() etc).

2025-06-14 21:26:16 +03:00

redis

transport: generic_server: remove no longer used connection advertising code

2025-05-27 19:31:09 +02:00

reloc

treewide: improve bash error reporting

2025-02-10 18:28:52 +03:00

repair

interval: rename start_ref() back to start() (and end_ref() etc).

2025-06-14 21:26:16 +03:00

replica

memtable: ensure _flushed_memory doesn't grow above total memory usage

2025-06-20 11:42:30 +02:00

rust

rust: update dependencies

2025-03-04 09:45:23 +02:00

schema

Revert "Merge 'Atomic in-memory schema changes application' from Marcin Maliszkiewicz"

2025-06-16 22:38:12 +03:00

scripts

alternator/stats.cc, metrics-config.yml: docs fix per-table metrics

2025-06-15 18:06:36 +03:00

seastar @ 26badcb14c

Update seastar submodule

2025-06-03 13:47:05 +03:00

service

Revert "Merge 'Atomic in-memory schema changes application' from Marcin Maliszkiewicz"

2025-06-16 22:38:12 +03:00

sstables

interval: rename start_ref() back to start() (and end_ref() etc).

2025-06-14 21:26:16 +03:00

streaming

Merge 'token_range_vector: fragment' from Avi Kivity

2025-05-29 18:45:13 +02:00

swagger-ui @ 12f1da1082

…

tasks

test: add test for getting tasks children

2025-04-17 13:48:44 +02:00

test

replica/memtable: move region_listener handlers from dirty_memory_manager to memtable

2025-06-20 11:42:30 +02:00

tools

Merge 'test: introduce upgrade tests to test.py, add a SSTable dict compression upgrade test' from Michał Chojnowski

2025-06-18 12:21:21 +03:00

tracing

tracing: trace_keyspace_helper: use named gate

2025-04-12 11:29:48 +03:00

transport

Revert "Merge 'Atomic in-memory schema changes application' from Marcin Maliszkiewicz"

2025-06-16 22:38:12 +03:00

types

allow "UTC" and "GMT" in string format of timestamp

2025-02-12 09:38:28 +02:00

unified

treewide: improve bash error reporting

2025-02-10 18:28:52 +03:00

utils

s3: Mark claimed_buffer constructor noexcept

2025-06-18 20:36:45 +03:00

.clang-format

…

.dockerignore

…

.gitattributes

configure.py: prepare the build for a default PGO profile in version control

2024-12-27 16:16:04 +08:00

.gitignore

…

.gitmodules

build: replace tools/java submodule with packaged cassandra-stress

2025-04-15 10:11:28 +03:00

.gitorderfile

…

.mailmap

…

absl-flat_hash_map.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

absl-flat_hash_map.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

amplify.yml

…

backlog_controller.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

build_mode.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

bytes_fwd.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

bytes_ostream.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

bytes.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

bytes.hh

bytes: adapt fmt_hex to std::span<const std::byte>

2025-04-01 00:07:27 +02:00

cache_temperature.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

cartesian_product.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

cell_locking.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

client_data.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

client_data.hh

transport/server: use scheduling group assigned to current user

2025-01-02 07:13:34 +01:00

clocks-impl.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

clocks-impl.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

clustering_bounds_comparator.hh

interval: rename start_ref() back to start() (and end_ref() etc).

2025-06-14 21:26:16 +03:00

clustering_interval_set.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

clustering_key_filter.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

clustering_ranges_walker.hh

interval: rename start_ref() back to start() (and end_ref() etc).

2025-06-14 21:26:16 +03:00

CMakeLists.txt

Move direct_failure_detector from root to service/

2025-04-08 13:03:24 +03:00

collection_mutation.cc

tree: Remove unused boost headers

2025-02-15 20:32:22 +02:00

collection_mutation.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

column_computation.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

combine.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

compound_compat.hh

utils: do not include unused headers

2025-01-14 07:56:39 -05:00

compound.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

compress.cc

transport/server: silence the oversized allocation warning in snappy_compress

2025-06-10 19:13:26 +03:00

compress.hh

db/config: add an option that disables dict-aware sstable compressors in DDL statements

2025-06-09 13:30:40 +03:00

concrete_types.hh

types: implement vector_type_impl

2025-01-26 19:36:41 +01:00

configure.py

Revert "Merge 'Atomic in-memory schema changes application' from Marcin Maliszkiewicz"

2025-06-16 22:38:12 +03:00

CONTRIBUTING.md

Fix typos

2025-02-11 00:17:43 +02:00

converting_mutation_partition_applier.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

converting_mutation_partition_applier.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

counters.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

counters.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

coverage_excludes.txt

…

coverage_sources.list

…

cql_serialization_format.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

db_clock.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

debug.cc

gdb: protect debug::the_database from lto

2025-01-23 22:26:04 +02:00

debug.hh

gdb: protect debug::the_database from lto

2025-01-23 22:26:04 +02:00

default.nix

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

Doxyfile

…

duration.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

duration.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

encoding_stats.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

enum_set.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

fix_system_distributed_tables.py

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

flake.lock

…

flake.nix

…

frozen_schema.cc

Revert "Merge 'Atomic in-memory schema changes application' from Marcin Maliszkiewicz"

2025-06-16 22:38:12 +03:00

frozen_schema.hh

Revert "Merge 'Atomic in-memory schema changes application' from Marcin Maliszkiewicz"

2025-06-16 22:38:12 +03:00

full_position.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

gc_clock.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

gdbinit

…

gen_segmented_compress_params.py

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

generic_server.cc

generic_server: make shutdown() return void

2025-05-27 19:31:09 +02:00

generic_server.hh

generic_server: make shutdown() return void

2025-05-27 19:31:09 +02:00

HACKING.md

build: replace tools/java submodule with packaged cassandra-stress

2025-04-15 10:11:28 +03:00

hashing_partition_visitor.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

idl-compiler.py

idl, message: make with_timeout and cancellable verb attributes composable

2025-04-30 11:45:51 +03:00

inet_address_vectors.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

init.cc

compress: distribute compression dictionaries over shards

2025-05-07 14:43:18 +02:00

init.hh

Move object_storage.yaml endpoints to scylla.yaml

2025-03-31 13:39:39 +03:00

install-dependencies.sh

toolchain: set scylla-driver release based on tools/cqlsh

2025-05-15 06:08:14 +03:00

install.sh

install.sh: simplify check_usermode_support()

2025-02-24 11:29:30 +03:00

interval.hh

interval: reduce sizeof

2025-06-14 21:29:43 +03:00

keys.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

keys.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

LICENSE-ScyllaDB-Source-Available.md

Fix typos

2025-02-13 01:54:08 +02:00

main.cc

main: don't start maintenance auth service if not enabled

2025-06-18 11:27:08 +02:00

map_difference.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

marshal_exception.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

multishard_mutation_query.cc

readers/mutation_source: s/make_reader_v2/make_mutation_reader/

2025-05-09 07:53:29 -04:00

multishard_mutation_query.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

mutation_query.cc

schema: deinline some speculative_retry methods

2025-01-02 12:28:33 +01:00

mutation_query.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

NOTICE.txt

…

ORIGIN

…

partition_builder.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

partition_range_compat.hh

interval: rename start_ref() back to start() (and end_ref() etc).

2025-06-14 21:26:16 +03:00

partition_slice_builder.cc

tree: Remove unused boost headers

2025-02-25 10:32:32 +03:00

partition_slice_builder.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

partition_snapshot_reader.hh

moved cache files to db

2025-02-04 12:21:31 +03:00

protocol_server.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

querier.cc

interval: rename start_ref() back to start() (and end_ref() etc).

2025-06-14 21:26:16 +03:00

querier.hh

readers/mutation_source: s/make_reader_v2/make_mutation_reader/

2025-05-09 07:53:29 -04:00

query_id.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

query_ranges_to_vnodes.cc

interval: rename start_ref() back to start() (and end_ref() etc).

2025-06-14 21:26:16 +03:00

query_ranges_to_vnodes.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

query_result_merger.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

query-request.hh

interval: rename start_ref() back to start() (and end_ref() etc).

2025-06-14 21:26:16 +03:00

query-result-reader.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

query-result-set.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

query-result-set.hh

db: prevent accidental copies of result_set_row by making it move-only

2025-02-17 09:48:08 +02:00

query-result-writer.hh

query-result-writer: reorder initialization to prevent use-after-move

2025-02-17 13:45:35 +03:00

query-result.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

query.cc

interval: rename start_ref() back to start() (and end_ref() etc).

2025-06-14 21:26:16 +03:00

reader_concurrency_semaphore_group.cc

treewide: fix misspellings

2025-01-05 16:13:09 +02:00

reader_concurrency_semaphore_group.hh

tree: Remove unused boost headers

2025-02-15 20:32:22 +02:00

reader_concurrency_semaphore.cc

reader_concurrency_semaphore: use named gate

2025-04-12 11:28:48 +03:00

reader_concurrency_semaphore.hh

reader_concurrency_semaphore: use named gate

2025-04-12 11:28:48 +03:00

reader_permit.hh

reader_permit: mark check_abort() as const

2025-02-07 01:32:35 -05:00

README.md

README: adjust to reflect license change

2025-01-30 10:28:32 +03:00

real_dirty_memory_accounter.hh

moved cache files to db

2025-02-04 12:21:31 +03:00

release.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

release.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

reversibly_mergeable.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

schema_mutations.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

schema_mutations.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

schema_upgrader.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

scylla_post_install.sh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

scylla-gdb.py

gdb: adjust unordered container accessors for libstdc++15

2025-06-18 09:15:03 +03:00

SCYLLA-VERSION-GEN

Update ScyllaDB version to: 2025.3.0-dev

2025-05-07 11:43:11 +03:00

seastarx.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

serialization_visitors.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

serializer_impl.hh

serializer_impl.hh: add as_input_stream(managed_bytes_view) overload

2025-05-13 10:32:32 +02:00

serializer.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

serializer.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

service_permit.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

shell.nix

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

sstable_dict_autotrainer.cc

compress: distribute compression dictionaries over shards

2025-05-07 14:43:18 +02:00

sstable_dict_autotrainer.hh

dict_autotrainer: introduce sstable_dict_autotrainer

2025-04-01 00:07:30 +02:00

sstables_loader.cc

Add support for nodetool refresh --skip-reshape

2025-06-10 12:52:13 +03:00

sstables_loader.hh

Add support for nodetool refresh --skip-reshape

2025-06-10 12:52:13 +03:00

supervisor.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

table_helper.cc

audit: Add service level support to CQL login process

2025-01-15 11:10:36 +01:00

table_helper.hh

audit: Add the audit subsystem

2025-01-15 11:10:35 +01:00

test.py

Merge 'test: introduce upgrade tests to test.py, add a SSTable dict compression upgrade test' from Michał Chojnowski

2025-06-18 12:21:21 +03:00

timeout_config.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

timeout_config.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

timestamp.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

tombstone_gc_extension.hh

schema: deprecate schema_extension

2025-03-19 20:36:16 +02:00

tombstone_gc_options.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

tombstone_gc_options.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

tombstone_gc-internals.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

tombstone_gc.cc

Merge 'scylla sstable: Add standard extensions and propagate to schema load ' from Calle Wilund

2025-02-26 13:52:47 +02:00

tombstone_gc.hh

repair: Wire repair_time in system.tablets for tombstone gc

2025-01-17 16:12:05 +08:00

ubsan-suppressions.supp

…

unimplemented.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

unimplemented.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

validation.cc

tree: Make values mutable to enable move semantics

2025-03-03 13:53:02 +03:00

validation.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

version.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

view_info.hh

base_info: remove the lw_shared_ptr variant

2025-04-24 01:08:40 +02:00

vint-serialization.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

vint-serialization.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

README.md

Scylla

What is Scylla?

Scylla is the real-time big data database that is API-compatible with Apache Cassandra and Amazon DynamoDB. Scylla embraces a shared-nothing approach that increases throughput and storage capacity to realize order-of-magnitude performance improvements and reduce hardware costs.

For more information, please see the ScyllaDB web site.

Build Prerequisites

Scylla is fairly fussy about its build environment, requiring very recent versions of the C++23 compiler and of many libraries to build. The document HACKING.md includes detailed information on building and developing Scylla, but to get Scylla building quickly on (almost) any build machine, Scylla offers a frozen toolchain, This is a pre-configured Docker image which includes recent versions of all the required compilers, libraries and build tools. Using the frozen toolchain allows you to avoid changing anything in your build machine to meet Scylla's requirements - you just need to meet the frozen toolchain's prerequisites (mostly, Docker or Podman being available).

Building Scylla

Building Scylla with the frozen toolchain dbuild is as easy as:

$ git submodule update --init --force --recursive
$ ./tools/toolchain/dbuild ./configure.py
$ ./tools/toolchain/dbuild ninja build/release/scylla

For further information, please see:

Developer documentation for more information on building Scylla.
Build documentation on how to build Scylla binaries, tests, and packages.
Docker image build documentation for information on how to build Docker images.

Running Scylla

To start Scylla server, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --workdir tmp --smp 1 --developer-mode 1

This will start a Scylla node with one CPU core allocated to it and data files stored in the tmp directory. The --developer-mode is needed to disable the various checks Scylla performs at startup to ensure the machine is configured for maximum performance (not relevant on development workstations). Please note that you need to run Scylla with dbuild if you built it with the frozen toolchain.

For more run options, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --help

Testing

See test.py manual.

Scylla APIs and compatibility

By default, Scylla is compatible with Apache Cassandra and its API - CQL. There is also support for the API of Amazon DynamoDB™, which needs to be enabled and configured in order to be used. For more information on how to enable the DynamoDB™ API in Scylla, and the current compatibility of this feature as well as Scylla-specific extensions, see Alternator and Getting started with Alternator.

Documentation

Documentation can be found here. Seastar documentation can be found here. User documentation can be found here.

Training

Training material and online courses can be found at Scylla University. The courses are free, self-paced and include hands-on examples. They cover a variety of topics including Scylla data modeling, administration, architecture, basic NoSQL concepts, using drivers for application development, Scylla setup, failover, compactions, multi-datacenters and how Scylla integrates with third-party applications.

Contributing to Scylla

If you want to report a bug or submit a pull request or a patch, please read the contribution guidelines.

If you are a developer working on Scylla, please read the developer guidelines.

Contact

The community forum and Slack channel are for users to discuss configuration, management, and operations of ScyllaDB.
The developers mailing list is for developers and people interested in following the development of ScyllaDB to discuss technical topics.

Languages

C++ 73.5%

Python 25.3%

CMake 0.3%

GAP 0.3%

Shell 0.3%