mirror of https://github.com/scylladb/scylladb.git synced 2026-06-01 20:46:56 +00:00

Go to file

Avi Kivity f73e2c992f Merge 'Keep range tombstones with rows in memtables and cache' from Tomasz Grabiec

This series switches memtable and cache to use a new representation for mutation data,
called `mutation_partition_v2`. In this representation, range tombstone information is stored
in the same tree as rows, attached to row entries. Each entry has a new tombstone field,
which represents range tombstone part which applies to the interval between this entry and
the previous one. See docs/dev/mvcc.md for more details about the format.

The transient mutation object still uses the old model in order to avoid work needed to adapt
old code to the new model. It may also be a good idea to live with two models, since the
transient mutation has different requirements and thus different trade-offs can be made.
Transient mutation doesn't need to support eviction and strong exception guarantees,
so its algorithms and in-memory representation can be simpler.

This allows us to incrementally evict range tombstone information. Before this series,
range tombstones were accumulated and evicted only when the whole partition entry was evicted. This
could lead to inefficient use of cache memory.

Another advantage of the new representation is that reads don't have to lookup
range tombstone information in a different tree while reading. This leads to simpler
and more efficient readers.

There are several disadvantages too. Firstly, rows_entry is now larger by 16 bytes.
Secondly, update algorithms are more complex because they need to deoverlap range tombstone
information. Also, to handle preemption and provide strong exception guarantees, update
algorithms may need to allocate sentinel entries, which adds complexity and reduces performance.

The memtable reader was changed to use the same cursor implementation
which cache uses, for improved code reuse and reducing risk of bugs
due to discrepancy of algorithms which deal with MVCC.

Remaining work:
  - performance optimizations to apply_monotonically() to avoid regressions
  - performance testing
  - preemption support in apply_to_incomplete (cache update from memtable)

Fixes #2578
Fixes #3288
Fixes #10587

Closes #12048

* github.com:scylladb/scylladb:
  test: mvcc: Extend some scenarios with exhaustive consistency checks on eviction
  test: mvcc: Extract mvcc_container::allocate_in_region()
  row_cache, lru: Introduce evict_shallow()
  test: mvcc: Avoid copies of mutation under failure injection
  test: mvcc: Add missing logalloc::reclaim_lock to test_apply_is_atomic
  mutation_partition_v2: Avoid full scan when applying mutation to non-evictable
  Pass is_evictable to apply()
  tests: mutation_partition_v2: Introduce test_external_memory_usage_v2 mirroring the test for v1
  tests: mutation: Fix test_external_memory_usage() to not measure mutation object footprint
  tests: mutation_partition_v2: Add test for exception safety of mutation merging
  tests: Add tests for the mutation_partition_v2 model
  mutation_partition_v2: Implement compact()
  cache_tracker: Extract insert(mutation_partition_v2&)
  mvcc, mutation_partition: Document guarantees in case merging succeeds
  mutation_partition_v2: Accept arbitrary preemption source in apply_monotonically()
  mutation_partition_v2: Simplify get_continuity()
  row_cache: Distinguish dummy insertion site in trace log
  db: Use mutation_partition_v2 in mvcc
  range_tombstone_change_merger: Introduce peek()
  readers: Extract range_tombstone_change_merger
  mvcc: partition_snapshot_row_cursor: Handle non-evictable snapshots
  mvcc: partition_snapshot_row_cursor: Support digest calculation
  mutation_partition_v2: Store range tombstones together with rows
  db: Introduce mutation_partition_v2
  doc: Introduce docs/dev/mvcc.md
  db: cache_tracker: Introduce insert() variant which positions before existing entry in the LRU
  db: Print range_tombstone bounds as position_in_partition
  test: memtable_test: Relax test_segment_migration_during_flush
  test: cache_flat_mutation_reader: Avoid timestamp clash
  test: cache_flat_mutation_reader_test: Use monotonic timestamps when inserting rows
  test: mvcc: Fix sporadic failures due to compact_for_compaction()
  test: lib: random_mutation_generator: Produce partition tombstone less often
  test: lib: random_utils: Introduce with_probability()
  test: lib: Improve error message in has_same_continuity()
  test: mvcc: mvcc_container: Avoid UB in tracker() getter when there is no tracker
  test: mvcc: Insert entries in the tracker
  test: mvcc_test: Do not set dummy::no on non-clustering rows
  mutation_partition: Print full position in error report in append_clustered_row()
  db: mutation_cleaner: Extract make_region_space_guard()
  position_in_partition: Optimize equality check
  mvcc: Fix version merging state resetting
  mutation_partition: apply_resume: Mark operator bool() as explicit

2023-02-05 22:33:10 +02:00

.github

Update CODEOWNERS file

2022-12-06 19:26:03 +02:00

alternator

alternator::streams: Special case single table in list_streams

2023-01-24 09:14:33 +00:00

api

api: get task statuses recursively

2023-01-11 12:34:06 +01:00

auth

api: Add API for resetting authorization cache

2022-06-28 19:58:06 -03:00

cdc

test: relax NULL check test predicate

2023-01-18 10:38:24 +02:00

compaction

compaction: Fix inefficiency when updating LCS backlog tracker

2023-02-01 15:19:07 +02:00

conf

conf: enable consistent_cluster_management by default

2023-01-20 13:29:06 +01:00

cql3

cql/query_options: add a check for missing bind marker name

2023-02-04 02:13:34 +02:00

data_dictionary

data_dictonary: add get_all_keyspaces() and get_user_keyspaces()

2022-12-10 12:51:05 +01:00

Merge 'Keep range tombstones with rows in memtables and cache' from Tomasz Grabiec

2023-02-05 22:33:10 +02:00

debug

…

dht

Merge 'table: trim ranges for compaction group cleanup' from Benny Halevy

2023-01-30 13:11:28 +02:00

direct_failure_detector

direct_failure_detector: don't change meaning of endpoint_liveness

2022-11-28 21:58:30 +02:00

dist

dist/docker: support --replace-node-first-boot

2023-01-13 18:36:09 +02:00

docs

Merge 'Keep range tombstones with rows in memtables and cache' from Tomasz Grabiec

2023-02-05 22:33:10 +02:00

exceptions

exception: fix the error code used for rate_limit_exception

2022-09-13 11:46:15 +02:00

gms

raft: replace experimental raft option with dedicated flag

2023-01-03 11:15:11 +02:00

idl

forward_service: fix timeout support in parallel aggregates

2023-01-16 12:08:13 +02:00

index

rjson: avoid copy constructors in from_string calls when possible

2023-01-16 15:15:26 +01:00

interface

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

lang

Merge 'tools/scylla-sstable: add lua scripting support' from Botond Dénes

2023-01-09 20:54:42 +02:00

licenses

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

locator

cql3, locator: call fmt::format_to() explicitly

2023-01-30 21:50:11 +08:00

message

messaging: check that a node knows its own topology before accessing it

2023-01-02 11:53:14 +02:00

mutation_writer

position_in_partition: Make after_key() work with non-full keys

2022-12-14 14:47:33 +01:00

raft

raft: Fix non-existing state_machine::apply_entry in docs

2023-01-17 12:53:05 +01:00

readers

range_tombstone_change_merger: Introduce peek()

2023-01-27 19:15:39 +01:00

redis

storage_proxy.hh: Remove unused headers

2022-10-02 20:48:50 +03:00

reloc

SCYLLA-VERSION-GEN: use semver-compatible version

2022-07-25 18:06:28 +03:00

repair

Merge 'Abort repair tasks' from Aleksandra Martyniuk

2023-01-05 15:21:35 +01:00

replica

Merge 'Keep range tombstones with rows in memtables and cache' from Tomasz Grabiec

2023-02-05 22:33:10 +02:00

rust

test: assert that WASM allocations can fail without crashing

2023-01-06 14:07:29 +01:00

scripts

open-coredump.sh: handle dev versions

2023-01-12 19:28:58 +02:00

seastar @ ef24279f03

Update seastar submodule

2023-02-01 17:19:49 +02:00

service

Merge 'Remove qctx from system_keyspace::increment_and_get_generation()' from Pavel Emelyanov

2023-01-24 12:17:12 +02:00

sstables

sstables/sstable: validate_checksums(): force-check EOF

2023-02-01 20:52:46 +02:00

streaming

db/view/view_update_check: check_needs_view_update_path(): filter out non-member hosts

2023-01-27 17:12:45 +03:00

swagger-ui @ 12f1da1082

…

tasks

Merge 'Abort repair tasks' from Aleksandra Martyniuk

2023-01-05 15:21:35 +01:00

test

Merge 'Keep range tombstones with rows in memtables and cache' from Tomasz Grabiec

2023-02-05 22:33:10 +02:00

thrift

treewide: drop cql_serialization_format

2023-01-03 19:54:13 +02:00

tools

tools: toolchain: dbuild: pass NOFILE limit from host to container

2023-01-27 13:56:35 +02:00

tracing

cql3, transport, tests: remove "unset" from value type system

2023-01-16 21:10:56 +02:00

transport

cql3, transport, tests: remove "unset" from value type system

2023-01-16 21:10:56 +02:00

types

types: allow lists with NULL

2023-01-18 10:38:24 +02:00

unified

install.sh: Skip systemd existance check when --without-systemd

2022-11-14 14:07:46 +02:00

utils

row_cache, lru: Introduce evict_shallow()

2023-01-27 21:56:31 +01:00

.dockerignore

…

.gitattributes

gitattributes: Mark *.svg as binary

2022-07-31 15:25:24 +03:00

.gitignore

Add rust/Cargo.lock to .gitignore

2022-10-14 13:54:50 +03:00

.gitmodules

build: drop abseil submodule, replace with distribution abseil

2022-12-28 19:02:23 +02:00

.gitorderfile

…

.mailmap

Add .mailmap

2022-07-04 13:44:28 +03:00

absl-flat_hash_map.cc

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

absl-flat_hash_map.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

amplify.yml

docs: automatic previews configuration

2022-11-04 15:44:22 +02:00

atomic_cell_hash.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

atomic_cell_or_collection.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

atomic_cell.cc

atomic_cell: compare_atomic_cell_for_merge: compare ttl if expiry is equal

2022-03-07 11:05:30 +02:00

atomic_cell.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

backlog_controller.hh

backlog_controller: keep scheduling_group by value

2022-08-02 07:38:40 +03:00

build_mode.hh

release: properly evaluate SCYLLA_BUILD_MODE_* macros

2022-08-29 10:20:19 +03:00

bytes_ostream.hh

bytes_ostream: don't take reference to packed variable

2022-11-28 21:40:18 +02:00

bytes.cc

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

bytes.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

cache_flat_mutation_reader.hh

row_cache: Distinguish dummy insertion site in trace log

2023-01-27 21:56:31 +01:00

cache_temperature.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

caching_options.cc

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

caching_options.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

canonical_mutation.cc

mutation_partition_view: make mutation_partition_view_virtual_visitor stoppable

2022-08-22 20:12:58 +03:00

canonical_mutation.hh

schema, everywhere: define and use table_id as a strong type

2022-08-08 08:09:41 +03:00

cartesian_product.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

cell_locking.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

checked-file-impl.hh

treewide: use system-#include (angle brackets) for seastar

2022-04-26 14:46:42 +03:00

client_data.cc

client_data: Sanitize connection_notifier

2022-02-18 15:02:26 +03:00

client_data.hh

client_data: Sanitize connection_notifier

2022-02-18 15:02:26 +03:00

clocks-impl.cc

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

clocks-impl.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

clustering_bounds_comparator.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

clustering_interval_set.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

clustering_key_filter.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

clustering_ranges_walker.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

CMakeLists.txt

build: drop abseil submodule, replace with distribution abseil

2022-12-28 19:02:23 +02:00

collection_mutation.cc

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

collection_mutation.hh

treewide: drop cql_serialization_format

2023-01-03 19:54:13 +02:00

column_computation.hh

column_computation: adjust to use clustering_or_static_row

2022-12-06 11:21:16 +01:00

combine.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

compatible_ring_position.hh

compatible_ring_position_or_view: make it cheap to copy

2022-10-04 12:00:21 +03:00

compound_compat.hh

compound_compat.hh: add missing methods of iterator

2022-03-08 15:37:03 +02:00

compound.hh

treewide: drop cql_serialization_format

2023-01-03 19:54:13 +02:00

compress.cc

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

compress.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

concrete_types.hh

types: make empty type deserialize to non-null value

2023-01-18 10:38:24 +02:00

configure.py

Merge 'Keep range tombstones with rows in memtables and cache' from Tomasz Grabiec

2023-02-05 22:33:10 +02:00

CONTRIBUTING.md

Add redirections

2022-06-28 09:39:14 +01:00

converting_mutation_partition_applier.cc

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

converting_mutation_partition_applier.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

counters.cc

everywhere: define locator::host_id as a strong tagged_uuid type

2022-08-12 06:01:44 +03:00

counters.hh

everywhere: define locator::host_id as a strong tagged_uuid type

2022-08-12 06:01:44 +03:00

cql_serialization_format.hh

treewide: drop cql_serialization_format

2023-01-03 19:54:13 +02:00

db_clock.hh

gc_clock, db_clock: mark functions noexcept

2022-07-27 13:17:01 +03:00

debug.cc

test: extract debug::the_database out

2023-01-19 17:42:23 +08:00

debug.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

default.nix

build: explicitly add rustc to Nix devenv

2023-01-19 15:53:49 +01:00

digest_algorithm.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

digester.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

Doxyfile

…

duration.cc

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

duration.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

encoding_stats.hh

encoding_state: mark functions noexcept

2022-07-27 13:43:17 +03:00

enum_set.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

fix_system_distributed_tables.py

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

flake.lock

build: bump Lua version (5.3 -> 5.4) in Nix devenv

2023-01-19 15:53:49 +01:00

flake.nix

build: fix Nix devenv

2022-12-19 20:53:07 +02:00

frozen_mutation.cc

schema, everywhere: define and use table_schema_version as a strong type

2022-08-08 08:09:45 +03:00

frozen_mutation.hh

mutation_partition_view: make mutation_partition_view_virtual_visitor stoppable

2022-08-22 20:12:58 +03:00

frozen_schema.cc

idl: make idl headers self-sufficient

2022-08-08 08:02:27 +03:00

frozen_schema.hh

frozen_schema: avoid allocating contiguous memory

2022-02-21 01:39:02 +01:00

full_position.hh

service/storage_proxy: set smallest continue pos as query's continue pos

2022-08-10 06:03:38 +03:00

gc_clock.hh

gc_clock, db_clock: mark functions noexcept

2022-07-27 13:17:01 +03:00

gdbinit

gdbinit: add ignore clause for SIG35

2023-01-12 12:13:04 +02:00

gen_segmented_compress_params.py

treewide: clean up stray license blurbs

2022-02-13 14:16:16 +02:00

generic_server.cc

generic_server: Gentle iterator

2022-02-18 14:25:08 +03:00

generic_server.hh

generic_server.hh: add missing include

2022-04-04 17:31:55 +03:00

HACKING.md

Move dev docs to docs/dev

2022-06-24 18:07:08 +01:00

hashers.cc

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

hashers.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

hashing_partition_visitor.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

hashing.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

idl-compiler.py

idl-compiler: introduce cancellable verbs

2022-08-19 19:15:18 +02:00

inet_address_vectors.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

init.cc

init: do not allow cfg.replace_node_first_boot of seed node

2023-01-13 18:30:48 +02:00

init.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

install-dependencies.sh

build: remove references to unused c bindings of wasmtime

2023-01-06 14:07:29 +01:00

install.sh

install.sh: drop locale workaround from python3 thunk

2022-11-28 13:07:03 +02:00

interval.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

intrusive_set_external_comparator.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

keys.cc

add utf8:validate to operator<< partition_key with_schema.

2022-09-22 16:42:31 +03:00

keys.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

LICENSE.AGPL

…

log.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

main.cc

db: system_keyspace: take the reserved_memory into account

2023-01-24 14:07:44 +02:00

map_difference.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

marshal_exception.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

multishard_mutation_query.cc

database.hh: Remove unused headers

2022-10-04 09:01:38 +03:00

multishard_mutation_query.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

mutation_cleaner.hh

db: mutation_cleaner: Extract make_region_space_guard()

2023-01-27 19:15:38 +01:00

mutation_compactor.hh

mutation_compactor: only pass consumed range-tombstone-change to validator

2023-01-27 14:03:45 +01:00

mutation_consumer_concepts.hh

introduce the MutationConsumer concept

2022-02-28 17:11:54 +02:00

mutation_consumer.hh

mutation{,_consumer,_partition}: remove consume_in_reverse::legacy_half_reverse

2023-01-05 18:48:55 +01:00

mutation_fragment_fwd.hh

flat_mutation_reader: Split readers by file and remove unnecessary includes.

2022-03-14 13:20:25 +02:00

mutation_fragment_stream_validator.hh

mutation_fragment_stream_validator: avoid allocation when stream is correct

2022-11-22 19:19:18 +02:00

mutation_fragment_v2.hh

mutation_fragment_v2: range_tombstone_change: add minimal_memory_usage()

2022-04-28 14:11:51 +03:00

mutation_fragment.cc

position_in_partition: Make after_key() work with non-full keys

2022-12-14 14:47:33 +01:00

mutation_fragment.hh

Merge 'Fix handling of non-full clustering keys in the read path' from Tomasz Grabiec

2022-12-15 10:47:12 +02:00

mutation_partition_serializer.cc

idl: make idl headers self-sufficient

2022-08-08 08:02:27 +03:00

mutation_partition_serializer.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

mutation_partition_v2.cc

mutation_partition_v2: Avoid full scan when applying mutation to non-evictable

2023-01-27 21:56:31 +01:00

mutation_partition_v2.hh

Pass is_evictable to apply()

2023-01-27 21:56:31 +01:00

mutation_partition_view.cc

mutation_partition_view: make mutation_partition_view_virtual_visitor stoppable

2022-08-22 20:12:58 +03:00

mutation_partition_view.hh

mutation_partition_view: make mutation_partition_view_virtual_visitor stoppable

2022-08-22 20:12:58 +03:00

mutation_partition_visitor.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

mutation_partition.cc

mutation_partition_v2: Store range tombstones together with rows

2023-01-27 19:15:39 +01:00

mutation_partition.hh

row_cache, lru: Introduce evict_shallow()

2023-01-27 21:56:31 +01:00

mutation_query.cc

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

mutation_query.hh

query: coroutinize to_data_query_result

2022-05-05 13:32:25 +03:00

mutation_rebuilder.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

mutation_source_metadata.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

mutation.cc

mutation_partition: compact_for_compaction: get tombstone_gc_state

2022-09-07 07:43:15 +03:00

mutation.hh

mutation{,_consumer,_partition}: remove consume_in_reverse::legacy_half_reverse

2023-01-05 18:48:55 +01:00

noexcept_traits.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

NOTICE.txt

…

ORIGIN

…

partition_builder.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

partition_range_compat.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

partition_slice_builder.cc

treewide: drop cql_serialization_format

2023-01-03 19:54:13 +02:00

partition_slice_builder.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

partition_snapshot_reader.hh

db: Use mutation_partition_v2 in mvcc

2023-01-27 21:56:28 +01:00

partition_snapshot_row_cursor.hh

db: Use mutation_partition_v2 in mvcc

2023-01-27 21:56:28 +01:00

partition_version_list.hh

row_cache: Fix undefined behavior during eviction under some conditions

2022-08-01 23:53:15 +02:00

partition_version.cc

Pass is_evictable to apply()

2023-01-27 21:56:31 +01:00

partition_version.hh

Pass is_evictable to apply()

2023-01-27 21:56:31 +01:00

position_in_partition.hh

position_in_partition: Optimize equality check

2023-01-27 19:15:38 +01:00

protocol_server.hh

compile: Fix headers so that *-headers targets compile cleanly.

2022-03-25 16:19:26 +02:00

querier.cc

Show warn message if tombstone_warn_threshold reached on querier.

2022-09-22 16:42:31 +03:00

querier.hh

querier: consume_page(): use partition_start as the sentinel value

2022-11-11 09:58:18 +02:00

query_class_config.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

query_ranges_to_vnodes.cc

storage_proxy: extract query_ranges_to_vnodes_generator to a separate file

2022-02-01 21:14:41 +01:00

query_ranges_to_vnodes.hh

storage_proxy: extract query_ranges_to_vnodes_generator to a separate file

2022-02-01 21:14:41 +01:00

query_result_merger.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

query-request.hh

forward_service: fix timeout support in parallel aggregates

2023-01-16 12:08:13 +02:00

query-result-reader.hh

treewide: use ::for_partition_start() instead of ::partition_start_tag_t{}

2022-11-11 09:58:18 +02:00

query-result-set.cc

treewide: drop cql_serialization_format

2023-01-03 19:54:13 +02:00

query-result-set.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

query-result-writer.hh

query-result-writer: stop when tombstone-limit is reached

2022-08-10 06:03:38 +03:00

query-result.hh

service/storage_proxy: set smallest continue pos as query's continue pos

2022-08-10 06:03:38 +03:00

query.cc

treewide: drop cql_serialization_format

2023-01-03 19:54:13 +02:00

range_tombstone_assembler.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

range_tombstone_change_generator.hh

range_tombstone_change_generator: flush(): add end_of_range

2022-04-21 14:37:10 +03:00

range_tombstone_list.cc

range_tombstone_list: Avoid amortized_reserve()

2022-08-09 11:34:16 +03:00

range_tombstone_list.hh

db: range_tombstone_list: Avoid quadratic behavior when applying

2022-08-05 20:34:07 +03:00

range_tombstone_splitter.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

range_tombstone.cc

db: Print range_tombstone bounds as position_in_partition

2023-01-27 19:15:39 +01:00

range_tombstone.hh

Move dev docs to docs/dev

2022-06-24 18:07:08 +01:00

range.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

read_context.hh

row_cache: update reader implementations to v2

2022-04-21 14:57:04 +03:00

reader_concurrency_semaphore.cc

reader_concurrency_semaphore: un-bless permits when they become inactive

2023-02-01 21:02:17 +02:00

reader_concurrency_semaphore.hh

reader_concurrency_semaphore: un-bless permits when they become inactive

2023-02-01 21:02:17 +02:00

reader_permit.hh

reader_permit: expose operator<<(reader_permit::state)

2023-01-17 05:27:04 -05:00

README.md

Fix broken links

2022-06-28 15:19:36 +01:00

real_dirty_memory_accounter.hh

dirty_memory_manager: move to replica module

2022-12-06 22:24:17 +02:00

release.cc

release: define SCYLLA_BUILD_MODE_STR by stringifying SCYLLA_BUILD_MODE

2022-08-25 16:50:42 +02:00

release.hh

release: define SCYLLA_BUILD_MODE_STR by stringifying SCYLLA_BUILD_MODE

2022-08-25 16:50:42 +02:00

reversibly_mergeable.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

row_cache.cc

row_cache, lru: Introduce evict_shallow()

2023-01-27 21:56:31 +01:00

row_cache.hh

row_cache: update reader implementations to v2

2022-04-21 14:57:04 +03:00

schema_builder.hh

schema, everywhere: define and use table_id as a strong type

2022-08-08 08:09:41 +03:00

schema_fwd.hh

schema, everywhere: define and use table_schema_version as a strong type

2022-08-08 08:09:45 +03:00

schema_mutations.cc

db: schema_mutations: Make operator<<() print all mutations

2022-08-26 16:48:15 +02:00

schema_mutations.hh

schema_mutations: Make it a monoid by defining appropriate += operator

2022-08-26 16:48:15 +02:00

schema_registry.cc

schema_registry: fix abandoned feature warning

2022-08-11 15:11:21 +03:00

schema_registry.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

schema_upgrader.hh

compile: Fix headers so that *-headers targets compile cleanly.

2022-03-25 16:19:26 +02:00

schema.cc

rjson: avoid copy constructors in from_string calls when possible

2023-01-16 15:15:26 +01:00

schema.hh

implement keyspace_element interface

2022-12-10 12:34:09 +01:00

scylla_post_install.sh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

scylla-gdb.py

scylla-gdb.py: scylla memory: remove 'sstable reads' from semaphore names

2023-01-25 21:55:27 +02:00

SCYLLA-VERSION-GEN

release: prepare for 5.3.0-dev

2023-01-18 16:22:41 +02:00

seastarx.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

serialization_visitors.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

serializer_impl.hh

serializer_impl.hh: add reverse vector serializer

2022-11-14 16:06:24 +01:00

serializer.cc

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

serializer.hh

code: Convert is_integral assertions to concepts

2022-02-24 19:44:29 +03:00

service_permit.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

setup.py

…

shell.nix

build: improvements & upgrades to Nix dev environment

2022-10-02 11:47:16 +03:00

sstables_loader.cc

streaming: define plan_id as a strong tagged_uuid type

2022-08-22 19:45:30 +03:00

sstables_loader.hh

schema, everywhere: define and use table_id as a strong type

2022-08-08 08:09:41 +03:00

supervisor.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

table_helper.cc

treewide: replace parallel_for_each with coroutine::parallel_for_each in coroutines

2022-05-31 09:06:24 +03:00

table_helper.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

test.py

test.py: Add option to run scylla tests with multiple compaction groups

2023-02-01 20:17:16 -03:00

timeout_config.cc

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

timeout_config.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

timestamp.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

to_string.hh

to_string: generalize operator<< for unordered_set

2022-07-18 18:20:33 +02:00

tombstone_gc_extension.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

tombstone_gc_options.cc

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

tombstone_gc_options.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

tombstone_gc.cc

tombstone_gc: deglobalize repair_history_maps

2022-09-07 07:43:15 +03:00

tombstone_gc.hh

tombstone_gc: deglobalize repair_history_maps

2022-09-07 07:43:15 +03:00

tombstone.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

tox.ini

…

types.cc

Merge 'Make regexes in types.cc static and remove unnecessary tolower transform' from Marcin Maliszkiewicz

2023-01-24 16:13:59 +02:00

types.hh

test: relax NULL check test predicate

2023-01-18 10:38:24 +02:00

ubsan-suppressions.supp

…

unimplemented.cc

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

unimplemented.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

validation.cc

treewide: remove empty comments in top-of-files

2022-05-13 07:11:58 +02:00

validation.hh

treewide: remove empty comments in top-of-files

2022-05-13 07:11:58 +02:00

version.hh

version: Reverse version increase

2022-12-12 18:45:32 +02:00

view_info.hh

view_info: adjust view_column to accept column_kind

2022-12-06 11:21:16 +01:00

vint-serialization.cc

treewide: remove empty comments in top-of-files

2022-05-13 07:11:58 +02:00

vint-serialization.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

xx_hasher.hh

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

zstd.cc

treewide: use Software Package Data Exchange (SPDX) license identifiers

2022-01-18 12:15:18 +01:00

README.md

Scylla

What is Scylla?

Scylla is the real-time big data database that is API-compatible with Apache Cassandra and Amazon DynamoDB. Scylla embraces a shared-nothing approach that increases throughput and storage capacity to realize order-of-magnitude performance improvements and reduce hardware costs.

For more information, please see the ScyllaDB web site.

Build Prerequisites

Scylla is fairly fussy about its build environment, requiring very recent versions of the C++20 compiler and of many libraries to build. The document HACKING.md includes detailed information on building and developing Scylla, but to get Scylla building quickly on (almost) any build machine, Scylla offers a frozen toolchain, This is a pre-configured Docker image which includes recent versions of all the required compilers, libraries and build tools. Using the frozen toolchain allows you to avoid changing anything in your build machine to meet Scylla's requirements - you just need to meet the frozen toolchain's prerequisites (mostly, Docker or Podman being available).

Building Scylla

Building Scylla with the frozen toolchain dbuild is as easy as:

$ git submodule update --init --force --recursive
$ ./tools/toolchain/dbuild ./configure.py
$ ./tools/toolchain/dbuild ninja build/release/scylla

For further information, please see:

Developer documentation for more information on building Scylla.
Build documentation on how to build Scylla binaries, tests, and packages.
Docker image build documentation for information on how to build Docker images.

Running Scylla

To start Scylla server, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --workdir tmp --smp 1 --developer-mode 1

This will start a Scylla node with one CPU core allocated to it and data files stored in the tmp directory. The --developer-mode is needed to disable the various checks Scylla performs at startup to ensure the machine is configured for maximum performance (not relevant on development workstations). Please note that you need to run Scylla with dbuild if you built it with the frozen toolchain.

For more run options, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --help

Testing

See test.py manual.

Scylla APIs and compatibility

By default, Scylla is compatible with Apache Cassandra and its APIs - CQL and Thrift. There is also support for the API of Amazon DynamoDB™, which needs to be enabled and configured in order to be used. For more information on how to enable the DynamoDB™ API in Scylla, and the current compatibility of this feature as well as Scylla-specific extensions, see Alternator and Getting started with Alternator.

Documentation

Documentation can be found here. Seastar documentation can be found here. User documentation can be found here.

Training

Training material and online courses can be found at Scylla University. The courses are free, self-paced and include hands-on examples. They cover a variety of topics including Scylla data modeling, administration, architecture, basic NoSQL concepts, using drivers for application development, Scylla setup, failover, compactions, multi-datacenters and how Scylla integrates with third-party applications.

Contributing to Scylla

If you want to report a bug or submit a pull request or a patch, please read the contribution guidelines.

If you are a developer working on Scylla, please read the developer guidelines.

Contact

The users mailing list and Slack channel are for users to discuss configuration, management, and operations of the ScyllaDB open source.
The developers mailing list is for developers and people interested in following the development of ScyllaDB to discuss technical topics.

Languages

C++ 72.2%

Python 26.6%

CMake 0.3%

GAP 0.3%

Shell 0.3%