mirror of https://github.com/scylladb/scylladb.git synced 2026-05-22 15:52:13 +00:00

Go to file

Raphael S. Carvalho 986491447b table: Optimize creation of reader excluding staging for view building

View building from staging creates a reader from scratch (memtable
+ sstables - staging) for every partition, in order to calculate
the diff between new staging data and data in base sstable set,
and then pushes the result into the view replicas.

perf shows that the reader creation is very expensive:
+   12.15%    10.75%  reactor-3        scylla             [.] lexicographical_tri_compare<compound_type<(allow_prefixes)0>::iterator, compound_type<(allow_prefixes)0>::iterator, legacy_compound_view<compound_type<(allow_prefixes)0> >::tri_comparator::operator()(managed_bytes_basic_view<(mutable_view)0>, managed_bytes
+   10.01%     9.99%  reactor-3        scylla             [.] boost::icl::is_empty<boost::icl::continuous_interval<compatible_ring_position_or_view, std::less> >
+    8.95%     8.94%  reactor-3        scylla             [.] legacy_compound_view<compound_type<(allow_prefixes)0> >::tri_comparator::operator()
+    7.29%     7.28%  reactor-3        scylla             [.] dht::ring_position_tri_compare
+    6.28%     6.27%  reactor-3        scylla             [.] dht::tri_compare
+    4.11%     3.52%  reactor-3        scylla             [.] boost::icl::interval_base_map<boost::icl::interval_map<compatible_ring_position_or_view, std::unordered_set<seastar::lw_shared_ptr<sstables::sstable>, std::hash<seastar::lw_shared_ptr<sstables::sstable> >, std::equal_to<seastar::lw_shared_ptr<sstables::sst+    4.09%     4.07%  reactor-3        scylla             [.] sstables::index_consume_entry_context<sstables::index_consumer>::process_state
+    3.46%     0.93%  reactor-3        scylla             [.] sstables::sstable_run::will_introduce_overlapping
+    2.53%     2.53%  reactor-3        libstdc++.so.6     [.] std::_Rb_tree_increment
+    2.45%     2.45%  reactor-3        scylla             [.] boost::icl::non_empty::exclusive_less<boost::icl::continuous_interval<compatible_ring_position_or_view, std::less> >
+    2.14%     2.13%  reactor-3        scylla             [.] boost::icl::exclusive_less<boost::icl::continuous_interval<compatible_ring_position_or_view, std::less> >
+    2.07%     2.07%  reactor-3        scylla             [.] logalloc::region_impl::free
+    2.06%     1.91%  reactor-3        scylla             [.] sstables::index_consumer::consume_entry(sstables::parsed_partition_index_entry&&)::{lambda()#1}::operator()() const::{lambda()#1}::operator()
+    2.04%     2.04%  reactor-3        scylla             [.] boost::icl::interval_base_map<boost::icl::interval_map<compatible_ring_position_or_view, std::unordered_set<seastar::lw_shared_ptr<sstables::sstable>, std::hash<seastar::lw_shared_ptr<sstables::sstable> >, std::equal_to<seastar::lw_shared_ptr<sstables::sst+    1.87%     0.00%  reactor-3        [kernel.kallsyms]  [k] entry_SYSCALL_64_after_hwframe
+    1.86%     0.00%  reactor-3        [kernel.kallsyms]  [k] do_syscall_64
+    1.39%     1.38%  reactor-3        libc.so.6          [.] __memcmp_avx2_movbe
+    1.37%     0.92%  reactor-3        scylla             [.] boost::icl::segmental::join_left<boost::icl::interval_map<compatible_ring_position_or_view, std::unordered_set<seastar::lw_shared_ptr<sstables::sstable>, std::hash<seastar::lw_shared_ptr<sstables::sstable> >, std::equal_to<seastar::lw_shared_ptr<sstables::
+    1.34%     1.33%  reactor-3        scylla             [.] logalloc::region_impl::alloc_small
+    1.33%     1.33%  reactor-3        scylla             [.] seastar::memory::small_pool::add_more_objects
+    1.30%     0.35%  reactor-3        scylla             [.] seastar::reactor::do_run
+    1.29%     1.29%  reactor-3        scylla             [.] seastar::memory::allocate
+    1.19%     0.05%  reactor-3        libc.so.6          [.] syscall
+    1.16%     1.04%  reactor-3        scylla             [.] boost::icl::interval_base_map<boost::icl::interval_map<compatible_ring_position_or_view, std::unordered_set<seastar::lw_shared_ptr<sstables::sstable>, std::hash<seastar::lw_shared_ptr<sstables::sstable> >, std::equal_to<seastar::lw_shared_ptr<sstables::sst
+    1.07%     0.79%  reactor-3        scylla             [.] sstables::partitioned_sstable_set::insert

That shows some significant amount of work for inserting sstables
into the interval map and maintaining the sstable run (which sorts
fragments by first key and checks for overlapping).

The interval map is known for having issues with L0 sstables, as
it will have to be replicated almost to every single interval
stored by the map, causing terrible space and time complexity.
With enough L0 sstables, it can fall into quadratic behavior.

This overhead is fixed by not building a new fresh sstable set
when recreating the reader, but rather supplying a predicate
to sstable set that will filter out staging sstables when
creating either a single-key or range scan reader.

This could have another benefit over today's approach which
may incorrectly consider a staging sstable as non-staging, if
the staging sst wasn't included in the current batch for view
building.

With this improvement, view building was measured to be 3x faster.

from
INFO  2023-06-16 12:36:40,014 [shard 0] view_update_generator - Processed keyspace1.standard1: 5 sstables in 963957ms = 50kB/s

to
INFO  2023-06-16 14:47:12,129 [shard 0] view_update_generator - Processed keyspace1.standard1: 5 sstables in 319899ms = 150kB/s

Refs #14089.
Fixes #14244.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
(cherry picked from commit 1d8cb32a5d)
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes #14764

2023-07-20 16:46:15 +03:00

.github

Update CODEOWNERS file

2022-12-06 19:26:03 +02:00

alternator

alternator: close output_stream when exception is thrown during response streaming

2023-07-13 23:27:46 +03:00

api

api: get task statuses recursively

2023-01-11 12:34:06 +01:00

auth

auth: don't use infinite timeout in default_role_row_satisfies query

2023-06-06 19:39:29 +03:00

cdc

treewide: drop cql_serialization_format

2023-01-03 19:54:13 +02:00

compaction

compaction: avoid excessive reallocation and during input list formatting

2023-07-09 23:54:18 +03:00

conf

Merge 'Enable Raft by default in new clusters' from Kamil Braun

2023-01-26 12:21:55 +01:00

cql3

Merge 'Backport bugfixes regarding UDT, UDF, UDA interactions to branch-5.2' from Wojciech Mitros

2023-04-19 01:38:08 -04:00

data_dictionary

data_dictionary: fix forgetting of UDTs on ALTER KEYSPACE

2023-06-06 21:52:47 +03:00

view_updating_consumer: make buffer limit a variable

2023-07-11 09:44:00 +02:00

debug

…

dht

dht/i_partitioner.hh: ring_position_ext: add weight() accessor

2023-01-09 09:46:57 -05:00

direct_failure_detector

direct_failure_detector: Avoid throwing exceptions in the success path

2023-04-27 19:14:31 +03:00

dist

scylla_fstrim_setup: start scylla-fstrim.timer on setup

2023-07-18 16:05:09 +03:00

docs

Merge 'atomic_cell: compare value last' from Benny Halevy

2023-07-12 10:09:56 +03:00

exceptions

exception: fix the error code used for rate_limit_exception

2022-09-13 11:46:15 +02:00

gms

Merge 'Do not mask node operation errors' from Benny Halevy

2023-04-30 18:58:28 +03:00

idl

forward_service: fix timeout support in parallel aggregates

2023-01-16 12:08:13 +02:00

index

index/built_indexes_virtual_reader.hh: Fix use-after-move

2023-05-15 20:24:35 +03:00

interface

…

lang

Merge 'tools/scylla-sstable: add lua scripting support' from Botond Dénes

2023-01-09 20:54:42 +02:00

licenses

…

locator

locator: token_metadata: get rid of a quadratic behaviour in get_address_ranges()

2023-04-16 21:59:14 +03:00

message

Merge ' message: match unknown tenants to the default tenant' from Botond Dénes

2023-07-12 15:31:48 +03:00

mutation_writer

position_in_partition: Make after_key() work with non-full keys

2022-12-14 14:47:33 +01:00

raft

raft: abort applier fiber when a state machine aborts

2023-02-23 14:12:12 +02:00

readers

combined: mergers: remove recursion in operator()()

2023-07-11 11:09:25 +03:00

redis

storage_proxy.hh: Remove unused headers

2022-10-02 20:48:50 +03:00

reloc

SCYLLA-VERSION-GEN: use semver-compatible version

2022-07-25 18:06:28 +03:00

repair

repair: Release permit earlier when the repair_reader is done

2023-07-14 18:18:43 +03:00

replica

table: Optimize creation of reader excluding staging for view building

2023-07-20 16:46:15 +03:00

rust

rust: update dependencies

2023-04-27 22:01:44 +03:00

scripts

open-coredump.sh: handle dev versions

2023-01-12 19:28:58 +02:00

seastar @ 29a0e64513

Update seastar submodule (default priority class shares)

2023-06-21 21:23:14 +03:00

service

storage_proxy: Make split_stats resilient to being called from different scheduling group

2023-07-12 09:24:56 +03:00

sstables

table: Optimize creation of reader excluding staging for view building

2023-07-20 16:46:15 +03:00

streaming

db/view/view_update_check: check_needs_view_update_path(): filter out non-member hosts

2023-03-22 09:03:50 +02:00

swagger-ui @ 12f1da1082

…

tasks

Merge 'Abort repair tasks' from Aleksandra Martyniuk

2023-01-05 15:21:35 +01:00

test

table: Optimize creation of reader excluding staging for view building

2023-07-20 16:46:15 +03:00

thrift

treewide: drop cql_serialization_format

2023-01-03 19:54:13 +02:00

tools

tools: toolchain: regenerate

2023-05-02 13:16:59 +03:00

tracing

cql3, transport, tests: remove "unset" from value type system

2023-01-16 21:10:56 +02:00

transport

transport server: fix unexpected server errors handling

2023-03-21 20:23:09 +02:00

types

treewide: drop cql_serialization_format

2023-01-03 19:54:13 +02:00

unified

install.sh: Skip systemd existance check when --without-systemd

2022-11-14 14:07:46 +02:00

utils

Merge 'Yield while building large results in Alternator - rjson::print, executor::batch_get_item' from Marcin Maliszkiewicz

2023-07-13 23:27:38 +03:00

.dockerignore

…

.gitattributes

gitattributes: Mark *.svg as binary

2022-07-31 15:25:24 +03:00

.gitignore

Add rust/Cargo.lock to .gitignore

2022-10-14 13:54:50 +03:00

.gitmodules

.gitmodules: point seastar submodule at scylla-seastar.git

2023-03-23 17:11:43 +02:00

.gitorderfile

…

.mailmap

Add .mailmap

2022-07-04 13:44:28 +03:00

absl-flat_hash_map.cc

…

absl-flat_hash_map.hh

…

amplify.yml

docs: automatic previews configuration

2022-11-04 15:44:22 +02:00

atomic_cell_hash.hh

…

atomic_cell_or_collection.hh

…

atomic_cell.cc

Merge 'atomic_cell: compare value last' from Benny Halevy

2023-07-12 10:09:56 +03:00

atomic_cell.hh

…

backlog_controller.hh

backlog_controller: keep scheduling_group by value

2022-08-02 07:38:40 +03:00

build_mode.hh

release: properly evaluate SCYLLA_BUILD_MODE_* macros

2022-08-29 10:20:19 +03:00

bytes_ostream.hh

bytes_ostream: don't take reference to packed variable

2022-11-28 21:40:18 +02:00

bytes.cc

…

bytes.hh

…

cache_flat_mutation_reader.hh

position_in_partition: Make after_key() work with non-full keys

2022-12-14 14:47:33 +01:00

cache_temperature.hh

…

caching_options.cc

…

caching_options.hh

…

canonical_mutation.cc

mutation_partition_view: make mutation_partition_view_virtual_visitor stoppable

2022-08-22 20:12:58 +03:00

canonical_mutation.hh

schema, everywhere: define and use table_id as a strong type

2022-08-08 08:09:41 +03:00

cartesian_product.hh

…

cell_locking.hh

…

checked-file-impl.hh

treewide: use system-#include (angle brackets) for seastar

2022-04-26 14:46:42 +03:00

client_data.cc

…

client_data.hh

…

clocks-impl.cc

…

clocks-impl.hh

…

clustering_bounds_comparator.hh

…

clustering_interval_set.hh

…

clustering_key_filter.hh

…

clustering_ranges_walker.hh

…

CMakeLists.txt

build: drop abseil submodule, replace with distribution abseil

2022-12-28 19:02:23 +02:00

collection_mutation.cc

…

collection_mutation.hh

treewide: drop cql_serialization_format

2023-01-03 19:54:13 +02:00

column_computation.hh

column_computation: adjust to use clustering_or_static_row

2022-12-06 11:21:16 +01:00

combine.hh

…

compatible_ring_position.hh

compatible_ring_position_or_view: make it cheap to copy

2022-10-04 12:00:21 +03:00

compound_compat.hh

…

compound.hh

treewide: drop cql_serialization_format

2023-01-03 19:54:13 +02:00

compress.cc

…

compress.hh

…

concrete_types.hh

…

configure.py

Merge 'configure.py: a bunch of clean-up changes' from Michał Chojnowski

2023-01-12 16:40:02 +02:00

CONTRIBUTING.md

Add redirections

2022-06-28 09:39:14 +01:00

converting_mutation_partition_applier.cc

…

converting_mutation_partition_applier.hh

…

counters.cc

everywhere: define locator::host_id as a strong tagged_uuid type

2022-08-12 06:01:44 +03:00

counters.hh

everywhere: define locator::host_id as a strong tagged_uuid type

2022-08-12 06:01:44 +03:00

cql_serialization_format.hh

treewide: drop cql_serialization_format

2023-01-03 19:54:13 +02:00

db_clock.hh

gc_clock, db_clock: mark functions noexcept

2022-07-27 13:17:01 +03:00

debug.hh

…

default.nix

build: remove references to unused c bindings of wasmtime

2023-01-06 14:07:29 +01:00

digest_algorithm.hh

…

digester.hh

…

Doxyfile

…

duration.cc

…

duration.hh

…

encoding_stats.hh

encoding_state: mark functions noexcept

2022-07-27 13:43:17 +03:00

enum_set.hh

…

fix_system_distributed_tables.py

…

flake.lock

build: fix Nix devenv

2022-12-19 20:53:07 +02:00

flake.nix

build: fix Nix devenv

2022-12-19 20:53:07 +02:00

frozen_mutation.cc

schema, everywhere: define and use table_schema_version as a strong type

2022-08-08 08:09:45 +03:00

frozen_mutation.hh

mutation_partition_view: make mutation_partition_view_virtual_visitor stoppable

2022-08-22 20:12:58 +03:00

frozen_schema.cc

idl: make idl headers self-sufficient

2022-08-08 08:02:27 +03:00

frozen_schema.hh

…

full_position.hh

service/storage_proxy: set smallest continue pos as query's continue pos

2022-08-10 06:03:38 +03:00

gc_clock.hh

gc_clock, db_clock: mark functions noexcept

2022-07-27 13:17:01 +03:00

gdbinit

gdbinit: add ignore clause for SIG35

2023-01-12 12:13:04 +02:00

gen_segmented_compress_params.py

…

generic_server.cc

…

generic_server.hh

generic_server.hh: add missing include

2022-04-04 17:31:55 +03:00

HACKING.md

Move dev docs to docs/dev

2022-06-24 18:07:08 +01:00

hashers.cc

…

hashers.hh

…

hashing_partition_visitor.hh

…

hashing.hh

…

idl-compiler.py

idl-compiler: introduce cancellable verbs

2022-08-19 19:15:18 +02:00

inet_address_vectors.hh

…

init.cc

init: do not allow cfg.replace_node_first_boot of seed node

2023-01-13 18:30:48 +02:00

init.hh

…

install-dependencies.sh

build: remove references to unused c bindings of wasmtime

2023-01-06 14:07:29 +01:00

install.sh

install.sh: drop locale workaround from python3 thunk

2022-11-28 13:07:03 +02:00

interval.hh

…

intrusive_set_external_comparator.hh

…

keys.cc

add utf8:validate to operator<< partition_key with_schema.

2022-09-22 16:42:31 +03:00

keys.hh

…

LICENSE.AGPL

…

log.hh

…

main.cc

db: system_keyspace: take the reserved_memory into account

2023-02-05 18:30:05 +02:00

map_difference.hh

…

marshal_exception.hh

…

multishard_mutation_query.cc

Merge 'multishard_mutation_query: make reader_context::lookup_readers() exception safe' from Botond Dénes

2023-06-08 04:29:51 -04:00

multishard_mutation_query.hh

…

mutation_cleaner.hh

db: mutation_cleaner: Enqueue new snapshots at the back

2022-06-28 18:29:29 +03:00

mutation_compactor.hh

mutation/mutation_compactor: consume_partition_end(): reset _stop

2023-04-18 02:32:24 -04:00

mutation_consumer_concepts.hh

…

mutation_consumer.hh

mutation{,_consumer,_partition}: remove consume_in_reverse::legacy_half_reverse

2023-01-05 18:48:55 +01:00

mutation_fragment_fwd.hh

…

mutation_fragment_stream_validator.hh

mutation_fragment_stream_validator: avoid allocation when stream is correct

2022-11-22 19:19:18 +02:00

mutation_fragment_v2.hh

mutation_fragment_v2: range_tombstone_change: add minimal_memory_usage()

2022-04-28 14:11:51 +03:00

mutation_fragment.cc

position_in_partition: Make after_key() work with non-full keys

2022-12-14 14:47:33 +01:00

mutation_fragment.hh

Merge 'Fix handling of non-full clustering keys in the read path' from Tomasz Grabiec

2022-12-15 10:47:12 +02:00

mutation_partition_serializer.cc

idl: make idl headers self-sufficient

2022-08-08 08:02:27 +03:00

mutation_partition_serializer.hh

…

mutation_partition_view.cc

mutation_partition_view: make mutation_partition_view_virtual_visitor stoppable

2022-08-22 20:12:58 +03:00

mutation_partition_view.hh

mutation_partition_view: make mutation_partition_view_virtual_visitor stoppable

2022-08-22 20:12:58 +03:00

mutation_partition_visitor.hh

…

mutation_partition.cc

Merge 'atomic_cell: compare value last' from Benny Halevy

2023-07-12 10:09:56 +03:00

mutation_partition.hh

deletable_row: add column_kind parameter to is_live

2022-12-06 11:21:16 +01:00

mutation_query.cc

…

mutation_query.hh

query: coroutinize to_data_query_result

2022-05-05 13:32:25 +03:00

mutation_rebuilder.hh

view: fix range tombstone handling on flushes in view_updating_consumer

2023-07-11 09:44:00 +02:00

mutation_source_metadata.hh

…

mutation.cc

mutation_partition: compact_for_compaction: get tombstone_gc_state

2022-09-07 07:43:15 +03:00

mutation.hh

mutation{,_consumer,_partition}: remove consume_in_reverse::legacy_half_reverse

2023-01-05 18:48:55 +01:00

noexcept_traits.hh

…

NOTICE.txt

…

ORIGIN

…

partition_builder.hh

…

partition_range_compat.hh

…

partition_slice_builder.cc

Merge 'multishard_mutation_query: make reader_context::lookup_readers() exception safe' from Botond Dénes

2023-06-08 04:29:51 -04:00

partition_slice_builder.hh

Merge 'multishard_mutation_query: make reader_context::lookup_readers() exception safe' from Botond Dénes

2023-06-08 04:29:51 -04:00

partition_snapshot_reader.hh

partition_snapshot_reader.hh: fix iterator invalidation in do_refresh_state

2023-07-17 14:20:37 +02:00

partition_snapshot_row_cursor.hh

row_cache: Fix missing row if upper bound of population range is evicted and has adjacent dummy

2022-08-09 02:28:56 +02:00

partition_version_list.hh

row_cache: Fix undefined behavior during eviction under some conditions

2022-08-01 23:53:15 +02:00

partition_version.cc

mvcc: Add snapshot details to the printout of partition_entry

2022-10-16 14:22:14 +03:00

partition_version.hh

mvcc: Add snapshot details to the printout of partition_entry

2022-10-16 14:22:14 +03:00

position_in_partition.hh

cache: Fix undefined behavior when populating with non-full keys

2023-01-10 12:51:54 +02:00

protocol_server.hh

compile: Fix headers so that *-headers targets compile cleanly.

2022-03-25 16:19:26 +02:00

querier.cc

Show warn message if tombstone_warn_threshold reached on querier.

2022-09-22 16:42:31 +03:00

querier.hh

querier: consume_page(): use partition_start as the sentinel value

2022-11-11 09:58:18 +02:00

query_class_config.hh

…

query_ranges_to_vnodes.cc

…

query_ranges_to_vnodes.hh

…

query_result_merger.hh

…

query-request.hh

forward_service: fix timeout support in parallel aggregates

2023-01-16 12:08:13 +02:00

query-result-reader.hh

treewide: use ::for_partition_start() instead of ::partition_start_tag_t{}

2022-11-11 09:58:18 +02:00

query-result-set.cc

treewide: drop cql_serialization_format

2023-01-03 19:54:13 +02:00

query-result-set.hh

…

query-result-writer.hh

query-result-writer: stop when tombstone-limit is reached

2022-08-10 06:03:38 +03:00

query-result.hh

service/storage_proxy: set smallest continue pos as query's continue pos

2022-08-10 06:03:38 +03:00

query.cc

treewide: drop cql_serialization_format

2023-01-03 19:54:13 +02:00

range_tombstone_assembler.hh

…

range_tombstone_change_generator.hh

range_tombstone_change_generator: fix an edge case in flush()

2023-05-15 17:48:24 +02:00

range_tombstone_list.cc

range_tombstone_list: Avoid amortized_reserve()

2022-08-09 11:34:16 +03:00

range_tombstone_list.hh

db: range_tombstone_list: Avoid quadratic behavior when applying

2022-08-05 20:34:07 +03:00

range_tombstone_splitter.hh

…

range_tombstone.cc

…

range_tombstone.hh

Move dev docs to docs/dev

2022-06-24 18:07:08 +01:00

range.hh

…

read_context.hh

row_cache: update reader implementations to v2

2022-04-21 14:57:04 +03:00

reader_concurrency_semaphore.cc

reader_concurrency_semaphore: don't evict inactive readers needlessly

2023-04-14 10:37:30 +03:00

reader_concurrency_semaphore.hh

reader_concurrency_semaphore: don't evict inactive readers needlessly

2023-04-14 10:37:30 +03:00

reader_permit.hh

reader_permit: expose operator<<(reader_permit::state)

2023-03-14 09:50:16 +02:00

README.md

Fix broken links

2022-06-28 15:19:36 +01:00

real_dirty_memory_accounter.hh

dirty_memory_manager: move to replica module

2022-12-06 22:24:17 +02:00

release.cc

release: define SCYLLA_BUILD_MODE_STR by stringifying SCYLLA_BUILD_MODE

2022-08-25 16:50:42 +02:00

release.hh

release: define SCYLLA_BUILD_MODE_STR by stringifying SCYLLA_BUILD_MODE

2022-08-25 16:50:42 +02:00

reversibly_mergeable.hh

…

row_cache.cc

Fix use-after-move when initializing row cache with dummy entry

2023-05-14 21:02:24 +03:00

row_cache.hh

row_cache: update reader implementations to v2

2022-04-21 14:57:04 +03:00

schema_builder.hh

schema, everywhere: define and use table_id as a strong type

2022-08-08 08:09:41 +03:00

schema_fwd.hh

schema, everywhere: define and use table_schema_version as a strong type

2022-08-08 08:09:45 +03:00

schema_mutations.cc

db: schema_mutations: Make operator<<() print all mutations

2022-08-26 16:48:15 +02:00

schema_mutations.hh

schema_mutations: Make it a monoid by defining appropriate += operator

2022-08-26 16:48:15 +02:00

schema_registry.cc

schema_registry: fix abandoned feature warning

2022-08-11 15:11:21 +03:00

schema_registry.hh

…

schema_upgrader.hh

compile: Fix headers so that *-headers targets compile cleanly.

2022-03-25 16:19:26 +02:00

schema.cc

schema: operator<<: print also tombstone_gc_options

2022-12-22 16:40:18 +02:00

schema.hh

implement keyspace_element interface

2022-12-10 12:34:09 +01:00

scylla_post_install.sh

scylla_coredump_setup: fix coredump timeout settings

2023-02-19 21:13:36 +02:00

scylla-gdb.py

scylla-gdb: Parse and eval _all_threads without quotes

2023-05-02 13:16:59 +03:00

SCYLLA-VERSION-GEN

release: prepare for 5.2.5

2023-07-13 14:23:18 +03:00

seastarx.hh

…

serialization_visitors.hh

…

serializer_impl.hh

serializer_impl.hh: add reverse vector serializer

2022-11-14 16:06:24 +01:00

serializer.cc

…

serializer.hh

…

service_permit.hh

…

setup.py

…

shell.nix

build: improvements & upgrades to Nix dev environment

2022-10-02 11:47:16 +03:00

sstables_loader.cc

streaming: define plan_id as a strong tagged_uuid type

2022-08-22 19:45:30 +03:00

sstables_loader.hh

schema, everywhere: define and use table_id as a strong type

2022-08-08 08:09:41 +03:00

supervisor.hh

…

table_helper.cc

treewide: replace parallel_for_each with coroutine::parallel_for_each in coroutines

2022-05-31 09:06:24 +03:00

table_helper.hh

…

test.py

Merge 'test.py: improve test failure handling' from Kamil Braun

2023-06-12 11:47:54 +02:00

timeout_config.cc

…

timeout_config.hh

…

timestamp.hh

…

to_string.hh

to_string: generalize operator<< for unordered_set

2022-07-18 18:20:33 +02:00

tombstone_gc_extension.hh

…

tombstone_gc_options.cc

…

tombstone_gc_options.hh

…

tombstone_gc.cc

tombstone_gc: Fix gc_before for immediate mode

2023-05-15 10:33:29 +03:00

tombstone_gc.hh

tombstone_gc: deglobalize repair_history_maps

2022-09-07 07:43:15 +03:00

tombstone.hh

…

tox.ini

…

types.cc

types: unserialize_value for multiprecision_int,bool: don't read uninitialized memory

2023-02-23 22:38:03 +02:00

types.hh

treewide: drop cql_serialization_format

2023-01-03 19:54:13 +02:00

ubsan-suppressions.supp

…

unimplemented.cc

…

unimplemented.hh

…

validation.cc

treewide: remove empty comments in top-of-files

2022-05-13 07:11:58 +02:00

validation.hh

treewide: remove empty comments in top-of-files

2022-05-13 07:11:58 +02:00

version.hh

version: Reverse version increase

2022-12-12 18:45:32 +02:00

view_info.hh

view_info: adjust view_column to accept column_kind

2022-12-06 11:21:16 +01:00

vint-serialization.cc

treewide: remove empty comments in top-of-files

2022-05-13 07:11:58 +02:00

vint-serialization.hh

…

xx_hasher.hh

…

zstd.cc

…

README.md

Scylla

What is Scylla?

Scylla is the real-time big data database that is API-compatible with Apache Cassandra and Amazon DynamoDB. Scylla embraces a shared-nothing approach that increases throughput and storage capacity to realize order-of-magnitude performance improvements and reduce hardware costs.

For more information, please see the ScyllaDB web site.

Build Prerequisites

Scylla is fairly fussy about its build environment, requiring very recent versions of the C++20 compiler and of many libraries to build. The document HACKING.md includes detailed information on building and developing Scylla, but to get Scylla building quickly on (almost) any build machine, Scylla offers a frozen toolchain, This is a pre-configured Docker image which includes recent versions of all the required compilers, libraries and build tools. Using the frozen toolchain allows you to avoid changing anything in your build machine to meet Scylla's requirements - you just need to meet the frozen toolchain's prerequisites (mostly, Docker or Podman being available).

Building Scylla

Building Scylla with the frozen toolchain dbuild is as easy as:

$ git submodule update --init --force --recursive
$ ./tools/toolchain/dbuild ./configure.py
$ ./tools/toolchain/dbuild ninja build/release/scylla

For further information, please see:

Developer documentation for more information on building Scylla.
Build documentation on how to build Scylla binaries, tests, and packages.
Docker image build documentation for information on how to build Docker images.

Running Scylla

To start Scylla server, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --workdir tmp --smp 1 --developer-mode 1

This will start a Scylla node with one CPU core allocated to it and data files stored in the tmp directory. The --developer-mode is needed to disable the various checks Scylla performs at startup to ensure the machine is configured for maximum performance (not relevant on development workstations). Please note that you need to run Scylla with dbuild if you built it with the frozen toolchain.

For more run options, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --help

Testing

See test.py manual.

Scylla APIs and compatibility

By default, Scylla is compatible with Apache Cassandra and its APIs - CQL and Thrift. There is also support for the API of Amazon DynamoDB™, which needs to be enabled and configured in order to be used. For more information on how to enable the DynamoDB™ API in Scylla, and the current compatibility of this feature as well as Scylla-specific extensions, see Alternator and Getting started with Alternator.

Documentation

Documentation can be found here. Seastar documentation can be found here. User documentation can be found here.

Training

Training material and online courses can be found at Scylla University. The courses are free, self-paced and include hands-on examples. They cover a variety of topics including Scylla data modeling, administration, architecture, basic NoSQL concepts, using drivers for application development, Scylla setup, failover, compactions, multi-datacenters and how Scylla integrates with third-party applications.

Contributing to Scylla

If you want to report a bug or submit a pull request or a patch, please read the contribution guidelines.

If you are a developer working on Scylla, please read the developer guidelines.

Contact

The users mailing list and Slack channel are for users to discuss configuration, management, and operations of the ScyllaDB open source.
The developers mailing list is for developers and people interested in following the development of ScyllaDB to discuss technical topics.

Languages

C++ 72.3%

Python 26.5%

CMake 0.3%

GAP 0.3%

Shell 0.3%