mirror of https://github.com/scylladb/scylladb.git synced 2026-06-07 07:23:15 +00:00

Go to file

Tomasz Grabiec cb0b8d1903 row_cache: Zap dummy entries when populating or reading a range

This will prevent accumulation of unnecessary dummy entries.

A single-partition populating scan with clustering key restrictions
will insert dummy entries positioned at the boundaries of the
clustering query range to mark the newly populated range as
continuous.

Those dummy entries may accumulate with time, increasing the cost of
the scan, which needs to walk over them.

In some workloads we could prevent this. If a populating query
overlaps with dummy entries, we could erase the old dummy entry since
it will not be needed, it will fall inside a broader continuous
range. This will be the case for time series worklodas which scan with
a decreasing (newest) lower bound.

Refs #8153.

_last_row is now updated atomically with _next_row. Before, _last_row
was moved first. If exception was thrown and the section was retried,
this could cause the wrong entry to be removed (new next instead of
old last) by the new algorithm. I don't think this was causing
problems before this patch.

The problem is not solved for all the cases. After this patch, we
remove dummies only when there is a single MVCC version. We could
patch apply_monotonically() to also do it, so that dummies which are
inside continuous ranges are eventually removed, but this is left for
later.

perf_row_cache_reads output after that patch shows that the second
scan touches no dummies:

$ build/release/test/perf/perf_row_cache_reads_g -c1 -m200M
Rows in cache: 0
Populating with dummy rows
Rows in cache: 265320
Scanning
read: 142.621613 [ms], preemption: {count: 639, 99%: 0.545791 [ms], max: 0.526929 [ms]}, cache: 0/0 [MB]
read: 0.023197 [ms], preemption: {count: 1, 99%: 0.035425 [ms], max: 0.032736 [ms]}, cache: 0/0 [MB]

Message-Id: <20210226172801.800264-1-tgrabiec@scylladb.com>

2021-03-01 20:34:35 +02:00

.github

docs: added multiversion_regex_builder

2021-01-13 11:07:29 +02:00

abseil @ 9c6a50fdd8

Update abseil submodule

2021-02-08 15:41:46 +02:00

alternator

Merge ' cdc: move (most of) CDC generation management to a new service' from Kamil Braun

2021-02-26 12:42:27 +01:00

api

api: Introduce system/drop_sstable_caches RESTful API

2021-03-01 16:13:04 +02:00

auth

migration_manager: drop announce_locally flag

2021-01-03 13:58:09 +02:00

cdc

cdc: move (most of) CDC generation management code to the new service

2021-02-26 12:06:12 +01:00

conf

transport: Fix abort on certain configurations of native_transport_port(_ssl)

2021-02-02 11:32:31 +02:00

cql3

cql3: query_processor: improve internal paged query API

2021-02-18 11:44:59 +01:00

Merge 'sstables: add versioning to the sstable_set ' from Wojciech Mitros

2021-03-01 14:16:36 +02:00

debug

…

dht

range-streamer: Remove global storage service reference

2021-02-12 15:50:30 +01:00

dist

dist: tune fs.aio-max-nr based on the number of cpus

2021-03-01 14:18:24 +02:00

docs

cdc: rewrite streams to the new description table

2021-02-18 11:44:59 +01:00

exceptions

cql: fix error return from execution of fromJson() and other functions

2021-01-21 15:21:13 +01:00

gms

schema: recalculate digest when computed_columns feature is enabled

2021-02-11 13:48:58 +02:00

idl

raft: joint consensus, use unordered_set for server_address list

2021-01-29 22:07:07 +03:00

index

flat_mutation_reader: return future from next_partition

2021-01-13 17:35:07 +02:00

interface

thrift: switch csharp backend to netstd

2020-06-23 19:40:18 +03:00

libdeflate @ e7e54eab42

…

licenses

Add abseil as a submodule

2020-06-14 08:18:37 -07:00

locator

locator: Check DC names in NTS

2021-02-09 07:04:17 +01:00

message

messaging_service: Move gossip ack message verb to gossip group

2021-02-23 10:10:00 +02:00

mutation_writer

mutation_writer: bucket_writer: add close

2021-01-19 19:03:58 +02:00

raft

Merge "raft: add unit tests for log, tracker, votes and fix found bugs" from Kostja

2021-02-18 10:55:59 +01:00

redis

redis: rename _args_size/_size_left

2021-01-25 10:26:37 +09:00

reloc

reloc: Remove "build_reloc.sh" script as obsolete

2020-11-20 22:41:26 +02:00

repair

imr: switch back to open-coded description of structures

2021-02-16 23:43:07 +01:00

scripts

dist: add node_exporter to scylla-server package

2020-12-24 11:44:13 +02:00

seastar @ 803e790598

Update seastar submodule

2021-02-25 16:58:06 +02:00

service

Merge 'sstables: add versioning to the sstable_set ' from Wojciech Mitros

2021-03-01 14:16:36 +02:00

sstables

Merge 'sstables: add versioning to the sstable_set ' from Wojciech Mitros

2021-03-01 14:16:36 +02:00

streaming

stream_session: prepare: fix missing string format argument

2021-02-11 12:05:32 +02:00

swagger-ui @ 12f1da1082

…

test

row_cache: Zap dummy entries when populating or reading a range

2021-03-01 20:34:35 +02:00

thrift

database: drop duplicated function

2021-02-01 18:52:04 +02:00

tools

Update tools/python3 submodule

2021-03-01 10:10:13 +02:00

tracing

uuid: reduce code dependency on UUID_gen.hh

2021-01-27 20:08:29 +02:00

transport

transport: fix an outdated comment

2021-02-24 11:14:01 +02:00

types

imr: switch back to open-coded description of structures

2021-02-16 23:43:07 +01:00

unified

install.sh: add systemd existance check

2021-01-13 19:32:45 +09:00

utils

Merge 'lsa: background reclaim' from Avi Kivity

2021-02-24 13:23:30 +01:00

.dockerignore

.dockerignore: add testlog

2020-02-07 08:59:39 +01:00

.gitattributes

…

.gitignore

docs: added theme

2020-12-03 17:37:18 +01:00

.gitmodules

scylla-python3: move scylla-python3 to separated repository

2020-08-18 09:34:08 +03:00

.gitorderfile

…

absl-flat_hash_map.cc

Add absl wrapper headers

2020-06-14 08:18:39 -07:00

absl-flat_hash_map.hh

Add absl wrapper headers

2020-06-14 08:18:39 -07:00

atomic_cell_hash.hh

imr: switch back to open-coded description of structures

2021-02-16 23:43:07 +01:00

atomic_cell_or_collection.hh

imr: switch back to open-coded description of structures

2021-02-16 23:43:07 +01:00

atomic_cell.cc

atomic_cell: fix operator<< for atomic_cell_or_collection

2021-02-22 14:45:34 +02:00

atomic_cell.hh

imr: switch back to open-coded description of structures

2021-02-16 23:43:07 +01:00

backlog_controller.hh

treewide: add missing headers and/or forward declarations

2020-03-23 09:29:45 +02:00

bytes_ostream.hh

bytes_ostream: Remove std::iterator from fragment_iterator

2020-11-17 16:53:20 +01:00

bytes.cc

mp_row_consumer: Provide hex-formatting wrapper for bytes_view

2020-08-26 20:44:11 +03:00

bytes.hh

bytes: implement std::hash using appending_hash

2021-01-08 13:17:46 +01:00

cache_flat_mutation_reader.hh

row_cache: Zap dummy entries when populating or reading a range

2021-03-01 20:34:35 +02:00

cache_temperature.hh

…

caching_options.hh

treewide: replace libjsoncpp usage with rjson

2020-07-03 10:27:23 +02:00

canonical_mutation.cc

canonical_mutation: make the data type non-contiguous

2021-02-15 10:24:47 +01:00

canonical_mutation.hh

canonical_mutation: make the data type non-contiguous

2021-02-15 10:24:47 +01:00

cartesian_product.hh

cartesian_product: Remove std::iterator from iterator

2020-11-17 16:53:20 +01:00

cell_locking.hh

…

checked-file-impl.hh

treewide: replace calls to engine().some_api() with some_api()

2020-04-05 12:46:04 +03:00

clocks-impl.cc

clocks-impl: switch to thread-safe time conversion

2020-05-04 14:11:38 +03:00

clocks-impl.hh

…

clustering_bounds_comparator.hh

clustering_bounds_comparator: do not depend on implicit conversion of keys to bytes_view

2020-12-20 15:14:44 +01:00

clustering_interval_set.hh

clustering_interval_set: Remove std::iterator from position_range_iterator

2020-11-17 16:53:20 +01:00

clustering_key_filter.hh

header: De-bloat schema.hh

2020-03-03 11:34:00 +01:00

clustering_ranges_walker.hh

clustering_range_walker: fix false discontiguity detected after a static row

2021-02-01 19:32:07 +02:00

CMakeLists.txt

mutation_writer/feed_writers: refactor bucket/shard writers

2021-01-19 18:48:01 +02:00

collection_mutation.cc

imr: switch back to open-coded description of structures

2021-02-16 23:43:07 +01:00

collection_mutation.hh

imr: switch back to open-coded description of structures

2021-02-16 23:43:07 +01:00

column_computation.hh

column_computation: add token_column_computation

2020-11-04 12:02:42 +01:00

combine.hh

treewide: add missing headers and/or forward declarations

2020-03-23 09:29:45 +02:00

compaction_garbage_collector.hh

header: De-bloat schema.hh

2020-03-03 11:34:00 +01:00

compaction_strategy_type.hh

compaction_strategy: add method to reshape SSTables

2020-06-18 09:37:18 -04:00

compaction_strategy.hh

distributed_loader: reshard before the node is made online

2020-06-18 09:37:18 -04:00

compatible_ring_position.hh

…

compound_compat.hh

utils: fragment_range: add a fragment iterator for FragmentedView

2021-01-15 14:05:44 +01:00

compound.hh

keys, compound: switch from bytes_view to managed_bytes_view

2021-01-08 14:16:08 +01:00

compress.cc

codebase wide: replace count with contains

2020-08-15 20:26:02 +03:00

compress.hh

…

concrete_types.hh

treewide: update concepts language from the Concepts TS to C++20

2020-06-02 09:12:21 +03:00

configure.py

Merge 'sstables: add versioning to the sstable_set ' from Wojciech Mitros

2021-03-01 14:16:36 +02:00

connection_notifier.cc

code: Use qctx::evecute_cql methods, not global ones

2020-11-19 18:39:05 +03:00

connection_notifier.hh

code: Use qctx::evecute_cql methods, not global ones

2020-11-19 18:39:05 +03:00

CONTRIBUTING.md

docs: improve CONTRIBUTING.md

2021-02-14 22:09:24 +02:00

converting_mutation_partition_applier.cc

imr: switch back to open-coded description of structures

2021-02-16 23:43:07 +01:00

converting_mutation_partition_applier.hh

converting_mutation_partition_applier: move to .cc file

2020-03-04 12:42:57 +02:00

counters.cc

imr: switch back to open-coded description of structures

2021-02-16 23:43:07 +01:00

counters.hh

imr: switch back to open-coded description of structures

2021-02-16 23:43:07 +01:00

cql_serialization_format.hh

treewide: add missing headers and/or forward declarations

2020-03-23 09:29:45 +02:00

database_fwd.hh

…

database.cc

api: Introduce system/drop_sstable_caches RESTful API

2021-03-01 16:13:04 +02:00

database.hh

api: Introduce system/drop_sstable_caches RESTful API

2021-03-01 16:13:04 +02:00

db_clock.hh

clocks: add printing functions

2020-01-30 11:10:08 +01:00

debug.hh

…

digest_algorithm.hh

digest: add null values to row digest

2020-09-10 13:16:44 +02:00

digester.hh

Merge "Don't expose exact collection from range_tombstone_list" from Pavel E

2020-09-15 10:09:15 +02:00

dirty_memory_manager.hh

…

distributed_loader.cc

Consolidate system and non system keyspace creation

2021-02-09 17:18:04 +01:00

distributed_loader.hh

distributed_loader: Add get_sstables_from_upload_dir

2021-01-16 20:03:17 +08:00

Doxyfile

…

duration.cc

duration: adjust for C++20 char8_t type

2020-05-12 20:40:30 +02:00

duration.hh

…

encoding_stats.hh

…

enum_set.hh

…

fix_system_distributed_tables.py

tracing: add username to the session table

2020-10-01 04:46:40 +02:00

flat_mutation_reader.cc

mutation_fragment_stream_validator: add token validation level

2021-03-01 07:49:23 +02:00

flat_mutation_reader.hh

flat_mutation_reader: move mutation consumer concepts to separate header

2021-01-22 15:27:48 +02:00

frozen_mutation.cc

frozen_mutation: add partition context to errors coming from deserializing

2020-12-02 15:08:49 +02:00

frozen_mutation.hh

Merge "lwt: store column_mapping's for each table schema version upon a DDL change" from Pavel Solodovnikov

2020-10-15 20:48:29 +02:00

frozen_schema.cc

frozen_schema: order idl implementations correctly

2020-10-03 19:56:28 +03:00

frozen_schema.hh

header: De-bloat schema.hh

2020-03-03 11:34:00 +01:00

gc_clock.hh

…

gen_segmented_compress_params.py

…

HACKING.md

README: better explanation of dependencies and build

2020-06-16 13:26:04 +02:00

hashers.cc

hashers: convert illegal contraint to static_assert

2020-09-21 16:32:10 +03:00

hashers.hh

hashers: Mark hash updates noexcept

2020-09-07 23:17:41 +03:00

hashing_partition_visitor.hh

…

hashing.hh

hashers: Mark hash updates noexcept

2020-09-07 23:17:41 +03:00

idl-compiler.py

idl-compiler: allow fields of type utils::chunked_vector

2021-01-13 04:09:18 +01:00

init.cc

messaging_service: Move initialization to messaging/

2020-08-19 13:08:12 +03:00

init.hh

messaging_service: Move initialization to messaging/

2020-08-19 13:08:12 +03:00

install-dependencies.sh

tools: toolchain: add node_exporter

2020-12-14 20:34:17 +02:00

install.sh

dist: drop /etc/security/limits.d/scylla.conf

2021-01-24 11:43:39 +02:00

interval.hh

range: rename range template family to interval

2020-06-16 13:36:20 +03:00

intrusive_set_external_comparator.hh

…

keys.cc

keys, compound: switch from bytes_view to managed_bytes_view

2021-01-08 14:16:08 +01:00

keys.hh

memtable: fix accounting of managed_bytes in partition_snapshot_accounter

2021-01-15 18:21:13 +01:00

LICENSE.AGPL

…

lister.cc

codebase wide: replace count with contains

2020-08-15 20:26:02 +03:00

lister.hh

Update seastar submodule

2020-08-19 17:18:57 +03:00

log.hh

…

lua.cc

imr: switch back to open-coded description of structures

2021-02-16 23:43:07 +01:00

lua.hh

lua: Handle nil returns correctly

2020-01-29 14:05:01 -08:00

main.cc

Merge ' cdc: move (most of) CDC generation management to a new service' from Kamil Braun

2021-02-26 12:42:27 +01:00

map_difference.hh

…

marshal_exception.hh

treewide: add missing headers and/or forward declarations

2020-03-23 09:29:45 +02:00

memtable-sstable.hh

table: Add write_memtable_to_sstable variant which accepts flat_mutation_reader

2021-01-04 16:23:00 -03:00

memtable.cc

treewide: explicitly use flat_mutation_reader_opt

2021-02-17 17:57:34 +02:00

memtable.hh

memtable: Track min timestamp

2021-01-04 13:24:43 -03:00

multishard_mutation_query.cc

reader_lifecycle_policy: retire low level try_resume method

2021-02-08 20:32:40 +02:00

multishard_mutation_query.hh

storage_proxy: use read_command::max_result_size to pass max result size around

2020-07-28 18:00:29 +03:00

mutation_cleaner.hh

…

mutation_compactor.hh

mutation compactor: query compaction: ignore purgeable tombstones

2021-01-22 15:27:48 +02:00

mutation_consumer_concepts.hh

flat_mutation_reader: move mutation consumer concepts to separate header

2021-01-22 15:27:48 +02:00

mutation_fragment_stream_validator.hh

mutation_fragment_stream_validator: add token validation level

2021-03-01 07:49:23 +02:00

mutation_fragment.cc

range_tombstone: Remove unused trim-front arg from .apply()

2020-11-06 15:13:05 +03:00

mutation_fragment.hh

Merge 'managed_bytes: switch to explicit linearization' from Michał Chojnowski

2021-01-18 11:01:28 +02:00

mutation_partition_serializer.cc

imr: switch back to open-coded description of structures

2021-02-16 23:43:07 +01:00

mutation_partition_serializer.hh

…

mutation_partition_view.cc

mutation_fragment: add schema and permit

2020-09-28 11:27:23 +03:00

mutation_partition_view.hh

treewide: update concepts language from the Concepts TS to C++20

2020-06-02 09:12:21 +03:00

mutation_partition_visitor.hh

…

mutation_partition.cc

Merge "Use radix tree to store cells on a row" from Pavel E

2021-02-18 21:19:14 +02:00

mutation_partition.hh

test/memory_footpring: Print radix tree node sizes

2021-02-15 20:41:09 +03:00

mutation_query.cc

mutation_query: move to_data_query_result() to mutation_partition.cc

2021-01-22 15:27:48 +02:00

mutation_query.hh

mutation_query: mark reconcilable_result_builder constructor noexcept

2021-02-17 18:56:12 +02:00

mutation_reader.cc

multishard_combining_reader: only read from needed shards

2021-02-26 23:29:20 +02:00

mutation_reader.hh

reader_lifecycle_policy: retire low level try_resume method

2021-02-08 20:32:40 +02:00

mutation_rebuilder.hh

…

mutation_source_metadata.hh

…

mutation.cc

mutation: remove now unused query() and query_compacted()

2021-01-22 15:36:37 +02:00

mutation.hh

mutation: consume(): add reverse mode

2021-02-03 11:00:47 +02:00

noexcept_traits.hh

treewide: add missing headers and/or forward declarations

2020-03-23 09:29:45 +02:00

NOTICE.txt

raft: etcd unit tests: initial boost tests

2021-01-18 12:33:12 -04:00

ORIGIN

…

partition_builder.hh

partition_builder: accept_row(): use append_clustering_row()

2020-12-02 15:08:49 +02:00

partition_range_compat.hh

…

partition_slice_builder.cc

…

partition_slice_builder.hh

partition_slice_builder: add with_option()

2020-07-28 18:00:29 +03:00

partition_snapshot_reader.hh

mutation_partition: Switch cache of rows onto B-tree

2021-02-02 09:30:30 +03:00

partition_snapshot_row_cursor.hh

row_cache: Zap dummy entries when populating or reading a range

2021-03-01 20:34:35 +02:00

partition_version_list.hh

…

partition_version.cc

misc: fix indentation

2021-01-08 14:16:08 +01:00

partition_version.hh

row_cache: Zap dummy entries when populating or reading a range

2021-03-01 20:34:35 +02:00

position_in_partition.hh

keys, compound: switch from bytes_view to managed_bytes_view

2021-01-08 14:16:08 +01:00

querier.cc

querier_cache: insert_querier: ignore errors to register inactive reader

2021-02-08 22:31:01 +02:00

querier.hh

Merge "Unify inactive readers" from Botond

2021-02-03 10:59:04 +02:00

query_class_config.hh

query: query_class_config: use max_result_size for the max_memory_for_unlimited_query field

2020-07-28 18:00:29 +03:00

query_result_merger.hh

increase the maximum size of query results to 2^64

2020-08-03 17:32:49 +02:00

query-request.hh

increase the maximum size of query results to 2^64

2020-08-03 17:32:49 +02:00

query-result-reader.hh

query-result-reader: order idl implementations correctly

2020-10-03 19:56:29 +03:00

query-result-set.cc

treewide: use query_mutations() instead of mutation::query()

2021-01-22 15:36:37 +02:00

query-result-set.hh

mutation_partition: Debloat header form others

2020-03-18 11:53:36 +02:00

query-result-writer.hh

query-result-writer: fix idl definition order related failures with clang

2020-10-11 17:57:12 +03:00

query-result.hh

result_memory_accounter: abort unpaged queries hitting the global limit

2021-02-26 23:43:16 +02:00

query.cc

increase the maximum size of query results to 2^64

2020-08-03 17:32:49 +02:00

range_tombstone_list.cc

tombstone: use comparison operator instead of ad-hoc compare() function and with_relational_operators

2020-06-02 09:28:52 +03:00

range_tombstone_list.hh

mutation: consume(): add reverse mode

2021-02-03 11:00:47 +02:00

range_tombstone.cc

…

range_tombstone.hh

memtable: fix accounting of managed_bytes in partition_snapshot_accounter

2021-01-15 18:21:13 +01:00

range.hh

range: rename range template family to interval

2020-06-16 13:36:20 +03:00

read_context.hh

row_cache: read_context: use query-request is_single_partition helper

2021-02-17 18:29:39 +02:00

reader_concurrency_semaphore.cc

reader_concurrency_semaphore: register_inactive_read: make noexcept

2021-02-08 22:31:01 +02:00

reader_concurrency_semaphore.hh

reader_concurrency_semaphore: register_inactive_read: make noexcept

2021-02-08 22:31:01 +02:00

reader_permit.hh

reader_concurrency_semaphore: dump permit diagnostics on timeout or queue overflow

2020-10-13 12:32:14 +03:00

README.md

docs: fix invalid path in README.mds

2021-02-21 13:49:12 +02:00

real_dirty_memory_accounter.hh

…

release.cc

scylla: Add "--build-mode" command line option

2021-01-20 16:07:29 +02:00

release.hh

scylla: Add "--build-mode" command line option

2021-01-20 16:07:29 +02:00

reversibly_mergeable.hh

…

row_cache.cc

row_cache: Zap dummy entries when populating or reading a range

2021-03-01 20:34:35 +02:00

row_cache.hh

row_cache: Add metric for dummy row hits

2021-02-25 18:26:01 +01:00

schema_builder.hh

schema_tables: put schema tables on shard 0

2021-01-28 13:28:22 +02:00

schema_fwd.hh

…

schema_mutations.cc

uuid: reduce code dependency on UUID_gen.hh

2021-01-27 20:08:29 +02:00

schema_mutations.hh

schema: include partitioner name in scylla tables mutation

2020-03-15 10:25:20 +01:00

schema_registry.cc

schema_registry: make grace period configurable

2020-09-15 17:53:27 +02:00

schema_registry.hh

schema_registry: make grace period configurable

2020-09-15 17:53:27 +02:00

schema_upgrader.hh

mutation_fragment: add schema and permit

2020-09-28 11:27:23 +03:00

schema.cc

schema_tables: put schema tables on shard 0

2021-01-28 13:28:22 +02:00

schema.hh

column_mapping_entry: extract == and != operators

2020-10-16 14:59:50 +02:00

scylla_post_install.sh

scylla_post_install.sh: generate memory.conf for CentOS7

2020-07-29 14:10:16 +03:00

scylla-gdb.py

scylla-gdb.py: nonwrapping_interval_printer: fix compatibility with 4.2+

2021-02-15 18:14:03 +02:00

SCYLLA-VERSION-GEN

release: prepare for 4.5.dev

2021-01-18 16:05:25 +02:00

seastarx.hh

Everywhere: Explicitly instantiate make_shared

2020-07-21 10:33:49 -07:00

serialization_visitors.hh

…

serializer_impl.hh

serializer: add serializer<lw_shared_ptr<T>> specialization

2021-01-29 01:58:46 +03:00

serializer.cc

serializer: add serializer<lw_shared_ptr<T>> specialization

2021-01-29 01:58:46 +03:00

serializer.hh

serializer: implement FragmentedView for buffer_view

2020-11-27 15:26:13 +01:00

service_permit.hh

Everywhere: Explicitly instantiate make_lw_shared

2020-07-21 10:33:49 -07:00

setup.py

…

supervisor.hh

supervisor: drop unused Upstart code, always use libsystemd

2020-06-10 08:17:35 +03:00

table_helper.cc

migration_manager: drop announce_locally flag

2021-01-03 13:58:09 +02:00

table_helper.hh

table_helper: Require local query processor in calls

2020-10-06 15:44:20 +03:00

table.cc

Merge 'sstables: add versioning to the sstable_set ' from Wojciech Mitros

2021-03-01 14:16:36 +02:00

test.py

test.py: enable back CQL based tests

2020-11-20 11:45:15 +02:00

timeout_config.cc

config: Place timeout_config() into own .cc file

2020-03-08 17:57:58 +02:00

timeout_config.hh

config: Place timeout_config() into own .cc file

2020-03-08 17:57:58 +02:00

timestamp.hh

add missing include to timestamp.hh

2020-02-05 19:42:18 +02:00

to_string.hh

treewide: add missing headers and/or forward declarations

2020-03-23 09:29:45 +02:00

tombstone.hh

tombstone: use comparison operator instead of ad-hoc compare() function and with_relational_operators

2020-06-02 09:28:52 +03:00

tox.ini

…

types.cc

imr: switch back to open-coded description of structures

2021-02-16 23:43:07 +01:00

types.hh

imr: switch back to open-coded description of structures

2021-02-16 23:43:07 +01:00

ubsan-suppressions.supp

suppress ubsan error in boost::deque::clear()

2020-11-09 11:25:19 +02:00

unimplemented.cc

everywhere: Insert space after switch

2020-08-18 14:31:04 +03:00

unimplemented.hh

…

user_types_metadata.hh

user_types_metadata: don't implement enable_lw_shared_from_this

2019-12-11 10:44:40 -08:00

validation.cc

validation: Remove get_local_storage_proxy call

2020-12-11 18:52:42 +03:00

validation.hh

validation: Remove get_local_storage_proxy call

2020-12-11 18:52:42 +03:00

version.hh

…

view_info.hh

db: view: Refactor view_info::initialize_base_dependent_fields()

2020-08-20 14:53:07 +02:00

vint-serialization.cc

…

vint-serialization.hh

vint-serialization: Reference the correct spec

2021-01-05 18:54:09 +02:00

xx_hasher.hh

Merge "Don't expose exact collection from range_tombstone_list" from Pavel E

2020-09-15 10:09:15 +02:00

zstd.cc

build: remove zstd submodule

2020-06-11 17:12:49 +03:00

README.md

Scylla

What is Scylla?

Scylla is the real-time big data database that is API-compatible with Apache Cassandra and Amazon DynamoDB. Scylla embraces a shared-nothing approach that increases throughput and storage capacity to realize order-of-magnitude performance improvements and reduce hardware costs.

For more information, please see the ScyllaDB web site.

Build Prerequisites

Scylla is fairly fussy about its build environment, requiring very recent versions of the C++20 compiler and of many libraries to build. The document HACKING.md includes detailed information on building and developing Scylla, but to get Scylla building quickly on (almost) any build machine, Scylla offers a frozen toolchain, This is a pre-configured Docker image which includes recent versions of all the required compilers, libraries and build tools. Using the frozen toolchain allows you to avoid changing anything in your build machine to meet Scylla's requirements - you just need to meet the frozen toolchain's prerequisites (mostly, Docker or Podman being available).

Building Scylla

Building Scylla with the frozen toolchain dbuild is as easy as:

$ git submodule update --init --force --recursive
$ ./tools/toolchain/dbuild ./configure.py
$ ./tools/toolchain/dbuild ninja build/release/scylla

For further information, please see:

Developer documentation for more information on building Scylla.
Build documentation on how to build Scylla binaries, tests, and packages.
Docker image build documentation for information on how to build Docker images.

Running Scylla

To start Scylla server, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --workdir tmp --smp 1 --developer-mode 1

This will start a Scylla node with one CPU core allocated to it and data files stored in the tmp directory. The --developer-mode is needed to disable the various checks Scylla performs at startup to ensure the machine is configured for maximum performance (not relevant on development workstations). Please note that you need to run Scylla with dbuild if you built it with the frozen toolchain.

For more run options, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --help

Testing

See test.py manual.

Scylla APIs and compatibility

By default, Scylla is compatible with Apache Cassandra and its APIs - CQL and Thrift. There is also support for the API of Amazon DynamoDB™, which needs to be enabled and configured in order to be used. For more information on how to enable the DynamoDB™ API in Scylla, and the current compatibility of this feature as well as Scylla-specific extensions, see Alternator and Getting started with Alternator.

Documentation

Documentation can be found here. Seastar documentation can be found here. User documentation can be found here.

Training

Training material and online courses can be found at Scylla University. The courses are free, self-paced and include hands-on examples. They cover a variety of topics including Scylla data modeling, administration, architecture, basic NoSQL concepts, using drivers for application development, Scylla setup, failover, compactions, multi-datacenters and how Scylla integrates with third-party applications.

Contributing to Scylla

If you want to report a bug or submit a pull request or a patch, please read the contribution guidelines.

If you are a developer working on Scylla, please read the developer guidelines.

Contact

The users mailing list and Slack channel are for users to discuss configuration, management, and operations of the ScyllaDB open source.
The developers mailing list is for developers and people interested in following the development of ScyllaDB to discuss technical topics.

Languages

C++ 72.1%

Python 26.7%

CMake 0.3%

GAP 0.3%

Shell 0.3%