mirror of https://github.com/scylladb/scylladb.git synced 2026-06-03 05:26:58 +00:00

Go to file

Avi Kivity f0950e023d Merge 'Split CDC streams table partitions into clustered rows ' from Kamil Braun

Until now, the lists of streams in the `cdc_streams_descriptions` table
for a given generation were stored in a single collection. This solution
has multiple problems when dealing with large clusters (which produce
large lists of streams):
1. large allocations
2. reactor stalls
3. mutations too large to even fit in commitlog segments

This commit changes the schema of the table as described in issue #7993.
The streams are grouped according to token ranges, each token range
being represented by a separate clustering row. Rows are inserted in
reasonably large batches for efficiency.

The table is renamed to enable easy upgrade. On upgrade, the latest CDC
generation's list of streams will be (re-)inserted into the new table.

Yet another table is added: one that contains only the generation
timestamps clustered in a single partition. This makes it easy for CDC
clients to learn about new generations. It also enables an elegant
two-phase insertion procedure of the generation description: first we
insert the streams; only after ensuring that a quorum of replicas
contains them, we insert the timestamp. Thus, if any client observes a
timestamp in the timestamps table (even using a ONE query),
it means that a quorum of replicas must contain the list of streams.

---

Nodes automatically ensure that the latest CDC generation's list of
streams is present in the streams description table. When a new
generation appears, we only need to update the table for this
generation; old generations are already inserted.

However, we've changed the description table (from
`cdc_streams_descriptions` to `cdc_streams_descriptions_v2`). The
existing mechanism only ensures that the latest generation appears in
the new description table. We add an additional procedure that
rewrites the older generations as well, if we find that it is necessary
to do so (i.e. when some CDC log tables may contain data in these
generations).

Closes #8116

* github.com:scylladb/scylla:
  tests: add a simple CDC cql pytest
  cdc: add config option to disable streams rewriting
  cdc: rewrite streams to the new description table
  cql3: query_processor: improve internal paged query API
  cdc: introduce no_generation_data_exception exception type
  docs: cdc: mention system.cdc_local table
  cdc: coroutinize do_update_streams_description
  sys_dist_ks: split CDC streams table partitions into clustered rows
  cdc: use chunked_vector for streams in streams_version
  cdc: remove `streams_version::expired` field
  system_distributed_keyspace: use mutation API to insert CDC streams
  storage_service: don't use `sys_dist_ks` before it is started

2021-02-18 12:49:43 +02:00

.github

docs: added multiversion_regex_builder

2021-01-13 11:07:29 +02:00

abseil @ 9c6a50fdd8

Update abseil submodule

2021-02-08 15:41:46 +02:00

alternator

sys_dist_ks: split CDC streams table partitions into clustered rows

2021-02-18 11:44:59 +01:00

api

API: Fix aggregation in column_familiy

2021-02-08 12:11:30 +02:00

auth

migration_manager: drop announce_locally flag

2021-01-03 13:58:09 +02:00

cdc

cdc: add config option to disable streams rewriting

2021-02-18 11:44:59 +01:00

conf

transport: Fix abort on certain configurations of native_transport_port(_ssl)

2021-02-02 11:32:31 +02:00

cql3

cql3: query_processor: improve internal paged query API

2021-02-18 11:44:59 +01:00

cdc: add config option to disable streams rewriting

2021-02-18 11:44:59 +01:00

debug

…

dht

range-streamer: Remove global storage service reference

2021-02-12 15:50:30 +01:00

dist

dist/debian: fix renaming debian/scylla-* files rule

2021-02-18 10:35:19 +02:00

docs

cdc: rewrite streams to the new description table

2021-02-18 11:44:59 +01:00

exceptions

cql: fix error return from execution of fromJson() and other functions

2021-01-21 15:21:13 +01:00

gms

schema: recalculate digest when computed_columns feature is enabled

2021-02-11 13:48:58 +02:00

idl

raft: joint consensus, use unordered_set for server_address list

2021-01-29 22:07:07 +03:00

index

flat_mutation_reader: return future from next_partition

2021-01-13 17:35:07 +02:00

interface

thrift: switch csharp backend to netstd

2020-06-23 19:40:18 +03:00

libdeflate @ e7e54eab42

…

licenses

Add abseil as a submodule

2020-06-14 08:18:37 -07:00

locator

locator: Check DC names in NTS

2021-02-09 07:04:17 +01:00

message

messaging: don't inherit from seastar::rpc::protocol

2021-02-16 16:04:44 +02:00

mutation_writer

mutation_writer: bucket_writer: add close

2021-01-19 19:03:58 +02:00

raft

Merge "raft: add unit tests for log, tracker, votes and fix found bugs" from Kostja

2021-02-18 10:55:59 +01:00

redis

redis: rename _args_size/_size_left

2021-01-25 10:26:37 +09:00

reloc

reloc: Remove "build_reloc.sh" script as obsolete

2020-11-20 22:41:26 +02:00

repair

imr: switch back to open-coded description of structures

2021-02-16 23:43:07 +01:00

scripts

dist: add node_exporter to scylla-server package

2020-12-24 11:44:13 +02:00

seastar @ e53a1059f9

Update seastar submodule

2021-02-16 16:19:26 +02:00

service

cdc: rewrite streams to the new description table

2021-02-18 11:44:59 +01:00

sstables

compaction: Fix leak of expired sstable in the backlog tracker

2021-02-18 11:12:00 +02:00

streaming

stream_session: prepare: fix missing string format argument

2021-02-11 12:05:32 +02:00

swagger-ui @ 12f1da1082

…

test

Merge 'Split CDC streams table partitions into clustered rows ' from Kamil Braun

2021-02-18 12:49:43 +02:00

thrift

database: drop duplicated function

2021-02-01 18:52:04 +02:00

tools

Update tools/jmx submodule

2021-02-18 10:35:00 +02:00

tracing

uuid: reduce code dependency on UUID_gen.hh

2021-01-27 20:08:29 +02:00

transport

transport: Remove global storage service reference

2021-02-08 12:58:49 +01:00

types

imr: switch back to open-coded description of structures

2021-02-16 23:43:07 +01:00

unified

install.sh: add systemd existance check

2021-01-13 19:32:45 +09:00

utils

imr: switch back to open-coded description of structures

2021-02-16 23:43:07 +01:00

.dockerignore

.dockerignore: add testlog

2020-02-07 08:59:39 +01:00

.gitattributes

…

.gitignore

docs: added theme

2020-12-03 17:37:18 +01:00

.gitmodules

scylla-python3: move scylla-python3 to separated repository

2020-08-18 09:34:08 +03:00

.gitorderfile

…

absl-flat_hash_map.cc

Add absl wrapper headers

2020-06-14 08:18:39 -07:00

absl-flat_hash_map.hh

Add absl wrapper headers

2020-06-14 08:18:39 -07:00

atomic_cell_hash.hh

imr: switch back to open-coded description of structures

2021-02-16 23:43:07 +01:00

atomic_cell_or_collection.hh

imr: switch back to open-coded description of structures

2021-02-16 23:43:07 +01:00

atomic_cell.cc

imr: switch back to open-coded description of structures

2021-02-16 23:43:07 +01:00

atomic_cell.hh

imr: switch back to open-coded description of structures

2021-02-16 23:43:07 +01:00

backlog_controller.hh

treewide: add missing headers and/or forward declarations

2020-03-23 09:29:45 +02:00

bytes_ostream.hh

bytes_ostream: Remove std::iterator from fragment_iterator

2020-11-17 16:53:20 +01:00

bytes.cc

mp_row_consumer: Provide hex-formatting wrapper for bytes_view

2020-08-26 20:44:11 +03:00

bytes.hh

bytes: implement std::hash using appending_hash

2021-01-08 13:17:46 +01:00

cache_flat_mutation_reader.hh

treewide: explicitly use flat_mutation_reader_opt

2021-02-17 17:57:34 +02:00

cache_temperature.hh

…

caching_options.hh

treewide: replace libjsoncpp usage with rjson

2020-07-03 10:27:23 +02:00

canonical_mutation.cc

canonical_mutation: make the data type non-contiguous

2021-02-15 10:24:47 +01:00

canonical_mutation.hh

canonical_mutation: make the data type non-contiguous

2021-02-15 10:24:47 +01:00

cartesian_product.hh

cartesian_product: Remove std::iterator from iterator

2020-11-17 16:53:20 +01:00

cell_locking.hh

…

checked-file-impl.hh

treewide: replace calls to engine().some_api() with some_api()

2020-04-05 12:46:04 +03:00

clocks-impl.cc

clocks-impl: switch to thread-safe time conversion

2020-05-04 14:11:38 +03:00

clocks-impl.hh

…

clustering_bounds_comparator.hh

clustering_bounds_comparator: do not depend on implicit conversion of keys to bytes_view

2020-12-20 15:14:44 +01:00

clustering_interval_set.hh

clustering_interval_set: Remove std::iterator from position_range_iterator

2020-11-17 16:53:20 +01:00

clustering_key_filter.hh

header: De-bloat schema.hh

2020-03-03 11:34:00 +01:00

clustering_ranges_walker.hh

clustering_range_walker: fix false discontiguity detected after a static row

2021-02-01 19:32:07 +02:00

CMakeLists.txt

mutation_writer/feed_writers: refactor bucket/shard writers

2021-01-19 18:48:01 +02:00

collection_mutation.cc

imr: switch back to open-coded description of structures

2021-02-16 23:43:07 +01:00

collection_mutation.hh

imr: switch back to open-coded description of structures

2021-02-16 23:43:07 +01:00

column_computation.hh

column_computation: add token_column_computation

2020-11-04 12:02:42 +01:00

combine.hh

treewide: add missing headers and/or forward declarations

2020-03-23 09:29:45 +02:00

compaction_garbage_collector.hh

header: De-bloat schema.hh

2020-03-03 11:34:00 +01:00

compaction_strategy_type.hh

compaction_strategy: add method to reshape SSTables

2020-06-18 09:37:18 -04:00

compaction_strategy.hh

distributed_loader: reshard before the node is made online

2020-06-18 09:37:18 -04:00

compatible_ring_position.hh

…

compound_compat.hh

utils: fragment_range: add a fragment iterator for FragmentedView

2021-01-15 14:05:44 +01:00

compound.hh

keys, compound: switch from bytes_view to managed_bytes_view

2021-01-08 14:16:08 +01:00

compress.cc

codebase wide: replace count with contains

2020-08-15 20:26:02 +03:00

compress.hh

…

concrete_types.hh

treewide: update concepts language from the Concepts TS to C++20

2020-06-02 09:12:21 +03:00

configure.py

Merge "raft: add unit tests for log, tracker, votes and fix found bugs" from Kostja

2021-02-18 10:55:59 +01:00

connection_notifier.cc

code: Use qctx::evecute_cql methods, not global ones

2020-11-19 18:39:05 +03:00

connection_notifier.hh

code: Use qctx::evecute_cql methods, not global ones

2020-11-19 18:39:05 +03:00

CONTRIBUTING.md

docs: improve CONTRIBUTING.md

2021-02-14 22:09:24 +02:00

converting_mutation_partition_applier.cc

imr: switch back to open-coded description of structures

2021-02-16 23:43:07 +01:00

converting_mutation_partition_applier.hh

converting_mutation_partition_applier: move to .cc file

2020-03-04 12:42:57 +02:00

counters.cc

imr: switch back to open-coded description of structures

2021-02-16 23:43:07 +01:00

counters.hh

imr: switch back to open-coded description of structures

2021-02-16 23:43:07 +01:00

cql_serialization_format.hh

treewide: add missing headers and/or forward declarations

2020-03-23 09:29:45 +02:00

database_fwd.hh

…

database.cc

database: futurize remove

2021-02-17 18:52:53 +02:00

database.hh

database: futurize remove

2021-02-17 18:52:53 +02:00

db_clock.hh

…

debug.hh

…

digest_algorithm.hh

digest: add null values to row digest

2020-09-10 13:16:44 +02:00

digester.hh

Merge "Don't expose exact collection from range_tombstone_list" from Pavel E

2020-09-15 10:09:15 +02:00

dirty_memory_manager.hh

…

distributed_loader.cc

Consolidate system and non system keyspace creation

2021-02-09 17:18:04 +01:00

distributed_loader.hh

distributed_loader: Add get_sstables_from_upload_dir

2021-01-16 20:03:17 +08:00

Doxyfile

…

duration.cc

duration: adjust for C++20 char8_t type

2020-05-12 20:40:30 +02:00

duration.hh

…

encoding_stats.hh

…

enum_set.hh

…

fix_system_distributed_tables.py

tracing: add username to the session table

2020-10-01 04:46:40 +02:00

flat_mutation_reader.cc

flat_mutation_reader: Use clear() in destroy_current_mutation()

2021-02-02 09:30:30 +03:00

flat_mutation_reader.hh

flat_mutation_reader: move mutation consumer concepts to separate header

2021-01-22 15:27:48 +02:00

frozen_mutation.cc

frozen_mutation: add partition context to errors coming from deserializing

2020-12-02 15:08:49 +02:00

frozen_mutation.hh

Merge "lwt: store column_mapping's for each table schema version upon a DDL change" from Pavel Solodovnikov

2020-10-15 20:48:29 +02:00

frozen_schema.cc

frozen_schema: order idl implementations correctly

2020-10-03 19:56:28 +03:00

frozen_schema.hh

header: De-bloat schema.hh

2020-03-03 11:34:00 +01:00

gc_clock.hh

…

gen_segmented_compress_params.py

…

HACKING.md

README: better explanation of dependencies and build

2020-06-16 13:26:04 +02:00

hashers.cc

hashers: convert illegal contraint to static_assert

2020-09-21 16:32:10 +03:00

hashers.hh

hashers: Mark hash updates noexcept

2020-09-07 23:17:41 +03:00

hashing_partition_visitor.hh

…

hashing.hh

hashers: Mark hash updates noexcept

2020-09-07 23:17:41 +03:00

idl-compiler.py

idl-compiler: allow fields of type utils::chunked_vector

2021-01-13 04:09:18 +01:00

init.cc

messaging_service: Move initialization to messaging/

2020-08-19 13:08:12 +03:00

init.hh

messaging_service: Move initialization to messaging/

2020-08-19 13:08:12 +03:00

install-dependencies.sh

tools: toolchain: add node_exporter

2020-12-14 20:34:17 +02:00

install.sh

dist: drop /etc/security/limits.d/scylla.conf

2021-01-24 11:43:39 +02:00

interval.hh

range: rename range template family to interval

2020-06-16 13:36:20 +03:00

intrusive_set_external_comparator.hh

…

keys.cc

keys, compound: switch from bytes_view to managed_bytes_view

2021-01-08 14:16:08 +01:00

keys.hh

memtable: fix accounting of managed_bytes in partition_snapshot_accounter

2021-01-15 18:21:13 +01:00

LICENSE.AGPL

…

lister.cc

codebase wide: replace count with contains

2020-08-15 20:26:02 +03:00

lister.hh

Update seastar submodule

2020-08-19 17:18:57 +03:00

log.hh

…

lua.cc

imr: switch back to open-coded description of structures

2021-02-16 23:43:07 +01:00

lua.hh

…

main.cc

system_distributed_keyspace: use mutation API to insert CDC streams

2021-02-18 11:44:59 +01:00

map_difference.hh

…

marshal_exception.hh

treewide: add missing headers and/or forward declarations

2020-03-23 09:29:45 +02:00

memtable-sstable.hh

table: Add write_memtable_to_sstable variant which accepts flat_mutation_reader

2021-01-04 16:23:00 -03:00

memtable.cc

treewide: explicitly use flat_mutation_reader_opt

2021-02-17 17:57:34 +02:00

memtable.hh

memtable: Track min timestamp

2021-01-04 13:24:43 -03:00

multishard_mutation_query.cc

reader_lifecycle_policy: retire low level try_resume method

2021-02-08 20:32:40 +02:00

multishard_mutation_query.hh

storage_proxy: use read_command::max_result_size to pass max result size around

2020-07-28 18:00:29 +03:00

mutation_cleaner.hh

…

mutation_compactor.hh

mutation compactor: query compaction: ignore purgeable tombstones

2021-01-22 15:27:48 +02:00

mutation_consumer_concepts.hh

flat_mutation_reader: move mutation consumer concepts to separate header

2021-01-22 15:27:48 +02:00

mutation_fragment_stream_validator.hh

mutation_fragment_stream_validator: make it easier to validate concrete fragment types

2021-01-11 08:07:42 +02:00

mutation_fragment.cc

range_tombstone: Remove unused trim-front arg from .apply()

2020-11-06 15:13:05 +03:00

mutation_fragment.hh

Merge 'managed_bytes: switch to explicit linearization' from Michał Chojnowski

2021-01-18 11:01:28 +02:00

mutation_partition_serializer.cc

imr: switch back to open-coded description of structures

2021-02-16 23:43:07 +01:00

mutation_partition_serializer.hh

…

mutation_partition_view.cc

mutation_fragment: add schema and permit

2020-09-28 11:27:23 +03:00

mutation_partition_view.hh

treewide: update concepts language from the Concepts TS to C++20

2020-06-02 09:12:21 +03:00

mutation_partition_visitor.hh

…

mutation_partition.cc

imr: switch back to open-coded description of structures

2021-02-16 23:43:07 +01:00

mutation_partition.hh

mutation_partition: Switch cache of rows onto B-tree

2021-02-02 09:30:30 +03:00

mutation_query.cc

mutation_query: move to_data_query_result() to mutation_partition.cc

2021-01-22 15:27:48 +02:00

mutation_query.hh

mutation_query: mark reconcilable_result_builder constructor noexcept

2021-02-17 18:56:12 +02:00

mutation_reader.cc

evictable_reader: reset _range_override after fast-forwarding

2021-02-17 19:11:00 +02:00

mutation_reader.hh

reader_lifecycle_policy: retire low level try_resume method

2021-02-08 20:32:40 +02:00

mutation_rebuilder.hh

…

mutation_source_metadata.hh

…

mutation.cc

mutation: remove now unused query() and query_compacted()

2021-01-22 15:36:37 +02:00

mutation.hh

mutation: consume(): add reverse mode

2021-02-03 11:00:47 +02:00

noexcept_traits.hh

treewide: add missing headers and/or forward declarations

2020-03-23 09:29:45 +02:00

NOTICE.txt

raft: etcd unit tests: initial boost tests

2021-01-18 12:33:12 -04:00

ORIGIN

…

partition_builder.hh

partition_builder: accept_row(): use append_clustering_row()

2020-12-02 15:08:49 +02:00

partition_range_compat.hh

…

partition_slice_builder.cc

…

partition_slice_builder.hh

partition_slice_builder: add with_option()

2020-07-28 18:00:29 +03:00

partition_snapshot_reader.hh

mutation_partition: Switch cache of rows onto B-tree

2021-02-02 09:30:30 +03:00

partition_snapshot_row_cursor.hh

partition_snapshot_row_cursor: Remove rows pointer

2021-02-02 09:30:30 +03:00

partition_version_list.hh

…

partition_version.cc

misc: fix indentation

2021-01-08 14:16:08 +01:00

partition_version.hh

partition_version: Change range_tombstones() to return chunked_vector

2020-10-26 11:54:42 +02:00

position_in_partition.hh

keys, compound: switch from bytes_view to managed_bytes_view

2021-01-08 14:16:08 +01:00

querier.cc

querier_cache: insert_querier: ignore errors to register inactive reader

2021-02-08 22:31:01 +02:00

querier.hh

Merge "Unify inactive readers" from Botond

2021-02-03 10:59:04 +02:00

query_class_config.hh

query: query_class_config: use max_result_size for the max_memory_for_unlimited_query field

2020-07-28 18:00:29 +03:00

query_result_merger.hh

increase the maximum size of query results to 2^64

2020-08-03 17:32:49 +02:00

query-request.hh

increase the maximum size of query results to 2^64

2020-08-03 17:32:49 +02:00

query-result-reader.hh

query-result-reader: order idl implementations correctly

2020-10-03 19:56:29 +03:00

query-result-set.cc

treewide: use query_mutations() instead of mutation::query()

2021-01-22 15:36:37 +02:00

query-result-set.hh

mutation_partition: Debloat header form others

2020-03-18 11:53:36 +02:00

query-result-writer.hh

query-result-writer: fix idl definition order related failures with clang

2020-10-11 17:57:12 +03:00

query-result.hh

increase the maximum size of query results to 2^64

2020-08-03 17:32:49 +02:00

query.cc

increase the maximum size of query results to 2^64

2020-08-03 17:32:49 +02:00

range_tombstone_list.cc

tombstone: use comparison operator instead of ad-hoc compare() function and with_relational_operators

2020-06-02 09:28:52 +03:00

range_tombstone_list.hh

mutation: consume(): add reverse mode

2021-02-03 11:00:47 +02:00

range_tombstone.cc

…

range_tombstone.hh

memtable: fix accounting of managed_bytes in partition_snapshot_accounter

2021-01-15 18:21:13 +01:00

range.hh

range: rename range template family to interval

2020-06-16 13:36:20 +03:00

read_context.hh

row_cache: read_context: use query-request is_single_partition helper

2021-02-17 18:29:39 +02:00

reader_concurrency_semaphore.cc

reader_concurrency_semaphore: register_inactive_read: make noexcept

2021-02-08 22:31:01 +02:00

reader_concurrency_semaphore.hh

reader_concurrency_semaphore: register_inactive_read: make noexcept

2021-02-08 22:31:01 +02:00

reader_permit.hh

reader_concurrency_semaphore: dump permit diagnostics on timeout or queue overflow

2020-10-13 12:32:14 +03:00

README.md

README: fix a dead link for building instructions

2021-02-03 10:59:50 +02:00

real_dirty_memory_accounter.hh

…

release.cc

scylla: Add "--build-mode" command line option

2021-01-20 16:07:29 +02:00

release.hh

scylla: Add "--build-mode" command line option

2021-01-20 16:07:29 +02:00

reversibly_mergeable.hh

…

row_cache.cc

row_cache: scanning_and_populating_reader: add _read_next_partition flag

2021-02-17 19:06:21 +02:00

row_cache.hh

Merge 'managed_bytes: switch to explicit linearization' from Michał Chojnowski

2021-01-18 11:01:28 +02:00

schema_builder.hh

schema_tables: put schema tables on shard 0

2021-01-28 13:28:22 +02:00

schema_fwd.hh

…

schema_mutations.cc

uuid: reduce code dependency on UUID_gen.hh

2021-01-27 20:08:29 +02:00

schema_mutations.hh

schema: include partitioner name in scylla tables mutation

2020-03-15 10:25:20 +01:00

schema_registry.cc

schema_registry: make grace period configurable

2020-09-15 17:53:27 +02:00

schema_registry.hh

schema_registry: make grace period configurable

2020-09-15 17:53:27 +02:00

schema_upgrader.hh

mutation_fragment: add schema and permit

2020-09-28 11:27:23 +03:00

schema.cc

schema_tables: put schema tables on shard 0

2021-01-28 13:28:22 +02:00

schema.hh

column_mapping_entry: extract == and != operators

2020-10-16 14:59:50 +02:00

scylla_post_install.sh

scylla_post_install.sh: generate memory.conf for CentOS7

2020-07-29 14:10:16 +03:00

scylla-gdb.py

scylla-gdb.py: nonwrapping_interval_printer: fix compatibility with 4.2+

2021-02-15 18:14:03 +02:00

SCYLLA-VERSION-GEN

release: prepare for 4.5.dev

2021-01-18 16:05:25 +02:00

seastarx.hh

Everywhere: Explicitly instantiate make_shared

2020-07-21 10:33:49 -07:00

serialization_visitors.hh

…

serializer_impl.hh

serializer: add serializer<lw_shared_ptr<T>> specialization

2021-01-29 01:58:46 +03:00

serializer.cc

serializer: add serializer<lw_shared_ptr<T>> specialization

2021-01-29 01:58:46 +03:00

serializer.hh

serializer: implement FragmentedView for buffer_view

2020-11-27 15:26:13 +01:00

service_permit.hh

Everywhere: Explicitly instantiate make_lw_shared

2020-07-21 10:33:49 -07:00

setup.py

…

supervisor.hh

supervisor: drop unused Upstart code, always use libsystemd

2020-06-10 08:17:35 +03:00

table_helper.cc

migration_manager: drop announce_locally flag

2021-01-03 13:58:09 +02:00

table_helper.hh

table_helper: Require local query processor in calls

2020-10-06 15:44:20 +03:00

table.cc

Merge 'Make commitlog disk limit a hard limit.' from Calle Wilund

2021-02-08 16:44:05 +02:00

test.py

test.py: enable back CQL based tests

2020-11-20 11:45:15 +02:00

timeout_config.cc

config: Place timeout_config() into own .cc file

2020-03-08 17:57:58 +02:00

timeout_config.hh

config: Place timeout_config() into own .cc file

2020-03-08 17:57:58 +02:00

timestamp.hh

add missing include to timestamp.hh

2020-02-05 19:42:18 +02:00

to_string.hh

treewide: add missing headers and/or forward declarations

2020-03-23 09:29:45 +02:00

tombstone.hh

tombstone: use comparison operator instead of ad-hoc compare() function and with_relational_operators

2020-06-02 09:28:52 +03:00

tox.ini

…

types.cc

imr: switch back to open-coded description of structures

2021-02-16 23:43:07 +01:00

types.hh

imr: switch back to open-coded description of structures

2021-02-16 23:43:07 +01:00

ubsan-suppressions.supp

suppress ubsan error in boost::deque::clear()

2020-11-09 11:25:19 +02:00

unimplemented.cc

everywhere: Insert space after switch

2020-08-18 14:31:04 +03:00

unimplemented.hh

…

user_types_metadata.hh

…

validation.cc

validation: Remove get_local_storage_proxy call

2020-12-11 18:52:42 +03:00

validation.hh

validation: Remove get_local_storage_proxy call

2020-12-11 18:52:42 +03:00

version.hh

…

view_info.hh

db: view: Refactor view_info::initialize_base_dependent_fields()

2020-08-20 14:53:07 +02:00

vint-serialization.cc

…

vint-serialization.hh

vint-serialization: Reference the correct spec

2021-01-05 18:54:09 +02:00

xx_hasher.hh

Merge "Don't expose exact collection from range_tombstone_list" from Pavel E

2020-09-15 10:09:15 +02:00

zstd.cc

build: remove zstd submodule

2020-06-11 17:12:49 +03:00

README.md

Scylla

What is Scylla?

Scylla is the real-time big data database that is API-compatible with Apache Cassandra and Amazon DynamoDB. Scylla embraces a shared-nothing approach that increases throughput and storage capacity to realize order-of-magnitude performance improvements and reduce hardware costs.

For more information, please see the ScyllaDB web site.

Build Prerequisites

Scylla is fairly fussy about its build environment, requiring very recent versions of the C++20 compiler and of many libraries to build. The document HACKING.md includes detailed information on building and developing Scylla, but to get Scylla building quickly on (almost) any build machine, Scylla offers a frozen toolchain, This is a pre-configured Docker image which includes recent versions of all the required compilers, libraries and build tools. Using the frozen toolchain allows you to avoid changing anything in your build machine to meet Scylla's requirements - you just need to meet the frozen toolchain's prerequisites (mostly, Docker or Podman being available).

Building Scylla

Building Scylla with the frozen toolchain dbuild is as easy as:

$ git submodule update --init --force --recursive
$ ./tools/toolchain/dbuild ./configure.py
$ ./tools/toolchain/dbuild ninja build/release/scylla

For further information, please see:

Developer documentation for more information on building Scylla.
Build documentation on how to build Scylla binaries, tests, and packages.
Docker image build documentation for information on how to build Docker images.

Running Scylla

To start Scylla server, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --workdir tmp --smp 1 --developer-mode 1

This will start a Scylla node with one CPU core allocated to it and data files stored in the tmp directory. The --developer-mode is needed to disable the various checks Scylla performs at startup to ensure the machine is configured for maximum performance (not relevant on development workstations). Please note that you need to run Scylla with dbuild if you built it with the frozen toolchain.

For more run options, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --help

Testing

See test.py manual.

Scylla APIs and compatibility

By default, Scylla is compatible with Apache Cassandra and its APIs - CQL and Thrift. There is also support for the API of Amazon DynamoDB™, which needs to be enabled and configured in order to be used. For more information on how to enable the DynamoDB™ API in Scylla, and the current compatibility of this feature as well as Scylla-specific extensions, see Alternator and Getting started with Alternator.

Documentation

Documentation can be found here. Seastar documentation can be found here. User documentation can be found here.

Training

Training material and online courses can be found at Scylla University. The courses are free, self-paced and include hands-on examples. They cover a variety of topics including Scylla data modeling, administration, architecture, basic NoSQL concepts, using drivers for application development, Scylla setup, failover, compactions, multi-datacenters and how Scylla integrates with third-party applications.

Contributing to Scylla

If you want to report a bug or submit a pull request or a patch, please read the contribution guidelines.

If you are a developer working on Scylla, please read the developer guidelines.

Contact

The users mailing list and Slack channel are for users to discuss configuration, management, and operations of the ScyllaDB open source.
The developers mailing list is for developers and people interested in following the development of ScyllaDB to discuss technical topics.

Languages

C++ 72.2%

Python 26.6%

CMake 0.3%

GAP 0.3%

Shell 0.3%