mirror of https://github.com/scylladb/scylladb.git synced 2026-05-30 19:46:48 +00:00

Go to file

Konstantin Osipov 2b8ce83eea lists: use query timestamp for list cell values during append

Scylla list cells are represented internally as a map of
timeuuid => value. To append a new value to a list
the coordinator generates a timeuuid reflecting the current time as key
and adds a value to the map using this key.

Before this patch, Scylla always generated a timeuuid for a new
value, even if the query had a user supplied or LWT timestamp.
This could break LWT linearizability. User supplied timestamps were
ignored.

This is reported as https://github.com/scylladb/scylla/issues/7611

A statement which appended multiple values to a list or a BATCH
generated an own microsecond-resolution timeuuid for each value:

BEGIN BATCH
  UPDATE ... SET a = a + [3]
  UPDATE ... SET a = a + [4]
APPLY BATCH

UPDATE ... SET a = a + [3, 4]

To fix the bug, it's necessary to preserve monotonicity of
timeuuids within a batch or multi-value append, but make sure
they all use the microsecond time, as is set by LWT or user.

To explain the fix, it's first necessary to recall the structure
of time-based UUIDs:

60 bits: time since start of GMT epoch, year 1582, represented
         in 100-nanosecond units
4 bits:  version
14 bits: clock sequence, a random number to avoid duplicates
         in case system clock is adjusted
2 bits:  type
48 bits: MAC address (or other hardware address)

The purpose of clockseq bits is as defined in
https://tools.ietf.org/html/rfc4122#section-4.1.5
is to reduce the probability of UUID collision in case clock
goes back in time or node id changes. The implementation should reset it
whenever one of these events may occur.

Since LWT microsecond time is guaranteed to be
unique by Paxos, the RFC provisioning for clockseq and MAC
slots becomes excessive.

The fix thus changes timeuuid slot content in the following way:
- time component now contains the same microsecond time for all
  values of a statement or a batch. The time is unique and monotonic in
  case of LWT. Otherwise it's most always monotonic, but may not be
  unique if two timestamps are created on different coordinators.
- clockseq component is used to store a sequence number which is
  unique and monotonic for all values within the statement/batch.
- to protect against time back-adjustments and duplicates
  if time is auto-generated, MAC component contains a random (spoof)
  MAC address, re-created on each restart. The address is different
  at each shard.

The change is made for all sources of time: user, generated, LWT.
Conditioning the list key generation algorithm on the source of
time would unnecessarily complicate the code while not increase
quality (uniqueness) of created list keys.

Since 14 bits of clockseq provide us with only 16383 distinct slots
per statement or batch, 3 extra bits in nanosecond part of the time
are used to extend the range to 131071 values per statement/batch.
If the rang is exceeded beyond the limit, an exception is produced.

A twist on the use of clockseq to extend timeuuid uniqueness is
that Scylla, like Cassandra, uses int8 compare to compare lower
bits of timeuuid for ordering. The patch takes this into account
and sign-complements the clockseq value to make it monotonic
according to the legacy compare function.

Fixes #7611

test: unit (dev)

2021-01-21 13:03:59 +03:00

.github

docs: added multiversion_regex_builder

2021-01-13 11:07:29 +02:00

abseil @ 1e3d25b265

Update abseil submodule from upstream

2020-10-25 12:51:40 +02:00

alternator

alternator: drop unneeded sstring creation

2021-01-04 09:47:01 +01:00

api

repair: Make removenode safe by default

2020-12-10 10:14:39 +02:00

auth

migration_manager: drop announce_locally flag

2021-01-03 13:58:09 +02:00

cdc

cdc: remove _token_metadata from db_context

2020-12-13 18:32:17 +02:00

conf

db: add TransitionalAuthorizer and TransitionalAuthenticator...

2020-11-09 10:51:54 +01:00

cql3

lists: use query timestamp for list cell values during append

2021-01-21 13:03:59 +03:00

data

data/cell: fix value_writer use before definition

2020-10-12 13:41:09 +03:00

Merge "Remove proxy from size-estimates reader" from Pavel E

2021-01-05 11:28:09 +02:00

debug

…

dht

token_metadata: add clear_gently

2020-12-22 11:22:21 +02:00

dist

Revert "dist/docker: Remove 'epel-release' from Docker image"

2021-01-02 12:49:12 +02:00

docs

docs: update url

2021-01-13 11:07:29 +02:00

exceptions

cql_metrics: Add metrics for CQL errors

2020-11-30 12:18:37 +02:00

gms

gossip: Added SNITCH_NAME to application_state

2020-12-09 15:45:25 +02:00

idl

idl: change the type of mutation_partition_view::rows() to a chunked_vector

2021-01-13 04:25:53 +01:00

imr

treewide: update concepts language from the Concepts TS to C++20

2020-06-02 09:12:21 +03:00

index

index-manager: Move feature evaluation one level up

2020-12-11 21:14:12 +03:00

interface

thrift: switch csharp backend to netstd

2020-06-23 19:40:18 +03:00

libdeflate @ e7e54eab42

Update libdeflate submodule

2018-12-03 11:18:02 +02:00

licenses

Add abseil as a submodule

2020-06-14 08:18:37 -07:00

locator

token_metadata: add clear_gently

2020-12-22 11:22:21 +02:00

message

repair: Make removenode safe by default

2020-12-10 10:14:39 +02:00

mutation_writer

mutation_writer: pass exceptions through feed_writer

2020-12-16 13:18:19 +02:00

raft

Revert "Revert "Merge "raft: fix replication if existing log on leader" from Gleb""

2020-12-08 19:19:55 +02:00

redis

redis: implement parse error, reply error message correctly

2021-01-07 13:22:20 +02:00

reloc

reloc: Remove "build_reloc.sh" script as obsolete

2020-11-20 22:41:26 +02:00

repair

token_metadata: add clear_gently

2020-12-22 11:22:21 +02:00

scripts

dist: add node_exporter to scylla-server package

2020-12-24 11:44:13 +02:00

seastar @ a287bb1a39

Update seastar submodule

2021-01-11 20:38:59 +02:00

service

migration_manager: drop announce_locally flag

2021-01-03 13:58:09 +02:00

sstables

sstable_writer: add validation

2021-01-11 09:12:56 +02:00

streaming

flat_mutation_reader: extract fragment stream validator into its own header

2021-01-11 08:07:42 +02:00

swagger-ui @ 12f1da1082

…

test

test: add a CQL test for list append/prepend operations

2021-01-18 17:32:00 +03:00

thrift

migration_manager: drop announce_locally flag

2021-01-03 13:58:09 +02:00

tools

Update tools/jmx submodule

2020-12-28 21:19:04 +02:00

tracing

tracing: Keep qp anchor on backend

2020-10-06 15:45:19 +03:00

transport

transport/server: Code cleanups

2020-12-14 12:48:05 +02:00

types

types: collection: add an optimization for single-fragment buffers in deserialize

2020-12-04 09:21:05 +01:00

unified

build: compress unified package faster

2020-11-23 00:31:04 +02:00

utils

lists: use query timestamp for list cell values during append

2021-01-21 13:03:59 +03:00

.dockerignore

.dockerignore: add testlog

2020-02-07 08:59:39 +01:00

.gitattributes

…

.gitignore

docs: added theme

2020-12-03 17:37:18 +01:00

.gitmodules

scylla-python3: move scylla-python3 to separated repository

2020-08-18 09:34:08 +03:00

.gitorderfile

…

absl-flat_hash_map.cc

Add absl wrapper headers

2020-06-14 08:18:39 -07:00

absl-flat_hash_map.hh

Add absl wrapper headers

2020-06-14 08:18:39 -07:00

atomic_cell_hash.hh

collection_mutation: easier (de)serialization of collection_mutation(s).

2019-10-25 10:42:58 +02:00

atomic_cell_or_collection.hh

atomic_cell: move collection_mutation(_view) to a new file.

2019-10-25 10:19:45 +02:00

atomic_cell.cc

data/cell: don't overshoot target allocation sizes

2020-09-14 14:21:46 +03:00

atomic_cell.hh

atomic_cell.hh: forward-declare atomic_cell_or_collection

2020-09-21 16:32:53 +03:00

backlog_controller.hh

treewide: add missing headers and/or forward declarations

2020-03-23 09:29:45 +02:00

bytes_ostream.hh

bytes_ostream: Remove std::iterator from fragment_iterator

2020-11-17 16:53:20 +01:00

bytes.cc

mp_row_consumer: Provide hex-formatting wrapper for bytes_view

2020-08-26 20:44:11 +03:00

bytes.hh

bytes: define contructor for fmt_hex

2020-09-21 16:32:53 +03:00

cache_flat_mutation_reader.hh

mutation-partition: Construct rows_entry directly from clustering_row

2020-12-24 18:13:44 +02:00

cache_temperature.hh

Move cache_temperature into its own header

2018-12-12 16:03:45 +02:00

caching_options.hh

treewide: replace libjsoncpp usage with rjson

2020-07-03 10:27:23 +02:00

canonical_mutation.cc

everywhere: Use uninitialized_string instead of sstring::initialized_later

2020-03-10 13:17:49 -07:00

canonical_mutation.hh

header: De-bloat schema.hh

2020-03-03 11:34:00 +01:00

cartesian_product.hh

cartesian_product: Remove std::iterator from iterator

2020-11-17 16:53:20 +01:00

cell_locking.hh

mutation_partition: make static_row optional to reduce memory footprint

2019-10-15 15:42:05 +03:00

checked-file-impl.hh

treewide: replace calls to engine().some_api() with some_api()

2020-04-05 12:46:04 +03:00

clocks-impl.cc

clocks-impl: switch to thread-safe time conversion

2020-05-04 14:11:38 +03:00

clocks-impl.hh

…

clustering_bounds_comparator.hh

treewide: update concepts language from the Concepts TS to C++20

2020-06-02 09:12:21 +03:00

clustering_interval_set.hh

clustering_interval_set: Remove std::iterator from position_range_iterator

2020-11-17 16:53:20 +01:00

clustering_key_filter.hh

header: De-bloat schema.hh

2020-03-03 11:34:00 +01:00

clustering_ranges_walker.hh

…

CMakeLists.txt

cmake: redesign scylla's CMakeLists.txt to finally allow full-fledged building

2020-11-10 10:34:27 +02:00

collection_mutation.cc

treewide: update concepts language from the Concepts TS to C++20

2020-06-02 09:12:21 +03:00

collection_mutation.hh

types: switch serialize_for_cql from bytes to bytes_ostream

2020-12-07 17:55:36 +01:00

column_computation.hh

column_computation: add token_column_computation

2020-11-04 12:02:42 +01:00

combine.hh

treewide: add missing headers and/or forward declarations

2020-03-23 09:29:45 +02:00

compaction_garbage_collector.hh

header: De-bloat schema.hh

2020-03-03 11:34:00 +01:00

compaction_strategy_type.hh

compaction_strategy: add method to reshape SSTables

2020-06-18 09:37:18 -04:00

compaction_strategy.hh

distributed_loader: reshard before the node is made online

2020-06-18 09:37:18 -04:00

compatible_ring_position.hh

Introduce compatible_ring_position and compatible_ring_position_or_view

2019-06-23 16:29:12 +03:00

compound_compat.hh

Merge ' types: add constraint on lexicographical_tri_compare()' from Avi Kivity

2020-12-09 18:48:01 +01:00

compound.hh

Merge ' types: add constraint on lexicographical_tri_compare()' from Avi Kivity

2020-12-09 18:48:01 +01:00

compress.cc

codebase wide: replace count with contains

2020-08-15 20:26:02 +03:00

compress.hh

Replace std::experimental types with C++17 std version.

2019-01-08 13:16:36 +02:00

concrete_types.hh

treewide: update concepts language from the Concepts TS to C++20

2020-06-02 09:12:21 +03:00

configure.py

configure: add utf8_test to pure_boost_tests

2021-01-13 11:07:29 +02:00

connection_notifier.cc

code: Use qctx::evecute_cql methods, not global ones

2020-11-19 18:39:05 +03:00

connection_notifier.hh

code: Use qctx::evecute_cql methods, not global ones

2020-11-19 18:39:05 +03:00

CONTRIBUTING.md

docs: add organization

2020-12-22 15:33:31 +02:00

converting_mutation_partition_applier.cc

converting_mutation_partition_applier: move to .cc file

2020-03-04 12:42:57 +02:00

converting_mutation_partition_applier.hh

converting_mutation_partition_applier: move to .cc file

2020-03-04 12:42:57 +02:00

counters.cc

counters: Drop call to get_local_storage_service and related

2020-12-04 16:31:12 +03:00

counters.hh

counters: Drop call to get_local_storage_service and related

2020-12-04 16:31:12 +03:00

cql_serialization_format.hh

treewide: add missing headers and/or forward declarations

2020-03-23 09:29:45 +02:00

database_fwd.hh

database_fwd.hh: add keyspace fwd declaration

2018-12-14 08:03:57 +02:00

database.cc

database: migrate find_keyspace to string views

2021-01-04 09:47:01 +01:00

database.hh

Merge "Wire interposer consumer for memtable flush" from Raphael

2021-01-13 11:07:29 +02:00

db_clock.hh

clocks: add printing functions

2020-01-30 11:10:08 +01:00

debug.hh

…

digest_algorithm.hh

digest: add null values to row digest

2020-09-10 13:16:44 +02:00

digester.hh

Merge "Don't expose exact collection from range_tombstone_list" from Pavel E

2020-09-15 10:09:15 +02:00

dirty_memory_manager.hh

commitlog+region_group: timeout exceptions with names

2019-12-03 19:07:19 +01:00

distributed_loader.cc

system_keyspace: migrate helper functions to string_view

2021-01-04 09:47:01 +01:00

distributed_loader.hh

distributed_loader: remove declaration of inexistent do_populate_column_family()

2020-06-29 14:23:42 -03:00

Doxyfile

…

duration.cc

duration: adjust for C++20 char8_t type

2020-05-12 20:40:30 +02:00

duration.hh

Replace std::experimental types with C++17 std version.

2019-01-08 13:16:36 +02:00

encoding_stats.hh

encoding_stats.hh: add missing include

2019-05-14 13:27:30 +03:00

enum_set.hh

lwt: ensure enum_set::of is constexpr.

2019-10-01 19:45:56 +02:00

fix_system_distributed_tables.py

tracing: add username to the session table

2020-10-01 04:46:40 +02:00

flat_mutation_reader.cc

mutation_fragment_stream_validator: make it easier to validate concrete fragment types

2021-01-11 08:07:42 +02:00

flat_mutation_reader.hh

flat_mutation_reader: extract fragment stream validator into its own header

2021-01-11 08:07:42 +02:00

frozen_mutation.cc

frozen_mutation: add partition context to errors coming from deserializing

2020-12-02 15:08:49 +02:00

frozen_mutation.hh

Merge "lwt: store column_mapping's for each table schema version upon a DDL change" from Pavel Solodovnikov

2020-10-15 20:48:29 +02:00

frozen_schema.cc

frozen_schema: order idl implementations correctly

2020-10-03 19:56:28 +03:00

frozen_schema.hh

header: De-bloat schema.hh

2020-03-03 11:34:00 +01:00

gc_clock.hh

gc_clock, serialization: define new serialization for gc_clock::duration (aka TTLs)

2019-10-23 18:36:33 +03:00

gen_segmented_compress_params.py

gen_segmented_compress_params.py: add encoding comment

2018-11-28 23:59:18 +01:00

HACKING.md

README: better explanation of dependencies and build

2020-06-16 13:26:04 +02:00

hashers.cc

hashers: convert illegal contraint to static_assert

2020-09-21 16:32:10 +03:00

hashers.hh

hashers: Mark hash updates noexcept

2020-09-07 23:17:41 +03:00

hashing_partition_visitor.hh

…

hashing.hh

hashers: Mark hash updates noexcept

2020-09-07 23:17:41 +03:00

idl-compiler.py

idl-compiler: allow fields of type utils::chunked_vector

2021-01-13 04:09:18 +01:00

init.cc

messaging_service: Move initialization to messaging/

2020-08-19 13:08:12 +03:00

init.hh

messaging_service: Move initialization to messaging/

2020-08-19 13:08:12 +03:00

install-dependencies.sh

tools: toolchain: add node_exporter

2020-12-14 20:34:17 +02:00

install.sh

install.sh: switch to use realpath for EnvironmentFile

2021-01-04 12:45:17 +02:00

interval.hh

range: rename range template family to interval

2020-06-16 13:36:20 +03:00

intrusive_set_external_comparator.hh

intrusive_set_external_comparator: make iterator nothrow move constructible

2018-12-05 20:07:29 +00:00

keys.cc

everywhere: Insert space after switch

2020-08-18 14:31:04 +03:00

keys.hh

partition_key_view: add validate method

2020-05-12 12:07:00 +03:00

LICENSE.AGPL

…

lister.cc

codebase wide: replace count with contains

2020-08-15 20:26:02 +03:00

lister.hh

Update seastar submodule

2020-08-19 17:18:57 +03:00

log.hh

…

lua.cc

lua: expect overflow when selecting lua types

2020-10-11 15:38:07 +03:00

lua.hh

lua: Handle nil returns correctly

2020-01-29 14:05:01 -08:00

main.cc

alternator: make default timeout configurable

2020-12-09 14:30:43 +01:00

map_difference.hh

…

marshal_exception.hh

treewide: add missing headers and/or forward declarations

2020-03-23 09:29:45 +02:00

memtable-sstable.hh

table: Add write_memtable_to_sstable variant which accepts flat_mutation_reader

2021-01-04 16:23:00 -03:00

memtable.cc

memtable: Track min timestamp

2021-01-04 13:24:43 -03:00

memtable.hh

memtable: Track min timestamp

2021-01-04 13:24:43 -03:00

multishard_mutation_query.cc

multishard_mutation_query: Propagate mutation_reader::forwarding flag

2020-11-02 15:24:36 +02:00

multishard_mutation_query.hh

storage_proxy: use read_command::max_result_size to pass max result size around

2020-07-28 18:00:29 +03:00

mutation_cleaner.hh

memtables: add partition/row hit/miss counters

2019-11-12 13:35:41 +01:00

mutation_compactor.hh

increase the maximum size of query results to 2^64

2020-08-03 17:32:49 +02:00

mutation_fragment_stream_validator.hh

mutation_fragment_stream_validator: make it easier to validate concrete fragment types

2021-01-11 08:07:42 +02:00

mutation_fragment.cc

range_tombstone: Remove unused trim-front arg from .apply()

2020-11-06 15:13:05 +03:00

mutation_fragment.hh

mutation-partition: Construct rows_entry directly from clustering_row

2020-12-24 18:13:44 +02:00

mutation_partition_serializer.cc

sstables: drop checks for correct counter order support

2020-09-14 12:05:11 +02:00

mutation_partition_serializer.hh

Replace std::experimental types with C++17 std version.

2019-01-08 13:16:36 +02:00

mutation_partition_view.cc

mutation_fragment: add schema and permit

2020-09-28 11:27:23 +03:00

mutation_partition_view.hh

treewide: update concepts language from the Concepts TS to C++20

2020-06-02 09:12:21 +03:00

mutation_partition_visitor.hh

atomic_cell: move collection_mutation(_view) to a new file.

2019-10-25 10:19:45 +02:00

mutation_partition.cc

Merge "frozen_mutation: better diagnostics for out-of-order and duplicate rows" from Botond

2021-01-10 19:30:12 +02:00

mutation_partition.hh

Merge "frozen_mutation: better diagnostics for out-of-order and duplicate rows" from Botond

2021-01-10 19:30:12 +02:00

mutation_query.cc

increase the maximum size of query results to 2^64

2020-08-03 17:32:49 +02:00

mutation_query.hh

mutation_query: mutation_query_stage: add get_stats()

2020-11-17 15:13:21 +02:00

mutation_reader.cc

clustering_order_reader_merger: fix the 0 readers case

2020-12-18 12:30:40 +01:00

mutation_reader.hh

Revert "Revert "Merge "raft: fix replication if existing log on leader" from Gleb""

2020-12-08 19:19:55 +02:00

mutation_rebuilder.hh

…

mutation_source_metadata.hh

Add mutation_source_metadata

2019-06-26 15:45:59 +03:00

mutation.cc

mutation: Improve log print of mutations

2020-09-04 16:33:25 +02:00

mutation.hh

increase the maximum size of query results to 2^64

2020-08-03 17:32:49 +02:00

noexcept_traits.hh

treewide: add missing headers and/or forward declarations

2020-03-23 09:29:45 +02:00

NOTICE.txt

tests: port Cassandra CQL tests to cql repl

2020-03-26 15:19:38 +02:00

ORIGIN

…

partition_builder.hh

partition_builder: accept_row(): use append_clustering_row()

2020-12-02 15:08:49 +02:00

partition_range_compat.hh

Replace std::experimental types with C++17 std version.

2019-01-08 13:16:36 +02:00

partition_slice_builder.cc

partition_slice: use small_vector for column_ids

2018-12-06 14:21:04 +00:00

partition_slice_builder.hh

partition_slice_builder: add with_option()

2020-07-28 18:00:29 +03:00

partition_snapshot_reader.hh

range_tombstone: Remove unused trim-front arg from .apply()

2020-11-06 15:13:05 +03:00

partition_snapshot_row_cursor.hh

partition_snapshot_row_cursor: row(): return clustering_row instead of mutation_fragment

2020-09-28 10:53:56 +03:00

partition_version_list.hh

…

partition_version.cc

partition_version: Change range_tombstones() to return chunked_vector

2020-10-26 11:54:42 +02:00

partition_version.hh

partition_version: Change range_tombstones() to return chunked_vector

2020-10-26 11:54:42 +02:00

position_in_partition.hh

Revert "Revert "Merge "raft: fix replication if existing log on leader" from Gleb""

2020-12-08 19:19:55 +02:00

querier.cc

querier: move common stuff into querier_base

2020-06-03 18:45:33 +03:00

querier.hh

querier_cache: use the reader permit for memory accounting

2020-10-06 08:22:56 +03:00

query_class_config.hh

query: query_class_config: use max_result_size for the max_memory_for_unlimited_query field

2020-07-28 18:00:29 +03:00

query_result_merger.hh

increase the maximum size of query results to 2^64

2020-08-03 17:32:49 +02:00

query-request.hh

increase the maximum size of query results to 2^64

2020-08-03 17:32:49 +02:00

query-result-reader.hh

query-result-reader: order idl implementations correctly

2020-10-03 19:56:29 +03:00

query-result-set.cc

query-result-set: don't linearize in result_set_builder::deserialize

2020-12-04 09:19:39 +01:00

query-result-set.hh

mutation_partition: Debloat header form others

2020-03-18 11:53:36 +02:00

query-result-writer.hh

query-result-writer: fix idl definition order related failures with clang

2020-10-11 17:57:12 +03:00

query-result.hh

increase the maximum size of query results to 2^64

2020-08-03 17:32:49 +02:00

query.cc

increase the maximum size of query results to 2^64

2020-08-03 17:32:49 +02:00

range_tombstone_list.cc

tombstone: use comparison operator instead of ad-hoc compare() function and with_relational_operators

2020-06-02 09:28:52 +03:00

range_tombstone_list.hh

range_tombstone_list: Do not expose internal collection

2020-09-07 23:17:41 +03:00

range_tombstone.cc

Replace std::experimental types with C++17 std version.

2019-01-08 13:16:36 +02:00

range_tombstone.hh

range_tombstone: Remove unused schema arg from .set_start

2020-11-06 15:13:05 +03:00

range.hh

range: rename range template family to interval

2020-06-16 13:36:20 +03:00

read_context.hh

row_cache: pass a valid permit to underlying read

2020-05-28 11:34:35 +03:00

reader_concurrency_semaphore.cc

reader_concurrency_semaphore: rate-limit diagnostics messages

2020-11-17 11:57:51 +02:00

reader_concurrency_semaphore.hh

reader_concurrency_semaphore: add is_unlimited()

2020-11-17 15:13:21 +02:00

reader_permit.hh

reader_concurrency_semaphore: dump permit diagnostics on timeout or queue overflow

2020-10-13 12:32:14 +03:00

README.md

docs: update url

2021-01-13 11:07:29 +02:00

real_dirty_memory_accounter.hh

…

release.cc

…

release.hh

…

reversibly_mergeable.hh

…

row_cache.cc

row_cache: linearize key in cache_entry::do_read()

2021-01-13 11:07:29 +02:00

row_cache.hh

row_cache: allow external updater to decouple preparation from execution

2020-12-28 13:17:45 -03:00

schema_builder.hh

schema: Pass an rvalue to set_compaction_strategy_options

2020-08-19 14:02:35 -07:00

schema_fwd.hh

collection_type_impl::mutation: compact_and_expire() add collector parameter

2019-07-15 17:37:55 +03:00

schema_mutations.cc

schema: include partitioner name in scylla tables mutation

2020-03-15 10:25:20 +01:00

schema_mutations.hh

schema: include partitioner name in scylla tables mutation

2020-03-15 10:25:20 +01:00

schema_registry.cc

schema_registry: make grace period configurable

2020-09-15 17:53:27 +02:00

schema_registry.hh

schema_registry: make grace period configurable

2020-09-15 17:53:27 +02:00

schema_upgrader.hh

mutation_fragment: add schema and permit

2020-09-28 11:27:23 +03:00

schema.cc

schema.cc/describe: fix invalid compaction options in schema

2020-12-06 17:40:05 +02:00

schema.hh

column_mapping_entry: extract == and != operators

2020-10-16 14:59:50 +02:00

scylla_post_install.sh

scylla_post_install.sh: generate memory.conf for CentOS7

2020-07-29 14:10:16 +03:00

scylla-gdb.py

database, streaming: remove remnants of memtable-base streaming

2020-11-16 14:32:19 +01:00

SCYLLA-VERSION-GEN

SCYLLA-VERSION-GEN: change master version to 4.4.dev

2020-11-03 13:42:54 +02:00

seastarx.hh

Everywhere: Explicitly instantiate make_shared

2020-07-21 10:33:49 -07:00

serialization_visitors.hh

…

serializer_impl.hh

Merge 'Reinstate [[nodiscard]] support' from Avi Kivity

2020-12-12 09:54:05 +02:00

serializer.hh

serializer: implement FragmentedView for buffer_view

2020-11-27 15:26:13 +01:00

service_permit.hh

Everywhere: Explicitly instantiate make_lw_shared

2020-07-21 10:33:49 -07:00

setup.py

setup.py: add python3 classifiers

2018-11-28 23:54:03 +01:00

supervisor.hh

supervisor: drop unused Upstart code, always use libsystemd

2020-06-10 08:17:35 +03:00

table_helper.cc

migration_manager: drop announce_locally flag

2021-01-03 13:58:09 +02:00

table_helper.hh

table_helper: Require local query processor in calls

2020-10-06 15:44:20 +03:00

table.cc

table: Wire interposer consumer for memtable flush

2021-01-04 16:26:07 -03:00

test.py

test.py: enable back CQL based tests

2020-11-20 11:45:15 +02:00

timeout_config.cc

config: Place timeout_config() into own .cc file

2020-03-08 17:57:58 +02:00

timeout_config.hh

config: Place timeout_config() into own .cc file

2020-03-08 17:57:58 +02:00

timestamp.hh

add missing include to timestamp.hh

2020-02-05 19:42:18 +02:00

to_string.hh

treewide: add missing headers and/or forward declarations

2020-03-23 09:29:45 +02:00

tombstone.hh

tombstone: use comparison operator instead of ad-hoc compare() function and with_relational_operators

2020-06-02 09:28:52 +03:00

tox.ini

…

types.cc

Merge 'types: don't linearize in validate()' from Michał Chojnowski

2020-12-11 17:33:59 +02:00

types.hh

cql3: Use correct comparator in timeuuid min/max

2021-01-13 11:07:29 +02:00

ubsan-suppressions.supp

suppress ubsan error in boost::deque::clear()

2020-11-09 11:25:19 +02:00

unimplemented.cc

everywhere: Insert space after switch

2020-08-18 14:31:04 +03:00

unimplemented.hh

…

user_types_metadata.hh

user_types_metadata: don't implement enable_lw_shared_from_this

2019-12-11 10:44:40 -08:00

validation.cc

validation: Remove get_local_storage_proxy call

2020-12-11 18:52:42 +03:00

validation.hh

validation: Remove get_local_storage_proxy call

2020-12-11 18:52:42 +03:00

version.hh

…

view_info.hh

db: view: Refactor view_info::initialize_base_dependent_fields()

2020-08-20 14:53:07 +02:00

vint-serialization.cc

vint: optimise deserialisation routine

2019-03-14 13:37:06 +00:00

vint-serialization.hh

vint-serialization: Reference the correct spec

2021-01-05 18:54:09 +02:00

xx_hasher.hh

Merge "Don't expose exact collection from range_tombstone_list" from Pavel E

2020-09-15 10:09:15 +02:00

zstd.cc

build: remove zstd submodule

2020-06-11 17:12:49 +03:00

README.md

Scylla

What is Scylla?

Scylla is the real-time big data database that is API-compatible with Apache Cassandra and Amazon DynamoDB. Scylla embraces a shared-nothing approach that increases throughput and storage capacity to realize order-of-magnitude performance improvements and reduce hardware costs.

For more information, please see the ScyllaDB web site.

Build Prerequisites

Scylla is fairly fussy about its build environment, requiring very recent versions of the C++20 compiler and of many libraries to build. The document HACKING.md includes detailed information on building and developing Scylla, but to get Scylla building quickly on (almost) any build machine, Scylla offers a frozen toolchain, This is a pre-configured Docker image which includes recent versions of all the required compilers, libraries and build tools. Using the frozen toolchain allows you to avoid changing anything in your build machine to meet Scylla's requirements - you just need to meet the frozen toolchain's prerequisites (mostly, Docker or Podman being available).

Building Scylla

Building Scylla with the frozen toolchain dbuild is as easy as:

$ git submodule update --init --force --recursive
$ ./tools/toolchain/dbuild ./configure.py
$ ./tools/toolchain/dbuild ninja build/release/scylla

For further information, please see:

Developer documentation for more information on building Scylla.
Build documentation on how to build Scylla binaries, tests, and packages.
Docker image build documentation for information on how to build Docker images.

Running Scylla

To start Scylla server, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --workdir tmp --smp 1 --developer-mode 1

This will start a Scylla node with one CPU core allocated to it and data files stored in the tmp directory. The --developer-mode is needed to disable the various checks Scylla performs at startup to ensure the machine is configured for maximum performance (not relevant on development workstations). Please note that you need to run Scylla with dbuild if you built it with the frozen toolchain.

For more run options, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --help

Testing

See test.py manual.

Scylla APIs and compatibility

By default, Scylla is compatible with Apache Cassandra and its APIs - CQL and Thrift. There is also support for the API of Amazon DynamoDB™, which needs to be enabled and configured in order to be used. For more information on how to enable the DynamoDB™ API in Scylla, and the current compatibility of this feature as well as Scylla-specific extensions, see Alternator and Getting started with Alternator.

Documentation

Documentation can be found here. Seastar documentation can be found here. User documentation can be found here.

Training

Training material and online courses can be found at Scylla University. The courses are free, self-paced and include hands-on examples. They cover a variety of topics including Scylla data modeling, administration, architecture, basic NoSQL concepts, using drivers for application development, Scylla setup, failover, compactions, multi-datacenters and how Scylla integrates with third-party applications.

Contributing to Scylla

If you want to report a bug or submit a pull request or a patch, please read the contribution guidelines.

If you are a developer working on Scylla, please read the developer guidelines.

Contact

The users mailing list and Slack channel are for users to discuss configuration, management, and operations of the ScyllaDB open source.
The developers mailing list is for developers and people interested in following the development of ScyllaDB to discuss technical topics.

Languages

C++ 72.3%

Python 26.5%

CMake 0.3%

GAP 0.3%

Shell 0.3%