mirror of https://github.com/scylladb/scylladb.git synced 2026-06-02 13:06:57 +00:00

Go to file

Avi Kivity fea5067dfa Merge "Limit non-paged query memory consumption" from Botond

"
Non-paged queries completely ignore the query result size limiter
mechanism. They consume all the memory they want. With sufficiently
large datasets this can easily lead to a handful or even a single
unpaged query producing an OOM.

This series continues the work started by 134d5a5f7, by introducing a
configurable pair of soft/hard limit (default to 1MB/100MB) that is
applied to otherwise unlimited queries, like reverse and unpaged ones.
When an unlimited query reaches the soft limit a warning is logged. This
should give users some heads-up to adjust their application. When the
hard limit is reached the query is aborted. The idea is to not greet
users with failing queries after an upgrade while at the same time
protect the database from the really bad queries. The hard limit should
be decreased from time to time gradually approaching the desired goal of
1MB.

We don't want to limit internal queries, we trust ourselves to either
use another form of memory usage control, or read only small datasets.
So the limit is selected according to the query class. User reads use
the `max_memory_for_unlimited_query_{soft,hard}_limit` configuration
items, while internal reads are not limited. The limit is obtained by
the coordinator, who passes it down to replicas using the existing
`max_result_size` parameter (which is not a special type containing the
two limits), which is now passed on every verb, instead of once per
connection. This ensures that all replicas work with the same limits.
For normal paged queries `max_result_size` is set to the usual
`query::result_memory_limiter::maximum_result_size` For queries that can
consume unlimited amount of memory -- unpaged and reverse queries --
this is set to the value of the aforementioned
`max_memory_for_unlimited_query_{soft,hard}_limit` configuration item,
but only for user reads, internal reads are not limited.

This has the side-effect that reverse reads now send entire
partitions in a single page, but this is not that bad. The data was
already read, and its size was below the limit, the replica might as well
send it all.

Fixes: #5870
"

* 'nonpaged-query-limit/v5' of https://github.com/denesb/scylla: (26 commits)
  test: database_test: add test for enforced max result limit
  mutation_partition: abort read when hard limit is exceeded for non-paged reads
  query-result.hh: move the definition of short_read to the top
  test: cql_test_env: set the max_memory_unlimited_query_{soft,hard}_limit
  test: set the allow_short_read slice option for paged queries
  partition_slice_builder: add with_option()
  result_memory_accounter: remove default constructor
  query_*(): use the coordinator specified memory limit for unlimited queries
  storage_proxy: use read_command::max_result_size to pass max result size around
  query: result_memory_limiter: use the new max_result_size type
  query: read_command: add max_result_size
  query: read_command: use tagged ints for limit ctor params
  query: read_command: add separate convenience constructor
  service: query_pager: set the allow_short_read flag
  result_memory_accounter: check(): use _maximum_result_size instead of hardcoded limit
  storage_proxy: add get_max_result_size()
  result_memory_limiter: add unlimited_result_size constant
  database: add get_statement_scheduling_group()
  database: query_mutations(): obtain the memory accounter inside
  query: query_class_config: use max_result_size for the max_memory_for_unlimited_query field
  ...

2020-07-29 13:41:53 +03:00

.github

…

abseil @ 2069dc796a

Add abseil as a submodule

2020-06-14 08:18:37 -07:00

alternator

query: read_command: add max_result_size

2020-07-28 18:00:29 +03:00

api

repair: Add synchronous API to query repair status

2020-07-14 11:20:15 +03:00

auth

auth: Convert sstring variables in common.hh to constexpr std::string_view

2020-07-03 12:35:58 -07:00

cdc

query: read_command: add max_result_size

2020-07-28 18:00:29 +03:00

conf

partitioners: Make it impossible to use RandomPartitioner

2020-01-24 09:09:13 +01:00

cql3

Merge "Limit non-paged query memory consumption" from Botond

2020-07-29 13:41:53 +03:00

data

treewide: update concepts language from the Concepts TS to C++20

2020-06-02 09:12:21 +03:00

Merge "Limit non-paged query memory consumption" from Botond

2020-07-29 13:41:53 +03:00

debug

…

dht

migration_manager: Remove db/schema_tables.hh inclustion into header

2020-07-17 17:54:43 +03:00

dist

scylla_setup: skip boot partition

2020-07-28 12:19:55 +03:00

docs

docs: add paragraph to tracing.md

2020-07-27 13:38:57 +03:00

exceptions

cql3: avoid using shared_ptr's in unrecognized_entity_exception

2020-05-06 19:02:36 +03:00

gms

gossiper: Drop replacement_quarantine

2020-07-06 11:27:55 +03:00

idl

query: read_command: add max_result_size

2020-07-28 18:00:29 +03:00

imr

treewide: update concepts language from the Concepts TS to C++20

2020-06-02 09:12:21 +03:00

index

treewide: replace libjsoncpp usage with rjson

2020-07-03 10:27:23 +02:00

interface

thrift: switch csharp backend to netstd

2020-06-23 19:40:18 +03:00

libdeflate @ e7e54eab42

…

licenses

Add abseil as a submodule

2020-06-14 08:18:37 -07:00

locator

storage_service: Improve log on removing pending replacing node

2020-07-28 11:51:22 +03:00

message

Merge "messaging: make verb handler registering independent of current scheduling group" from Botond

2020-07-27 13:56:52 +03:00

mutation_writer

everywhere: Prepare for seastar api v4 (when_all_succeed return value)

2020-06-18 15:13:56 +03:00

python3

…

redis

query: read_command: add max_result_size

2020-07-28 18:00:29 +03:00

redis-test

redis: add strlen command

2020-07-14 10:56:23 +03:00

reloc

build-deb.sh: fix rm to erase only python

2020-07-08 17:58:38 +03:00

repair

repair: Fix race between create_writer and wait_for_writer_done

2020-07-28 11:53:40 +03:00

scripts

build: don't package tools/java and tools/jmx in relocatable pacakge

2020-07-22 20:03:18 +03:00

seastar @ 02ad74fa7d

Update seastar submodule

2020-07-21 19:08:36 +03:00

service

Merge "Limit non-paged query memory consumption" from Botond

2020-07-29 13:41:53 +03:00

sstables

compaction: Improve compaction efficiency by killing the procedure that trims jobs

2020-07-28 17:44:00 +03:00

streaming

sstables: clamp estimated_partitions to [1, +inf) in writers

2020-07-27 09:19:37 +02:00

swagger-ui @ 12f1da1082

…

test

Merge "Limit non-paged query memory consumption" from Botond

2020-07-29 13:41:53 +03:00

thrift

query: read_command: add max_result_size

2020-07-28 18:00:29 +03:00

tools

Update tools/jmx and tools/java submodules

2020-07-29 12:55:18 +03:00

tracing

migration_manager: Remove db/schema_tables.hh inclustion into header

2020-07-17 17:54:43 +03:00

transport

Merge "lwt: introduce LWT flag in prepared statement metadata" from Pavel

2020-06-30 12:40:19 +03:00

types

cql3: pass column_specification via lw_shared_ptr

2020-04-27 12:47:42 +03:00

unified

reloc: support unified relocatable package

2020-07-15 20:29:31 +03:00

utils

utils: config_src::add_command_line_options(): drop name and desc args

2020-07-28 18:00:29 +03:00

.dockerignore

.dockerignore: add testlog

2020-02-07 08:59:39 +01:00

.gitattributes

…

.gitignore

test.py: add CQL .reject files to gitignore

2020-01-15 11:41:19 +03:00

.gitmodules

move jmx/tools submodules to tools directory

2020-07-13 17:14:14 +03:00

.gitorderfile

…

absl-flat_hash_map.cc

Add absl wrapper headers

2020-06-14 08:18:39 -07:00

absl-flat_hash_map.hh

Add absl wrapper headers

2020-06-14 08:18:39 -07:00

atomic_cell_hash.hh

…

atomic_cell_or_collection.hh

…

atomic_cell.cc

atomic_cell: special rule for printing counter cells

2020-02-24 17:11:34 +02:00

atomic_cell.hh

treewide: update concepts language from the Concepts TS to C++20

2020-06-02 09:12:21 +03:00

backlog_controller.hh

treewide: add missing headers and/or forward declarations

2020-03-23 09:29:45 +02:00

build_id.cc

build_id: add missing include for assert()

2020-01-29 23:44:50 +02:00

build_id.hh

Print build-id on startup

2019-12-19 15:43:04 +02:00

bytes_ostream.hh

bytes_ostream: make it a FragmentRange

2019-12-02 10:10:31 +02:00

bytes.cc

everywhere: Use uninitialized_string instead of sstring::initialized_later

2020-03-10 13:17:49 -07:00

bytes.hh

bytes: compare_unsigned: do not pass nullptr to memcmp

2020-07-09 17:54:46 +03:00

cache_flat_mutation_reader.hh

treewide: throw std::bad_function_call with backtraces

2020-04-08 13:54:06 +02:00

cache_temperature.hh

…

caching_options.hh

treewide: replace libjsoncpp usage with rjson

2020-07-03 10:27:23 +02:00

canonical_mutation.cc

everywhere: Use uninitialized_string instead of sstring::initialized_later

2020-03-10 13:17:49 -07:00

canonical_mutation.hh

header: De-bloat schema.hh

2020-03-03 11:34:00 +01:00

cartesian_product.hh

treewide: add missing headers and/or forward declarations

2020-03-23 09:29:45 +02:00

cell_locking.hh

…

checked-file-impl.hh

treewide: replace calls to engine().some_api() with some_api()

2020-04-05 12:46:04 +03:00

clocks-impl.cc

clocks-impl: switch to thread-safe time conversion

2020-05-04 14:11:38 +03:00

clocks-impl.hh

…

clustering_bounds_comparator.hh

treewide: update concepts language from the Concepts TS to C++20

2020-06-02 09:12:21 +03:00

clustering_interval_set.hh

header: De-bloat schema.hh

2020-03-03 11:34:00 +01:00

clustering_key_filter.hh

header: De-bloat schema.hh

2020-03-03 11:34:00 +01:00

clustering_ranges_walker.hh

…

CMakeLists.txt

CMakeLists.txt: Update to C++20

2020-06-18 09:51:23 +03:00

collection_mutation.cc

treewide: update concepts language from the Concepts TS to C++20

2020-06-02 09:12:21 +03:00

collection_mutation.hh

collection_mutation_view: add type-aware pretty printer

2020-01-07 12:06:29 +02:00

column_computation.hh

treewide: replace libjsoncpp usage with rjson

2020-07-03 10:27:23 +02:00

combine.hh

treewide: add missing headers and/or forward declarations

2020-03-23 09:29:45 +02:00

compaction_garbage_collector.hh

header: De-bloat schema.hh

2020-03-03 11:34:00 +01:00

compaction_strategy_type.hh

compaction_strategy: add method to reshape SSTables

2020-06-18 09:37:18 -04:00

compaction_strategy.hh

distributed_loader: reshard before the node is made online

2020-06-18 09:37:18 -04:00

compatible_ring_position.hh

…

compound_compat.hh

bytes: compare_unsigned: do not pass nullptr to memcmp

2020-07-09 17:54:46 +03:00

compound.hh

compound_type: implement validate()

2020-05-07 16:19:56 +03:00

compress.cc

compressor: Add an explicit cast to const sstring&

2020-03-10 13:13:48 -07:00

compress.hh

…

concrete_types.hh

treewide: update concepts language from the Concepts TS to C++20

2020-06-02 09:12:21 +03:00

configure.py

build: Use -fdata-sections and -ffunction-sections

2020-07-28 19:39:26 +03:00

connection_notifier.cc

system_keyspace: Added infrastructure for table `system.clients'

2019-12-17 11:31:28 +01:00

connection_notifier.hh

system_keyspace: Added infrastructure for table `system.clients'

2019-12-17 11:31:28 +01:00

CONTRIBUTING.md

Fix a link to contributor-agreement in the CONTRIBUTING page

2020-05-17 14:15:49 +03:00

converting_mutation_partition_applier.cc

converting_mutation_partition_applier: move to .cc file

2020-03-04 12:42:57 +02:00

converting_mutation_partition_applier.hh

converting_mutation_partition_applier: move to .cc file

2020-03-04 12:42:57 +02:00

counters.cc

…

counters.hh

treewide: update concepts language from the Concepts TS to C++20

2020-06-02 09:12:21 +03:00

cql_serialization_format.hh

treewide: add missing headers and/or forward declarations

2020-03-23 09:29:45 +02:00

database_fwd.hh

…

database.cc

mutation_partition: abort read when hard limit is exceeded for non-paged reads

2020-07-29 08:32:31 +03:00

database.hh

query_*(): use the coordinator specified memory limit for unlimited queries

2020-07-28 18:00:29 +03:00

db_clock.hh

clocks: add printing functions

2020-01-30 11:10:08 +01:00

debug.hh

…

digest_algorithm.hh

treewide: add missing headers and/or forward declarations

2020-03-23 09:29:45 +02:00

digester.hh

…

dirty_memory_manager.hh

commitlog+region_group: timeout exceptions with names

2019-12-03 19:07:19 +01:00

distributed_loader.cc

sstables: sstable_version_types: implement operator<=>

2020-07-08 14:23:11 +03:00

distributed_loader.hh

distributed_loader: remove declaration of inexistent do_populate_column_family()

2020-06-29 14:23:42 -03:00

Doxyfile

…

duration.cc

duration: adjust for C++20 char8_t type

2020-05-12 20:40:30 +02:00

duration.hh

…

encoding_stats.hh

…

enum_set.hh

…

fix_system_distributed_tables.py

…

flat_mutation_reader.cc

query: query_class_config: use max_result_size for the max_memory_for_unlimited_query field

2020-07-28 18:00:29 +03:00

flat_mutation_reader.hh

query: query_class_config: use max_result_size for the max_memory_for_unlimited_query field

2020-07-28 18:00:29 +03:00

frozen_mutation.cc

headers:: Remove flat_mutation_reader.hh from several other headers

2020-07-17 17:54:47 +03:00

frozen_mutation.hh

headers:: Remove flat_mutation_reader.hh from several other headers

2020-07-17 17:54:47 +03:00

frozen_schema.cc

…

frozen_schema.hh

header: De-bloat schema.hh

2020-03-03 11:34:00 +01:00

gc_clock.hh

…

gen_segmented_compress_params.py

…

HACKING.md

README: better explanation of dependencies and build

2020-06-16 13:26:04 +02:00

hashers.cc

…

hashers.hh

…

hashing_partition_visitor.hh

…

hashing.hh

treewide: update concepts language from the Concepts TS to C++20

2020-06-02 09:12:21 +03:00

idl-compiler.py

treewide: update concepts language from the Concepts TS to C++20

2020-06-02 09:12:21 +03:00

init.cc

init: init_ms_fd_gossiper: use logger for error message

2020-06-30 12:46:44 +03:00

init.hh

storage_service: Kill initialization helper from init.cc

2020-01-15 14:27:27 +03:00

install-dependencies.sh

build: install jmx and tools-java submodule dependencies

2020-07-22 20:13:50 +03:00

install.sh

install.sh: support calling install.sh from other directory

2020-07-15 18:55:12 +03:00

interval.hh

range: rename range template family to interval

2020-06-16 13:36:20 +03:00

intrusive_set_external_comparator.hh

…

keys.cc

dht: add dht::get_token

2020-02-17 10:59:15 +01:00

keys.hh

partition_key_view: add validate method

2020-05-12 12:07:00 +03:00

LICENSE.AGPL

…

lister.cc

treewide: replace calls to engine().some_api() with some_api()

2020-04-05 12:46:04 +03:00

lister.hh

treewide: replace calls to engine().some_api() with some_api()

2020-04-05 12:46:04 +03:00

log.hh

…

lua.cc

big_decimal: Add a as_rational member function

2020-06-25 15:33:31 -07:00

lua.hh

lua: Handle nil returns correctly

2020-01-29 14:05:01 -08:00

main.cc

main: Add missing calls to unregister RPC hanlers

2020-07-22 16:35:07 +03:00

MAINTAINERS

Update MAINTAINERS

2020-07-16 17:29:41 +03:00

map_difference.hh

…

marshal_exception.hh

treewide: add missing headers and/or forward declarations

2020-03-23 09:29:45 +02:00

memtable-sstable.hh

treewide: add missing headers and/or forward declarations

2020-03-23 09:29:45 +02:00

memtable.cc

memtable: Switch onto B+ rails

2020-07-14 16:30:02 +03:00

memtable.hh

headers:: Remove flat_mutation_reader.hh from several other headers

2020-07-17 17:54:47 +03:00

multishard_mutation_query.cc

mutation_partition: abort read when hard limit is exceeded for non-paged reads

2020-07-29 08:32:31 +03:00

multishard_mutation_query.hh

storage_proxy: use read_command::max_result_size to pass max result size around

2020-07-28 18:00:29 +03:00

mutation_cleaner.hh

…

mutation_compactor.hh

treewide: update concepts language from the Concepts TS to C++20

2020-06-02 09:12:21 +03:00

mutation_fragment.cc

clustering_interval_set: split to own header file

2020-02-16 17:40:47 +02:00

mutation_fragment.hh

treewide: update concepts language from the Concepts TS to C++20

2020-06-02 09:12:21 +03:00

mutation_partition_serializer.cc

cql3/query_processor.hh: Debloat from other headers

2020-02-16 11:22:30 +02:00

mutation_partition_serializer.hh

…

mutation_partition_view.cc

treewide: update concepts language from the Concepts TS to C++20

2020-06-02 09:12:21 +03:00

mutation_partition_view.hh

treewide: update concepts language from the Concepts TS to C++20

2020-06-02 09:12:21 +03:00

mutation_partition_visitor.hh

…

mutation_partition.cc

mutation_partition: abort read when hard limit is exceeded for non-paged reads

2020-07-29 08:32:31 +03:00

mutation_partition.hh

tombstone: use comparison operator instead of ad-hoc compare() function and with_relational_operators

2020-06-02 09:28:52 +03:00

mutation_query.cc

result_memory_accounter: remove default constructor

2020-07-28 18:00:29 +03:00

mutation_query.hh

mutation_partition: abort read when hard limit is exceeded for non-paged reads

2020-07-29 08:32:31 +03:00

mutation_reader.cc

mutation_reader: expose new_reader_base_cost

2020-07-20 11:23:39 +03:00

mutation_reader.hh

mutation_reader: expose new_reader_base_cost

2020-07-20 11:23:39 +03:00

mutation_rebuilder.hh

…

mutation_source_metadata.hh

…

mutation.cc

result_memory_accounter: remove default constructor

2020-07-28 18:00:29 +03:00

mutation.hh

result_memory_accounter: remove default constructor

2020-07-28 18:00:29 +03:00

noexcept_traits.hh

treewide: add missing headers and/or forward declarations

2020-03-23 09:29:45 +02:00

NOTICE.txt

tests: port Cassandra CQL tests to cql repl

2020-03-26 15:19:38 +02:00

ORIGIN

…

partition_builder.hh

…

partition_range_compat.hh

…

partition_slice_builder.cc

…

partition_slice_builder.hh

partition_slice_builder: add with_option()

2020-07-28 18:00:29 +03:00

partition_snapshot_reader.hh

…

partition_snapshot_row_cursor.hh

…

partition_version_list.hh

…

partition_version.cc

treewide: update concepts language from the Concepts TS to C++20

2020-06-02 09:12:21 +03:00

partition_version.hh

…

position_in_partition.hh

position_in_partition: Introduce external_memory_usage()

2020-06-16 16:15:24 +02:00

querier.cc

querier: move common stuff into querier_base

2020-06-03 18:45:33 +03:00

querier.hh

query: query_class_config: use max_result_size for the max_memory_for_unlimited_query field

2020-07-28 18:00:29 +03:00

query_class_config.hh

query: query_class_config: use max_result_size for the max_memory_for_unlimited_query field

2020-07-28 18:00:29 +03:00

query_result_merger.hh

…

query-request.hh

query: read_command: add max_result_size

2020-07-28 18:00:29 +03:00

query-result-reader.hh

treewide: update concepts language from the Concepts TS to C++20

2020-06-02 09:12:21 +03:00

query-result-set.cc

result_memory_accounter: remove default constructor

2020-07-28 18:00:29 +03:00

query-result-set.hh

mutation_partition: Debloat header form others

2020-03-18 11:53:36 +02:00

query-result-writer.hh

…

query-result.hh

mutation_partition: abort read when hard limit is exceeded for non-paged reads

2020-07-29 08:32:31 +03:00

query.cc

result_memory_limiter: add unlimited_result_size constant

2020-07-28 18:00:29 +03:00

range_tombstone_list.cc

tombstone: use comparison operator instead of ad-hoc compare() function and with_relational_operators

2020-06-02 09:28:52 +03:00

range_tombstone_list.hh

…

range_tombstone.cc

…

range_tombstone.hh

…

range.hh

range: rename range template family to interval

2020-06-16 13:36:20 +03:00

read_context.hh

row_cache: pass a valid permit to underlying read

2020-05-28 11:34:35 +03:00

reader_concurrency_semaphore.cc

reader_concurrency_semaphore: make inactive read handles unique across semaphores

2020-07-23 16:43:33 +03:00

reader_concurrency_semaphore.hh

Merge "messaging: make verb handler registering independent of current scheduling group" from Botond

2020-07-27 13:56:52 +03:00

reader_permit.hh

reader_permit: reader_resources: add operator- and operator+

2020-07-20 11:23:39 +03:00

README.md

README.md: Add Slack and Twitter social banners

2020-07-16 10:55:15 +03:00

real_dirty_memory_accounter.hh

…

release.cc

…

release.hh

…

reversibly_mergeable.hh

…

row_cache.cc

memtable: Switch onto B+ rails

2020-07-14 16:30:02 +03:00

row_cache.hh

headers:: Remove flat_mutation_reader.hh from several other headers

2020-07-17 17:54:47 +03:00

schema_builder.hh

cdc::schema: Make extensions expicitly settable from builder

2020-07-15 08:21:34 +00:00

schema_fwd.hh

…

schema_mutations.cc

schema: include partitioner name in scylla tables mutation

2020-03-15 10:25:20 +01:00

schema_mutations.hh

schema: include partitioner name in scylla tables mutation

2020-03-15 10:25:20 +01:00

schema_registry.cc

everywhere: Replace engine().cpu_id() with this_shard_id()

2020-03-27 11:40:03 +03:00

schema_registry.hh

header: De-bloat schema.hh

2020-03-03 11:34:00 +01:00

schema_upgrader.hh

treewide: update concepts language from the Concepts TS to C++20

2020-06-02 09:12:21 +03:00

schema.cc

cdc::schema: Make extensions expicitly settable from builder

2020-07-15 08:21:34 +00:00

schema.hh

snap: Get rid of storage_service reference in schema.cc

2020-06-26 20:28:25 +03:00

scylla_post_install.sh

dist: add scylla_memory_setup

2020-04-26 13:34:05 +03:00

scylla-gdb.py

scylla-gdb.py: scylla fiber: add suggestion for further investigation

2020-07-12 15:43:21 +03:00

SCYLLA-VERSION-GEN

SCYLLA-VERSION-GEN: skip updating version files when git hash unchanged

2020-02-06 18:36:46 +02:00

seastarx.hh

…

serialization_visitors.hh

…

serializer_impl.hh

repair: Switch to btree_set for repair_hash.

2020-07-09 11:35:18 +03:00

serializer.hh

treewide: add missing headers and/or forward declarations

2020-03-23 09:29:45 +02:00

service_permit.hh

…

setup.py

…

supervisor.hh

supervisor: drop unused Upstart code, always use libsystemd

2020-06-10 08:17:35 +03:00

table_helper.cc

everywhere: Replace engine().cpu_id() with this_shard_id()

2020-03-27 11:40:03 +03:00

table_helper.hh

treewide: update concepts language from the Concepts TS to C++20

2020-06-02 09:12:21 +03:00

table.cc

mutation_partition: abort read when hard limit is exceeded for non-paged reads

2020-07-29 08:32:31 +03:00

test.py

Export TMPDIR pointing at subdir of testlog/

2020-07-13 22:22:43 +03:00

timeout_config.cc

config: Place timeout_config() into own .cc file

2020-03-08 17:57:58 +02:00

timeout_config.hh

config: Place timeout_config() into own .cc file

2020-03-08 17:57:58 +02:00

timestamp.hh

add missing include to timestamp.hh

2020-02-05 19:42:18 +02:00

to_string.hh

treewide: add missing headers and/or forward declarations

2020-03-23 09:29:45 +02:00

tombstone.hh

tombstone: use comparison operator instead of ad-hoc compare() function and with_relational_operators

2020-06-02 09:28:52 +03:00

tox.ini

…

types.cc

treewide: update concepts language from the Concepts TS to C++20

2020-06-02 09:12:21 +03:00

types.hh

types, compound: pass std::current_exception() to on_internal_error()

2020-05-07 11:25:25 +02:00

unimplemented.cc

…

unimplemented.hh

…

user_types_metadata.hh

user_types_metadata: don't implement enable_lw_shared_from_this

2019-12-11 10:44:40 -08:00

validation.cc

validation: add is_cql_key_invalid()

2020-05-12 12:07:00 +03:00

validation.hh

validation: add is_cql_key_invalid()

2020-05-12 12:07:00 +03:00

version.hh

…

view_info.hh

header: De-bloat schema.hh

2020-03-03 11:34:00 +01:00

vint-serialization.cc

…

vint-serialization.hh

…

xx_hasher.hh

build: replace xxhash submodule with OS package

2020-04-27 14:00:31 +03:00

zstd.cc

build: remove zstd submodule

2020-06-11 17:12:49 +03:00

README.md

Scylla

What is Scylla?

Scylla is the real-time big data database that is API-compatible with Apache Cassandra and Amazon DynamoDB. Scylla embraces a shared-nothing approach that increases throughput and storage capacity to realize order-of-magnitude performance improvements and reduce hardware costs.

For more information, please see the ScyllaDB web site.

Build Prerequisites

Scylla is fairly fussy about its build environment, requiring very recent versions of the C++20 compiler and of many libraries to build. The document HACKING.md includes detailed information on building and developing Scylla, but to get Scylla building quickly on (almost) any build machine, Scylla offers a frozen toolchain, This is a pre-configured Docker image which includes recent versions of all the required compilers, libraries and build tools. Using the frozen toolchain allows you to avoid changing anything in your build machine to meet Scylla's requirements - you just need to meet the frozen toolchain's prerequisites (mostly, Docker or Podman being available).

Building Scylla

Building Scylla with the frozen toolchain dbuild is as easy as:

$ git submodule update --init --force --recursive
$ ./tools/toolchain/dbuild ./configure.py
$ ./tools/toolchain/dbuild ninja build/release/scylla

For further information, please see:

Developer documentation for more information on building Scylla.
Packaging documentation on how to build Scylla packages for different Linux distributions.
Docker image build documentation for information on how to build Docker images.

Running Scylla

To start Scylla server, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --workdir tmp --smp 1 --developer-mode 1

This will start a Scylla node with one CPU core allocated to it and data files stored in the tmp directory. The --developer-mode is needed to disable the various checks Scylla performs at startup to ensure the machine is configured for maximum performance (not relevant on development workstations). Please note that you need to run Scylla with dbuild if you built it with the frozen toolchain.

For more run options, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --help

Testing

See test.py manual.

Scylla APIs and compatibility

By default, Scylla is compatible with Apache Cassandra and its APIs - CQL and Thrift. There is also support for the API of Amazon DynamoDB™, which needs to be enabled and configured in order to be used. For more information on how to enable the DynamoDB™ API in Scylla, and the current compatibility of this feature as well as Scylla-specific extensions, see Alternator and Getting started with Alternator.

Documentation

Documentation can be found in ./docs and on the wiki. There is currently no clear definition of what goes where, so when looking for something be sure to check both. Seastar documentation can be found here. User documentation can be found here.

Training

Training material and online courses can be found at Scylla University. The courses are free, self-paced and include hands-on examples. They cover a variety of topics including Scylla data modeling, administration, architecture, basic NoSQL concepts, using drivers for application development, Scylla setup, failover, compactions, multi-datacenters and how Scylla integrates with third-party applications.

Contributing to Scylla

If you want to report a bug or submit a pull request or a patch, please read the contribution guidelines.

If you are a developer working on Scylla, please read the developer guidelines.

Contact

The users mailing list and Slack channel are for users to discuss configuration, management, and operations of the ScyllaDB open source.
The developers mailing list is for developers and people interested in following the development of ScyllaDB to discuss technical topics.

Languages

C++ 72.2%

Python 26.6%

CMake 0.3%

GAP 0.3%

Shell 0.3%