mirror of https://github.com/scylladb/scylladb.git synced 2026-05-30 03:30:49 +00:00

Go to file

Nadav Har'El 14cdd034ee test/alternator: fix flaky test for partition-tombstone scan

The test test_scan.py::test_scan_long_partition_tombstone_string
checks that a full-table Scan operation ends a page in the middle of
a very long string of partition tombstones, and does NOT scan the
entire table in one page (if we did that, getting a single page could
take an unbounded amount of time).

The test is currently flaky, having failed in CI runs three times in
the past two months.

The reason for the flakiness is that we don't know exactly how long
we need to make the sequence of partition tombstones in the test before
we can be absolutely sure a single page will not read this entire sequence.
For single-partition scans we have the "query_tombstone_page_limit"
configuration parameter, which tells us exactly how long we need to
make the sequence of row tombstones. But for a full-table scan of
partition tombstones, the situation is more complicated - because the
scan is done in parallel on several vnodes in parallel and each of
them needs to read query_tombstone_page_limit before it stops.

In my experiments, using query_tombstone_limit * 4 consecutive tombstones
was always enough - I ran this test hundreds of times and it didn't fail
once. But since it did fail on Jenkins very rarely (3 times in the last
two months), maybe the multiplier 4 isn't enough. So this patch doubles
it to 8. Hopefully this would be enough for anyone (TM).

This makes this test even bigger and slower than it was. To make it
faster, I changed this test's write isolation mode from the default
always_use_lwt to forbid_rmw (not use LWT). This leaves the test's
total run time to be similar to what it was before this patch - around
0.5 seconds in dev build mode on my laptop.

Fixes #12817

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #12819

2023-02-14 08:09:44 +02:00

.github

Update CODEOWNERS file

2022-12-06 19:26:03 +02:00

alternator

alternator::streams: Special case single table in list_streams

2023-01-24 09:14:33 +00:00

api

system_keysace: De-static calls that update view-building tables

2023-02-03 21:56:54 +03:00

auth

…

cdc

test: relax NULL check test predicate

2023-01-18 10:38:24 +02:00

compaction

main: use defer_verbose_shutdown() to shutdown compaction manager

2023-02-07 16:00:40 +08:00

conf

conf: enable consistent_cluster_management by default

2023-01-20 13:29:06 +01:00

cql3

Merge 'functions: handle replacing UDFs used in UDAs' from Wojciech Mitros

2023-02-13 16:30:24 +02:00

data_dictionary

data_dictonary: add get_all_keyspaces() and get_user_keyspaces()

2022-12-10 12:51:05 +01:00

Merge 'functions: handle replacing UDFs used in UDAs' from Wojciech Mitros

2023-02-13 16:30:24 +02:00

debug

…

dht

Merge 'table: trim ranges for compaction group cleanup' from Benny Halevy

2023-01-30 13:11:28 +02:00

direct_failure_detector

direct_failure_detector: don't change meaning of endpoint_liveness

2022-11-28 21:58:30 +02:00

dist

dist/debian: drop unused Makefile variable

2023-02-03 11:18:51 +08:00

docs

Merge 'cql3: convert LWT IF clause to expressions' from Avi Kivity

2023-02-13 16:30:24 +02:00

exceptions

…

gms

raft: replace experimental raft option with dedicated flag

2023-01-03 11:15:11 +02:00

idl

forward_service: fix timeout support in parallel aggregates

2023-01-16 12:08:13 +02:00

index

rjson: avoid copy constructors in from_string calls when possible

2023-01-16 15:15:26 +01:00

interface

…

lang

wasm udf: deserialize counters as integers

2023-02-13 14:24:20 +01:00

licenses

…

locator

abstract_replication_strategy: add for_each_natural_endpoint_until

2023-02-13 16:30:24 +02:00

message

messaging: check that a node knows its own topology before accessing it

2023-01-02 11:53:14 +02:00

mutation_writer

position_in_partition: Make after_key() work with non-full keys

2022-12-14 14:47:33 +01:00

raft

raft conf error injection for snapshot

2023-02-03 22:33:33 +01:00

readers

range_tombstone_change_merger: Introduce peek()

2023-01-27 19:15:39 +01:00

redis

…

reloc

…

repair

Merge 'repair: finish repair immediately on local keyspaces' from Aleksandra Martyniuk

2023-02-09 18:44:37 +01:00

replica

Merge 'Keep reshape and reshard logic in distributed loader' from Pavel Emelyanov

2023-02-09 10:01:44 +02:00

rust

test: assert that WASM allocations can fail without crashing

2023-01-06 14:07:29 +01:00

scripts

open-coredump.sh: handle dev versions

2023-01-12 19:28:58 +02:00

seastar @ 943c09f869

Revert "Update seastar submodule"

2023-02-06 14:56:44 +02:00

service

Merge 'Sequence CDC preimage select with Paxos learn write' from Kamil Braun

2023-02-12 13:28:34 +02:00

sstables

sstable_directory: Rename remove_input_sstables_from_reshaping()

2023-02-08 15:00:44 +03:00

streaming

db/view/view_update_check: check_needs_view_update_path(): filter out non-member hosts

2023-01-27 17:12:45 +03:00

swagger-ui @ 12f1da1082

…

tasks

Merge 'Method to create and start task manager task' from Aleksandra Martyniuk

2023-02-06 12:38:35 +02:00

test

test/alternator: fix flaky test for partition-tombstone scan

2023-02-14 08:09:44 +02:00

thrift

treewide: drop cql_serialization_format

2023-01-03 19:54:13 +02:00

tools

install-dependencies.sh: update node_exporter to 1.5.0

2023-02-13 16:30:24 +02:00

tracing

cql3, transport, tests: remove "unset" from value type system

2023-01-16 21:10:56 +02:00

transport

transport server: fix "request size too large" handling

2023-02-08 00:07:08 +04:00

types

types: allow lists with NULL

2023-01-18 10:38:24 +02:00

unified

install.sh: Skip systemd existance check when --without-systemd

2022-11-14 14:07:46 +02:00

utils

Merge 'Introduce recent_entries_map datatype to track least recent visited entries.' from Andrii Patsula

2023-02-06 18:01:26 +01:00

.dockerignore

…

.gitattributes

…

.gitignore

Add rust/Cargo.lock to .gitignore

2022-10-14 13:54:50 +03:00

.gitmodules

build: drop abseil submodule, replace with distribution abseil

2022-12-28 19:02:23 +02:00

.gitorderfile

…

.mailmap

…

absl-flat_hash_map.cc

…

absl-flat_hash_map.hh

…

amplify.yml

docs: automatic previews configuration

2022-11-04 15:44:22 +02:00

atomic_cell_hash.hh

…

atomic_cell_or_collection.hh

…

atomic_cell.cc

…

atomic_cell.hh

…

backlog_controller.hh

…

build_mode.hh

…

bytes_ostream.hh

bytes_ostream: don't take reference to packed variable

2022-11-28 21:40:18 +02:00

bytes.cc

…

bytes.hh

…

cache_flat_mutation_reader.hh

row_cache: Distinguish dummy insertion site in trace log

2023-01-27 21:56:31 +01:00

cache_temperature.hh

…

caching_options.cc

…

caching_options.hh

…

canonical_mutation.cc

…

canonical_mutation.hh

…

cartesian_product.hh

…

cell_locking.hh

…

checked-file-impl.hh

…

client_data.cc

…

client_data.hh

…

clocks-impl.cc

…

clocks-impl.hh

…

clustering_bounds_comparator.hh

…

clustering_interval_set.hh

…

clustering_key_filter.hh

…

clustering_ranges_walker.hh

…

CMakeLists.txt

build: drop abseil submodule, replace with distribution abseil

2022-12-28 19:02:23 +02:00

collection_mutation.cc

…

collection_mutation.hh

treewide: drop cql_serialization_format

2023-01-03 19:54:13 +02:00

column_computation.hh

column_computation: adjust to use clustering_or_static_row

2022-12-06 11:21:16 +01:00

combine.hh

…

compatible_ring_position.hh

…

compound_compat.hh

…

compound.hh

treewide: drop cql_serialization_format

2023-01-03 19:54:13 +02:00

compress.cc

…

compress.hh

…

concrete_types.hh

types: make empty type deserialize to non-null value

2023-01-18 10:38:24 +02:00

configure.py

configure.py: use seastar_dep and seastar_testing_dep

2023-02-13 16:30:24 +02:00

CONTRIBUTING.md

…

converting_mutation_partition_applier.cc

…

converting_mutation_partition_applier.hh

…

counters.cc

…

counters.hh

…

cql_serialization_format.hh

treewide: drop cql_serialization_format

2023-01-03 19:54:13 +02:00

db_clock.hh

…

debug.cc

test: extract debug::the_database out

2023-01-19 17:42:23 +08:00

debug.hh

…

default.nix

build: explicitly add rustc to Nix devenv

2023-01-19 15:53:49 +01:00

digest_algorithm.hh

…

digester.hh

…

Doxyfile

…

duration.cc

…

duration.hh

…

encoding_stats.hh

…

enum_set.hh

…

fix_system_distributed_tables.py

…

flake.lock

build: bump Lua version (5.3 -> 5.4) in Nix devenv

2023-01-19 15:53:49 +01:00

flake.nix

build: fix Nix devenv

2022-12-19 20:53:07 +02:00

frozen_mutation.cc

…

frozen_mutation.hh

…

frozen_schema.cc

…

frozen_schema.hh

…

full_position.hh

…

gc_clock.hh

…

gdbinit

gdbinit: add ignore clause for SIG35

2023-01-12 12:13:04 +02:00

gen_segmented_compress_params.py

…

generic_server.cc

…

generic_server.hh

…

HACKING.md

…

hashers.cc

…

hashers.hh

…

hashing_partition_visitor.hh

…

hashing.hh

…

idl-compiler.py

…

inet_address_vectors.hh

…

init.cc

init: do not allow cfg.replace_node_first_boot of seed node

2023-01-13 18:30:48 +02:00

init.hh

…

install-dependencies.sh

install-dependencies.sh: update node_exporter to 1.5.0

2023-02-13 16:30:24 +02:00

install.sh

install.sh: drop locale workaround from python3 thunk

2022-11-28 13:07:03 +02:00

interval.hh

…

intrusive_set_external_comparator.hh

…

keys.cc

…

keys.hh

…

LICENSE.AGPL

…

log.hh

…

main.cc

main: use defer_verbose_shutdown() to shutdown compaction manager

2023-02-07 16:00:40 +08:00

map_difference.hh

…

marshal_exception.hh

…

multishard_mutation_query.cc

…

multishard_mutation_query.hh

…

mutation_cleaner.hh

db: mutation_cleaner: Extract make_region_space_guard()

2023-01-27 19:15:38 +01:00

mutation_compactor.hh

mutation_compactor: only pass consumed range-tombstone-change to validator

2023-01-27 14:03:45 +01:00

mutation_consumer_concepts.hh

…

mutation_consumer.hh

mutation{,_consumer,_partition}: remove consume_in_reverse::legacy_half_reverse

2023-01-05 18:48:55 +01:00

mutation_fragment_fwd.hh

…

mutation_fragment_stream_validator.hh

mutation_fragment_stream_validator: avoid allocation when stream is correct

2022-11-22 19:19:18 +02:00

mutation_fragment_v2.hh

…

mutation_fragment.cc

position_in_partition: Make after_key() work with non-full keys

2022-12-14 14:47:33 +01:00

mutation_fragment.hh

Merge 'Fix handling of non-full clustering keys in the read path' from Tomasz Grabiec

2022-12-15 10:47:12 +02:00

mutation_partition_serializer.cc

…

mutation_partition_serializer.hh

…

mutation_partition_v2.cc

cache: Fix empty partition entries being left in cache in some cases

2023-02-09 23:03:23 +02:00

mutation_partition_v2.hh

Pass is_evictable to apply()

2023-01-27 21:56:31 +01:00

mutation_partition_view.cc

…

mutation_partition_view.hh

…

mutation_partition_visitor.hh

…

mutation_partition.cc

mutation_partition_v2: Store range tombstones together with rows

2023-01-27 19:15:39 +01:00

mutation_partition.hh

row_cache, lru: Introduce evict_shallow()

2023-01-27 21:56:31 +01:00

mutation_query.cc

…

mutation_query.hh

…

mutation_rebuilder.hh

…

mutation_source_metadata.hh

…

mutation.cc

…

mutation.hh

mutation{,_consumer,_partition}: remove consume_in_reverse::legacy_half_reverse

2023-01-05 18:48:55 +01:00

noexcept_traits.hh

…

NOTICE.txt

…

ORIGIN

…

partition_builder.hh

…

partition_range_compat.hh

…

partition_slice_builder.cc

treewide: drop cql_serialization_format

2023-01-03 19:54:13 +02:00

partition_slice_builder.hh

…

partition_snapshot_reader.hh

db: Use mutation_partition_v2 in mvcc

2023-01-27 21:56:28 +01:00

partition_snapshot_row_cursor.hh

db: Use mutation_partition_v2 in mvcc

2023-01-27 21:56:28 +01:00

partition_version_list.hh

…

partition_version.cc

Pass is_evictable to apply()

2023-01-27 21:56:31 +01:00

partition_version.hh

Pass is_evictable to apply()

2023-01-27 21:56:31 +01:00

position_in_partition.hh

position_in_partition: Optimize equality check

2023-01-27 19:15:38 +01:00

protocol_server.hh

…

querier.cc

…

querier.hh

querier: consume_page(): use partition_start as the sentinel value

2022-11-11 09:58:18 +02:00

query_class_config.hh

…

query_ranges_to_vnodes.cc

query_ranges_to_vnodes_generator: fix for exclusive boundaries

2023-02-07 16:02:31 +02:00

query_ranges_to_vnodes.hh

…

query_result_merger.hh

…

query-request.hh

forward_service: fix timeout support in parallel aggregates

2023-01-16 12:08:13 +02:00

query-result-reader.hh

treewide: use ::for_partition_start() instead of ::partition_start_tag_t{}

2022-11-11 09:58:18 +02:00

query-result-set.cc

treewide: drop cql_serialization_format

2023-01-03 19:54:13 +02:00

query-result-set.hh

…

query-result-writer.hh

…

query-result.hh

…

query.cc

treewide: drop cql_serialization_format

2023-01-03 19:54:13 +02:00

range_tombstone_assembler.hh

…

range_tombstone_change_generator.hh

…

range_tombstone_list.cc

…

range_tombstone_list.hh

…

range_tombstone_splitter.hh

…

range_tombstone.cc

db: Print range_tombstone bounds as position_in_partition

2023-01-27 19:15:39 +01:00

range_tombstone.hh

…

range.hh

…

read_context.hh

…

reader_concurrency_semaphore.cc

reader_concurrency_semaphore: un-bless permits when they become inactive

2023-02-01 21:02:17 +02:00

reader_concurrency_semaphore.hh

reader_concurrency_semaphore: un-bless permits when they become inactive

2023-02-01 21:02:17 +02:00

reader_permit.hh

reader_permit: expose operator<<(reader_permit::state)

2023-01-17 05:27:04 -05:00

README.md

…

real_dirty_memory_accounter.hh

dirty_memory_manager: move to replica module

2022-12-06 22:24:17 +02:00

release.cc

…

release.hh

…

reversibly_mergeable.hh

…

row_cache.cc

row_cache, lru: Introduce evict_shallow()

2023-01-27 21:56:31 +01:00

row_cache.hh

…

schema_builder.hh

…

schema_fwd.hh

…

schema_mutations.cc

…

schema_mutations.hh

…

schema_registry.cc

…

schema_registry.hh

…

schema_upgrader.hh

…

schema.cc

rjson: avoid copy constructors in from_string calls when possible

2023-01-16 15:15:26 +01:00

schema.hh

implement keyspace_element interface

2022-12-10 12:34:09 +01:00

scylla_post_install.sh

…

scylla-gdb.py

scylla-gdb.py: scylla memory: remove 'sstable reads' from semaphore names

2023-01-25 21:55:27 +02:00

SCYLLA-VERSION-GEN

release: prepare for 5.3.0-dev

2023-01-18 16:22:41 +02:00

seastarx.hh

…

serialization_visitors.hh

…

serializer_impl.hh

serializer_impl.hh: add reverse vector serializer

2022-11-14 16:06:24 +01:00

serializer.cc

…

serializer.hh

…

service_permit.hh

…

setup.py

…

shell.nix

…

sstables_loader.cc

…

sstables_loader.hh

…

supervisor.hh

…

table_helper.cc

…

table_helper.hh

…

test.py

Merge 'test.py: improve test failure handling' from Kamil Braun

2023-02-12 12:13:25 +02:00

timeout_config.cc

…

timeout_config.hh

…

timestamp.hh

…

to_string.hh

…

tombstone_gc_extension.hh

…

tombstone_gc_options.cc

…

tombstone_gc_options.hh

…

tombstone_gc.cc

…

tombstone_gc.hh

…

tombstone.hh

…

tox.ini

…

types.cc

Merge 'Make regexes in types.cc static and remove unnecessary tolower transform' from Marcin Maliszkiewicz

2023-01-24 16:13:59 +02:00

types.hh

test: relax NULL check test predicate

2023-01-18 10:38:24 +02:00

ubsan-suppressions.supp

…

unimplemented.cc

…

unimplemented.hh

…

validation.cc

…

validation.hh

…

version.hh

version: Reverse version increase

2022-12-12 18:45:32 +02:00

view_info.hh

view_info: adjust view_column to accept column_kind

2022-12-06 11:21:16 +01:00

vint-serialization.cc

…

vint-serialization.hh

…

xx_hasher.hh

…

zstd.cc

…

README.md

Scylla

What is Scylla?

Scylla is the real-time big data database that is API-compatible with Apache Cassandra and Amazon DynamoDB. Scylla embraces a shared-nothing approach that increases throughput and storage capacity to realize order-of-magnitude performance improvements and reduce hardware costs.

For more information, please see the ScyllaDB web site.

Build Prerequisites

Scylla is fairly fussy about its build environment, requiring very recent versions of the C++20 compiler and of many libraries to build. The document HACKING.md includes detailed information on building and developing Scylla, but to get Scylla building quickly on (almost) any build machine, Scylla offers a frozen toolchain, This is a pre-configured Docker image which includes recent versions of all the required compilers, libraries and build tools. Using the frozen toolchain allows you to avoid changing anything in your build machine to meet Scylla's requirements - you just need to meet the frozen toolchain's prerequisites (mostly, Docker or Podman being available).

Building Scylla

Building Scylla with the frozen toolchain dbuild is as easy as:

$ git submodule update --init --force --recursive
$ ./tools/toolchain/dbuild ./configure.py
$ ./tools/toolchain/dbuild ninja build/release/scylla

For further information, please see:

Developer documentation for more information on building Scylla.
Build documentation on how to build Scylla binaries, tests, and packages.
Docker image build documentation for information on how to build Docker images.

Running Scylla

To start Scylla server, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --workdir tmp --smp 1 --developer-mode 1

This will start a Scylla node with one CPU core allocated to it and data files stored in the tmp directory. The --developer-mode is needed to disable the various checks Scylla performs at startup to ensure the machine is configured for maximum performance (not relevant on development workstations). Please note that you need to run Scylla with dbuild if you built it with the frozen toolchain.

For more run options, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --help

Testing

See test.py manual.

Scylla APIs and compatibility

By default, Scylla is compatible with Apache Cassandra and its APIs - CQL and Thrift. There is also support for the API of Amazon DynamoDB™, which needs to be enabled and configured in order to be used. For more information on how to enable the DynamoDB™ API in Scylla, and the current compatibility of this feature as well as Scylla-specific extensions, see Alternator and Getting started with Alternator.

Documentation

Documentation can be found here. Seastar documentation can be found here. User documentation can be found here.

Training

Training material and online courses can be found at Scylla University. The courses are free, self-paced and include hands-on examples. They cover a variety of topics including Scylla data modeling, administration, architecture, basic NoSQL concepts, using drivers for application development, Scylla setup, failover, compactions, multi-datacenters and how Scylla integrates with third-party applications.

Contributing to Scylla

If you want to report a bug or submit a pull request or a patch, please read the contribution guidelines.

If you are a developer working on Scylla, please read the developer guidelines.

Contact

The users mailing list and Slack channel are for users to discuss configuration, management, and operations of the ScyllaDB open source.
The developers mailing list is for developers and people interested in following the development of ScyllaDB to discuss technical topics.

Languages

C++ 72.3%

Python 26.5%

CMake 0.3%

GAP 0.3%

Shell 0.3%