mirror of https://github.com/scylladb/scylladb.git synced 2026-05-12 19:02:12 +00:00

Go to file

Nadav Har'El 1f15e05946 test: fix replica_read_timeout_no_exception flakiness on slow systems

The test uses a 10ms read timeout to exercise code paths that handle
timed-out reads without throwing C++ exceptions.  As part of setup, it
inserts rows and flushes them to two SSTables, then runs a warm-up
SELECT to populate internal caches (e.g. the auth cache) before the
real test begins.

The reason for this warm-up read was the possibility that the first
read does additional operations (such as reading and caching
authentication) that might throw exceptions internally. I couldn't
verify that such exceptions actually happen in today's code, but
they might (re)appear in the future, so we should keep the warm-up
SELECT.

On slow CI machines (aarch64, debug build), that warm-up SELECT can
take longer than 10ms to read from the two SSTables.  When it does, the
read times out: the coordinator receives 0 responses from the local
replica within the deadline and propagates a read_timeout_exception.
Since the exception is not caught, it escapes the test lambda, is
logged as "cql env callback failed", and causes Boost.Test to report a
C++ failure at the do_with_cql_env_thread call site.  This matches the
CI failure seen in SCYLLADB-1774:

  ERROR ... replica_read_timeout_no_exception: cql env callback failed,
  error: exceptions::read_timeout_exception (Operation timed out for
  replica_read_timeout_no_exception.tbl - received only 0 responses
  from 1 CL=ONE.)

The CI log also shows that only 12 reads were admitted (the warm-up
read plus the 11 reads from the two prepare() calls and CREATE/INSERT
statements made earlier), and the current permit was stuck in
need_cpu state -- the reactor hadn't had a chance to schedule the read
before the 10ms window elapsed.

The fix catches read_timeout_exception from the warm-up SELECT and
retries until the read succeeds. The warm-up is required for
correctness: some lazy-init code paths (e.g. auth cache population)
use C++ exceptions for control flow internally. Those exceptions must
be absorbed before the cxx_exceptions baseline is sampled inside
execute_test(); otherwise they would appear in the delta and cause a
false test failure. Simply ignoring a timed-out warm-up is not safe,
because the lazy-init exceptions would then fire during the 1000 test
reads, inflating cxx_exceptions_after relative to
cxx_exceptions_before.

No other calls in setup are susceptible to the 10ms read timeout:
- CREATE KEYSPACE, CREATE TABLE, INSERT, and flush use the write
  timeout (10s) and are not reads.
- e.prepare() goes through the query processor without reading table
  data, so it is not subject to the read timeout.
- The semaphore manipulation in Test 2 is internal and has no timeout.
- All 1000 reads in execute_test() are expected to fail, so a timeout
  there is the happy path, not a failure.

The 10ms timeout itself is fine for the test's purpose: it is
deliberately aggressive so that reads reliably time out on the hot path
being tested.  The problem was only that the pre-test warm-up was not
guarded against the same timeout.

Fixes: SCYLLADB-1774

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#29731

2026-05-05 15:13:13 +03:00

.github

Fix CODEOWNERS to cover nested docs subfolders

2026-04-20 17:55:43 +03:00

abseil @ 255c84dadd

abseil: update to lts_2026_01_07

2026-04-08 12:19:54 +03:00

alternator

alternator: use stream_arn instead of std::string in list_streams

2026-04-22 14:02:53 +02:00

api

storage_service: gate REST-facing async operations during shutdown

2026-04-22 10:30:33 +02:00

audit

audit: assert storage ordering invariants at runtime

2026-04-28 18:58:49 +02:00

auth

auth: make shutdown the exact reverse of startup

2026-04-24 13:34:09 +02:00

bin

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

cdc

treewide: fix spelling errors.

2026-04-21 18:20:26 +03:00

cmake

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

compaction

compaction: Restrict tombstone GC sstable set to repaired sstables for tombstone_gc=repair mode

2026-04-20 16:59:09 -03:00

conf

conf: pair sstable_format=ms with column_index_size_in_kb=1

2026-04-20 17:53:56 +03:00

cql3

Merge 'audit: set audit_info for native-protocol BATCH messages' from Andrzej Jackowski

2026-04-22 18:56:28 +02:00

data_dictionary

db: add columns to system_schema.keyspaces

2026-04-17 09:58:07 +02:00

Merge 'Restrict tombstone GC sstable set to repaired sstables for tombstone_gc=repair mode' from Raphael Raph Carvalho

2026-04-22 10:21:37 +03:00

debug

…

dht

locator: tablets: Support arbitrary tablet boundaries

2026-04-15 01:25:14 +02:00

dist

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

docs

Merge 'service: Support adding/removing a datacenter with tablets by changing RF' from Aleksandra Martyniuk

2026-04-22 01:46:11 +02:00

ent

encryption: cover system.raft table in system_info_encryption

2026-04-16 13:22:10 +02:00

exceptions

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

gms

gms/gossiper: fix use-after-move in do_send_ack2_msg

2026-04-30 07:07:39 +03:00

idl

logstor: split log record to header and data

2026-04-16 10:00:35 +03:00

index

Merge 'vector_index: allow recreating vector indexes on the same column' from Dawid Pawlik

2026-04-15 14:40:15 +03:00

keys

keys: move key_to_str() to keys/keys.hh

2026-04-16 08:42:54 +03:00

lang

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

licenses

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

locator

service: implement make_rf_change_plan

2026-04-17 09:58:07 +02:00

message

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

mutation

alternator: fix Alternator writing unnecesary cdc entries

2026-04-17 18:00:25 +02:00

mutation_writer

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

node_ops

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

pgo

test: auth_cluster: use safe_driver_shutdown() for Cluster teardown

2026-04-21 17:45:11 +02:00

query

Merge 'query: result_set: change row member to a chunked vector' from Benny Halevy

2026-04-15 14:40:15 +03:00

raft

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

readers

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

reloc

…

repair

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

replica

replica/database: fix cross-shard deadlock in lock_tables_metadata()

2026-04-29 21:13:53 +02:00

rust

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

schema

alternator/streams: Block tablet merges when Alternator Streams are enabled

2026-04-19 03:54:33 +02:00

scripts

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

seastar @ 4d268e0ef5

Revert "Update seastar submodule"

2026-04-19 15:14:48 +03:00

service

raft/group0: fix destroy assertion on startup failure

2026-05-04 11:25:46 +02:00

sstables

sstables: only wipe TemporaryHashes for sstable formats that have it

2026-04-29 08:06:36 +03:00

streaming

Merge 'streaming: add oos protection in mutation based streaming' from Łukasz Paszkowski

2026-04-20 17:56:36 +03:00

swagger-ui @ 12f1da1082

…

tasks

service: Add virtual task for vnodes-to-tablets migrations

2026-04-17 20:59:05 +03:00

test

test: fix replica_read_timeout_no_exception flakiness on slow systems

2026-05-05 15:13:13 +03:00

tools

compaction: Restrict tombstone GC sstable set to repaired sstables for tombstone_gc=repair mode

2026-04-20 16:59:09 -03:00

tracing

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

transport

Merge 'audit: set audit_info for native-protocol BATCH messages' from Andrzej Jackowski

2026-04-22 18:56:28 +02:00

types

Merge 'query: result_set: change row member to a chunked vector' from Benny Halevy

2026-04-15 14:40:15 +03:00

unified

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

utils

utils/error_injection: add waiters() API

2026-04-30 11:45:12 +02:00

vector_search

vector_search: decrease default connection timeout to 3s

2026-04-17 12:26:39 +03:00

.clang-format

…

.dockerignore

…

.gitattributes

…

.gitignore

…

.gitmodules

…

.gitorderfile

…

.mailmap

…

absl-flat_hash_map.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

absl-flat_hash_map.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

AGENTS.md

tree: add AGENTS.md router and improve AI instruction files

2026-04-19 21:59:52 +03:00

amplify.yml

…

backlog_controller_fwd.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

backlog_controller.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

build_mode.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

bytes_fwd.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

bytes_ostream.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

bytes.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

bytes.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

cartesian_product.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

client_data.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

client_data.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

clocks-impl.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

clocks-impl.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

CMakeLists.txt

Merge 'Introduce maintenance scheduling supergroup and do initial population' from Pavel Emelyanov

2026-04-12 00:34:48 +03:00

configure.py

Merge 'table_helper: fix use-after-free on prepared-statement invalidation' from Marcin Maliszkiewicz

2026-05-04 17:21:05 +02:00

CONTRIBUTING.md

…

coverage_excludes.txt

…

coverage_sources.list

…

db_clock.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

debug.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

debug.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

default.nix

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

Doxyfile

…

encoding_stats.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

enum_set.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

exported_templates.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

exported_templates.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

fix_system_distributed_tables.py

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

flake.lock

…

flake.nix

…

gc_clock.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

gdbinit

…

gen_segmented_compress_params.py

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

HACKING.md

…

hashing_partition_visitor.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

idl-compiler.py

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

inet_address_vectors.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

init.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

init.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

install-dependencies.sh

build: add slirp4netns to dependencies

2026-03-05 17:44:17 +02:00

install.sh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

LICENSE-ScyllaDB-Source-Available.md

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

main.cc

audit: assert storage ordering invariants at runtime

2026-04-28 18:58:49 +02:00

marshal_exception.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

mutation_query.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

mutation_query.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

NOTICE.txt

…

ORIGIN

…

partition_builder.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

partition_range_compat.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

partition_slice_builder.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

partition_slice_builder.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

query_ranges_to_vnodes.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

query_ranges_to_vnodes.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

reader_concurrency_semaphore_group.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

reader_concurrency_semaphore_group.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

reader_concurrency_semaphore.cc

reader_concurrency_semaphore: drop unused stop_ext_{pre,post}()

2026-04-15 14:40:15 +03:00

reader_concurrency_semaphore.hh

reader_concurrency_semaphore: drop unused stop_ext_{pre,post}()

2026-04-15 14:40:15 +03:00

reader_permit.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

README.md

docs: fix link to docker build README.MD

2026-02-18 12:12:46 +01:00

real_dirty_memory_accounter.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

release.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

release.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

reversibly_mergeable.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

schema_upgrader.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

scylla_post_install.sh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

scylla-gdb.py

scylla-gdb: fix compaction-tasks command for intrusive list

2026-04-29 13:11:13 +03:00

SCYLLA-VERSION-GEN

Update ScyllaDB version to: 2026.3.0-dev

2026-04-26 15:30:13 +03:00

seastarx.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

serialization_visitors.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

serializer_impl.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

serializer.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

serializer.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

service_permit.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

shell.nix

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

sstable_dict_autotrainer.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

sstable_dict_autotrainer.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

sstables_loader.cc

sstables_loader: prevent use-after-free on table drop during streaming

2026-04-20 07:39:51 +03:00

sstables_loader.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

stdafx.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

stdafx.hh

build: drop utils/rolling_max_tracker.hh from precompiled header

2026-04-22 15:46:50 +03:00

supervisor.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

table_helper.cc

test/boost: add regression test for table_helper insert() UAF

2026-04-30 11:45:12 +02:00

table_helper.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

test.py

test: add --keep-duplicates and assign RUN_ID via shared cache

2026-04-29 02:36:05 +00:00

timeout_config.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

timeout_config.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

tombstone_gc_extension.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

tombstone_gc_options.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

tombstone_gc_options.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

tombstone_gc-internals.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

tombstone_gc.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

tombstone_gc.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

ubsan-suppressions.supp

…

unimplemented.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

unimplemented.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

validation.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

validation.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

version.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

view_info.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

vint-serialization.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

vint-serialization.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

README.md

Scylla

What is Scylla?

Scylla is the real-time big data database that is API-compatible with Apache Cassandra and Amazon DynamoDB. Scylla embraces a shared-nothing approach that increases throughput and storage capacity to realize order-of-magnitude performance improvements and reduce hardware costs.

For more information, please see the ScyllaDB web site.

Build Prerequisites

Scylla is fairly fussy about its build environment, requiring very recent versions of the C++23 compiler and of many libraries to build. The document HACKING.md includes detailed information on building and developing Scylla, but to get Scylla building quickly on (almost) any build machine, Scylla offers a frozen toolchain. This is a pre-configured Docker image which includes recent versions of all the required compilers, libraries and build tools. Using the frozen toolchain allows you to avoid changing anything in your build machine to meet Scylla's requirements - you just need to meet the frozen toolchain's prerequisites (mostly, Docker or Podman being available).

Building Scylla

Building Scylla with the frozen toolchain dbuild is as easy as:

$ git submodule update --init --force --recursive
$ ./tools/toolchain/dbuild ./configure.py
$ ./tools/toolchain/dbuild ninja build/release/scylla

For further information, please see:

Developer documentation for more information on building Scylla.
Build documentation on how to build Scylla binaries, tests, and packages.
Docker image build documentation for information on how to build Docker images.

Running Scylla

To start Scylla server, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --workdir tmp --smp 1 --developer-mode 1

This will start a Scylla node with one CPU core allocated to it and data files stored in the tmp directory. The --developer-mode is needed to disable the various checks Scylla performs at startup to ensure the machine is configured for maximum performance (not relevant on development workstations). Please note that you need to run Scylla with dbuild if you built it with the frozen toolchain.

For more run options, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --help

Testing

See test.py manual.

Scylla APIs and compatibility

By default, Scylla is compatible with Apache Cassandra and its API - CQL. There is also support for the API of Amazon DynamoDB™, which needs to be enabled and configured in order to be used. For more information on how to enable the DynamoDB™ API in Scylla, and the current compatibility of this feature as well as Scylla-specific extensions, see Alternator and Getting started with Alternator.

Documentation

Documentation can be found here. Seastar documentation can be found here. User documentation can be found here.

Training

Training material and online courses can be found at Scylla University. The courses are free, self-paced and include hands-on examples. They cover a variety of topics including Scylla data modeling, administration, architecture, basic NoSQL concepts, using drivers for application development, Scylla setup, failover, compactions, multi-datacenters and how Scylla integrates with third-party applications.

Contributing to Scylla

If you want to report a bug or submit a pull request or a patch, please read the contribution guidelines.

If you are a developer working on Scylla, please read the developer guidelines.

Contact

The community forum and Slack channel are for users to discuss configuration, management, and operations of ScyllaDB.
The developers mailing list is for developers and people interested in following the development of ScyllaDB to discuss technical topics.

Languages

C++ 72.5%

Python 26.2%

CMake 0.4%

GAP 0.3%

Shell 0.3%