Nadav Har'El 1f15e05946 test: fix replica_read_timeout_no_exception flakiness on slow systems
The test uses a 10ms read timeout to exercise code paths that handle
timed-out reads without throwing C++ exceptions.  As part of setup, it
inserts rows and flushes them to two SSTables, then runs a warm-up
SELECT to populate internal caches (e.g. the auth cache) before the
real test begins.

The reason for this warm-up read was the possibility that the first
read does additional operations (such as reading and caching
authentication) that might throw exceptions internally. I couldn't
verify that such exceptions actually happen in today's code, but
they might (re)appear in the future, so we should keep the warm-up
SELECT.

On slow CI machines (aarch64, debug build), that warm-up SELECT can
take longer than 10ms to read from the two SSTables.  When it does, the
read times out: the coordinator receives 0 responses from the local
replica within the deadline and propagates a read_timeout_exception.
Since the exception is not caught, it escapes the test lambda, is
logged as "cql env callback failed", and causes Boost.Test to report a
C++ failure at the do_with_cql_env_thread call site.  This matches the
CI failure seen in SCYLLADB-1774:

  ERROR ... replica_read_timeout_no_exception: cql env callback failed,
  error: exceptions::read_timeout_exception (Operation timed out for
  replica_read_timeout_no_exception.tbl - received only 0 responses
  from 1 CL=ONE.)

The CI log also shows that only 12 reads were admitted (the warm-up
read plus the 11 reads from the two prepare() calls and CREATE/INSERT
statements made earlier), and the current permit was stuck in
need_cpu state -- the reactor hadn't had a chance to schedule the read
before the 10ms window elapsed.

The fix catches read_timeout_exception from the warm-up SELECT and
retries until the read succeeds. The warm-up is required for
correctness: some lazy-init code paths (e.g. auth cache population)
use C++ exceptions for control flow internally. Those exceptions must
be absorbed before the cxx_exceptions baseline is sampled inside
execute_test(); otherwise they would appear in the delta and cause a
false test failure. Simply ignoring a timed-out warm-up is not safe,
because the lazy-init exceptions would then fire during the 1000 test
reads, inflating cxx_exceptions_after relative to
cxx_exceptions_before.

No other calls in setup are susceptible to the 10ms read timeout:
- CREATE KEYSPACE, CREATE TABLE, INSERT, and flush use the write
  timeout (10s) and are not reads.
- e.prepare() goes through the query processor without reading table
  data, so it is not subject to the read timeout.
- The semaphore manipulation in Test 2 is internal and has no timeout.
- All 1000 reads in execute_test() are expected to fail, so a timeout
  there is the happy path, not a failure.

The 10ms timeout itself is fine for the test's purpose: it is
deliberately aggressive so that reads reliably time out on the hot path
being tested.  The problem was only that the pre-test warm-up was not
guarded against the same timeout.

Fixes: SCYLLADB-1774

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#29731
2026-05-05 15:13:13 +03:00
2026-04-08 12:19:54 +03:00
2026-04-12 19:46:33 +03:00
2026-04-21 18:20:26 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00
2026-04-12 19:46:33 +03:00

Scylla

Slack Twitter

What is Scylla?

Scylla is the real-time big data database that is API-compatible with Apache Cassandra and Amazon DynamoDB. Scylla embraces a shared-nothing approach that increases throughput and storage capacity to realize order-of-magnitude performance improvements and reduce hardware costs.

For more information, please see the ScyllaDB web site.

Build Prerequisites

Scylla is fairly fussy about its build environment, requiring very recent versions of the C++23 compiler and of many libraries to build. The document HACKING.md includes detailed information on building and developing Scylla, but to get Scylla building quickly on (almost) any build machine, Scylla offers a frozen toolchain. This is a pre-configured Docker image which includes recent versions of all the required compilers, libraries and build tools. Using the frozen toolchain allows you to avoid changing anything in your build machine to meet Scylla's requirements - you just need to meet the frozen toolchain's prerequisites (mostly, Docker or Podman being available).

Building Scylla

Building Scylla with the frozen toolchain dbuild is as easy as:

$ git submodule update --init --force --recursive
$ ./tools/toolchain/dbuild ./configure.py
$ ./tools/toolchain/dbuild ninja build/release/scylla

For further information, please see:

Running Scylla

To start Scylla server, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --workdir tmp --smp 1 --developer-mode 1

This will start a Scylla node with one CPU core allocated to it and data files stored in the tmp directory. The --developer-mode is needed to disable the various checks Scylla performs at startup to ensure the machine is configured for maximum performance (not relevant on development workstations). Please note that you need to run Scylla with dbuild if you built it with the frozen toolchain.

For more run options, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --help

Testing

Build with the latest Seastar Check Reproducible Build clang-nightly

See test.py manual.

Scylla APIs and compatibility

By default, Scylla is compatible with Apache Cassandra and its API - CQL. There is also support for the API of Amazon DynamoDB™, which needs to be enabled and configured in order to be used. For more information on how to enable the DynamoDB™ API in Scylla, and the current compatibility of this feature as well as Scylla-specific extensions, see Alternator and Getting started with Alternator.

Documentation

Documentation can be found here. Seastar documentation can be found here. User documentation can be found here.

Training

Training material and online courses can be found at Scylla University. The courses are free, self-paced and include hands-on examples. They cover a variety of topics including Scylla data modeling, administration, architecture, basic NoSQL concepts, using drivers for application development, Scylla setup, failover, compactions, multi-datacenters and how Scylla integrates with third-party applications.

Contributing to Scylla

If you want to report a bug or submit a pull request or a patch, please read the contribution guidelines.

If you are a developer working on Scylla, please read the developer guidelines.

Contact

  • The community forum and Slack channel are for users to discuss configuration, management, and operations of ScyllaDB.
  • The developers mailing list is for developers and people interested in following the development of ScyllaDB to discuss technical topics.
Description
No description provided
Readme 397 MiB
Languages
C++ 72.5%
Python 26.2%
CMake 0.4%
GAP 0.3%
Shell 0.3%