mirror of https://github.com/scylladb/scylladb.git synced 2026-04-26 03:20:37 +00:00

Go to file

Piotr Smaron d14d07a079 test: fix flaky test_sstable_write_large_{row,cell} by using a fixed partition key

Commit ce00d61917 ("db: implement large_data virtual tables with feature
flag gating") changed these two tests to construct their mutation with
a randomly generated partition key (simple_schema::make_pkey()) instead
of the previously fixed pk "pv", with the comment that this avoids a
"Failed to generate sharding metadata" error.

simple_schema::make_pkey() delegates to tests::generate_partition_key(),
which defaults to key_size{1, 128}, i.e. the partition key length is
uniformly random in [1, 128] bytes. That interacts badly with the fact
that both tests pick thresholds at exact byte boundaries of the MC
sstable row encoding:

  - The large-data handler records a row's size as
      _data_writer->offset() - current_pos
    (sstables/mx/writer.cc: collect_row_stats()), i.e. the number of
    bytes the row took on disk.
  - For the first clustering row, the body includes a vint-encoded
    prev_row_size = pos - _prev_row_start.
  - _prev_row_start is captured at the start of the partition
    (consume_new_partition()) before the partition key is written to
    the data stream, so prev_row_size rolls in the partition key's
    serialized length (2-byte prefix + pk bytes) + deletion_time +
    static row size.

A random-size partition key therefore perturbs the first clustering
row's encoded size by 1-2 bytes across runs (the vint of prev_row_size
crosses the 128 boundary), flipping the test's byte-exact threshold
comparison. On seed 2104744000 this produced:

  critical check row_size_count == expected.size() has failed [3 != 2]

Fix the two byte-exact-sensitive tests by reverting their partition key
to the fixed s.new_mutation("pv") used before ce00d61917. Under smp=1
(which these tests run with, per -c1 in the test invocation) a fixed
key is always shard-local, so no sharding-metadata issue arises here.

The other tests modified by ce00d61917 (test_sstable_log_too_many_rows,
test_sstable_log_too_many_dead_rows, test_sstable_too_many_collection_elements,
test_large_data_records_round_trip, etc.) assert on row/element counts
or use thresholds with enough slack that the partition key size does
not matter, and are left unchanged.

Add an explanatory comment to each fixed site so the pitfall is not
re-introduced by a future refactor.

Verified stable with:
  ./test.py --mode=dev     test/boost/sstable_3_x_test.cc::test_sstable_write_large_row  --repeat 100 --max-failures 1
  ./test.py --mode=dev     test/boost/sstable_3_x_test.cc::test_sstable_write_large_cell --repeat 100 --max-failures 1
  ./test.py --mode=release test/boost/sstable_3_x_test.cc::test_sstable_write_large_row  --repeat 100 --max-failures 1
  ./test.py --mode=release test/boost/sstable_3_x_test.cc::test_sstable_write_large_cell --repeat 100 --max-failures 1

All four invocations: 100/100 passed.

Fixes: SCYLLADB-1685

Closes scylladb/scylladb#29621

2026-04-25 16:32:02 +03:00

.github

Fix CODEOWNERS to cover nested docs subfolders

2026-04-20 17:55:43 +03:00

abseil @ 255c84dadd

abseil: update to lts_2026_01_07

2026-04-08 12:19:54 +03:00

alternator

alternator: use stream_arn instead of std::string in list_streams

2026-04-22 14:02:53 +02:00

api

storage_service: gate REST-facing async operations during shutdown

2026-04-22 10:30:33 +02:00

audit

Merge 'audit: decrease allocations / instructions on will_log() fast path' from Marcin Maliszkiewicz

2026-04-22 15:46:16 +03:00

auth

Merge 'auth: sanitize {USER} substitution in LDAP URL template' from Piotr Smaron

2026-04-15 14:40:15 +03:00

bin

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

cdc

treewide: fix spelling errors.

2026-04-21 18:20:26 +03:00

cmake

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

compaction

compaction: Restrict tombstone GC sstable set to repaired sstables for tombstone_gc=repair mode

2026-04-20 16:59:09 -03:00

conf

conf: pair sstable_format=ms with column_index_size_in_kb=1

2026-04-20 17:53:56 +03:00

cql3

Merge 'audit: set audit_info for native-protocol BATCH messages' from Andrzej Jackowski

2026-04-22 18:56:28 +02:00

data_dictionary

db: add columns to system_schema.keyspaces

2026-04-17 09:58:07 +02:00

Merge 'Restrict tombstone GC sstable set to repaired sstables for tombstone_gc=repair mode' from Raphael Raph Carvalho

2026-04-22 10:21:37 +03:00

debug

…

dht

locator: tablets: Support arbitrary tablet boundaries

2026-04-15 01:25:14 +02:00

dist

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

docs

Merge 'service: Support adding/removing a datacenter with tablets by changing RF' from Aleksandra Martyniuk

2026-04-22 01:46:11 +02:00

ent

encryption: cover system.raft table in system_info_encryption

2026-04-16 13:22:10 +02:00

exceptions

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

gms

gms: add keyspace_multi_rf_change feature

2026-04-17 09:58:05 +02:00

idl

logstor: split log record to header and data

2026-04-16 10:00:35 +03:00

index

Merge 'vector_index: allow recreating vector indexes on the same column' from Dawid Pawlik

2026-04-15 14:40:15 +03:00

keys

keys: move key_to_str() to keys/keys.hh

2026-04-16 08:42:54 +03:00

lang

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

licenses

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

locator

service: implement make_rf_change_plan

2026-04-17 09:58:07 +02:00

message

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

mutation

alternator: fix Alternator writing unnecesary cdc entries

2026-04-17 18:00:25 +02:00

mutation_writer

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

node_ops

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

pgo

test: auth_cluster: use safe_driver_shutdown() for Cluster teardown

2026-04-21 17:45:11 +02:00

query

Merge 'query: result_set: change row member to a chunked vector' from Benny Halevy

2026-04-15 14:40:15 +03:00

raft

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

readers

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

reloc

…

repair

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

replica

Merge 'Restrict tombstone GC sstable set to repaired sstables for tombstone_gc=repair mode' from Raphael Raph Carvalho

2026-04-22 10:21:37 +03:00

rust

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

schema

alternator/streams: Block tablet merges when Alternator Streams are enabled

2026-04-19 03:54:33 +02:00

scripts

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

seastar @ 4d268e0ef5

Revert "Update seastar submodule"

2026-04-19 15:14:48 +03:00

service

storage_service: fix shard-0 forwarding in REST helpers

2026-04-22 10:30:33 +02:00

sstables

sstables: fix segfault in parse_assert() when message is nullptr

2026-04-21 12:40:33 +02:00

streaming

Merge 'streaming: add oos protection in mutation based streaming' from Łukasz Paszkowski

2026-04-20 17:56:36 +03:00

swagger-ui @ 12f1da1082

…

tasks

service: Add virtual task for vnodes-to-tablets migrations

2026-04-17 20:59:05 +03:00

test

test: fix flaky test_sstable_write_large_{row,cell} by using a fixed partition key

2026-04-25 16:32:02 +03:00

tools

compaction: Restrict tombstone GC sstable set to repaired sstables for tombstone_gc=repair mode

2026-04-20 16:59:09 -03:00

tracing

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

transport

Merge 'audit: set audit_info for native-protocol BATCH messages' from Andrzej Jackowski

2026-04-22 18:56:28 +02:00

types

Merge 'query: result_set: change row member to a chunked vector' from Benny Halevy

2026-04-15 14:40:15 +03:00

unified

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

utils

Update position in dma_read(iovec) in create_file_for_seekable_source

2026-04-23 14:54:20 +03:00

vector_search

vector_search: decrease default connection timeout to 3s

2026-04-17 12:26:39 +03:00

.clang-format

…

.dockerignore

…

.gitattributes

…

.gitignore

.gitignore: add rust target

2025-08-19 13:09:18 +03:00

.gitmodules

…

.gitorderfile

…

.mailmap

…

absl-flat_hash_map.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

absl-flat_hash_map.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

AGENTS.md

tree: add AGENTS.md router and improve AI instruction files

2026-04-19 21:59:52 +03:00

amplify.yml

…

backlog_controller_fwd.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

backlog_controller.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

build_mode.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

bytes_fwd.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

bytes_ostream.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

bytes.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

bytes.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

cartesian_product.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

client_data.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

client_data.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

clocks-impl.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

clocks-impl.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

CMakeLists.txt

Merge 'Introduce maintenance scheduling supergroup and do initial population' from Pavel Emelyanov

2026-04-12 00:34:48 +03:00

configure.py

Merge 'Alternator: Add vector search support' from Nadav Har'El

2026-04-17 10:25:45 +02:00

CONTRIBUTING.md

docs: fix typos and spelling errors

2025-09-30 13:16:49 +02:00

coverage_excludes.txt

…

coverage_sources.list

…

db_clock.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

debug.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

debug.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

default.nix

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

Doxyfile

…

encoding_stats.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

enum_set.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

exported_templates.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

exported_templates.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

fix_system_distributed_tables.py

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

flake.lock

…

flake.nix

…

gc_clock.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

gdbinit

…

gen_segmented_compress_params.py

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

HACKING.md

docs: fix typos and spelling errors

2025-09-30 13:16:49 +02:00

hashing_partition_visitor.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

idl-compiler.py

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

inet_address_vectors.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

init.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

init.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

install-dependencies.sh

build: add slirp4netns to dependencies

2026-03-05 17:44:17 +02:00

install.sh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

LICENSE-ScyllaDB-Source-Available.md

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

main.cc

treewide: fix spelling errors.

2026-04-21 18:20:26 +03:00

marshal_exception.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

mutation_query.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

mutation_query.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

NOTICE.txt

PowerPC: remove ppc stuff

2025-07-08 10:38:23 +03:00

ORIGIN

…

partition_builder.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

partition_range_compat.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

partition_slice_builder.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

partition_slice_builder.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

query_ranges_to_vnodes.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

query_ranges_to_vnodes.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

reader_concurrency_semaphore_group.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

reader_concurrency_semaphore_group.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

reader_concurrency_semaphore.cc

reader_concurrency_semaphore: drop unused stop_ext_{pre,post}()

2026-04-15 14:40:15 +03:00

reader_concurrency_semaphore.hh

reader_concurrency_semaphore: drop unused stop_ext_{pre,post}()

2026-04-15 14:40:15 +03:00

reader_permit.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

README.md

docs: fix link to docker build README.MD

2026-02-18 12:12:46 +01:00

real_dirty_memory_accounter.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

release.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

release.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

reversibly_mergeable.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

schema_upgrader.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

scylla_post_install.sh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

scylla-gdb.py

abseil: update to lts_2026_01_07

2026-04-08 12:19:54 +03:00

SCYLLA-VERSION-GEN

Update ScyllaDB version to: 2026.2.0-dev

2026-01-25 11:09:17 +02:00

seastarx.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

serialization_visitors.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

serializer_impl.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

serializer.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

serializer.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

service_permit.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

shell.nix

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

sstable_dict_autotrainer.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

sstable_dict_autotrainer.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

sstables_loader.cc

sstables_loader: prevent use-after-free on table drop during streaming

2026-04-20 07:39:51 +03:00

sstables_loader.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

stdafx.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

stdafx.hh

build: drop utils/rolling_max_tracker.hh from precompiled header

2026-04-22 15:46:50 +03:00

supervisor.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

table_helper.cc

cql3: Add cql_config parameter to parsed_statement::prepare()

2026-04-16 07:57:25 +03:00

table_helper.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

test.py

test.py: enhance error output in case no tests were executed

2026-04-23 14:03:55 +02:00

timeout_config.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

timeout_config.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

tombstone_gc_extension.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

tombstone_gc_options.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

tombstone_gc_options.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

tombstone_gc-internals.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

tombstone_gc.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

tombstone_gc.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

ubsan-suppressions.supp

…

unimplemented.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

unimplemented.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

validation.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

validation.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

version.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

view_info.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

vint-serialization.cc

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

vint-serialization.hh

LICENSE: Update to version 1.1

2026-04-12 19:46:33 +03:00

README.md

Scylla

What is Scylla?

Scylla is the real-time big data database that is API-compatible with Apache Cassandra and Amazon DynamoDB. Scylla embraces a shared-nothing approach that increases throughput and storage capacity to realize order-of-magnitude performance improvements and reduce hardware costs.

For more information, please see the ScyllaDB web site.

Build Prerequisites

Scylla is fairly fussy about its build environment, requiring very recent versions of the C++23 compiler and of many libraries to build. The document HACKING.md includes detailed information on building and developing Scylla, but to get Scylla building quickly on (almost) any build machine, Scylla offers a frozen toolchain. This is a pre-configured Docker image which includes recent versions of all the required compilers, libraries and build tools. Using the frozen toolchain allows you to avoid changing anything in your build machine to meet Scylla's requirements - you just need to meet the frozen toolchain's prerequisites (mostly, Docker or Podman being available).

Building Scylla

Building Scylla with the frozen toolchain dbuild is as easy as:

$ git submodule update --init --force --recursive
$ ./tools/toolchain/dbuild ./configure.py
$ ./tools/toolchain/dbuild ninja build/release/scylla

For further information, please see:

Developer documentation for more information on building Scylla.
Build documentation on how to build Scylla binaries, tests, and packages.
Docker image build documentation for information on how to build Docker images.

Running Scylla

To start Scylla server, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --workdir tmp --smp 1 --developer-mode 1

This will start a Scylla node with one CPU core allocated to it and data files stored in the tmp directory. The --developer-mode is needed to disable the various checks Scylla performs at startup to ensure the machine is configured for maximum performance (not relevant on development workstations). Please note that you need to run Scylla with dbuild if you built it with the frozen toolchain.

For more run options, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --help

Testing

See test.py manual.

Scylla APIs and compatibility

By default, Scylla is compatible with Apache Cassandra and its API - CQL. There is also support for the API of Amazon DynamoDB™, which needs to be enabled and configured in order to be used. For more information on how to enable the DynamoDB™ API in Scylla, and the current compatibility of this feature as well as Scylla-specific extensions, see Alternator and Getting started with Alternator.

Documentation

Documentation can be found here. Seastar documentation can be found here. User documentation can be found here.

Training

Training material and online courses can be found at Scylla University. The courses are free, self-paced and include hands-on examples. They cover a variety of topics including Scylla data modeling, administration, architecture, basic NoSQL concepts, using drivers for application development, Scylla setup, failover, compactions, multi-datacenters and how Scylla integrates with third-party applications.

Contributing to Scylla

If you want to report a bug or submit a pull request or a patch, please read the contribution guidelines.

If you are a developer working on Scylla, please read the developer guidelines.

Contact

The community forum and Slack channel are for users to discuss configuration, management, and operations of ScyllaDB.
The developers mailing list is for developers and people interested in following the development of ScyllaDB to discuss technical topics.

Languages

C++ 72.7%

Python 26.1%

CMake 0.3%

GAP 0.3%

Shell 0.3%