Go to file

Marcin Maliszkiewicz dfd77a7a9c transport: fix connection code to consume only initially taken semaphore units

The connection's cpu_concurrency_t struct tracks the state of a connection
to manage the admission of new requests and prevent CPU overload during
connection storms. When a connection holds units (allowed only 0 or 1), it is
considered to be in the "CPU state" and contributes to the concurrency limits
used when accepting new connections.

The bug stems from the fact that `counted_data_source_impl::get` and
`counted_data_sink_impl::put` calls can interleave during execution. This
occurs because of `should_parallelize` and `_ready_to_respond`, the latter being
a future chain that can run in the background while requests are being read.
Consequently, while reading request (N), the system may concurrently be
writing the response for request (N-1) on the same connection.

This interleaving allows `return_all()` to be called twice before the
subsequent `consume_units()` is invoked. While the second `return_all()` call
correctly returns 0 units, the matching `consume_units()` call would
mistakenly take an extra unit from the semaphore. Over time, a connection
blocked on a read operation could end up holding an unreturned semaphore
unit. If this pattern repeats across multiple connections, the semaphore
units are eventually depleted, preventing the server from accepting any
new connections.

The fix ensures that we always consume the exact number of units that were
previously returned. With this change, interleaved operations behave as
follows:

get() return_all     — returns 1 unit
put() return_all     — returns 0 units
get() consume_units  — takes back 1 unit
put() consume_units  — takes back 0 units

Logically, the networking phase ends when the first network operation
concludes. But more importantly, when a network operation
starts, we no longer hold any units.

Other solutions are possible but the chosen one seems to be the
simplest and safest to backport.

Fixes SCYLLADB-485

(cherry picked from commit 0376d16)

2026-02-19 16:33:23 +01:00

.github

.github/workflows/backport-pr-fixes-validation: support Atlassian URL format in backport PR fixes validation

2026-01-27 16:06:44 +02:00

abseil @ d7aaad83b4

…

alternator

alternator: use storage_proxy from the correct shard in executor::delete_table

2026-02-18 13:01:05 +02:00

api

Merge '[Backport 2025.3] api: storage_service: tasks: unify sync and async compaction APIs' from Scylladb[bot]

2026-01-08 17:56:18 +02:00

audit

audit: introduce debug level logs on happy path

2025-06-27 16:27:27 +02:00

auth

auth: add query_state parameter to query functions

2025-10-09 12:48:07 +00:00

bin

…

cdc

cdc: check if recreating a column too soon

2025-11-16 10:03:07 +01:00

cmake

build: cmake: Use LINKER: prefix for consistent linker option handling

2025-06-25 11:17:15 +03:00

compaction

Merge '[Backport 2025.3] compaction: ensure that all compaction executors are stopped' from Scylladb[bot]

2025-09-26 13:20:52 +03:00

conf

scylla.yaml: add recommended value for stream_io_throughput_mb_per_sec

2025-08-01 15:02:01 +03:00

cql3

cql3: Make abstract_type explicitly noncopyable

2025-11-13 11:51:22 +01:00

data_dictionary

…

mv: don't mark the view as built if the reader produced no partitions

2026-02-18 13:03:05 +02:00

debug

…

dht

interval: reduce sizeof

2025-06-14 21:29:43 +03:00

dist

Make scylla_io_setup detect request size for best write IOPS

2025-10-27 16:53:44 +03:00

docs

doc: fix the default compaction strategy for Materialized Views

2026-01-21 06:45:36 +02:00

ent

encryption::kms_host: Add exponential backoff-retry for 503 errors

2025-11-17 11:48:42 +00:00

exceptions

transport: storage_proxy: release ERM when waiting for query timeout

2025-04-23 09:29:47 +02:00

gms

gossiper: add_saved_endpoint: make generations of excluded nodes negative

2026-01-13 16:03:22 +01:00

idl

direct_failure_detector: pass timeout to direct_fd_ping verb

2025-12-07 14:57:10 +00:00

index

interval: rename start() to start_ref() (and end() etc).

2025-06-14 21:26:16 +03:00

lang

…

licenses

…

locator

load_sketch: Allow populating load_sketch with normalized current load

2026-01-16 13:50:16 +01:00

message

direct_failure_detector: run direct failure detector in the gossiper scheduling group

2025-12-09 17:07:12 +02:00

mutation

Merge 'sstables/mx/writer: handle non-full prefix row keys' from Botond Dénes

2025-06-29 18:18:36 +03:00

mutation_writer

replica: Fix split compaction when tablet boundaries change

2025-09-29 20:29:05 -03:00

node_ops

storage_service: change node_ops_info::ignore_nodes to host id

2025-09-26 10:53:47 +02:00

pgo

Update pgo profiles - aarch64

2026-02-15 05:15:44 +02:00

raft

raft: server_impl: use named gate

2025-04-12 11:28:48 +03:00

readers

interval: rename start_ref() back to start() (and end_ref() etc).

2025-06-14 21:26:16 +03:00

redis

generic_server: Two-step connection shutdown.

2025-08-18 15:46:46 +02:00

reloc

…

repair

service: pass topology guard to RBNO

2026-01-22 11:00:10 +01:00

replica

database: truncate_table_on_all_shards: drop outdated TODO comment

2026-01-09 08:35:00 +02:00

rust

…

schema

schema: speculative_retry: update exception type for sstring ops

2025-11-13 19:44:43 +00:00

scripts

docs: expose alternator metrics

2025-09-01 09:10:41 +03:00

seastar @ 04d7c63acc

Update seastar submodule (assorted fixes for S3 client update)

2026-02-03 11:40:47 +03:00

service

Merge '[Backport 2025.3] service: pass topology guard to RBNO' from Scylladb[bot]

2026-02-18 12:58:10 +02:00

sstables

scylla-sstable: correctly dump sharding_metadata

2025-11-16 15:43:35 +02:00

streaming

streaming: Enclose potential throws in try block and ensure sink close before logging

2025-09-21 18:11:43 +03:00

swagger-ui @ 12f1da1082

…

tasks

tasks: change _finished_children type

2025-08-06 07:36:04 +03:00

test

mv: don't mark the view as built if the reader produced no partitions

2026-02-18 13:03:05 +02:00

tools

tools: toolchain: prepare: replace 'reg' with 'skopeo'

2025-11-24 16:32:04 +02:00

tracing

tracing: trace_keyspace_helper: use named gate

2025-04-12 11:29:48 +03:00

transport

transport: call update_scheduling_group for non-auth connections

2025-10-30 18:38:43 +01:00

types

cql3: Make abstract_type explicitly noncopyable

2025-11-13 11:51:22 +01:00

unified

…

utils

s3_client: limit multipart upload concurrency

2026-02-18 12:56:56 +02:00

.clang-format

…

.dockerignore

…

.gitattributes

…

.gitignore

…

.gitmodules

Update seastar submodule

2025-10-28 13:10:12 +03:00

.gitorderfile

…

.mailmap

…

absl-flat_hash_map.cc

…

absl-flat_hash_map.hh

…

amplify.yml

…

backlog_controller.hh

…

build_mode.hh

…

bytes_fwd.hh

…

bytes_ostream.hh

…

bytes.cc

…

bytes.hh

…

cache_temperature.hh

…

cartesian_product.hh

…

cell_locking.hh

…

client_data.cc

…

client_data.hh

…

clocks-impl.cc

…

clocks-impl.hh

…

clustering_bounds_comparator.hh

interval: rename start_ref() back to start() (and end_ref() etc).

2025-06-14 21:26:16 +03:00

clustering_interval_set.hh

…

clustering_key_filter.hh

…

clustering_ranges_walker.hh

interval: rename start_ref() back to start() (and end_ref() etc).

2025-06-14 21:26:16 +03:00

CMakeLists.txt

tools: add patchelf utility

2025-06-30 07:24:05 +03:00

collection_mutation.cc

…

collection_mutation.hh

…

column_computation.hh

…

combine.hh

…

compound_compat.hh

…

compound.hh

compound: optimize is_full() for single-component types

2025-06-23 09:38:45 +03:00

compress.cc

main: Validate SSTable compression options from config

2025-10-20 09:28:13 +03:00

compress.hh

main: Validate SSTable compression options from config

2025-10-20 09:28:13 +03:00

concrete_types.hh

…

configure.py

utils: Introduce helper for replicated data structures

2025-12-04 14:50:31 +01:00

CONTRIBUTING.md

…

converting_mutation_partition_applier.cc

…

converting_mutation_partition_applier.hh

…

counters.cc

…

counters.hh

…

coverage_excludes.txt

…

coverage_sources.list

…

cql_serialization_format.hh

…

db_clock.hh

…

debug.cc

…

debug.hh

…

default.nix

…

Doxyfile

…

duration.cc

…

duration.hh

…

encoding_stats.hh

…

enum_set.hh

…

fix_system_distributed_tables.py

…

flake.lock

…

flake.nix

…

frozen_schema.cc

Revert "Merge 'Atomic in-memory schema changes application' from Marcin Maliszkiewicz"

2025-06-16 22:38:12 +03:00

frozen_schema.hh

Revert "Merge 'Atomic in-memory schema changes application' from Marcin Maliszkiewicz"

2025-06-16 22:38:12 +03:00

full_position.hh

…

gc_clock.hh

…

gdbinit

…

gen_segmented_compress_params.py

…

generic_server.cc

transport: fix connection code to consume only initially taken semaphore units

2026-02-19 16:33:23 +01:00

generic_server.hh

generic_server: Two-step connection shutdown.

2025-08-18 15:46:46 +02:00

HACKING.md

build: replace tools/java submodule with packaged cassandra-stress

2025-04-15 10:11:28 +03:00

hashing_partition_visitor.hh

…

idl-compiler.py

idl-compiler.py: generate skip() definition for enums serializers

2025-06-24 11:05:31 +03:00

inet_address_vectors.hh

…

init.cc

compress: distribute compression dictionaries over shards

2025-05-07 14:43:18 +02:00

init.hh

…

install-dependencies.sh

install-dependencies.sh: update node_exporter to 1.10.2

2025-11-14 10:28:56 +02:00

install.sh

…

interval.hh

interval: reduce sizeof

2025-06-14 21:29:43 +03:00

keys.cc

keys: from_nodetool_style_string don't split single partition keys

2025-09-01 15:36:56 +03:00

keys.hh

…

LICENSE-ScyllaDB-Source-Available.md

…

main.cc

direct_failure_detector: run direct failure detector in the gossiper scheduling group

2025-12-09 17:07:12 +02:00

map_difference.hh

…

marshal_exception.hh

…

multishard_mutation_query.cc

readers/mutation_source: s/make_reader_v2/make_mutation_reader/

2025-05-09 07:53:29 -04:00

multishard_mutation_query.hh

…

mutation_query.cc

…

mutation_query.hh

…

NOTICE.txt

…

ORIGIN

…

partition_builder.hh

…

partition_range_compat.hh

interval: rename start_ref() back to start() (and end_ref() etc).

2025-06-14 21:26:16 +03:00

partition_slice_builder.cc

…

partition_slice_builder.hh

…

partition_snapshot_reader.hh

…

protocol_server.hh

…

querier.cc

interval: rename start_ref() back to start() (and end_ref() etc).

2025-06-14 21:26:16 +03:00

querier.hh

readers/mutation_source: s/make_reader_v2/make_mutation_reader/

2025-05-09 07:53:29 -04:00

query_id.hh

…

query_ranges_to_vnodes.cc

interval: rename start_ref() back to start() (and end_ref() etc).

2025-06-14 21:26:16 +03:00

query_ranges_to_vnodes.hh

…

query_result_merger.hh

…

query-request.hh

mapreduce: add shard_id_hint to mapreduce request

2025-06-25 19:23:07 +02:00

query-result-reader.hh

…

query-result-set.cc

…

query-result-set.hh

…

query-result-writer.hh

…

query-result.hh

…

query.cc

mapreduce: add missing comma and space in mapreduce_request operator<<

2025-06-25 19:23:07 +02:00

reader_concurrency_semaphore_group.cc

…

reader_concurrency_semaphore_group.hh

…

reader_concurrency_semaphore.cc

reader_concurrency_semaphore: improve handling of base resources

2026-01-21 06:45:13 +02:00

reader_concurrency_semaphore.hh

reader_concurrency_semaphore: use named gate

2025-04-12 11:28:48 +03:00

reader_permit.hh

…

README.md

…

real_dirty_memory_accounter.hh

…

release.cc

release: adjust doc_link() for the post source-available world

2025-10-03 14:28:44 +00:00

release.hh

…

reversibly_mergeable.hh

…

schema_mutations.cc

…

schema_mutations.hh

…

schema_upgrader.hh

…

scylla_post_install.sh

…

scylla-gdb.py

scylla-gdb: Fix fair-queue entry printing

2025-09-30 11:29:06 +03:00

SCYLLA-VERSION-GEN

Update ScyllaDB version to: 2025.3.8

2026-02-10 22:43:29 +02:00

seastarx.hh

…

serialization_visitors.hh

…

serializer_impl.hh

serializer_impl.hh: add as_input_stream(managed_bytes_view) overload

2025-05-13 10:32:32 +02:00

serializer.cc

…

serializer.hh

…

service_permit.hh

…

shell.nix

…

sstable_dict_autotrainer.cc

compress: distribute compression dictionaries over shards

2025-05-07 14:43:18 +02:00

sstable_dict_autotrainer.hh

…

sstables_loader.cc

streaming:: add more logging

2025-12-02 12:13:21 +01:00

sstables_loader.hh

streaming:: add more logging

2025-12-02 12:13:21 +01:00

supervisor.hh

…

table_helper.cc

…

table_helper.hh

…

test.py

test.py: add bypassing x_log2_compaction_groups to boost tests

2025-08-25 15:15:30 +02:00

timeout_config.cc

…

timeout_config.hh

…

timestamp.hh

…

tombstone_gc_extension.hh

…

tombstone_gc_options.cc

…

tombstone_gc_options.hh

…

tombstone_gc-internals.hh

…

tombstone_gc.cc

…

tombstone_gc.hh

…

ubsan-suppressions.supp

…

unimplemented.cc

…

unimplemented.hh

…

validation.cc

…

validation.hh

…

version.hh

…

view_info.hh

base_info: remove the lw_shared_ptr variant

2025-04-24 01:08:40 +02:00

vint-serialization.cc

…

vint-serialization.hh

…

README.md

Scylla

What is Scylla?

Scylla is the real-time big data database that is API-compatible with Apache Cassandra and Amazon DynamoDB. Scylla embraces a shared-nothing approach that increases throughput and storage capacity to realize order-of-magnitude performance improvements and reduce hardware costs.

For more information, please see the ScyllaDB web site.

Build Prerequisites

Scylla is fairly fussy about its build environment, requiring very recent versions of the C++23 compiler and of many libraries to build. The document HACKING.md includes detailed information on building and developing Scylla, but to get Scylla building quickly on (almost) any build machine, Scylla offers a frozen toolchain, This is a pre-configured Docker image which includes recent versions of all the required compilers, libraries and build tools. Using the frozen toolchain allows you to avoid changing anything in your build machine to meet Scylla's requirements - you just need to meet the frozen toolchain's prerequisites (mostly, Docker or Podman being available).

Building Scylla

Building Scylla with the frozen toolchain dbuild is as easy as:

$ git submodule update --init --force --recursive
$ ./tools/toolchain/dbuild ./configure.py
$ ./tools/toolchain/dbuild ninja build/release/scylla

For further information, please see:

Developer documentation for more information on building Scylla.
Build documentation on how to build Scylla binaries, tests, and packages.
Docker image build documentation for information on how to build Docker images.

Running Scylla

To start Scylla server, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --workdir tmp --smp 1 --developer-mode 1

This will start a Scylla node with one CPU core allocated to it and data files stored in the tmp directory. The --developer-mode is needed to disable the various checks Scylla performs at startup to ensure the machine is configured for maximum performance (not relevant on development workstations). Please note that you need to run Scylla with dbuild if you built it with the frozen toolchain.

For more run options, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --help

Testing

See test.py manual.

Scylla APIs and compatibility

By default, Scylla is compatible with Apache Cassandra and its API - CQL. There is also support for the API of Amazon DynamoDB™, which needs to be enabled and configured in order to be used. For more information on how to enable the DynamoDB™ API in Scylla, and the current compatibility of this feature as well as Scylla-specific extensions, see Alternator and Getting started with Alternator.

Documentation

Documentation can be found here. Seastar documentation can be found here. User documentation can be found here.

Training

Training material and online courses can be found at Scylla University. The courses are free, self-paced and include hands-on examples. They cover a variety of topics including Scylla data modeling, administration, architecture, basic NoSQL concepts, using drivers for application development, Scylla setup, failover, compactions, multi-datacenters and how Scylla integrates with third-party applications.

Contributing to Scylla

If you want to report a bug or submit a pull request or a patch, please read the contribution guidelines.

If you are a developer working on Scylla, please read the developer guidelines.

Contact

The community forum and Slack channel are for users to discuss configuration, management, and operations of ScyllaDB.
The developers mailing list is for developers and people interested in following the development of ScyllaDB to discuss technical topics.

Languages

C++ 73.5%

Python 25.3%

CMake 0.3%

GAP 0.3%

Shell 0.3%