mirror of https://github.com/scylladb/scylladb.git synced 2026-06-03 13:37:04 +00:00

Go to file

Michał Chojnowski ed98102c45 row_cache: update _prev_snapshot_pos even if apply_to_incomplete() is preempted

Commit e81fc1f095 accidentally broke the control
flow of row_cache::do_update().

Before that commit, the body of the loop was wrapped in a lambda.
Thus, to break out of the loop, `return` was used.

The bad commit removed the lambda, but didn't update the `return` accordingly.
Thus, since the commit, the statement doesn't just break out of the loop as
intended, but also skips the code after the loop, which updates `_prev_snapshot_pos`
to reflect the work done by the loop.

As a result, whenever `apply_to_incomplete()` (the `updater`) is preempted,
`do_update()` fails to update `_prev_snapshot_pos`. It remains in a
stale state, until `do_update()` runs again and either finishes or
is preempted outside of `updater`.

If we read a partition processed by `do_update()` but not covered by
`_prev_snapshot_pos`, we will read stale data (from the previous snapshot),
which will be remembered in the cache as the current data.

This results in outdated data being returned by the replica.
(And perhaps in something worse if range tombstones are involved.
I didn't investigate this possibility in depth).

Note: for queries with CL>1, occurences of this bug are likely to be hidden
by reconciliation, because the reconciled query will only see stale data if
the queried partition is affected by the bug on on *all* queried replicas
at the time of the query.

Fixes #16759

Closes scylladb/scylladb#17138

2024-02-04 11:17:41 +02:00

.github

.git: add more skip words

2024-01-29 14:37:03 +02:00

alternator

Merge 'alternator: enable tablets by default if experimental feature is enabled' from Nadav Har'El

2024-01-29 09:22:13 +02:00

api

Merge 'compaction_manager: flush tables before cleanup' from Kefu Chai

2024-02-01 13:47:45 +02:00

auth

service/maintenance_mode: move maintenance_socket_enabled definition to seperate file

2024-01-25 15:27:53 +01:00

bin

tools: add cqlsh shortcut

2023-07-12 09:36:59 +03:00

cdc

cdc: not include unused headers

2024-01-11 09:13:37 +02:00

cmake

build: cmake: use # for line comment

2024-01-03 15:05:00 +02:00

compaction

compaction: run rewrite_sstables_compaction_task_executor tasks in maintenance group

2024-02-02 11:18:49 +02:00

conf

Merge 'Add maintenance socket' from Mikołaj Grzebieluch

2023-12-20 19:04:40 +02:00

cql3

cql3: Sanitize ALTER KEYSPACE check for non-local storages

2024-02-02 11:13:29 +02:00

data_dictionary

data_dictionary: Add formatter for keyspace-metadata

2024-02-02 11:26:39 +02:00

db: add formatter for db::write_type

2024-02-01 10:22:45 +02:00

debug

…

dht

db: add formatter for dht::decorated_key and repair_sync_boundary

2024-01-29 11:11:41 +02:00

direct_failure_detector

…

dist

scylla_raid_setup: drop unused import

2024-02-02 15:20:40 +01:00

docs

docs: s/ontop/on top/

2024-02-02 15:20:40 +01:00

exceptions

Typos: fix typos in code

2023-12-05 15:18:11 +02:00

gms

gms: Remove unused operator<< for feature object

2024-02-01 19:00:46 +02:00

idl

storage_service: Implement table_load_stats RPC

2024-01-25 18:36:08 -03:00

index

Merge 'scylla-sstable: add support for loading schema of views and indexes' from Botond Dénes

2024-01-24 23:36:54 +02:00

interface

Typos: fix typos in comments

2023-12-02 22:37:22 +02:00

lang

Typos: fix typos in code

2023-12-05 15:18:11 +02:00

licenses

…

locator

locator/utils: remove stale comment

2024-02-02 11:07:35 +02:00

message

messaging: Add formatter for netw::msg_addr

2024-02-02 15:20:40 +01:00

mutation

treewide: fix misspellings in code comments

2024-01-31 09:16:10 +02:00

mutation_writer

mutation_writer: do not include unused headers

2024-01-24 15:20:02 +02:00

node_ops

token_metadata: drop the template

2023-12-12 23:19:54 +04:00

raft

Merge 'Add an API to trigger snapshot in Raft servers' from Kamil Braun

2024-01-29 15:06:04 +02:00

readers

reader: do not include unused headers

2024-01-29 16:21:42 +02:00

redis

redis: do not include unused headers

2024-01-31 09:17:18 +02:00

reloc

…

repair

treewide: fix misspellings in code comments

2024-01-31 09:16:10 +02:00

replica

replica/database: use structured-bind when appropriate

2024-02-01 16:31:29 +02:00

rust

rust: update dependencies

2023-12-17 13:20:25 +02:00

schema

schema: column_mapping::{static,regular}_column_at(): use on_internal_error()

2024-01-31 05:12:33 -05:00

scripts

Typos: fix typos in code

2023-12-13 10:45:21 +02:00

seastar @ 85359b2866

Update seastar submodule

2024-01-22 11:29:50 +01:00

service

storage_service: Drop unnecessary co_return in raft_topology_cmd_handler

2024-02-02 08:20:06 +08:00

sstables

Remove unnecessary calculations in integrity_checked_file_impl::write_dma.

2024-02-01 13:42:59 +02:00

streaming

streaming: Verify stream consumer runs inside streaming group

2024-02-01 10:37:24 +08:00

swagger-ui @ 12f1da1082

…

tasks

tasks: do not include unused headers

2024-02-02 15:20:40 +01:00

test

Merge 'Move some tablets tests from topology_custom to cql-pytest' from Pavel Emelyanov

2024-02-01 16:28:43 +02:00

thrift

thrift: remove unused namespace definition

2024-01-30 09:16:47 +02:00

tools

tools: lua_sstable_consumer.cc: load os and math libs

2024-02-02 19:00:57 +03:00

tracing

tracing: add formatter for tracing::span_id

2024-01-31 13:43:46 +02:00

transport

transport: do not include unused headers

2024-02-02 11:20:24 +02:00

types

utils: do not include unused headers

2024-01-18 12:50:06 +02:00

unified

Update unified/build_unified.sh

2023-12-05 15:23:38 +02:00

utils

utils: Remove unused operator<< for file_lock object

2024-02-02 15:20:40 +01:00

.dockerignore

…

.gitattributes

…

.gitignore

docs: download iam csv files

2023-10-02 12:28:56 +03:00

.gitmodules

…

.gitorderfile

…

.mailmap

…

absl-flat_hash_map.cc

…

absl-flat_hash_map.hh

…

amplify.yml

…

backlog_controller.hh

treewide: apply codespell to the comments in source code

2023-12-20 10:25:03 +02:00

build_mode.hh

…

bytes_ostream.hh

…

bytes.cc

…

bytes.hh

bytes.hh: correct spelling of delimiter and delimited

2023-12-18 20:46:21 +02:00

cache_flat_mutation_reader.hh

cache_flat_mutation_reader: fix a broken iterator validity guarantee in ensure_population_lower_bound()

2023-11-16 19:01:18 +01:00

cache_temperature.hh

…

cartesian_product.hh

…

cell_locking.hh

…

checked-file-impl.hh

code: Switch to seastar API level 7

2023-06-06 13:29:16 +03:00

client_data.cc

…

client_data.hh

…

clocks-impl.cc

clocks-impl: format time_point using fmt

2023-11-22 17:44:07 +02:00

clocks-impl.hh

…

clustering_bounds_comparator.hh

…

clustering_interval_set.hh

…

clustering_key_filter.hh

…

clustering_ranges_walker.hh

…

CMakeLists.txt

build: cmake: add "mode_list" target

2023-12-24 12:35:02 +08:00

collection_mutation.cc

…

collection_mutation.hh

…

column_computation.hh

Typos: fix typos in code

2023-12-05 15:18:11 +02:00

combine.hh

…

compound_compat.hh

compound_compat: do not format an sstring with {:d}

2023-07-08 15:13:11 +03:00

compound.hh

Typos: fix typos in code

2023-12-05 15:18:11 +02:00

compress.cc

./: not include unused headers

2024-01-17 16:30:14 +02:00

compress.hh

./: not include unused headers

2024-01-17 16:30:14 +02:00

concrete_types.hh

make timestamp string format cassandra compatible

2023-07-27 12:01:09 +03:00

configure.py

Merge 'Fix mintimeuuid() call that could crash Scylla' from Nadav Har'El

2024-02-01 10:48:48 +02:00

CONTRIBUTING.md

…

converting_mutation_partition_applier.cc

…

converting_mutation_partition_applier.hh

…

counters.cc

counters: move fmt::formatter<counter_{shard,cell}_view>::format() to .cc

2023-05-24 09:36:49 +03:00

counters.hh

counters: move fmt::formatter<counter_{shard,cell}_view>::format() to .cc

2023-05-24 09:36:49 +03:00

coverage_excludes.txt

test.py: support code coverage

2024-01-18 11:11:34 +02:00

coverage_sources.list

configure.py support coverage profiles on standrad build modes

2024-01-18 11:11:34 +02:00

cql_serialization_format.hh

…

db_clock.hh

…

debug.cc

…

debug.hh

…

default.nix

…

Doxyfile

…

duration.cc

Typos: fix typos in code

2023-12-05 15:18:11 +02:00

duration.hh

…

encoding_stats.hh

encoding_state: mark helper methods protected

2023-08-29 15:41:13 +03:00

enum_set.hh

…

fix_system_distributed_tables.py

…

flake.lock

…

flake.nix

…

frozen_schema.cc

…

frozen_schema.hh

…

full_position.hh

…

gc_clock.hh

…

gdbinit

…

gen_segmented_compress_params.py

Typos: fix typos in code

2023-12-13 10:45:21 +02:00

generic_server.cc

generic_server: use mutable reference in for_each_gently

2023-11-14 14:25:22 +02:00

generic_server.hh

generic_server: use mutable reference in for_each_gently

2023-11-14 14:25:22 +02:00

HACKING.md

…

hashing_partition_visitor.hh

./: not include unused headers

2024-01-17 16:30:14 +02:00

idl-compiler.py

Typos: fix typos in code

2023-12-13 10:45:21 +02:00

inet_address_vectors.hh

abstract_replication_strategy: calculate_natural_endpoints: make it work with both versions of token_metadata

2023-12-12 23:19:53 +04:00

init.cc

./: not include unused headers

2024-01-17 16:30:14 +02:00

init.hh

Merge 'Typos: fix typos in code' from Yaniv Kaul

2023-12-06 07:36:41 +02:00

install-dependencies.sh

install-dependencies.sh: remove duplicate python3-pyudev package

2024-02-02 15:20:40 +01:00

install.sh

install.sh: use a temporary file when packaging scylla.yaml

2024-01-01 21:50:29 +02:00

interval.hh

interval: make default ctor and make_open_ended_both_sides constexpr

2023-11-06 18:39:53 +01:00

keys.cc

keys: Move exploded_clustering_prefix's operator<< to keys.cc

2023-07-19 11:57:27 +03:00

keys.hh

keys: do not use zip_iterator for printing key components

2023-07-01 23:49:02 +03:00

LICENSE.AGPL

…

log.hh

…

main.cc

directories: prevent inode cache fragmentation by orderly verifying data directory contents

2024-02-01 12:20:23 +05:30

map_difference.hh

./: not include unused headers

2024-01-17 16:30:14 +02:00

marshal_exception.hh

…

multishard_mutation_query.cc

reader_permit: store schema_ptr instead of raw schema pointer

2024-01-11 08:37:56 +02:00

multishard_mutation_query.hh

treewide: apply codespell to the comments in source code

2023-12-20 10:25:03 +02:00

mutation_query.cc

./: not include unused headers

2024-01-17 16:30:14 +02:00

mutation_query.hh

mutation_query: add formatter for reconcilable_result::printer

2023-11-26 20:20:50 +02:00

noexcept_traits.hh

…

NOTICE.txt

…

ORIGIN

…

partition_builder.hh

…

partition_range_compat.hh

./: not include unused headers

2024-01-17 16:30:14 +02:00

partition_slice_builder.cc

partition_slice_builder: add set_specific_ranges()

2023-05-08 07:35:39 -04:00

partition_slice_builder.hh

partition_slice_builder: add set_specific_ranges()

2023-05-08 07:35:39 -04:00

partition_snapshot_reader.hh

…

partition_snapshot_row_cursor.hh

./: not include unused headers

2024-01-17 16:30:14 +02:00

protocol_server.hh

…

querier.cc

Typos: fix typos in code

2023-12-05 15:18:11 +02:00

querier.hh

Typos: fix typos in code

2023-12-05 15:18:11 +02:00

query_id.hh

…

query_ranges_to_vnodes.cc

everywhere: reduce dependencies on i_partitioner.hh

2023-11-05 20:47:44 +02:00

query_ranges_to_vnodes.hh

everywhere: reduce dependencies on i_partitioner.hh

2023-11-05 20:47:44 +02:00

query_result_merger.hh

…

query-request.hh

./: not include unused headers

2024-01-17 16:30:14 +02:00

query-result-reader.hh

./: not include unused headers

2024-01-17 16:30:14 +02:00

query-result-set.cc

./: not include unused headers

2024-01-17 16:30:14 +02:00

query-result-set.hh

…

query-result-writer.hh

…

query-result.hh

treewide: do not mark return value const if this has no effect

2023-11-17 17:46:19 +08:00

query.cc

treewide: use #include <seastar/...> for seastar headers

2023-06-06 08:36:09 +03:00

range.hh

…

read_context.hh

compact and remove expired rows from cache on read

2023-06-26 15:29:01 +02:00

reader_concurrency_semaphore.cc

reader_concurrency_semaphore.cc: move stringstream content instead of copying it

2024-01-31 09:31:50 +02:00

reader_concurrency_semaphore.hh

reader_permit: store schema_ptr instead of raw schema pointer

2024-01-11 08:37:56 +02:00

reader_permit.hh

reader_permit: store schema_ptr instead of raw schema pointer

2024-01-11 08:37:56 +02:00

README.md

…

real_dirty_memory_accounter.hh

real_dirty_memory_accounter: document what the class is doing

2023-05-23 09:11:31 +03:00

release.cc

…

release.hh

…

reversibly_mergeable.hh

…

row_cache.cc

row_cache: update _prev_snapshot_pos even if apply_to_incomplete() is preempted

2024-02-04 11:17:41 +02:00

row_cache.hh

Merge 'row_cache: abort on exteral_updater::execute errors' from Benny Halevy

2023-10-31 10:07:01 +02:00

schema_mutations.cc

schema_mutations, migration_manager: Ignore empty partitions in per-table digest

2023-07-03 23:06:55 +02:00

schema_mutations.hh

schema_mutations, migration_manager: Ignore empty partitions in per-table digest

2023-07-03 23:06:55 +02:00

schema_upgrader.hh

…

scylla_post_install.sh

dist: drop legacy control group parameters

2023-12-11 19:38:28 +09:00

scylla-gdb.py

reader_permit: store schema_ptr instead of raw schema pointer

2024-01-11 08:37:56 +02:00

SCYLLA-VERSION-GEN

Typos: fix typos in code

2023-12-13 10:45:21 +02:00

seastarx.hh

…

serialization_visitors.hh

…

serializer_impl.hh

./: not include unused headers

2024-01-17 16:30:14 +02:00

serializer.cc

…

serializer.hh

./: not include unused headers

2024-01-17 16:30:14 +02:00

service_permit.hh

…

setup.py

…

shell.nix

…

sstables_loader.cc

sstables_loader: load_new_sstables: auto-enable load-and-stream for tablets

2024-01-16 18:43:52 +02:00

sstables_loader.hh

…

supervisor.hh

…

table_helper.cc

keyspace_metadata: Add default value for new_keyspace's durable_writes

2023-12-26 11:47:37 +03:00

table_helper.hh

Typos: fix typos in code

2023-12-05 15:18:11 +02:00

test.py

test.py: add boost_tests() to suite

2024-01-31 13:43:21 +02:00

timeout_config.cc

./: not include unused headers

2024-01-17 16:30:14 +02:00

timeout_config.hh

./: not include unused headers

2024-01-17 16:30:14 +02:00

timestamp.hh

…

tombstone_gc_extension.hh

…

tombstone_gc_options.cc

…

tombstone_gc_options.hh

…

tombstone_gc.cc

./: not include unused headers

2024-01-17 16:30:14 +02:00

tombstone_gc.hh

./: not include unused headers

2024-01-17 16:30:14 +02:00

tox.ini

…

ubsan-suppressions.supp

…

unimplemented.cc

unimplemented: add format_as() for unimplemented::cause

2024-01-19 08:38:30 +02:00

unimplemented.hh

./: not include unused headers

2024-01-17 16:30:14 +02:00

validation.cc

…

validation.hh

…

version.hh

…

view_info.hh

everywhere: reduce dependencies on i_partitioner.hh

2023-11-05 20:47:44 +02:00

vint-serialization.cc

./: not include unused headers

2024-01-17 16:30:14 +02:00

vint-serialization.hh

Typos: fix typos in code

2023-12-05 15:18:11 +02:00

zstd.cc

./: not include unused headers

2024-01-17 16:30:14 +02:00

README.md

Scylla

What is Scylla?

Scylla is the real-time big data database that is API-compatible with Apache Cassandra and Amazon DynamoDB. Scylla embraces a shared-nothing approach that increases throughput and storage capacity to realize order-of-magnitude performance improvements and reduce hardware costs.

For more information, please see the ScyllaDB web site.

Build Prerequisites

Scylla is fairly fussy about its build environment, requiring very recent versions of the C++20 compiler and of many libraries to build. The document HACKING.md includes detailed information on building and developing Scylla, but to get Scylla building quickly on (almost) any build machine, Scylla offers a frozen toolchain, This is a pre-configured Docker image which includes recent versions of all the required compilers, libraries and build tools. Using the frozen toolchain allows you to avoid changing anything in your build machine to meet Scylla's requirements - you just need to meet the frozen toolchain's prerequisites (mostly, Docker or Podman being available).

Building Scylla

Building Scylla with the frozen toolchain dbuild is as easy as:

$ git submodule update --init --force --recursive
$ ./tools/toolchain/dbuild ./configure.py
$ ./tools/toolchain/dbuild ninja build/release/scylla

For further information, please see:

Developer documentation for more information on building Scylla.
Build documentation on how to build Scylla binaries, tests, and packages.
Docker image build documentation for information on how to build Docker images.

Running Scylla

To start Scylla server, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --workdir tmp --smp 1 --developer-mode 1

This will start a Scylla node with one CPU core allocated to it and data files stored in the tmp directory. The --developer-mode is needed to disable the various checks Scylla performs at startup to ensure the machine is configured for maximum performance (not relevant on development workstations). Please note that you need to run Scylla with dbuild if you built it with the frozen toolchain.

For more run options, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --help

Testing

See test.py manual.

Scylla APIs and compatibility

By default, Scylla is compatible with Apache Cassandra and its APIs - CQL and Thrift. There is also support for the API of Amazon DynamoDB™, which needs to be enabled and configured in order to be used. For more information on how to enable the DynamoDB™ API in Scylla, and the current compatibility of this feature as well as Scylla-specific extensions, see Alternator and Getting started with Alternator.

Documentation

Documentation can be found here. Seastar documentation can be found here. User documentation can be found here.

Training

Training material and online courses can be found at Scylla University. The courses are free, self-paced and include hands-on examples. They cover a variety of topics including Scylla data modeling, administration, architecture, basic NoSQL concepts, using drivers for application development, Scylla setup, failover, compactions, multi-datacenters and how Scylla integrates with third-party applications.

Contributing to Scylla

If you want to report a bug or submit a pull request or a patch, please read the contribution guidelines.

If you are a developer working on Scylla, please read the developer guidelines.

Contact

The community forum and Slack channel are for users to discuss configuration, management, and operations of the ScyllaDB open source.
The developers mailing list is for developers and people interested in following the development of ScyllaDB to discuss technical topics.

Languages

C++ 72.2%

Python 26.6%

CMake 0.3%

GAP 0.3%

Shell 0.3%