scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-01 12:36:56 +00:00

Files

Tomasz Grabiec bbfa52822e row_cache: Switch readers to use per-entry snapshots

Currently readers are always using the latest snapshot. This is fine
for respecting write atomicity if partitions are fully continuous in
cache (now), but will break write atomicity once partial population is
allowed.

Consider the following case:

  flush write(ck=1), write(ck=2) -> snapshot_1
  cache reader 1 reads and inserts ck=1 @snapshot_1
  flush write(ck=1), write(ck=2) -> snapshot_2
  cache reader 2 reads and inserts ck=2 @snapshot_2

Because cache update is not atomic, it can happen that reader 2 will
complete while the partition hasn't been updated yet for snapshot_2.
In such case, after read 2 the partition would contain ck=1 from
snapshot_1 and ck=2 from snapshot_2. It will match neither of the
snapshots, and this could violate write atomicity.

To solve this problem we conceptually assign each partition key in the
ring to its current snapshot which it reflects. The update process
gradually converts entries in ring order to the new snapshot. Reads
will not be using the latest snapshot, but rather the current snapshot
for the position in the ring they are at.

There is a race between the update process and populating reads. Since
after the update all entries must reflect the new snapshot, reads
using the old snapshot cannot be allowed to insert data which can no
longer be reached by the update process. Before this patch this race
was prevented by the use of a phased_barrier, where readers would keep
phased_barrier::operation alive between starting a read of a partition
and inserting it into cache. Cache update was waiting for all prior
operations before starting the update. Any later read which was not
waited for would use the latest snapshot for reads, so the update
process didn't have to fix anything up for such reads.

After this change, later reads cannot always use the latest snapshot,
they have to use the snapshot corresponding to given entry. So it's
not enough for update() to wait for prior reads in order to prevent
stale populations. The (simple) solution implemented in this patch is
to detect the conflict and abandon population of given sub-range. In
general, reads are allowed to populate given range only if it belongs
to a single snapshot.

Note that the range here is not the whole query range. For population
of continuity, it is the range starting after the previous key and
ending after the key being inserted. When populating a partition
entry, the range is a singular range containing only the partition
key. Readers switch to new snapshots automatically as they move across
the ring. It's possible that the insertion of the partition doesn't
conflict, but continuity does. In such case the entry will be inserted
but continuity will not be set.

2017-06-24 18:06:11 +02:00

perf

tests: fix call to seastar::sleep()

2017-06-22 18:16:13 +03:00

snitch_property_files

tests: added ec2_snitch_test

2015-10-08 20:57:20 +03:00

sstables

tests/sstables: test reading and writing counters

2017-02-02 10:35:14 +00:00

allocation_strategy_test.cc

build: support for linking statically with boost

2016-10-26 08:51:21 +03:00

anchorless_list_test.cc

build: support for linking statically with boost

2016-10-26 08:51:21 +03:00

auth_test.cc

build: support for linking statically with boost

2016-10-26 08:51:21 +03:00

batchlog_manager_test.cc

Merge seatar upstream (seastar namespace)

2017-05-21 12:26:15 +03:00

bytes_ostream_test.cc

build: support for linking statically with boost

2016-10-26 08:51:21 +03:00

canonical_mutation_test.cc

tests/canonical_mutation: don't try to upgrade incompatible schemas

2017-02-07 15:17:14 +00:00

cartesian_product_test.cc

build: support for linking statically with boost

2016-10-26 08:51:21 +03:00

cell_locker_test.cc

cell_locker: add metrics for lock acquisition

2017-03-02 09:05:12 +00:00

commitlog_test.cc

commitlog_test: Fix reader test dropping rp handles

2017-06-16 22:45:46 +01:00

compound_test.cc

compound_compat: Return composite from serialize_value()

2017-03-28 18:10:39 +02:00

config_test.cc

build: support for linking statically with boost

2016-10-26 08:51:21 +03:00

counter_test.cc

tests/counter: test transform_counter_updates_to_shards

2017-04-28 16:29:34 +01:00

cql_assertions.cc

Merge seatar upstream (seastar namespace)

2017-05-21 12:26:15 +03:00

cql_assertions.hh

Merge seatar upstream (seastar namespace)

2017-05-21 12:26:15 +03:00

cql_query_test.cc

Merge seatar upstream (seastar namespace)

2017-05-21 12:26:15 +03:00

cql_test_env.cc

Merge seatar upstream (seastar namespace)

2017-05-21 12:26:15 +03:00

cql_test_env.hh

Merge seatar upstream (seastar namespace)

2017-05-21 12:26:15 +03:00

crc_test.cc

utils: Put crc32 under utils namespace

2016-12-05 11:48:29 +02:00

database_test.cc

storage_proxy: pass maximum result size to replicas

2016-12-22 17:16:23 +01:00

dynamic_bitset_test.cc

build: support for linking statically with boost

2016-10-26 08:51:21 +03:00

ec2_snitch_test.cc

build: support for linking statically with boost

2016-10-26 08:51:21 +03:00

flush_queue_test.cc

Merge seatar upstream (seastar namespace)

2017-05-21 12:26:15 +03:00

frozen_mutation_test.cc

build: support for linking statically with boost

2016-10-26 08:51:21 +03:00

gossip_test.cc

Merge seatar upstream (seastar namespace)

2017-05-21 12:26:15 +03:00

gossip.cc

Merge seatar upstream (seastar namespace)

2017-05-21 12:26:15 +03:00

gossiping_property_file_snitch_test.cc

build: support for linking statically with boost

2016-10-26 08:51:21 +03:00

hash_test.cc

build: support for linking statically with boost

2016-10-26 08:51:21 +03:00

idl_test.cc

Merge seatar upstream (seastar namespace)

2017-05-21 12:26:15 +03:00

input_stream_test.cc

build: support for linking statically with boost

2016-10-26 08:51:21 +03:00

keys_test.cc

build: support for linking statically with boost

2016-10-26 08:51:21 +03:00

log_histogram_test.cc

tests: log_historgram_test: Fix compiation on Ubuntu

2017-04-25 12:15:28 +03:00

logalloc_test.cc

logalloc: reduce descriptor overhead

2017-04-24 12:23:12 +02:00

lsa_async_eviction_test.cc

tests: lsa_async_eviction_test: Allocate objects under allocating section

2017-03-16 10:21:10 +01:00

lsa_sync_eviction_test.cc

Fix pre-ScyllaDB copyright statements

2016-04-08 08:12:47 +03:00

make_random_string.hh

tests: Extract simple_schema

2017-03-10 14:42:22 +01:00

managed_vector_test.cc

build: support for linking statically with boost

2016-10-26 08:51:21 +03:00

map_difference_test.cc

build: support for linking statically with boost

2016-10-26 08:51:21 +03:00

memory_footprint.cc

row_cache: Switch to using snapshot_source

2017-06-24 18:06:11 +02:00

memtable_snapshot_source.hh

tests: Introduce memtable_snapshot_source

2017-06-24 18:06:11 +02:00

memtable_test.cc

database: rework dirty memory hierarchy

2016-12-13 14:07:53 -05:00

message.cc

Merge seatar upstream (seastar namespace)

2017-05-21 12:26:15 +03:00

murmur_hash_test.cc

build: support for linking statically with boost

2016-10-26 08:51:21 +03:00

mutation_assertions.hh

tests: mutation_assertions: Add ability to limit verification to given clustering_row_ranges

2017-06-24 18:06:11 +02:00

mutation_query_test.cc

Allow reading exactly desired byte ranges and fast_forward_to

2017-06-19 18:31:32 +03:00

mutation_reader_assertions.hh

tests: mutation_assertions: Add ability to limit verification to given clustering_row_ranges

2017-06-24 18:06:11 +02:00

mutation_reader_test.cc

Convert to use dht::partition_range_vector and dht::token_range_vector

2016-12-19 14:08:50 +08:00

mutation_source_test.cc

tests: mutation_source: Relax expectations about range tombstones

2017-06-24 18:06:11 +02:00

mutation_source_test.hh

tests: Add range generators to random_mutation_generator

2017-02-23 18:50:53 +01:00

mutation_test.cc

tests: Add test for continuity merging rules

2017-06-24 18:06:11 +02:00

network_topology_strategy_test.cc

Merge seatar upstream (seastar namespace)

2017-05-21 12:26:15 +03:00

nonwrapping_range_test.cc

Convert to use dht::partition_range

2016-12-19 08:04:30 +08:00

partitioner_test.cc

tests: fix partitioner_test build on gcc 5

2017-06-14 17:22:01 +03:00

perf_row_cache_update.cc

row_cache: Switch to using snapshot_source

2017-06-24 18:06:11 +02:00

query_processor_test.cc

build: support for linking statically with boost

2016-10-26 08:51:21 +03:00

range_assert.hh

Fix pre-ScyllaDB copyright statements

2016-04-08 08:12:47 +03:00

range_test.cc

tests: Add test case for nonwrapping_range::intersection()

2017-05-17 10:33:18 +02:00

range_tombstone_list_test.cc

mutation_partition_serializer: Assume range tombstone support

2017-06-15 09:54:05 +03:00

result_set_assertions.cc

Merge "Fix query digest mismatch" from Tomasz

2016-04-08 12:13:29 +03:00

result_set_assertions.hh

Merge "Fix query digest mismatch" from Tomasz

2016-04-08 12:13:29 +03:00

row_cache_alloc_stress.cc

row_cache: Switch to using snapshot_source

2017-06-24 18:06:11 +02:00

row_cache_test.cc

row_cache: Switch readers to use per-entry snapshots

2017-06-24 18:06:11 +02:00

schema_change_test.cc

v3 schema test fixes

2017-05-10 16:44:48 +00:00

schema_registry_test.cc

build: support for linking statically with boost

2016-10-26 08:51:21 +03:00

simple_schema.hh

tests: sstable: Add test for fast forwarding within partition using index

2017-03-28 18:34:55 +02:00

snitch_reset_test.cc

build: support for linking statically with boost

2016-10-26 08:51:21 +03:00

sstable_atomic_deletion_test.cc

sstables: add unit tests for atomic deletion

2016-11-04 15:48:43 +02:00

sstable_datafile_test.cc

lcs: actually prefer oldest sstables of L0 when it falls behind

2017-06-19 20:45:39 -03:00

sstable_mutation_test.cc

sstables: Introduce sstable::as_mutation_source()

2017-05-25 19:30:20 +03:00

sstable_resharding_test.cc

sstables: Introduce sstable::as_mutation_source()

2017-05-25 19:30:20 +03:00

sstable_test.cc

Allow reading exactly desired byte ranges and fast_forward_to

2017-06-19 18:31:32 +03:00

sstable_test.hh

sstables: Introduce sstable::as_mutation_source()

2017-05-25 19:30:20 +03:00

storage_proxy_test.cc

Convert to use dht::partition_range_vector and dht::token_range_vector

2016-12-19 14:08:50 +08:00

streamed_mutation_test.cc

mutation_partition: Add support for specifying continuity

2017-06-24 18:06:11 +02:00

test_services.hh

Merge seatar upstream (seastar namespace)

2017-05-21 12:26:15 +03:00

test-serialization.cc

utils::serialization: remove not used deserialization_xxx() functions

2017-05-26 19:26:20 +03:00

tmpdir.hh

Fix pre-ScyllaDB copyright statements

2016-04-08 08:12:47 +03:00

total_order_check.hh

utils: Extract to_boost_visitor() to a separate header

2017-05-22 19:30:02 +02:00

types_test.cc

Merge "Migrate schema tables to v3 format" from Calle

2017-05-17 11:25:52 +03:00

UUID_test.cc

utils::UUID: operator< should behave as comparison of hex strings/bytes

2017-02-22 09:19:22 +00:00

view_schema_test.cc

tests/view_schema_test: Add more test cases

2017-05-17 11:21:58 +02:00

virtual_reader_test.cc

tests: Fix failure virtual_reader_test

2017-04-23 14:06:35 +03:00