Commit Graph

131 Commits

Author SHA1 Message Date
Botond Dénes
5998a859f7 tombstone_gc: allow use of repair-mode for RF=1 tables
Modify the methods which calculate the default gc mode as well as that
which validates whether repair-mode can be used at all, so both accepts
use of repair-mode on RF=1 tables.

This de-facto changes the default tombstone-gc to repair-mode for all
tables. Documentation is updated accordingly.

Some tests need adjusting:
* cqlpy/test_select_from_mutation_fragments.py: disable GC for some test
  cases because this patch makes tombstones they write subject to GC
  when using defaults.
* test/cluster/test_mv.py::test_mv_tombstone_gc_not_inherited used
  repair-mode as a non-default for the base table and expected the MV to
  revert to default. Another mode has to be used as the non-default
  (immediate).
* test/cqlpy/test_tools.py::test_scylla_sstable_dump_schema: don't
  compare tombstone_gc schema extension when comparing dumped schema vs.
  original. The tool's schema loader doesn't have access to the keyspace
  definition so it will come up with different defaults for
  tombstone-gc.
* test/boost/row_cache_test.cc::test_populating_cache_with_expired_and_nonexpired_tombstones
  sets tombstone expiry assuming the tombstone-gc timeout-mode default.
  Change the CREATE TABLE statement to set the expected mode.
2026-03-04 09:44:24 +02:00
Botond Dénes
6004e84f18 test: move away from tombstone_gc_state(nullptr) ctor
Use for_tests() instead (or no_gc() where approriate).
2026-03-03 14:09:28 +02:00
Botond Dénes
6364e35403 replica/table: add get_tombstone_gc_state()
Shorthand for
get_compaction_manager().get_shared_tombstone_gc_state().get_tombstone_gc_state().
2026-03-03 14:09:28 +02:00
Botond Dénes
83e20d920e db/row_cache: use tombstone_gc_state with value semantics
Instead of keeping a pointer to it. Replace nullptr with
tombstone_gc_state::no_gc().

This object is now designed to be used as a value-type, after recent
refactoring.
2026-03-03 14:09:27 +02:00
Łukasz Paszkowski
8829098e90 reader_concurrency_semaphore: Remove cpu_concurrency's default value
The commit 59faa6d, introduces a new parameter called cpu_concurrency
and sets its default value to 1 which violates the commit fbb83dd that
removes all default values from constructors but one used by the unit
tests.

The patch removes the default value of the cpu_concurrency parameter
and alters tests to use the test dedicated reader_concurrency_semaphore
constructor wherever possible.
2026-01-27 15:40:11 +01:00
Botond Dénes
a53f989d2f db/row_cache: make_nonpopulating_reader(): pass cache tracker to snapshot
The API contract in partition_version.hh states that when dealing with
evictable entries, a real cache tracker pointer has to be passed to all
methods that ask for it. The nonpopulating reader violates this, passing
a nullptr to the snapshot. This was observed to cause a crash when a
concurrent cache read accessed the snapshot with the null tracker.

A reproducer is included which fails before and passes after the fix.

Fixes: #26847

Closes scylladb/scylladb#28163
2026-01-20 12:34:37 +01:00
Botond Dénes
1e09a34686 replica: add abort polling to memtable and cache readers
Continuing the read once it is aborted (e.g. due to timeout) is a waste
of resources, as the produced results will be discarded.
Poll the permit's abort exception in the memtable and cache reader's
fill_buffer(). This results in one poll per buffer filled (8KB of data).

We already have similar poll for sstable readers, as disk reads are
usually much heavier and therefore it is more important to stop them
ASAP after abort. Cache and memtable reads are usually quick but not
always, hence it is important to also have polling in the cache and
memtable readers.

Refs: #11469
Fixes: #28148

Closes scylladb/scylladb#28149
2026-01-16 18:03:04 +01:00
Aleksandra Martyniuk
75b772adfb db: optimize cache invalidation following repair/streaming
Currently, if a new sstable is created during repair/streaming,
we invalidate its whole	token range in cache. If the sstable
is sparse, we unnecessarily clear too much data.

Modify cache invalidation, so that only the partitions present
in the sstable are cleared.

To check whether a partition is present in the sstable, we use bloom
filters. Bloom filters may return false positives and show that
an sstable contains a partition, even though it does not. Due to that
we may invalidate a bit more than we need to, but the cache will be
in valid state.

An issue arises when we do not invalidate two consecutive partitions
that are continuous. The sstable may contain a token that falls
between these partitions, breaking the continuity. To check that, we
would need to scan sstable index. However, such a change would
noticeably complicate the invalidation, both performance and code.
In this change, sstable index reader isn't used. Instead, the continuity
flag is unset for all scanned partitions. This comes at a cost of
heavier reads, as we will need to verify continuity when reading more
than one partition from cache.

Fixes: https://github.com/scylladb/scylladb/issues/9136.

Closes scylladb/scylladb#25996
2025-09-14 19:48:14 +03:00
Botond Dénes
65c770f21a test/boost/row_cache_test: add test for memtable overlap check elision 2025-08-11 17:20:12 +03:00
Botond Dénes
614d17347a tombstone_gc: extract shared state into shared_tombstone_gc_state
Instead of storing it partially in tombstone_gc and partially in an
external map. Move all external parts into the new
shared_tombstone_gc_state. This new class is responsible for
keeping and updating the repair history. tombstone_gc_state just keeps
const pointers to the shared state as before and is only responsible for
querying the tombstone gc before times.
This separation makes the code easier to follow and also enables further
patching of tombstone_gc_state.
2025-08-11 07:09:14 +03:00
Botond Dénes
c150bdd59c test/boost/row_cache_test: refactor test_populating_reader_tombstone_gc_with_data_in_memtable
This test currently uses gc_grace_seconds=0. The introduction
of memtable overlap elision will break these tests because the
optimization is always active with this tombstone-gc.
Switch the tests to use tombstone-gc=repair, which allows for greater
control over when the memtable overlap elision is triggered.
This requires a move to vnodes, as tombstone-gc=repair doesn't
work with RF=1 currently, and using RF=3 won't work with tablets.
2025-08-11 07:09:13 +03:00
Botond Dénes
c052f2ad1d test: rewrite test_compacting_reader_tombstone_gc_with_data_in_memtable in C++
This test will soon need to be changed to use tombstone-gc=repair. This
cannot work as of now, as the test uses a single-node cluster.
The options are the following:
* Make it use more than one nodes
* Make repair work with single node clusters
* Rewrite in C++ where repair can be done synthetically

We chose the last option, it is the simplest one both in terms of code
and runtime footprint.

The new test is in test/boost/row_cache_test.cc
Two changes were done during the migration
* Change the name to
  test_populating_reader_tombstone_gc_with_data_in_memtable
  to better express which cache component this test is targetting;
* Use NullCompactionStrategy on the table instead of disabling
  auto-compaction.
2025-08-11 07:09:13 +03:00
Botond Dénes
e4c048ada1 test/boost/row_cache_test: refactor cache tombstone GC with memtable overlap tests
These tests currently use tombstone-gc=immediate. The introduction
of memtable overlap elision will break these tests because the
optimization is always active with this tombstone-gc.
Switch the tests to use tombstone-gc=repair, which allows for greater
control over when the memtable overlap elision is triggered.
This requires a move to vnodes, as tombstone-gc=repair doesn't
work with RF=1 currently, and using RF=3 won't work with tablets.
2025-08-11 07:09:13 +03:00
Benny Halevy
3feb759943 everywhere: use utils::chunked_vector for list of mutations
Currently, we use std::vector<*mutation> to keep
a list of mutations for processing.
This can lead to large allocation, e.g. when the vector
size is a function of the number of tables.

Use a chunked vector instead to prevent oversized allocations.

`perf-simple-query --smp 1` results obtained for fixed 400MHz frequency
and PGO disabled:

Before (read path):
```
enable-cache=1
Running test with config: {partitions=10000, concurrency=100, mode=read, query_single_key=no, counters=no}
Disabling auto compaction
Creating 10000 partitions...

89055.97 tps ( 66.1 allocs/op,   0.0 logallocs/op,  14.2 tasks/op,   39417 insns/op,   18003 cycles/op,        0 errors)
103372.72 tps ( 66.1 allocs/op,   0.0 logallocs/op,  14.2 tasks/op,   39380 insns/op,   17300 cycles/op,        0 errors)
98942.27 tps ( 66.1 allocs/op,   0.0 logallocs/op,  14.2 tasks/op,   39413 insns/op,   17336 cycles/op,        0 errors)
103752.93 tps ( 66.1 allocs/op,   0.0 logallocs/op,  14.2 tasks/op,   39407 insns/op,   17252 cycles/op,        0 errors)
102516.77 tps ( 66.1 allocs/op,   0.0 logallocs/op,  14.2 tasks/op,   39403 insns/op,   17288 cycles/op,        0 errors)
throughput:
	mean=   99528.13 standard-deviation=6155.71
	median= 102516.77 median-absolute-deviation=3844.59
	maximum=103752.93 minimum=89055.97
instructions_per_op:
	mean=   39403.99 standard-deviation=14.25
	median= 39406.75 median-absolute-deviation=9.30
	maximum=39416.63 minimum=39380.39
cpu_cycles_per_op:
	mean=   17435.81 standard-deviation=318.24
	median= 17300.40 median-absolute-deviation=147.59
	maximum=18002.53 minimum=17251.75
```

After (read path)
```
enable-cache=1
Running test with config: {partitions=10000, concurrency=100, mode=read, query_single_key=no, counters=no}
Disabling auto compaction
Creating 10000 partitions...
59755.04 tps ( 66.2 allocs/op,   0.0 logallocs/op,  14.2 tasks/op,   39466 insns/op,   22834 cycles/op,        0 errors)
71854.16 tps ( 66.1 allocs/op,   0.0 logallocs/op,  14.2 tasks/op,   39417 insns/op,   17883 cycles/op,        0 errors)
82149.45 tps ( 66.1 allocs/op,   0.0 logallocs/op,  14.2 tasks/op,   39411 insns/op,   17409 cycles/op,        0 errors)
49640.04 tps ( 66.1 allocs/op,   0.0 logallocs/op,  14.3 tasks/op,   39474 insns/op,   19975 cycles/op,        0 errors)
54963.22 tps ( 66.1 allocs/op,   0.0 logallocs/op,  14.3 tasks/op,   39474 insns/op,   18235 cycles/op,        0 errors)
throughput:
	mean=   63672.38 standard-deviation=13195.12
	median= 59755.04 median-absolute-deviation=8709.16
	maximum=82149.45 minimum=49640.04
instructions_per_op:
	mean=   39448.38 standard-deviation=31.60
	median= 39466.17 median-absolute-deviation=25.75
	maximum=39474.12 minimum=39411.42
cpu_cycles_per_op:
	mean=   19267.01 standard-deviation=2217.03
	median= 18234.80 median-absolute-deviation=1384.25
	maximum=22834.26 minimum=17408.67
```

`perf-simple-query --smp 1 --write` results obtained for fixed 400MHz frequency
and PGO disabled:

Before (write path):
```
enable-cache=1
Running test with config: {partitions=10000, concurrency=100, mode=write, query_single_key=no, counters=no}
Disabling auto compaction
63736.96 tps ( 59.4 allocs/op,  16.4 logallocs/op,  14.3 tasks/op,   49667 insns/op,   19924 cycles/op,        0 errors)
64109.41 tps ( 59.3 allocs/op,  16.0 logallocs/op,  14.3 tasks/op,   49992 insns/op,   20084 cycles/op,        0 errors)
56950.47 tps ( 59.3 allocs/op,  16.0 logallocs/op,  14.3 tasks/op,   50005 insns/op,   20501 cycles/op,        0 errors)
44858.42 tps ( 59.3 allocs/op,  16.0 logallocs/op,  14.3 tasks/op,   50014 insns/op,   21947 cycles/op,        0 errors)
28592.87 tps ( 59.3 allocs/op,  16.0 logallocs/op,  14.3 tasks/op,   50027 insns/op,   27659 cycles/op,        0 errors)
throughput:
	mean=   51649.63 standard-deviation=15059.74
	median= 56950.47 median-absolute-deviation=12087.33
	maximum=64109.41 minimum=28592.87
instructions_per_op:
	mean=   49941.18 standard-deviation=153.76
	median= 50005.24 median-absolute-deviation=73.01
	maximum=50027.07 minimum=49667.05
cpu_cycles_per_op:
	mean=   22023.01 standard-deviation=3249.92
	median= 20500.74 median-absolute-deviation=1938.76
	maximum=27658.75 minimum=19924.32
```

After (write path)
```
enable-cache=1
Running test with config: {partitions=10000, concurrency=100, mode=write, query_single_key=no, counters=no}
Disabling auto compaction
53395.93 tps ( 59.4 allocs/op,  16.5 logallocs/op,  14.3 tasks/op,   50326 insns/op,   21252 cycles/op,        0 errors)
46527.83 tps ( 59.3 allocs/op,  16.0 logallocs/op,  14.3 tasks/op,   50704 insns/op,   21555 cycles/op,        0 errors)
55846.30 tps ( 59.3 allocs/op,  16.0 logallocs/op,  14.3 tasks/op,   50731 insns/op,   21060 cycles/op,        0 errors)
55669.30 tps ( 59.3 allocs/op,  16.0 logallocs/op,  14.3 tasks/op,   50735 insns/op,   21521 cycles/op,        0 errors)
52130.17 tps ( 59.3 allocs/op,  16.0 logallocs/op,  14.3 tasks/op,   50757 insns/op,   21334 cycles/op,        0 errors)
throughput:
	mean=   52713.91 standard-deviation=3795.38
	median= 53395.93 median-absolute-deviation=2955.40
	maximum=55846.30 minimum=46527.83
instructions_per_op:
	mean=   50650.57 standard-deviation=182.46
	median= 50731.38 median-absolute-deviation=84.09
	maximum=50756.62 minimum=50325.87
cpu_cycles_per_op:
	mean=   21344.42 standard-deviation=202.86
	median= 21334.00 median-absolute-deviation=176.37
	maximum=21554.61 minimum=21060.24
```

Fixes #24815

Improvement for rare corner cases. No backport required

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#24919
2025-07-13 19:13:11 +03:00
Botond Dénes
ab96c703ff mutation: check key of inserted rows
Make sure the keys are full prefixes as it is expected to be the case
for rows. At severeal occasions we have seen empty row keys make their
ways into the sstables, despite the fact that they are not allowed by
the CQL frontend. This means that such empty keys are possibly results
of memory corruption or use-after-{free,copy} errors. The source of the
corruption is impossible to pinpoint when the empty key is discovered in
the sstable. So this patch adds checks for such keys to places where
mutations are built: when building or unserializing mutations.

The test row_cache_test/test_reading_of_nonfull_keys needs adjustment to
work with the changes: it has to make the schema use compact storage,
otherwise the non-full changes used by this tests are rejected by the
new checks.

Fixes: https://github.com/scylladb/scylladb/issues/24506
2025-06-23 09:38:45 +03:00
Botond Dénes
75fddbc078 readers/mutation_readers: s/delegating_reader_v2/delegating_reader/ 2025-05-09 07:53:30 -04:00
Botond Dénes
674d41e3e6 readers/mutation_source: s/make_reader_v2/make_mutation_reader/ 2025-05-09 07:53:29 -04:00
Botond Dénes
b5170e27d0 replica/table: s/make_reader_v2/make_mutation_reader/ 2025-05-09 07:53:29 -04:00
Botond Dénes
c29c696780 readers: mv from_mutations_v2.hh from_mutations.hh
Completely mechanical change.
2025-04-16 04:46:08 -04:00
Botond Dénes
b104862702 tree: s/make_mutation_reader_from_mutations_v2/make_mutation_reader_from_mutations/s
Completely mechanical change.
2025-04-16 04:46:07 -04:00
Botond Dénes
a9d75c4f9d readers: mv empty_v2.hh empty.hh
Completely mechanical change.
2025-04-16 04:32:56 -04:00
Botond Dénes
05829f98f3 tree: s/make_empty_flat_reader_v2/make_empty_mutation_reader/
Completely mechanical change.
2025-04-16 04:32:56 -04:00
Botond Dénes
c7f68a2649 readers/delegating_v2.hh: move reader definition to _impl.hh file
The idea behind readers/ is that each reader has its minimal header with
just a factory method declaration. The delegating reader is defined in
the factory header because it has a derived class in row_cache_test.cc.
Move the definition to delegating_impl.hh so users not interested in
deriving from it don't pay the price in header include cost.
2025-04-16 03:47:57 -04:00
Botond Dénes
0d39091df2 test/boost/row_cache_test: add memtable overlap check tests
Similar to test/cluster/test_data_resurrection_in_memtable.py but works
on a single node and uses more low-level mechanism. These tests can also
reproduce more advanced scenarios, like concurrent reads, with some
reading from flushed memtables.
2025-04-08 00:11:36 -04:00
Botond Dénes
6b5b563ef7 db/row_cache: add overlap-check for cache tombstone garbage collection
The cache should not garbage-collect tombstone which cover data in the
memtable. Add overlap checks (get_max_purgeable) to garbage collection
to detect tombstones which cover data in the memtable and to prevent
their garbage collection.
2025-04-08 00:11:35 -04:00
Botond Dénes
a0d8102a1f replica/memtable: s/make_flat_reader/make_mutation_reader/
Following the recent refactoring of removing "flat" and "v2" from reader
names, replacing all the fully qualified names with simply "mutation_reader".

Closes scylladb/scylladb#23346
2025-04-01 17:58:13 +03:00
Kefu Chai
9a20fb43ab tree: replace boost::min_element() with std::ranges::min_element()
in order to reduce the external header dependency, let's switch to
the standardlized std::ranges::min_element().

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#22572
2025-02-05 21:54:01 +02:00
Ran Regev
edd56a2c1c moved cache files to db
As requested in #22097, moved the files
and fixed other includes and build system.

Fixes: #22097
Signed-off-by: Ran Regev <ran.regev@scylladb.com>

Closes scylladb/scylladb#22495
2025-02-04 12:21:31 +03:00
Takuya ASADA
03461d6a54 test: compile unit tests into a single executable
To reduce test executable size and speed up compilation time, compile unit
tests into a single executable.

Here is a file size comparison of the unit test executable:

- Before applying the patch
$ du -h --exclude='*.o' --exclude='*.o.d' build/release/test/boost/ build/debug/test/boost/
11G	build/release/test/boost/
29G	build/debug/test/boost/

- After applying the patch
du -h --exclude='*.o' --exclude='*.o.d' build/release/test/boost/ build/debug/test/boost/
5.5G	build/release/test/boost/
19G	build/debug/test/boost/

It reduces executable sizes 5.5GB on release, and 10GB on debug.

Closes #9155

Closes scylladb/scylladb#21443
2024-12-22 19:14:09 +02:00
Avi Kivity
f3eade2f62 treewide: relicense to ScyllaDB-Source-Available-1.0
Drop the AGPL license in favor of a source-available license.
See the blog post [1] for details.

[1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/
2024-12-18 17:45:13 +02:00
Kefu Chai
ce2f80c227 treewide: migrate from boost::make_iterator_range to ranges::subrange
Replace boost::make_iterator_range() with std::ranges::subrange.

This change improves code modernization and reduces external dependencies:

- Replace boost::make_iterator_range() with std::ranges::subrange
- Remove boost/range/iterator_range.hpp include
- Improve iterator type detection in interval.hh using std::ranges::const_iterator_t<Range>

This is part of ongoing efforts to modernize our codebase and minimize
external dependencies.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21787
2024-12-09 21:31:53 +02:00
Alexander Turetskiy
e83ab28d2d Improve compation on read of expired tombstones
compact expired tombstones in cache even if they are blocked by
commitlog

fixes #16781

Closes scylladb/scylladb#21613
2024-11-22 10:31:21 +02:00
Avi Kivity
1c26c8deeb mutation: mutation_partition_v2.hh: switch from boost ranges to std ranges
Consolidate on one range solution. Fallout in mutation_partition_v2.cc
and row_cache_test.cc due to interoperability problems is adjusted.
2024-11-15 14:36:28 +02:00
Kefu Chai
59eb2ab119 treewide: s/boost::algorithm::any_of/std::ranges::any_of/
now that we are allowed to use C++23. we now have the luxury of using
`std::ranges::any_of`.

in this change, we replace `boost::algorithm::any_of` with
`std::ranges::any_of`

to reduce the dependency to boost for better maintainability, and
leverage standard library features for better long-term support.

this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-11-05 14:06:09 +08:00
Kefu Chai
f9091066b7 treewide: replace boost::irange with std::views::iota where possible
when building scylla with the standard library from GCC-14.2, shipped by
fedora 41, we have following build failure:

```
/home/kefu/.local/bin/clang++ -DDEBUG -DDEBUG_LSA_SANITIZER -DFMT_SHARED -DSANITIZE -DSCYLLA_BUILD_MODE=debug -DSCYLLA_ENABLE_ERROR_INJECTION -DSEASTAR_API_LEVEL=7 -DSEASTAR_DEBUG -DSEASTAR_DEBUG_PROMISE -DSEASTAR_DEBUG_SHARED_PTR -DSEASTAR_DEFAULT_ALLOCATOR -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SHUFFLE_TASK_QUEUE -DSEASTAR_SSTRING -DSEASTAR_TYPE_ERASE_MORE -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"Debug\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -isystem /home/kefu/dev/scylladb/abseil -g -Og -g -gz -std=gnu++23 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb/build=. -march=x86-64-v3 -mpclmul -Xclang -fexperimental-assignment-tracking=disabled -Werror=unused-result -fstack-clash-protection -fsanitize=address -fsanitize=undefined -MD -MT CMakeFiles/scylla-main.dir/Debug/init.cc.o -MF CMakeFiles/scylla-main.dir/Debug/init.cc.o.d -o CMakeFiles/scylla-main.dir/Debug/init.cc.o -c /home/kefu/dev/scylladb/init.cc
In file included from /home/kefu/dev/scylladb/init.cc:12:
In file included from /home/kefu/dev/scylladb/db/config.hh:20:
In file included from /home/kefu/dev/scylladb/locator/abstract_replication_strategy.hh:26:
/home/kefu/dev/scylladb/locator/tablets.hh:410:30: error: unexpected type name 'size_t': expected expression
  410 |         return boost::irange<size_t>(0, tablet_count()) | boost::adaptors::transformed([] (size_t i) {
      |                              ^
/home/kefu/dev/scylladb/locator/tablets.hh:410:23: error: no member named 'irange' in namespace 'boost'
  410 |         return boost::irange<size_t>(0, tablet_count()) | boost::adaptors::transformed([] (size_t i) {
      |                ~~~~~~~^
/home/kefu/dev/scylladb/locator/tablets.hh:410:38: error: left operand of comma operator has no effect [-Werror,-Wunused-value]
  410 |         return boost::irange<size_t>(0, tablet_count()) | boost::adaptors::transformed([] (size_t i) {
      |                                      ^
3 errors generated.
[16/782] Building CXX object CMakeFiles/scylla-main.dir/Debug/keys.cc.o
[17/782] Building CXX object CMakeFiles/scylla-main.dir/Debug/counters.cc.o
[18/782] Building CXX object CMakeFiles/scylla-main.dir/Debug/partition_slice_builder.cc.o
[19/782] Building CXX object CMakeFiles/scylla-main.dir/Debug/mutation_query.cc.o
FAILED: CMakeFiles/scylla-main.dir/Debug/mutation_query.cc.o
/home/kefu/.local/bin/clang++ -DDEBUG -DDEBUG_LSA_SANITIZER -DFMT_SHARED -DSANITIZE -DSCYLLA_BUILD_MODE=debug -DSCYLLA_ENABLE_ERROR_INJECTION -DSEASTAR_API_LEVEL=7 -DSEASTAR_DEBUG -DSEASTAR_DEBUG_PROMISE -DSEASTAR_DEBUG_SHARED_PTR -DSEASTAR_DEFAULT_ALLOCATOR -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SHUFFLE_TASK_QUEUE -DSEASTAR_SSTRING -DSEASTAR_TYPE_ERASE_MORE -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"Debug\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -isystem /home/kefu/dev/scylladb/abseil -g -Og -g -gz -std=gnu++23 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb/build=. -march=x86-64-v3 -mpclmul -Xclang -fexperimental-assignment-tracking=disabled -Werror=unused-result -fstack-clash-protection -fsanitize=address -fsanitize=undefined -MD -MT CMakeFiles/scylla-main.dir/Debug/mutation_query.cc.o -MF CMakeFiles/scylla-main.dir/Debug/mutation_query.cc.o.d -o CMakeFiles/scylla-main.dir/Debug/mutation_query.cc.o -c /home/kefu/dev/scylladb/mutation_query.cc
In file included from /home/kefu/dev/scylladb/mutation_query.cc:12:
In file included from /home/kefu/dev/scylladb/schema/schema_registry.hh:17:
In file included from /home/kefu/dev/scylladb/replica/database.hh:11:
In file included from /home/kefu/dev/scylladb/locator/abstract_replication_strategy.hh:26:
/home/kefu/dev/scylladb/locator/tablets.hh:410:30: error: unexpected type name 'size_t': expected expression
  410 |         return boost::irange<size_t>(0, tablet_count()) | boost::adaptors::transformed([] (size_t i) {
      |                              ^
/home/kefu/dev/scylladb/locator/tablets.hh:410:23: error: no member named 'irange' in namespace 'boost'
  410 |         return boost::irange<size_t>(0, tablet_count()) | boost::adaptors::transformed([] (size_t i) {
      |                ~~~~~~~^
/home/kefu/dev/scylladb/locator/tablets.hh:410:38: error: left operand of comma operator has no effect [-Werror,-Wunused-value]
  410 |         return boost::irange<size_t>(0, tablet_count()) | boost::adaptors::transformed([] (size_t i) {
      |                                      ^
In file included from /home/kefu/dev/scylladb/mutation_query.cc:12:
In file included from /home/kefu/dev/scylladb/schema/schema_registry.hh:17:
In file included from /home/kefu/dev/scylladb/replica/database.hh:37:
In file included from /home/kefu/dev/scylladb/db/snapshot-ctl.hh:20:
/home/kefu/dev/scylladb/tasks/task_manager.hh:403:54: error: no member named 'irange' in namespace 'boost'
  403 |         co_await coroutine::parallel_for_each(boost::irange(0u, smp::count), [&tm, id, &res, &func] (unsigned shard) -> future<> {
      |                                               ~~~~~~~^
4 errors generated.
```

so let's take the opportunity to switch from `boost::irange` to
`std::views::iota`.

in this change, we:

- switch from boost::irange to std::views::iota for better standard library compatibility
- retain boost::irange where step parameter is used, as std::views::iota doesn't support it
- this change partially modernizes our range usage while maintaining
- existing functionality

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#20924
2024-10-03 10:33:33 +03:00
Kefu Chai
3e84d43f93 treewide: use seastar::format() or fmt::format() explicitly
before this change, we rely on `using namespace seastar` to use
`seastar::format()` without qualifying the `format()` with its
namespace. this works fine until we changed the parameter type
of format string `seastar::format()` from `const char*` to
`fmt::format_string<...>`. this change practically invited
`seastar::format()` to the club of `std::format()` and `fmt::format()`,
where all members accept a templated parameter as its `fmt`
parameter. and `seastar::format()` is not the best candidate anymore.
despite that argument-dependent lookup (ADT for short) favors the
function which is in the same namespace as its parameter, but
`using namespace` makes `seastar::format()` more competitive,
so both `std::format()` and `seastar::format()` are considered
as the condidates.

that is what is happening scylladb in quite a few caller sites of
`format()`, hence ADT is not able to tell which function the winner
in the name lookup:

```
/__w/scylladb/scylladb/mutation/mutation_fragment_stream_validator.cc:265:12: error: call to 'format' is ambiguous
  265 |     return format("{} ({}.{} {})", _name_view, s.ks_name(), s.cf_name(), s.id());
      |            ^~~~~~
/usr/bin/../lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/format:4290:5: note: candidate function [with _Args = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>]
 4290 |     format(format_string<_Args...> __fmt, _Args&&... __args)
      |     ^
/__w/scylladb/scylladb/seastar/include/seastar/core/print.hh:143:1: note: candidate function [with A = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>]
  143 | format(fmt::format_string<A...> fmt, A&&... a) {
      | ^
```

in this change, we

change all `format()` to either `fmt::format()` or `seastar::format()`
with following rules:
- if the caller expects an `sstring` or `std::string_view`, change to
  `seastar::format()`
- if the caller expects an `std::string`, change to `fmt::format()`.
  because, `sstring::operator std::basic_string` would incur a deep
  copy.

we will need another change to enable scylladb to compile with the
latest seastar. namely, to pass the format string as a templated
parameter down to helper functions which format their parameters.
to miminize the scope of this change, let's include that change when
bumping up the seastar submodule. as that change will depend on
the seastar change.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-09-11 23:21:40 +03:00
Łukasz Paszkowski
da95f44adc readers: Use reversed schema and native reversed slices
The reconcilable_result is built as it would be constructed for
forward read queries for tables with reversed order.

Mutations constructed for reversed queries are consumed forward.

Drop overloaded reversed functions that reverse read_command and
reconcilable_result directly and keep only those requiring smart
pointers. They are not used any more.
2024-08-13 10:03:46 +02:00
Avi Kivity
aa1270a00c treewide: change assert() to SCYLLA_ASSERT()
assert() is traditionally disabled in release builds, but not in
scylladb. This hasn't caused problems so far, but the latest abseil
release includes a commit [1] that causes a 1000 insn/op regression when
NDEBUG is not defined.

Clearly, we must move towards a build system where NDEBUG is defined in
release builds. But we can't just define it blindly without vetting
all the assert() calls, as some were written with the expectation that
they are enabled in release mode.

To solve the conundrum, change all assert() calls to a new SCYLLA_ASSERT()
macro in utils/assert.hh. This macro is always defined and is not conditional
on NDEBUG, so we can later (after vetting Seastar) enable NDEBUG in release
mode.

[1] 66ef711d68

Closes scylladb/scylladb#20006
2024-08-05 08:23:35 +03:00
Avi Kivity
fdc1449392 treewide: rename flat_mutation_reader_v2 to mutation_reader
flat_mutation_reader_v2 was introduced in a pair of commits in 2021:

  e3309322c3 "Clone flat_mutation_reader related classes into v2 variants"
  08b5773c12 "Adapt flat_mutation_reader_v2 to the new version of the API"

as a replacement for flat_mutation_reader, using range_tombstone_change
instead of range_tombstone to represent represent range tombstones. See
those commits for more information.

The transition was incremental; the last use of the original
flat_mutation_reader was removed in 2022 in commit

  026f8cc1e7 "db: Use mutation_partition_v2 in mvcc"

In turn, flat_mutation_reader was introduced in 2017 in commit

  748205ca75 "Introduce flat_mutation_reader"

To transition from a mutation_reader that nested rows within
a partition in a separate stream, to a flat reader that streamed
partitions and rows in the same stream.

Here, we reclaim the original name and rename the awkward
flat_mutation_reader_v2 to mutation_reader.

Note that mutation_fragment_v2 remains since we still use the original
for compatibilty, sometimes.

Some notes about the transition:

 - files were also renamed. In one case (flat_mutation_reader_test.cc), the
   rename target already existed, so we rename to
    mutation_reader_another_test.cc.

 - a namespace 'mutation_reader' with two definitions existed (in
   mutation_reader_fwd.hh). Its contents was folded into the mutation_reader
   class. As a result, a few #includes had to be adjusted.

Closes scylladb/scylladb#19356
2024-06-21 07:12:06 +03:00
Kefu Chai
a439ebcfce treewide: include fmt/ranges.h and/or fmt/std.h
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we include `fmt/ranges.h` and/or `fmt/std.h`
for formatting the container types, like vector, map
optional and variant using {fmt} instead of the homebrew
formatter based on operator<<.
with this change, the changes adding fmt::formatter and
the changes using ostream formatter explicitly, we are
allowed to drop `FMT_DEPRECATED_OSTREAM` macro.

Refs scylladb#13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-04-19 22:56:16 +08:00
Michał Chojnowski
8147ab69ac row_cache_test: avoid a throw in external_updater
In test_exception_safety_of_update_from_memtable, we have a potential
throw from external_updater.

external_updater is supposed to be infallible.
Scylla currently aborts when an external_updater throws, so a throw from
there just fails the test.

This isn't intended. We aren't testing external_updater in this test.

Fixes #18163

Closes scylladb/scylladb#18171
2024-04-03 23:22:08 +02:00
Michał Chojnowski
295b27a07b cache_flat_mutation_reader: only call get_iterator_in_latest() when pointing at a row
Calling `_next_row.get_iterator_in_latest()` is illegal when `_next_row` is not
pointing at a row. In particular, the iterator returned by such call might be
dangling.

We have observed this to cause a use-after-free in the field, when a reverse
read called `maybe_add_to_cache` after `_latest_it` was left dangling after
a dead row removal in `copy_from_cache_to_buffer`.

To fix this, we should ensure that we only call `_next_row.get_iterator_in_latest`
is pointing at a row.

Only the occurrences of this problem in `maybe_add_to_cache` are truly dangerous.
As far as I can see, other occurrences can't break anything as of now.
But we apply fixes to them anyway.

Closes scylladb/scylladb#18046
2024-03-27 11:48:42 +01:00
Pavel Emelyanov
d90db016bf treewide: Use partition_slice::is_reversed()
Continuation of cc56a971e8, more noisy places detected

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#17763
2024-03-13 08:52:46 +02:00
Michał Chojnowski
f5e3a728e4 row_cache_test: test cache consistency during memtable-to-cache merge
A rather minimal reproducer for #16759. Not extensive.
2024-02-07 18:31:36 +01:00
Avi Kivity
7cb1c10fed treewide: replace seastar::future::get0() with seastar::future::get()
get0() dates back from the days where Seastar futures carried tuples, and
get0() was a way to get the first (and usually only) element. Now
it's a distraction, and Seastar is likely to deprecate and remove it.

Replace with seastar::future::get(), which does the same thing.
2024-02-02 22:12:57 +08:00
Lakshmi Narayanan Sreethar
76f0d5e35b reader_permit: store schema_ptr instead of raw schema pointer
Store schema_ptr in reader permit instead of storing a const pointer to
schema to ensure that the schema doesn't get changed elsewhere when the
permit is holding on to it. Also update the constructors and all the
relevant callers to pass down schema_ptr instead of a raw pointer.

Fixes #16180

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>

Closes scylladb/scylladb#16658
2024-01-11 08:37:56 +02:00
Botond Dénes
e1b30f50be reader_concurrency_semaphore: add register_metrics constructor parameter
To be used in the next patch to control whether the semaphore registers
and exports metrics or not. We want to move metric registration to the
semaphore but we don't want all semaphores to export metrics. The
decision on whether a semaphore should or shouldn't export metrics
should be made on a case-by-case basis so this new parameter has no
default value (except for the for_tests constructor).
2023-12-13 06:25:45 -05:00
Pavel Emelyanov
eeee58def8 tests: Make use of make_memtable() helper
There's one in the utils that creates lw_shared_ptr<memtable> and
applies provided vector of mutations into it. Lots of other test cases
do literally the same by hand.

The make_memtable() assumes that the caller is sitting in the seastar
thread, and all the test cases that can benfit from it already are.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-11-02 19:28:35 +03:00
Botond Dénes
460bc7d8e1 test/boost/row_cache_test: test_exception_safety_of_reads: also cover single-partition reads
The test currently only covers scans. Single partition reads have a
different code-path, make sure it is also covered.
2023-10-20 03:16:57 -04:00
Avi Kivity
e600f35d1e Merge 'logalloc, reader_concurrency_semaphore: cooperate on OOM kills' from Botond Dénes
Consider the following code snippet:
```c++
future<> foo() {
    semaphore.consume(1024);
}

future<> bar() {
    return _allocating_section([&] {
        foo();
    });
}
```

If the consumed memory triggers the OOM kill limit, the semaphore will throw `std::bad_alloc`. The allocating section will catch this, bump std reserves and retry the lambda. Bumping the reserves will not do anything to prevent the next call to `consume()` from triggering the kill limit. So this cycle will repeat until std reserves are so large that ensuring the reserve fails. At this point LSA gives up and re-throws the `std::bad_alloc`. Beyond the useless time spent on code that is doomed to fail, this also results in expensive LSA compaction and eviction of the cache (while trying to ensure reserves).
Prevent this situation by throwing a distinct exception type which is derived from `std::bad_alloc`. Allocating section will not retry on seeing this exception.
A test reproducing the bug is also added.

Fixes: #15278

Closes scylladb/scylladb#15581

* github.com:scylladb/scylladb:
  test/boost/row_cache_test: add test_cache_reader_semaphore_oom_kill
  utils/logalloc: handle utils::memory_limit_reached in with_reclaiming_disabled()
  reader_concurrency_semaphore: use utils::memory_limit_reached exception
  utils: add memory_limit_reached exception
2023-10-05 19:47:21 +03:00