Commit Graph

31056 Commits

Author SHA1 Message Date
Gleb Natapov
e52abdca30 migration_manager: pass abort source to raft primitives
We want to be able to abort raft operations on migration manager drain.
MM already has an abort source that is signaled on drain, so all that is
left is to pass it to raft calls.
2022-03-31 10:00:29 +03:00
Gleb Natapov
1409b885a0 storage_proxy: relax some read error reporting
Silence request_aborted read error since it is expected to happen suring
shutdown and report remote rpc errors as warnings instead of errors since
if they are indeed server they should be handled by the rpc client, but
OTOH some non critical errors do expect to happen during shutdown.
2022-03-31 10:00:29 +03:00
Botond Dénes
2e634883d9 frozen_mutation: fragment_and_freeze(): convert to v2 2022-03-31 09:57:48 +03:00
Botond Dénes
7f3986ed1a frozen_mutation: coroutinize fragment_and_freeze() 2022-03-31 09:57:48 +03:00
Botond Dénes
fc27b6b7ed readers: migrate away from v1 reversing reader
The only internal user is the v1 make reader from mutations, we use a
downgrade/upgrade to be able to use the v2 reversing reader there. This
is ugly but the v1 reader from mutations is going away soon too, so not
a real problem.
2022-03-31 09:57:48 +03:00
Botond Dénes
eb125b98eb db/virtual_table: use v2 variant of reversing and forwardable readers 2022-03-31 09:57:48 +03:00
Botond Dénes
c10d7bf9f8 replica/table: use v2 variant of reversing reader 2022-03-31 09:57:48 +03:00
Botond Dénes
3b67c25e49 sstables/sstable: remove unused make_crawling_reader_v1() 2022-03-31 09:57:48 +03:00
Botond Dénes
219cb881a4 sstables/sstable: remove make_reader_v1()
No external users, only used internally, by make_reader(), who delegates
cases currently unsupported by v2 to it. The code needed from
make_reader_v1() is inlined into make_reader() and the former is
removed.
2022-03-31 09:57:48 +03:00
Botond Dénes
470dc0d013 readers: add v2 variant of reversing reader
The v2 format allows for a much simpler reversing mechanism since
clustering fragments can simply be reversed as they are read. Fragments
are directly pushed in the reader's buffer eliminating a separate move
phase.
Existing reverse reader unit tests are converted to test the v2 one.
2022-03-31 09:57:48 +03:00
Botond Dénes
06bea6ae6b readers/reversing: remove FIXME 2022-03-31 09:57:48 +03:00
Botond Dénes
c38a7963b1 readers: reader from mutations: use mutation's own schema when slicing
Instead of the schema that is used for the reader. The schema of
individual mutations might be different (albeit compatible) and in debug
mode this can trigger an assert in mutation partition.
2022-03-31 09:57:48 +03:00
Piotr Sarna
4a3890b79c cql3: disambiguate an error message for indexes and IN clause
A user pointed out a misleading error message produced when
an indexed column is queried along with an IN relation
on the partition key. The message suggests that such queries are
not supported, but they are supported - just without indexing.
In particular, with ALLOW FILTERING, such queries are perfectly
fine.

Closes #10299
2022-03-31 07:04:00 +03:00
Piotr Sarna
85e95a8cc3 cql3: fix misleading error message for service level timeouts
The error message incorrectly stated that the timeout value cannot
be longer than 24h, but it can - the actual restriction is that the
value cannot be expressed in units like days or months, which was done
in order to significantly simplify the parsing routines (and the fact
that timeouts counted in days are not expected to be common).

Fixes #10286

Closes #10294
2022-03-31 07:04:00 +03:00
Raphael S. Carvalho
20a1ef3bee compaction_backlog_tracker: Raise logging level to error when disabling tracker on exception
If exception is caught while updating backlog tracker, the backlog
tracker will be disabled for the underlying table, potentially
causing compaction to fall behind.
That being said, let's raise the log level to error, to give it
its due importance and allow tests to detect the problem.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20220330151421.49054-1-raphaelsc@scylladb.com>
2022-03-31 07:04:00 +03:00
Wojciech Mitros
8a9d55d3a1 wasm: add wasm ABI version 2
Because the only available version of wasm ABI did not allow
freeing any allocated memory, a new version of the ABI is
introduced. In this version, the host is required to export
_scylla_malloc and _scylla_free methods, which are later used
for the memory management.

Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
2022-03-30 20:49:35 +02:00
Wojciech Mitros
5b60cd1eab wasm: add WASI handling
One of the issues that comes with compiling programs to WebAssembly
is the lack of a default implementation of a memory allocator. As
a result, the only available solutions to the need of memory allocation
are growing the wasm memory for each new allocated segment, or
implementing one's own memory allocator. To avoid both of these
approaches, for many languages, the user may compile a program to
a WASI target. By doing so, the compiler adds default implementations
of malloc and free methods, and the user can use them for dynamic
memory management.
This patch enables executing programs compiled with WASI by enabling
it in the wasmtime runtime.

Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
2022-03-30 19:44:36 +02:00
Wojciech Mitros
1f81e05d52 wasm: add documentation
The ABI of wasm UDFs changed since the last time the documentation
was written, so it's being update in this patch.

Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
2022-03-30 19:44:30 +02:00
Pavel Solodovnikov
3e2a42fcf0 db: system_keyspace: add bootstrap_needed() method
The method checks that bootstrap state is equal to
`NEEDS_BOOTSTRAP`. This will be used later to check
if we are in the state of "fresh" start (i.e. starting
a node from scratch).

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2022-03-30 20:41:35 +03:00
Pavel Solodovnikov
c69c47b2bb db: system_keyspace: mark getter methods for bootstrap state as "const"
The `bootstrap_complete()`, `bootstrap_in_progress()`,
`was_decommissioned()` and `get_bootstrap_state()` don't
modify internal state, so eligible to be marked as `const`.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2022-03-30 20:40:02 +03:00
Wojciech Mitros
93b082e8d9 wasm: add _scylla_abi export for specifying abi for wasm udfs
Different languages may require different ABIs for passing
parameters, etc. This patch adds a requirement for all wasm
UDFs to export an _scylla_abi symbol, that is an 32-bit integer
with a value specifying the ABI version.

Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
2022-03-30 19:37:11 +02:00
Wojciech Mitros
a7ee3ccf52 wasm: update ABI for passing parameters to wasm UDFs
WebAssembly uses 32-bit address space, while also
having 64-bit integers as it native types. As a result,
when passing size of an object in memory and its address,
it can be combined into one 64-bit value. As a bonus,
if the object is null, we can signal it by passing -1 as
its size.

This patch implements handling of this new ABI and adjusts
expamples in test_wasm.py.

Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
2022-03-30 17:13:25 +02:00
Wojciech Mitros
7fd81e6dae wasm: move common code to a separate function
Both init_nullable_arg_visitor and, in case
of abstract_type, init_arg_visitor were
the same method with one difference. The
common part was moved to init_abstract_arg,
and the difference remained in the operator()
method.

Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
2022-03-30 17:13:22 +02:00
Wojciech Mitros
62761a7cf3 wasm: use wasm pages for wasm memory
The memory.grow and memory.size wasm methods return
the memory size in pages, and memory.size takes its
argument in the number of pages. A WebAssembly page
has a size of 64KiB, so during memory allocation
we have to divide our desired size in bytes by page
size and round up. Similarly, when reading memory
size we need to multiply the result by 64KiB to
get the size in bytes.

The change affects current naive allocator for
arguments when calling wasm UDFs and the examples
in wasm_test.py - both commented code and compiled
wasm in text representation.

Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
2022-03-30 17:13:13 +02:00
Avi Kivity
caa4cddebf Update tools/java submodule
* tools/java b1e09c8b8f...ac5d6c840d (1):
  > Merge 'Sync with Cassandra 3.11.12' from Michael Livshin
2022-03-30 17:03:55 +03:00
Avi Kivity
9cb43f7029 Merge "Split mutation_reader.hh" from Botond
"
Following up on the recent split of flat_mutation_reader.hh and friends,
this series applies the same treatment to mutation_reader.hh. Each
readers gets its own header, while definitions are moved into
readers/mutation_readers.cc. There are two exceptions to this: the
combined and multishard reader families each make up more than 1K SLOC,
so these get their own source file, to avoid a SLOC explosion in
mutation_readers.cc.

This series is almost completely mechanical, moving code and patching
inclusion sites.

Tests: unit(dev)
"

* 'mutation-reader-hh-split/v1' of https://github.com/denesb/scylla:
  readers: merge fmr_logger and mrlog
  tree: remove now empty mutation_reader.{hh,cc}
  tree: remove mutation_reader.hh include
  mutation_reader: move mrlog (mutation reader logger) to readers/
  mutation_reader: move compacting reader into readers/
  mutation_reader: move queue reader to readers/
  mutation_reader: move mutation source into readers/
  mutation_reader: move slicing filtering reader into readers/
  mutation_reader: move filtering reader into readers/
  readers: move multishard reader & friends to reader/multishard.cc
  mutation_reader: remove unused remote_fill_buffer_result
  readers: move combined reader into readers/
2022-03-30 16:33:20 +03:00
Botond Dénes
3289df2e74 readers: merge fmr_logger and mrlog
By folding the former to the latter. Now that all the readers are nicely
co-located in the same folder, no point in having two distinct logger
for them.
2022-03-30 15:44:08 +03:00
Botond Dénes
c9e30b9a6c tree: remove now empty mutation_reader.{hh,cc} 2022-03-30 15:42:51 +03:00
Botond Dénes
b029bd3db7 tree: remove mutation_reader.hh include
In most files it was unused. We should move these to the patch which
moved out the last interesting reader from mutation_reader.hh (and added
the corresponding new header include) but its probably not worth the
effort.
Some other files still relied on mutation_reader.hh to provide reader
concurrency semaphore and some other misc reader related definitions.
2022-03-30 15:42:51 +03:00
Botond Dénes
20c9e556e1 mutation_reader: move mrlog (mutation reader logger) to readers/ 2022-03-30 15:42:51 +03:00
Botond Dénes
b7954138ac mutation_reader: move compacting reader into readers/ 2022-03-30 15:42:51 +03:00
Botond Dénes
11c378a175 mutation_reader: move queue reader to readers/ 2022-03-30 15:42:51 +03:00
Botond Dénes
11109f4c45 mutation_reader: move mutation source into readers/ 2022-03-30 15:42:51 +03:00
Botond Dénes
7eae66efe0 mutation_reader: move slicing filtering reader into readers/
Only the declaration has to be moved, the definition is already in
readers/mutation_readers.cc.
2022-03-30 15:42:51 +03:00
Botond Dénes
f24f2f726a mutation_reader: move filtering reader into readers/ 2022-03-30 15:42:51 +03:00
Botond Dénes
d0ea895671 readers: move multishard reader & friends to reader/multishard.cc
Since the multishard reader family weighs more than 1K SLOC, it gets
its own .cc file.
2022-03-30 15:42:51 +03:00
Botond Dénes
3505ef8a49 mutation_reader: remove unused remote_fill_buffer_result 2022-03-30 15:42:51 +03:00
Botond Dénes
f8015d9c26 readers: move combined reader into readers/
Since the combined reader family weighs more than 1K SLOC, it gets its
own .cc file.
2022-03-30 15:42:51 +03:00
Botond Dénes
0c3d4091a4 Merge "Make TWCS' cleanup bucket aware" from Raphael S. Carvalho
"
Quoting patch 3/4:
"This continues the work in a69d98c3d0,
by implementing the cleanup method in TWCS to make it bucket aware.
Till now, the default impl was used which cleanups on file at a
time, starting from the smallest.

The cleanup strategy for TWCS is simple. It's simply calling the
size tiered cleanup method for each bucket, so there will be
one job for each tier in each window.

The next strategies to receive this improvement are LCS and ICS
(the latter one being only available in enterprise).

Refs #10097."

** Simply put, the goal is to reduce writeamp when performing cleanup
on a TWCS table, therefore reducing the operation time. **

tests: unit(dev).
"

* 'twcs_cleanup_bucket_aware/v1' of https://github.com/raphaelsc/scylla:
  tests: sstable_compaction_test: Add test for TWCS' bucket-aware cleanup
  compaction: TWCS: Implement cleanup method for bucket awareness
  compaction: TWCS: change get_buckets() signature to work with const qualified functions
  compaction_strategy: get_cleanup_compaction_jobs: accept candidates by value
2022-03-30 11:45:28 +03:00
Pavel Emelyanov
edd0481b38 Merge 'Scrub compaction: prevent mishandling of range tombstone changes' from Botond
With v2 having individual bounds of range tombstone as separate
fragments, out-of-order fragments become more difficult to handle,
especially in the presence of active range tombstone.
Scrub in both SKIP and SEGREGATE mode closes the partition on
seeing the first invalid fragment (SEGREAGE re-opens it immediately).
If there is an active range tombstone, scrub now also has to take care
of closing said tombstone when closing the partition. In a normal stream
it could just use the last position-in-partition to create a closing
bound. But when out-of-order fragments are on the table this is not
possible: the closing bound may be found later in the stream, with a
position smaller than that of the current position-in-partition.
To prevent extending range tombstone changes like that, Scrub now aborts
the compaction on the first invalid fragment seen *inside* an active
range tombstone.
Fixing a v2 stream with range tombstone changes is definitely possible,
but non-trivial, so we defer it until there is demand for it.

This series also makes the mutation fragment stream validator check for
open range tombstones on partition-end and adds a comprehensive
test-suite for the validator.

Fixes: #10168

Tests: unit(dev)

* scrub-rtc-handling-fix/v2 of github.com/denesb/scylla.git:
  compaction/compaction: abort scrub when attempting to rectify stream with active tombstone
  test/boost/mutation_test: add test for mutation_fragment_stream_validator
  mutation_fragment_stream_validator: validate range tombstone changes
2022-03-30 11:42:52 +03:00
Pavel Emelyanov
ba6d2ecc6f api: Remove unused argument from set_tables_autocompaction helper
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20220329093113.5953-1-xemul@scylladb.com>
2022-03-30 11:42:52 +03:00
Raphael S. Carvalho
177a8e8259 compaction_manager: allow sstable to be moved into rewrite_sstable()
Caller was already trying to move sstable, but rewrite_sstable() signature
was incorrect.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20220329022149.250655-1-raphaelsc@scylladb.com>
2022-03-30 11:42:52 +03:00
Kamil Braun
475a408792 test: raft: randomized_nemesis_test: on timeout, abort calls instead of discarding them
When a Raft API call such as `add_entry`, `set_configuration` or
`modify_config` takes too long, we need to time-out. There was no way to
abort these calls previously so we would do that by discarding the futures.
Recently the APIs were extended with `abort_source` parameters. Use this.

Also improve debuggability if the functions throw an exception type that
we don't expect. Previously if they did, a cryptic assert would fail
somewhere deep in the generator code, making the problem hard to debug.
2022-03-29 18:25:26 +02:00
Kamil Braun
0f0d75fd66 raft: server: translate semaphore_aborted to request_aborted 2022-03-29 15:10:29 +02:00
Kamil Braun
8d4dd53c25 test: raft: logical_timer: add abortable version of sleep_until
`sleep_until(time_point tp, abort_source& as)` will sleep until `tp`, or
until `as.request_abort()` is called, whatever comes first.
2022-03-29 15:10:29 +02:00
Kamil Braun
6f9dcd784c test: raft: randomized_nemesis_test: collect statistics on successful and failed ops 2022-03-29 15:10:29 +02:00
Raphael S. Carvalho
a1fd9c1ee8 tests: sstable_compaction_test: Add test for TWCS' bucket-aware cleanup
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2022-03-29 09:52:11 -03:00
Raphael S. Carvalho
568bb40127 compaction: TWCS: Implement cleanup method for bucket awareness
This continues the work in a69d98c3d0,
by implementing the cleanup method in TWCS to make it bucket aware.
Till now, the default impl was used which cleanups on file at a
time, starting from the smallest.

The cleanup strategy for TWCS is simple. It's simply calling the
size tiered cleanup method for each bucket, so there will be
one job for each tier in each window.

The next strategies to receive this improvement are LCS and ICS
(the latter one being only available in enterprise).

Refs #10097.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2022-03-29 09:52:06 -03:00
Raphael S. Carvalho
8f4c04c38a compaction: TWCS: change get_buckets() signature to work with const qualified functions
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2022-03-29 09:49:14 -03:00
Raphael S. Carvalho
2a9bfa3e3f compaction_strategy: get_cleanup_compaction_jobs: accept candidates by value
Then caller can decide whether to copy or move candidate set into the
function. cleanup_sstables_compaction_task can move candidates as
it's no longer needed once it retrieves all descriptors.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2022-03-29 09:49:13 -03:00