Commit Graph

31056 Commits

Author SHA1 Message Date
Benny Halevy
40ad057b6c database: delete db_apply_executor forward declaration
The class is long gone, since version 3.0.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20220407094632.2647967-1-bhalevy@scylladb.com>
2022-04-07 17:11:38 +03:00
Pavel Solodovnikov
293c5f39ee service: raft_group0: make join_group0 re-entrant
Detect if we have already finished joining group0 before
and do nothing in that case.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2022-04-07 12:36:40 +03:00
Pavel Solodovnikov
057a12e213 service: storage_service: add join_group0 method
Just delegates work to `service::raft_group0::join_group0()`
so that it can be used in `main` to activate raft group0
early in some cases (before waiting for gossiper to settle).

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2022-04-07 12:36:33 +03:00
Pavel Solodovnikov
0d5e2157e1 raft_group_registry: update gossiper state only on shard 0
Since `gossiper::add_local_application_state` is not
safe to call concurrently from multiple shards (which
will cause a deadlock inside the method), call this
only on shard 0 in `_raft_support_listener`.

This fixes sporadic hangs when starting a fresh node in an
empty cluster where node hangs during startup.

Tests: unit(dev), manual

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2022-04-07 12:33:40 +03:00
Pavel Solodovnikov
7903d2afa8 raft: don't update gossiper state if raft is enabled early or not enabled at all
There is a listener in the `raft_group_registry`,
which makes the gossiper to re-publish supported
features app state to the cluster.

We don't need to do this in case `USES_RAFT_CLUSTER_MANAGEMENT`
feature is enabled before the usual time, i.e. before the
gossiper settles. So, short-circuit the listener logic in
that case and do nothing.

Also, don't do anything if raft group registry is not enabled
at all, this is just a generic safeguard.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2022-04-07 12:31:29 +03:00
Pavel Solodovnikov
ccb59ba6c7 gms: feature_service: add cluster_uses_raft_mgmt accessor method
Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2022-04-07 12:30:21 +03:00
Wojciech Mitros
97408078a1 dependencies: add rust
The main reason for adding rust dependency to scylla is the
wasmtime library, which is written in rust. Although there
exist c++ bindings, they don't expose all of its features,
so we want to do that ourselves using rust's cxx.

Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
[avi: update toolchain]
[avi: remove example, saving for a follow-on]
2022-04-07 12:26:05 +03:00
Botond Dénes
ad075b27a4 test/lib/mutation_diff: s/colordiff/diff/
Colordiff is problematic when writing the diff into a file for later
examination. Use regular diff instead. One can still get syntax
highlighting by writing the output into `.diff` file (which most editors
will recognize).

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20220407080944.324108-1-bdenes@scylladb.com>
2022-04-07 12:07:24 +03:00
Michael Livshin
da7c7fd3dc delete code of the unused normalizing_reader class
Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
Message-Id: <20220406161107.2376568-3-michael.livshin@scylladb.com>
2022-04-07 09:29:41 +03:00
Michael Livshin
d8598d048a enormous_table_reader: inherit from flat_mutation_reader_v2::impl
(completely mechanical change)

Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
Message-Id: <20220406161107.2376568-2-michael.livshin@scylladb.com>
2022-04-07 09:29:41 +03:00
Michael Livshin
702ad7447a enormous_table_reader: remove the duplicate _schema field
flat_mutation_reader{,_v2}::impl already contains one, which makes for
very exciting debugging experience (and no, clang does not mind at
all).

Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
Message-Id: <20220406161107.2376568-1-michael.livshin@scylladb.com>
2022-04-07 09:29:41 +03:00
Pavel Emelyanov
9066224cf4 table: Don't export compaction manager reference
There's a public call on replica::table to get back the compaction
manager reference. It's not needed, actually. The users of the call are
distributed loader which already has database at hand, and a test that
creates itw own instance of compaction manager for its testing tables
and thus also has it available.

tests: unit(dev)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20220406171351.3050-1-xemul@scylladb.com>
2022-04-07 09:27:45 +03:00
Pavel Emelyanov
2cab2a32b8 database: Coroutinize close_tables
To make next patch a bit simpler

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-04-06 18:43:32 +03:00
Pavel Emelyanov
401c0edea2 test: Add test for cross_shard_barrier::abort()
The tests runs a loop of arrivals each of which can randomly
throw before arriving. As the result the test expects all shards
to resolve into exception in the same phase.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-04-06 18:21:59 +03:00
Pavel Emelyanov
8d7a7cbe21 cross-shard-barrier: Add .abort() method
The method makes all the .arrive_and_wait()s in the current phase
to resolve with barrier_aborted_exception() exceptional future.

The barrier turns into a broken state and is not supposed to serve
any subsequence arrivals anyhow reasonably.

The .abort() method is re-entrable in two senses. The first is that
more than one shard can abort a barrier, which is pretty natural.
The second is that the exception-safety fuses like that imply that
if the arrive_and_wait() resolves into exception the caller will try
to abort() the barrier as well, even though the phase would be over.
This case is also "supported".

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-04-06 18:21:59 +03:00
Botond Dénes
18be2e9faf Merge "Remove gossiper->snitch kicking" from Pavel Emelyanov
"
Gossiper calls snitch->gossiper_starting() when being enabled. This
generates a dependency loop -- snitch needs gossiper to gossip its
states and get DC/RACK, gossiper needs snitch to do this kick.

This set removes this notification. The new approach is to kick the
snitch to gossip its states in the same places where gossiper is
enabled() so that only the snitch->gossiper dependency remains.

As a side effect the set ditches a bunch of references to global
snitch instance.

tests: unit(dev)
"

* 'br-snitch-gossiper-starting' of https://github.com/xemul/scylla:
  snitch: Remove gossiper_starting()
  snitch: Remove gossip_snitch_info()
  property-file snitch: Re-gossip states with the help of .get_app_states()
  property-file snitch: Reload state in .start()
  ec2 multi-region snitch: Register helper in .start()
  snitch, storage service: Gossip snitch info once
  snitch: Introduce get_app_states() method
  property-file snitch: Use _my_distributed to re-shard
  storage service: Shuffle snitch name gossiping
2022-04-06 17:41:36 +03:00
Piotr Sarna
2683b54402 Merge 'CQL3: Optional FINALFUNC and INITCOND for UDA' from Michał Jadwiszczak
Makes final function and initial condition to be optional while
creating UDA. No final function means UDA returns final state
and default initial condition is `null`.
Both items were optional in cql's grammar but they were treated as required in code.

Additionally I've added check if state function returns state.

Fixes #10324

Closes #10331

* github.com:scylladb/scylla:
  CQL3: check sfunc return type in UDA
  cql-pytest: UDA no final_func/initcond tests
  cql3: allow no final_func and no initcond in UDA
2022-04-06 16:04:47 +02:00
Michael Livshin
a90e02c302 skeleton_reader: inherit from flat_mutation_reader_v2::impl
(completely mechanical change)

Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
Message-Id: <20220406122912.2248111-1-michael.livshin@scylladb.com>
2022-04-06 16:55:54 +03:00
Michael Livshin
6001a0fef1 multi_partition_reader: inherit from flat_mutation_reader_v2::impl
(completely mechanical change)

Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
Message-Id: <20220406122122.2246058-1-michael.livshin@scylladb.com>
2022-04-06 16:55:07 +03:00
Michał Sala
28970389bc forward_service: uncoroutinize dispatch method
Done to mitigate potential misscompilations.
2022-04-06 15:01:31 +02:00
Michał Sala
edc32a7118 forward_service: uncoroutinize retrying_dispatcher
Done to mitigate potential misscompilations.
2022-04-06 14:52:59 +02:00
Michał Sala
59ff51c824 forward_service: rety a failed forwarder call
Failed-to-forward sub-queries will be executed locally (on a
super-coordinator). This local execution is meant as a fallback for
forward_requests that could not be sent to its destined coordinator
(e.g. due gossiper not reacting fast enough). Local execution was chosen
as the safest one - it does not require sending data to another
coordinator.
2022-04-06 14:44:55 +02:00
Benny Halevy
17358ac2a0 cmake: CMakeLists.txt: rename flat_mutation_reader.cc to readers/mutation_readers.cc
It was moved in 31d84a254c00b36dc2576e06ee288e28a13238195.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20220406110512.3731011-3-bhalevy@scylladb.com>
2022-04-06 14:10:34 +03:00
Benny Halevy
4b3d0643a8 cmake: CMakeLists.txt: remove conncetion_notifier.cc
It was removed in 3aa05f7f03.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20220406110512.3731011-2-bhalevy@scylladb.com>
2022-04-06 14:10:33 +03:00
Benny Halevy
8d95e12ecd cmake: CMakeLists.txt: update source paths
Those were moved to subdirectories.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20220406110512.3731011-1-bhalevy@scylladb.com>
2022-04-06 14:10:32 +03:00
Avi Kivity
82733aeadb Merge 'Perf: Add extended template version of timed_perf + use in CL perf' from Calle Wilund
Adds sub-template for time_parallel with templated result type + optional per-iteration post-process func. Idea is that Res may be a subtype of perf_result, with additional stats, initiated on init, and post-process  function can fix up and apply stats -> we can add stats to result.

Then uses this mighty construct to add some IO stats to CL perf.

Closes #10334

* github.com:scylladb/scylla:
  perf_commitlog: Add bytes + bytes written stats
  perf: Add aio_writes mixin for perf_results
  test/perf/perf.hh: Make templated version of test routine to allow extended stats
2022-04-06 12:52:53 +03:00
Nadav Har'El
0f3cd6ad18 test/cql-pytest: fix fails_without_raft tests on Cassandra
We had a Python typo ("false" instead of "False") which prevented
tests with the fails_without_raft marker for running on Cassandra.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220405170337.36321-1-nyh@scylladb.com>
2022-04-06 11:20:25 +03:00
Jadw1
b560286ffe CQL3: check sfunc return type in UDA
Thre return type of state function is now checked while creating UDA.
Appropriate test added to cql-pytest.
2022-04-06 09:25:17 +02:00
Jadw1
977d6ac8b0 cql-pytest: UDA no final_func/initcond tests
Cql-pytests to check if UDA works properly without final function
or initial condition.
2022-04-06 09:25:12 +02:00
Jadw1
c921efd1b3 cql3: allow no final_func and no initcond in UDA
Makes final function and initial condition to be optional while
creating UDA. No final function means UDA returns final state
and defeult initial condition is `null`.

Fixes: #10324
2022-04-06 09:08:50 +02:00
Kamil Braun
424411ee5f test: raft: randomized_nemesis_test: enable entry forwarding
The test will now, with probability 1/2, enable forwarding of entries by
followers to leaders. This is possible thanks to the new abort_source&
APIs which we use to ensure that no operations are running on servers
before we destroy them.
2022-04-05 19:29:26 +02:00
Nadav Har'El
cfe04e6437 test/cql-pytest: nicer error message if a test can't find nodetool
When testing Scylla, cql-pytest does *not* need an external nodetool
command - it uses the REST API instead because it is much faster and
there is no need to install anything. However, if cql-pytest is run
against Cassandra, the tests do want to use the "nodetool" utility and
want to know what it is. The tests use either the NODETOOL environment
variable, or if that doesn't exist, look for "nodetool" in the path.

If nodetool wasn't found in that way, before this patch, we got an ugly
error message with long irrelevant Python backtraces. It wasn't easy
to understand that what actually happened was that the user forgot
to set the NODETOOL environment variable.

This patch cleans up this error handling. Now, if nodetool cannot be
found, every test that tries to run nodetool will report just a one-
line error message, clearly explaining what went wrong and how to
fix it:

        Error: Can't find nodetool. Please set the NODETOOL
        environment variable to the path of the nodetool utility.

To reiterate, when testing Scylla, nodetool is *not* needed even after
this patch. These errors will not happen even if you don't have the
nodetool utility. You only need nodetool if you plan to test Cassandra.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220405171835.43992-1-nyh@scylladb.com>
2022-04-05 20:29:02 +03:00
Kamil Braun
f31c61c7c9 test: raft: randomized_nemesis_test: increase logging level on some rare operations
Increase the logging level on the few operations which happen at the end
of the test but make debugging a bit easier if the test hangs for some
reason.
2022-04-05 19:19:59 +02:00
Kamil Braun
ad3141d3e0 raft: server: translate abort_requested_exception to raft::request_aborted
The `wait_for_leader` function would throw a low-level
`abort_requested_aborted` exception from seastar::shared_promise.
Translate it to the high-level raft::request_aborted so we can reduce
the number of different exception types which cross the Raft API
boundary.

Also, add comments on Raft API functions about the exception thrown when
requests are aborted.
2022-04-05 19:18:53 +02:00
Kamil Braun
7da586b912 raft: fsm: when stopping, become follower to reject new requests
After enabling add_entry forwarding in randomized_nemesis_test, the test
would sometimes hang on _rpc->abort() call due to add_entry messages
from followers which waited on log_limiter_semaphore on the leader
preventing _rpc from finishing the abort; the log_limter_semaphore would
not get unblocked because the part of the server was already stopped.

Prevent log_limiter_semaphore from being waited on when stopping the
server by becoming a follower in fsm::stop.
2022-04-05 19:11:44 +02:00
Calle Wilund
af28fb6d94 perf_commitlog: Add bytes + bytes written stats
Used extended perf_result used with aio_writes + aio_write_bytes to
include some IO stats for the benchmark.
2022-04-05 13:43:57 +00:00
Calle Wilund
5b60a6cf7c perf: Add aio_writes mixin for perf_results
Can be used with time_parallel_ex. Adds measurements for aio writes/aio written bytes.
2022-04-05 13:42:36 +00:00
Calle Wilund
12ab34a3d9 test/perf/perf.hh: Make templated version of test routine to allow extended stats
Adds sub-template for time_parallel with templated result type + optional
per-iteration post-process func. Idea is that Res may be a subtype of
perf_result, with additional stats, initiated on init, and post-process
function can fix up and apply stats -> we can add stats to result.
2022-04-05 13:30:42 +00:00
Avi Kivity
0d5fd526a5 Merge "tools/scylla-sstable alternative schema load method for system tables" from Botond
"
Examining sstables of system tables is quite a common task. Having to
dump the schemas of such tables into a schema.cql is annoying knowing
that these schemas are readily available in scylla, as they are
hardcoded. This mini-series adds a method to make use of this fact, by
adding a new option: `--system-schema`, which takes the name of a system
table and looks up its schema.

Tests: unit(dev)
"

* 'scylla-sstable-system-schema/v1' of https://github.com/denesb/scylla:
  tools/scylla-sstable: add alternative schema load method for system tables
  tools/schema_loader: add load_system_schema()
  db/system_distributed_keyspace: add all tables methods
  tools/scylla-sstable: reorganize main help text
2022-04-05 15:48:29 +03:00
Avi Kivity
6cfc1d6f6a Update seastar submodule
* seastar 798ec50701...2a2a13058e (2):
  > condition_variable: Add "has_waiters()" accessor + test
  > Merge "RPC tester" from Pavel E
2022-04-05 13:47:51 +03:00
Gleb Natapov
7bf557332f storage_service: remove maybe from maybe_start_sys_dist_ks
There is nothing "maybe" about it now.

Message-Id: <Ykv/bj8MvKh0UU23@scylladb.com>
2022-04-05 12:49:56 +03:00
Benny Halevy
abbf5de68c frozen_mutation: introduce consume method
Allowing to consume the frozen_mutation directly
to a stream rather than unfreezing it first
and then consuming the unfrozen mutation.

Streaming directly from the frozen_mutation
saves both cpu and memory, and will make it
easier to be made async as a follow, to allow
yielding, e.g. between rows.

This is used today only in to_data_query_result
which is invoked on the read-repair path.

Refs #10038
Fixes #10021

Test: unit(release)

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20220405055807.1834494-1-bhalevy@scylladb.com>
2022-04-05 10:51:21 +03:00
Nadav Har'El
67e0590bbc alternator: remove old TODO (with test verifying it)
We had an old TODO in the Alternator "Scan" operation code which
suggested that we may need to do something to limit the size of pages
when a row limit ("Limit") isn't given.

But we do already have a built-in limit on page sizes (1 MB),
so this TODO isn't needed and can be removed.

But I also wanted to make sure we have a test that this limit works:

We already had a test that this 1 MB limit works for a single-partition
Query (test_query.py::test_query_reverse_longish - tested both forward
and reversed queries). In this patch I add a similar test for a whole-
table Scan. It turns out that although page size is limited in this case
as well, it's not exactly 1 MB... For small tables can even reach 3 MB.
I consider this "good enough" and that we can drop the TODO, but also
opened issue #10327 to document this surprising (for me) finding.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220404145240.354198-1-nyh@scylladb.com>
2022-04-05 09:23:23 +03:00
Nadav Har'El
56936d3c16 test/alternator: add reproducers for scan of long string of tombstones
This patch adds two xfailing tests for issue #7933. That issue is about
what Scan or Query paging does when encountering a very long string of
consecutive tombstones (partition or row tombstones). Ideally, in that
case the scan could stop on one of these tombstones after already
processing too many. But as these two tests demonstrate, the scan can't
stop in the middle of a long string of tombstones - and as a result
retrieving a single page can take an unbounded amount of time, which is
wrong.

Currently the tests are marked `@veryslow` (they each take more than a
minute) because they each create a huge number of tombstones to
demonstrate a huge amount of work for a single page. When we fix
issue #7933 and have a much smaller limit on the number of tombstones
processed in a single page, we can hopefully make these tests much
shorter and remove the `@veryslow` tag. The `@veryslow` tags means
that although these tests can be used manually (with `--runveryslow`)
they will not yet be run as part of the usual regression tests.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220403070706.250147-1-nyh@scylladb.com>
2022-04-05 09:11:38 +03:00
Raphael S. Carvalho
840500fc4d compaction: Make cleanup for Leveled strategy bucket-aware
Bucket awareness in cleanup was introduced in a69d98c3d0.
STCS and TWCS already support it, and now LCS will receive it.

The goal of bucket awareness is to reduce writeamp in cleanup,
therefore reducing operation time. Additionally, garbage collection
becomes more efficient as shadowed data can now be potentially
compacted with the data that shadows it, assuming they're on
the same level.

The implementation for LCS is simple. Will reuse the procedure
for STCS for returning jobs in level 0. And one job will be
returned for each non-empty level > 0. What allows us to do it
is our incremental selection approach used in compaction,
that sets a limit on memory usage and disk space requirement.

Fixes #10097.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20220331173417.211257-1-raphaelsc@scylladb.com>
2022-04-05 09:10:21 +03:00
Benny Halevy
2d80057617 range_tombstone_list: insert_from: correct rev.update range_tombstone in not overlapping case
2nd std::move(start) looks like a typo
in fe2fa3f20d.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20220404124741.1775076-1-bhalevy@scylladb.com>
2022-04-04 22:26:29 +02:00
Tomasz Grabiec
0a3aba36e6 Merge 'range_tombstone_change_generator: flush: emit closing range_tombstone_change' from Benny Halevy
When the highest tombstone is open ended, we must
emit a closing range_tombstone_change at
position_in_partition::after_all_clustered_rows().

Since all consumers need to do it, implement the logic
in the range_tombstone_change_generator itself.

It turned out that mutation::consume doesn't do that,
hence this series, and 5a09e5234ef4e1ee673bc7fca481defbbb2c0384 in particular,
fix the issue.

Change 028b2a8cdfdc12721b2be23d175cbc756d2507de exposes the issue
by generating a richer set of random range_tombstone that include open-ended
range tombstones.

Fixes #10316

Test: unit(dev)

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #10317

* github.com:scylladb/scylla:
  test: random_mutation_generator: make more interesting range tombstones
  reader: upgrading_consumer: let range_tombstone_change_generator emit last closing change
  range_tombstone_change_generator: flush: emit end_position when upper limit is after all clustered rows
  range_tombstone_change_generator: flush: use tri_compare rather than less
  range_tombstone_change_generator: flush: return early if empty
2022-04-04 19:07:45 +02:00
Michał Sala
e170961b4d forward_service: copy arguments/captured vars to local variables
Copying captured variables into local variables (that live in a
coroutine's frame) is a mitigation of suspected lifetime issues.
Arguments of forward_service::dispatch are also copied (to prevent
potential undefined behavior or miss-compilation triggered by
referencing the arguments in a capture list of a lambda that produces a
coroutine).
2022-04-04 16:58:08 +02:00
Benny Halevy
b3e2bbe5bd test: random_mutation_generator: make more interesting range tombstones
Include also singular prefix and semi-bounded range tombstones.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-04-04 17:34:49 +03:00
Piotr Grabowski
63fa5ac915 generic_server.hh: add missing include
Add missing include of "<list>" which caused compile errors on GCC:

In file included from generic_server.cc:9:
generic_server.hh:91:10: error: ‘list’ in namespace ‘std’ does not name a template type
   91 |     std::list<gentle_iterator> _gentle_iterators;
      |          ^~~~
generic_server.hh:19:1: note: ‘std::list’ is defined in header ‘<list>’; did you forget to ‘#include <list>’?
   18 | #include <seastar/net/tls.hh>
  +++ |+#include <list>
   19 |

Note that there are some GCC compilation problems still left apart from
this one.

Closes #10328
2022-04-04 17:31:55 +03:00