Commit Graph

30749 Commits

Author SHA1 Message Date
Benny Halevy
abbf5de68c frozen_mutation: introduce consume method
Allowing to consume the frozen_mutation directly
to a stream rather than unfreezing it first
and then consuming the unfrozen mutation.

Streaming directly from the frozen_mutation
saves both cpu and memory, and will make it
easier to be made async as a follow, to allow
yielding, e.g. between rows.

This is used today only in to_data_query_result
which is invoked on the read-repair path.

Refs #10038
Fixes #10021

Test: unit(release)

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20220405055807.1834494-1-bhalevy@scylladb.com>
2022-04-05 10:51:21 +03:00
Nadav Har'El
67e0590bbc alternator: remove old TODO (with test verifying it)
We had an old TODO in the Alternator "Scan" operation code which
suggested that we may need to do something to limit the size of pages
when a row limit ("Limit") isn't given.

But we do already have a built-in limit on page sizes (1 MB),
so this TODO isn't needed and can be removed.

But I also wanted to make sure we have a test that this limit works:

We already had a test that this 1 MB limit works for a single-partition
Query (test_query.py::test_query_reverse_longish - tested both forward
and reversed queries). In this patch I add a similar test for a whole-
table Scan. It turns out that although page size is limited in this case
as well, it's not exactly 1 MB... For small tables can even reach 3 MB.
I consider this "good enough" and that we can drop the TODO, but also
opened issue #10327 to document this surprising (for me) finding.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220404145240.354198-1-nyh@scylladb.com>
2022-04-05 09:23:23 +03:00
Nadav Har'El
56936d3c16 test/alternator: add reproducers for scan of long string of tombstones
This patch adds two xfailing tests for issue #7933. That issue is about
what Scan or Query paging does when encountering a very long string of
consecutive tombstones (partition or row tombstones). Ideally, in that
case the scan could stop on one of these tombstones after already
processing too many. But as these two tests demonstrate, the scan can't
stop in the middle of a long string of tombstones - and as a result
retrieving a single page can take an unbounded amount of time, which is
wrong.

Currently the tests are marked `@veryslow` (they each take more than a
minute) because they each create a huge number of tombstones to
demonstrate a huge amount of work for a single page. When we fix
issue #7933 and have a much smaller limit on the number of tombstones
processed in a single page, we can hopefully make these tests much
shorter and remove the `@veryslow` tag. The `@veryslow` tags means
that although these tests can be used manually (with `--runveryslow`)
they will not yet be run as part of the usual regression tests.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220403070706.250147-1-nyh@scylladb.com>
2022-04-05 09:11:38 +03:00
Raphael S. Carvalho
840500fc4d compaction: Make cleanup for Leveled strategy bucket-aware
Bucket awareness in cleanup was introduced in a69d98c3d0.
STCS and TWCS already support it, and now LCS will receive it.

The goal of bucket awareness is to reduce writeamp in cleanup,
therefore reducing operation time. Additionally, garbage collection
becomes more efficient as shadowed data can now be potentially
compacted with the data that shadows it, assuming they're on
the same level.

The implementation for LCS is simple. Will reuse the procedure
for STCS for returning jobs in level 0. And one job will be
returned for each non-empty level > 0. What allows us to do it
is our incremental selection approach used in compaction,
that sets a limit on memory usage and disk space requirement.

Fixes #10097.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20220331173417.211257-1-raphaelsc@scylladb.com>
2022-04-05 09:10:21 +03:00
Benny Halevy
2d80057617 range_tombstone_list: insert_from: correct rev.update range_tombstone in not overlapping case
2nd std::move(start) looks like a typo
in fe2fa3f20d.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20220404124741.1775076-1-bhalevy@scylladb.com>
2022-04-04 22:26:29 +02:00
Tomasz Grabiec
0a3aba36e6 Merge 'range_tombstone_change_generator: flush: emit closing range_tombstone_change' from Benny Halevy
When the highest tombstone is open ended, we must
emit a closing range_tombstone_change at
position_in_partition::after_all_clustered_rows().

Since all consumers need to do it, implement the logic
in the range_tombstone_change_generator itself.

It turned out that mutation::consume doesn't do that,
hence this series, and 5a09e5234ef4e1ee673bc7fca481defbbb2c0384 in particular,
fix the issue.

Change 028b2a8cdfdc12721b2be23d175cbc756d2507de exposes the issue
by generating a richer set of random range_tombstone that include open-ended
range tombstones.

Fixes #10316

Test: unit(dev)

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #10317

* github.com:scylladb/scylla:
  test: random_mutation_generator: make more interesting range tombstones
  reader: upgrading_consumer: let range_tombstone_change_generator emit last closing change
  range_tombstone_change_generator: flush: emit end_position when upper limit is after all clustered rows
  range_tombstone_change_generator: flush: use tri_compare rather than less
  range_tombstone_change_generator: flush: return early if empty
2022-04-04 19:07:45 +02:00
Benny Halevy
b3e2bbe5bd test: random_mutation_generator: make more interesting range tombstones
Include also singular prefix and semi-bounded range tombstones.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-04-04 17:34:49 +03:00
Piotr Grabowski
63fa5ac915 generic_server.hh: add missing include
Add missing include of "<list>" which caused compile errors on GCC:

In file included from generic_server.cc:9:
generic_server.hh:91:10: error: ‘list’ in namespace ‘std’ does not name a template type
   91 |     std::list<gentle_iterator> _gentle_iterators;
      |          ^~~~
generic_server.hh:19:1: note: ‘std::list’ is defined in header ‘<list>’; did you forget to ‘#include <list>’?
   18 | #include <seastar/net/tls.hh>
  +++ |+#include <list>
   19 |

Note that there are some GCC compilation problems still left apart from
this one.

Closes #10328
2022-04-04 17:31:55 +03:00
Lukasz Sojka
5727f196e3 Add big batch logs tests
Tests for warning and error lines in logfile when user executes
big batch (above preconfigured thresholds in scylla.yaml).

Signed-off-by: Lukasz Sojka <lukasz.sojka@scylladb.com>

Closes #10232
2022-04-04 17:25:13 +03:00
Takuya ASADA
f95a531407 docker: run scylla as root
Previous versions of Docker image runs scylla as root, but cb19048
accidently modified it to scylla user.
To keep compatibility we need to revert this to root.

Fixes #10261

Closes #10325
2022-04-04 17:25:13 +03:00
Pavel Emelyanov
9fdb49c86a Merge 'fix hang on shutdown while ddl query is running and there is no quorum' from Gleb
A node that runs DDL query while its cluster does not have a quorum
cannot be shutdown since the query is not abortable. The series makes it
abortable and also fixes the order in which components are shutdown to
avoid the deadlock.

* gleb/raft_shutdown_v4 of git@github.com:scylladb/scylla-dev.git:
  migration_manager: drain migration manager before stopping protocol servers on shutdown
  migration_manager: pass abort source to raft primitives
  storage_proxy: relax some read error reporting
2022-04-04 17:25:13 +03:00
Benny Halevy
002be743f6 reader: upgrading_consumer: let range_tombstone_change_generator emit last closing change
When flushing range tombstones up to
position_in_partition::after_all_clustered_rows(),
the range_tombstone_change_generator now emits
the closing range_tombstone_change, so there's
no need for the upgrading_consumer to do so too.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-04-04 17:00:53 +03:00
Benny Halevy
cd171f309c range_tombstone_change_generator: flush: emit end_position when upper limit is after all clustered rows
When the highest tombstone is open ended, we must
emit a closing range_tombstone_change at
position_in_partition::after_all_clustered_rows().

Since all consumers need to do it, implement the logic
int the range_tombstone_change_generator itself.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-04-04 17:00:53 +03:00
Benny Halevy
2c5a6b3894 range_tombstone_change_generator: flush: use tri_compare rather than less
less is already using tri_compare internally,
and we'll use tri_compare for equality in the next patch.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-04-04 17:00:53 +03:00
Benny Halevy
18a80a98b8 range_tombstone_change_generator: flush: return early if empty
Optimize the common, empty case.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-04-04 17:00:53 +03:00
Takuya ASADA
41edc045d9 docker: revert scylla-server.conf service name change
We changed supervisor service name at cb19048, but this breaks
compatibility with scylla-operator.
To fix the issue we need to revert the service name to previous one.

Fixes #10269

Closes #10323
2022-04-03 19:18:18 +03:00
Avi Kivity
538fabc05d Update tools/java submodule
* tools/java ac5d6c840d...9bc83b7a32 (1):
  > fix metadata printing for ME sstables
2022-04-03 18:33:04 +03:00
Alexey Kartashov
d86c3a8061 dist/docker: fix incorrect locale value
Docker build script contains an incorrect locale specification for LC_ALL setting,
this commit fixes that.

Fixes #10310

Closes #10321
2022-04-03 14:24:54 +03:00
David Garcia
934beb6e20 docs: update theme 1.2.1
Related issue scylladb/sphinx-scylladb-theme#395

ScyllaDB Sphinx Theme 1.2 is now released partying_face

We’ve added automatic checks for broken links and introduced numerous UI updates.

You can read more about all notable changes here.

Closes #10313
2022-04-03 13:45:07 +03:00
Nadav Har'El
0a67c87438 Update seastar submodule
* seastar 44389842...798ec507 (4):
  > CONTRIBUTING: update to note that pull requests are accepted (#1036)
  > semaphore: improve documentation of timeout and abort errors
  > condition_variable: fix cv.signal with active "when" wait would switch fiber
  > abortable_fifo: stop dereferencing null pointers

Fixes #10319 with "abortable_fifo: stop dereferencing null pointers".
2022-04-03 13:41:41 +03:00
Benny Halevy
5ca73019dd shard_reader_v2: do_fill_buffer: reserve buffer space ahead
To prevent unneeded reallocations, just reserve the
pre-known number of entries before pushing them.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20220402130847.625085-2-bhalevy@scylladb.com>
2022-04-03 11:28:32 +03:00
Benny Halevy
8ab57aa4ab shard_reader_v2: do_fill_buffer: maybe yield when copying result
Prevent a reactor stall with e.g. large number of range tombstones.

Fixes #10314

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20220402130847.625085-1-bhalevy@scylladb.com>
2022-04-03 11:05:14 +03:00
Tomasz Grabiec
9c96a37143 Merge "raft: nemesis test: use abort_source for time-outs" from Kamil
When a Raft API call such as `add_entry`, `set_configuration` or
`modify_config` takes too long, we need to time-out. There was no way to
abort these calls previously so we would do that by discarding the futures.
Recently the APIs were extended with `abort_source` parameters. Use this.

Also improve debuggability if the functions throw an exception type that
we don't expect. Previously if they did, a cryptic assert would fail
somewhere deep in the generator code, making the problem hard to debug.

Also collect some statistics in the test about the number of successful
and failed ops. I used it to manually check whether there was a
difference in how often operations fail with using the out timeout
method and the new timeout method (there doesn't seem to be any).

* kbr/nemesis-abort-source:
  test: raft: randomized_nemesis_test: on timeout, abort calls instead of discarding them
  raft: server: translate semaphore_aborted to request_aborted
  test: raft: logical_timer: add abortable version of `sleep_until`
  test: raft: randomized_nemesis_test: collect statistics on successful and failed ops
2022-04-01 16:25:23 +02:00
Pavel Emelyanov
886a275192 Merge 'replica/table: remove v1 reader factory methods' from Botond
Only users are internal and tests.

Tests: unit(dev)

* replica-table-remove-make-reader-v1/v2 of github.com/denesb/scylla.git
  replica/table: remove v1 reader factory methods
  tests: move away from table::make_reader()
  replica/table: add short make_reader_v2() variant:
2022-04-01 13:57:10 +03:00
Botond Dénes
9338affb8e replica/table: remove v1 reader factory methods 2022-04-01 13:52:08 +03:00
Botond Dénes
c8ea0715e9 tests: move away from table::make_reader()
Use v2 equivalents instead.
2022-04-01 13:39:26 +03:00
Botond Dénes
5aa97ccf0d replica/table: add short make_reader_v2() variant: 2022-04-01 13:39:26 +03:00
Botond Dénes
a325d3434a Merge "make_slicing_filtering_reader(): return flat mutation reader v2" from Michael Livshin
"
Tests: unit(dev)
"

* 'slicing-filtering-v2' of https://github.com/cmm/scylla:
  make_slicing_filtering_reader(): return flat mutation reader v2
  mutation_readers: refactor generic partition slicing logic
2022-04-01 11:08:25 +03:00
Nadav Har'El
758f8f01d7 test/alternator: turn REST API finding into a fixture
In test_tracing.py and util.py, we already have three duplicates of code
which looks for the Scylla REST API. We'll soon want to add even more uses
of this REST API, so it's good time to add a single fixture, "rest_api",
which can be use in all tests that need the Scylla REST API instead of
duplicating the same code.

A test using the "rest_api" fixture will be skipped if the server isn't
Scylla, or its port 10000 is not available or not responsive.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220331195337.64352-1-nyh@scylladb.com>
2022-04-01 10:51:59 +03:00
Raphael S. Carvalho
61c67105d2 compaction_manager: move internal stop functions into private namespace
They don't belong to public interface.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20220331202255.237688-1-raphaelsc@scylladb.com>
2022-04-01 10:50:27 +03:00
Michael Livshin
830aa041a8 make_slicing_filtering_reader(): return flat mutation reader v2
Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2022-03-31 19:59:53 +03:00
Michael Livshin
aac51be0cc mutation_readers: refactor generic partition slicing logic
There are at least 1 actual and 1 potential users for it; this
change converts the existing one.

Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2022-03-31 19:59:53 +03:00
Avi Kivity
af07519928 Merge "Remove reader from mutations v1" from Botond
"
First migrate all users to the v2 variant, all of which are tests.
However, to be able to properly migrate all tests off it, a v2 variant
of the restricted reader is also needed. All restricted reader users are
then migrated to the freshly introduced v2 variant and the v1 variant is
removed.
Users include:
* replica::table::make_reader_v2()
* streaming_virtual_table::as_mutation_source()
* sstables::make_reader()
* tests

This allows us to get rid of a bunch of conversions on the query path,
which was mostly v2 already.

With a few tests we did kick the can down the road by wrapping the v2
reader in `downgrade_to_v1()`, but this series is long enough already.

Tests: unit(dev), unit(boost/flat_mutation_reader_test:debug)
"

* 'remove-reader-from-mutations-v1/v3' of https://github.com/denesb/scylla:
  readers: remove now unused v1 reader from mutations
  test: move away from v1 reader from mutations
  test/boost/mutation_reader_test: use fragment_scatterer
  test/boost/mutation_fragment_test: extract fragment_scatterer into a separate hh
  test/boost: mutation_fragment_test: refactor fragment_scatterer
  readers: remove now unused v1 reversing reader
  test/boost/flat_mutation_reader_test: convert to v2
  frozen_mutation: fragment_and_freeze(): convert to v2
  frozen_mutation: coroutinize fragment_and_freeze()
  readers: migrate away from v1 reversing reader
  db/virtual_table: use v2 variant of reversing and forwardable readers
  replica/table: use v2 variant of reversing reader
  sstables/sstable: remove unused make_crawling_reader_v1()
  sstables/sstable: remove make_reader_v1()
  readers: add v2 variant of reversing reader
  readers/reversing: remove FIXME
  readers: reader from mutations: use mutation's own schema when slicing
2022-03-31 13:29:11 +03:00
Avi Kivity
5fc093ad42 Merge 'wasm: manage memory using exports from the client' from Wojciech Mitros
This patch adds importing the `malloc` and `free` method from the wasm client, and using them for allocating wasm memory for UDF arguments and freeing its result. When the methods are not exported, the old behaviour is used instead. To make that possible, this patch also includes a fix to the usage of pages in wasm memory (methods `size` and `grow`) that were used for allocating memory for arguments until now. (The source codes  for the examples didn't work on my machine in their original form, so when updating paging I've also added small unrelated modifications)

Tests:unit(dev)

Closes #10234

* github.com:scylladb/scylla:
  wasm: add wasm ABI version 2
  wasm: add WASI handling
  wasm: add documentation
  wasm: add _scylla_abi export for specifying abi for wasm udfs
  wasm: update ABI for passing parameters to wasm UDFs
  wasm: move common code to a separate function
  wasm: use wasm pages for wasm memory
2022-03-31 12:33:55 +03:00
Wojciech Mitros
56c5459c50 wasm: add null handling for wasm udf
As the name suggests, for UDFs defined as RETURNS NULL ON NULL
INPUT, we sometimes want to return nulls. However, currently
we do not return nulls. Instead, we fail on the null check in
init_arg_visitor. Fix by adding null handling before passing
arguments, same as in lua.

Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>

Closes #10298
2022-03-31 12:27:38 +03:00
Piotr Sarna
c0fd53a9d7 cql3: fix qualifying restrictions with IN for indexing
When a query contains IN restriction on its partition key,
it's currently not eligible for indexing. It was however
erroneously qualified as such, which lead to fetching incorrect
results. This commit fixes the issue by not allowing such queries
to undergo indexing, and comes with a regression test.

Fixes #10300

Closes #10302
2022-03-31 11:04:17 +03:00
Nadav Har'El
3eaafbbdf7 test/cql-pytest: mark a test failing on Cassandra with cassandra_bug
We have a test for the LIKE restriction with ALLOW FILTERING.

Cassandra does not yet support this combination (it only supports LIKE
with SASI indexes), so this test fails on Cassandra, suggesting either
the test is wrong, or Cassandra is wrong. In this case, Cassandra is
wrong - they have an issue requesting this to be fixed -
https://issues.apache.org/jira/browse/CASSANDRA-17198, and even an
implementation which is being reviewed.

So let's mark this test with "cassandra_bug", meaning it is expected
to fail (xfail) when running against Cassandra. When CASSANDRA-17198
is fixed, we can remove the cassandra_bug mark.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220330211734.4103691-1-nyh@scylladb.com>
2022-03-31 09:47:44 +02:00
Botond Dénes
7d49afe78b readers: remove now unused v1 reader from mutations 2022-03-31 10:36:26 +03:00
Botond Dénes
fd69add579 test: move away from v1 reader from mutations
Use the v2 variant instead.
2022-03-31 10:36:23 +03:00
Botond Dénes
2e00ff314d test/boost/mutation_reader_test: use fragment_scatterer
Instead of the open-coded equivalent the test currently has.
2022-03-31 10:25:45 +03:00
Botond Dénes
feecc19d5b test/boost/mutation_fragment_test: extract fragment_scatterer into a separate hh
We want to use it in test/boost/mutation_reader_test.cc too.
2022-03-31 10:25:45 +03:00
Botond Dénes
226f01162e test/boost: mutation_fragment_test: refactor fragment_scatterer
Instead of taking an output parameter in the constructor, take just the
desired number of mutations to build and return the mutation list from
`consume_end_of_stream()`.
2022-03-31 10:25:45 +03:00
Botond Dénes
b8f0ab3b98 readers: remove now unused v1 reversing reader 2022-03-31 10:04:45 +03:00
Botond Dénes
56e3c6add6 test/boost/flat_mutation_reader_test: convert to v2 2022-03-31 10:04:29 +03:00
Gleb Natapov
c17a03727c migration_manager: drain migration manager before stopping protocol servers on shutdown
When protocol servers are stopping they wait for all active queries to
complete, but DDL queries use migration manager internally, so if they
hang there protocol servers will not be able to stop since migration
manager is drained afterwords. The patch moves the migration manager
draining before protocol servers stoppage.

Since after the patch migration managers is drained before messaging
service is stopped we need to make sure that no rpc request triggers new
migration manager requests. We do it by making sure that any attempt to
issue such a request after aborted will return abort_requested_exception.
2022-03-31 10:00:29 +03:00
Gleb Natapov
e52abdca30 migration_manager: pass abort source to raft primitives
We want to be able to abort raft operations on migration manager drain.
MM already has an abort source that is signaled on drain, so all that is
left is to pass it to raft calls.
2022-03-31 10:00:29 +03:00
Gleb Natapov
1409b885a0 storage_proxy: relax some read error reporting
Silence request_aborted read error since it is expected to happen suring
shutdown and report remote rpc errors as warnings instead of errors since
if they are indeed server they should be handled by the rpc client, but
OTOH some non critical errors do expect to happen during shutdown.
2022-03-31 10:00:29 +03:00
Botond Dénes
2e634883d9 frozen_mutation: fragment_and_freeze(): convert to v2 2022-03-31 09:57:48 +03:00
Botond Dénes
7f3986ed1a frozen_mutation: coroutinize fragment_and_freeze() 2022-03-31 09:57:48 +03:00
Botond Dénes
fc27b6b7ed readers: migrate away from v1 reversing reader
The only internal user is the v1 make reader from mutations, we use a
downgrade/upgrade to be able to use the v2 reversing reader there. This
is ugly but the v1 reader from mutations is going away soon too, so not
a real problem.
2022-03-31 09:57:48 +03:00