Commit Graph

2087 Commits

Author SHA1 Message Date
Tomasz Grabiec
e115fce8f7 Merge "raft: sometimes become a candidate even if outside the configuration" from Kamil
There are situations where a node outside the current configuration is
the only node that can become a leader. We become candidates in such
cases. But there is an easy check for when we don't need to; a comment was
added explaining that.

* kbr/candidate-outside-config-v3:
  raft: sometimes become a candidate even if outside the configuration
  raft: fsm: update _commit_idx when applying snapshot
2021-08-09 12:29:03 +02:00
Piotr Dulikowski
e18b29765a hints: add hint sync point structure
Adds a sync_point structure. A sync point is a (possibly incomplete)
mapping from hint queues to a replay position in it. Users will be able
to create sync points consisting of the last written positions of some
hint queues, so then they can wait until hint replay in all of the
queues reach that point.

The sync point supports serialization - first it is serialized with the
help of IDL to a binary form, and then converted to a hexadecimal
string. Deserialization is also possible.
2021-08-09 09:24:36 +02:00
Piotr Dulikowski
5a0942a0f8 utils,alternator: move base64 code from alternator to utils
The base64 encoding/decoding functions will be used for serialization of
hint sync point descriptions. Base64 format is not specific to
Alternator, so it can be moved to utils.
2021-08-09 09:24:36 +02:00
Kamil Braun
7533c84e62 raft: sometimes become a candidate even if outside the configuration
There are situations where a node outside the current configuration is
the only node that can become a leader. We become candidates in such
cases. But there is an easy check for when we don't need to; a comment was
added explaining that.
2021-08-06 13:18:32 +02:00
Kamil Braun
93822b0ee7 test: raft: regression test for storing cluster configuration when taking snapshots
Before the fix introduced in the previous patch, the cluster would
forget its configuration when taking a snapshot, making it unable to
reelect a leader. This regression test catches that.
2021-08-06 12:17:22 +02:00
Avi Kivity
52364b5da0 Merge 'cql3: Use expressions to calculate the local-index clustering ranges' from Jan Ciołek
Calculating clustering ranges on a local index has been rewritten to use the new `expression` variant.

This allows us to finally remove the old `bounds_ranges` function.

Closes #9080

* github.com:scylladb/scylla:
  cql3: Remove unused functions like bounds_ranges
  cql3: Use expressions to calculate the local-index clustering ranges
  statement_restrictions_test: tests for extracting column restrictions
  expression: add a function to extract restrictions for a column
2021-08-05 18:32:11 +03:00
Kamil Braun
f050d3682c raft: fsm: stronger check for outdated remote snapshots
We must not apply remote snapshots with commit indexes smaller than our
local commit index; this could result in out-of-order command
application to the local state machine replica, leading to
serializability violations.
Message-Id: <20210805112736.35059-1-kbraun@scylladb.com>
2021-08-05 14:29:50 +02:00
Nadav Har'El
ae51fef57c cql-pytest: add tests for estimated partition count
In issue #9083 a user noted that whereas Cassandra's partition-count
estimation is accurate, Scylla's (rewritten in commit b93cc21) is very
inaccurate. The tests introduce here, which all xfail on Scylla, confirm
this suspicion.

The most important tests are the "simple" tests, involving a workload
which writes N *distinct* partitions and then asks for the estimated
partition count. Cassandra provides accurate estimates, which grow
more accurate with more partitions, so it passes these tests, while
Scylla provides bad estimates and fails them.

Additional tests demonstrate that neither Scylla nor Cassandra
can handle anything beyond the "simple" case of distinct partitions.
Two tests which xfail on both Cassandra and Scylla demonstrate that
if we write the same partitions to multiple sstables - or also delete
partitions - the estimated partition counts will be way off.

Refs #9083

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210726211315.1515856-1-nyh@scylladb.com>
2021-08-05 08:50:19 +02:00
Nadav Har'El
d640998ca8 test/cql-pytest: add test for another ALLOW FILTERING case
In this patch we add another test case for a case where ALLOW FILTERING
should not be required (and Cassandra doesn't require it) but Scylla
does.

This problem was introduced by pull request #9122. The pull request
fixed an incorrect query (see issue #9085) involving both an index and
a multi-column restriction on a compound clustering key - and the fix is
using filtering. However, in one specific case involving a full prefix,
it shouldn't require filtering. This test reproduces this case.

The new test passes on Cassandra (and also theoretically, should pass),
but fails on Scylla - the check_af_optional() call fails because Scylla
made the ALLOW FILTERING mandatory for that case.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210803092046.1677584-1-nyh@scylladb.com>
2021-08-04 15:24:47 +03:00
Nadav Har'El
dba184039a test/alternator: another test for Query's ExclusiveStartKey
We already have tests for Query's ExclusiveStartKey option, but we
only exercised it as a way for paging linearly through all the results.
Now we add a test that confirms that ExclusiveStartKey can be used not
just for paging through all the result - but also for jumping directly to
the middle of a partition after any clustering key (existing or non-
existing clustering key). The new test also for the first time verifies
that ExclusiveStartKey with a specific format works (previous tests just
copied LastEvaluatedKey to ExclusiveStartKey, so any opaque cookie could
have worked).

The test passes on both DynamoDB and Alternator so it did not find a new
bug. But it's useful to have as a regression test, in case in the future
we want to improve paging performance (see #6278) - and need to keep in
mind that ExclusiveStartKey is not just for paging.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210729114703.1609058-1-nyh@scylladb.com>
2021-08-04 15:24:47 +03:00
Kamil Braun
4165045356 test: raft: randomized_nemesis_test: handle timeouts in rpc::send_snapshot
They were already correctly returned to the caller, but we had a
leftover discarded future that would sometimes end up with a
broken_promise exception. Ignore the exception explicitly.
Message-Id: <20210803122207.78406-1-kbraun@scylladb.com>
2021-08-04 15:24:47 +03:00
Raphael S. Carvalho
a869d61c89 tests: Move compaction-related tests into its own unit
With commit 1924e8d2b6, compaction code was moved into a
top level dir as compaction is layered on top of sstables.
Let's continue this work by moving all compaction unit tests
into its own test file. This also makes things much more
organized.

sstable_datafile_test, as its name implies, will only contain
sstable data tests. Perhaps it should be renamed to only
sstable_data_test, as the test also contains tests involving
other components, not only the data one.

BEFORE
$ cat test/boost/sstable_datafile_test.cc | grep TEST_CASE | wc -l
105

AFTER
$ cat test/boost/sstable_compaction_test.cc | grep TEST_CASE | wc -l
57
$ cat test/boost/sstable_datafile_test.cc | grep TEST_CASE | wc -l
48

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20210802192120.148583-1-raphaelsc@scylladb.com>
2021-08-02 22:26:26 +03:00
Jan Ciolek
a7d1dab066 statement_restrictions_test: tests for extracting column restrictions
Add unit tests for the function
extract_single_column_restrictions_for_column()

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-08-02 15:43:42 +02:00
Tomasz Grabiec
3e47f28c65 Merge "raft: use the correct term when storing a snapshot" from Kamil
We should not use the current term; we should use the term of the
snapshot's index, which may be lower.

* https://github.com/kbr-/scylla/tree/snapshot-right-term-fix:
  test: raft: regression test for using the correct term when taking a snapshot
  test: raft: randomized_nemesis_test: server configuration parameter
  raft: use the correct term when storing a snapshot
2021-08-02 15:33:52 +02:00
Dejan Mircevski
debf65e136 cql3: Filter regular-index results on multi-column
When a WHERE clause contains a multi-column restriction and an indexed
regular column, we must filter the results.  It is generally not
possible to craft the index-table query so it fetches only the
matching rows, because that table's clustering key doesn't match up
with the column tuple.

Fixes #9085.

Tests: unit (dev, debug)

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>

Closes #9122
2021-08-02 14:15:43 +03:00
Kamil Braun
ac5121a016 test: raft: regression test for using the correct term when taking a snapshot 2021-08-02 11:48:35 +02:00
Kamil Braun
63fdc718d4 test: raft: randomized_nemesis_test: server configuration parameter 2021-08-02 11:47:19 +02:00
Avi Kivity
ebda2fd4db test: cql_test_env: increase file descriptor limit
It was observed that since fce124bd90 ('Merge "Introduce
flat_mutation_reader_v2" from Tomasz') database_test takes much longer.
This is expected since it now runs the upgrade/downgrade reader tests
on all existing tests. It was also observed that in a similar time frame
database_test sometimes times our on test machines, taking much
longer than usual, even with the extra work for testing reader
upgrade/downgrade.

In an attempt to reproduce, I noticed ti failing on EMFILE (too many
open file descriptors). I saw that tests usually use ~100 open file
descriptors, while the default limit is 1024.

I suspect we have runaway concurrency, but I was not able to pinpoint the
cause. It could be compaction lagging behind, or cleanup work for
deleting tables (the test
test_database_with_data_in_sstables_is_a_mutation_source creates and
deletes many tables).

As a stopgap solution to unblock the tests, this patch raises the file
descriptor limit in the way recommended by [1]. While tests shouldn't
use so many descriptors, I ran out of ideas about how to plug the hole.

Note that main() does something similar, through more elaborate since
it needs to communicate to users. See ec60f44b64 ("main: improve
process file limit handling").

[1] http://0pointer.net/blog/file-descriptor-limits.html

Closes #9121
2021-08-02 11:57:14 +03:00
Pavel Solodovnikov
b1a3b59a08 test: test_materialized_view: test_mv_select_stmt_bound_values: improve error handling
Restrict expected exception message to filter only relevant exception,
matching both for scylla and cassandra.

For example, the former has this message:

    Cannot use query parameters in CREATE MATERIALIZED VIEW statements

While the latter throws this:

    Bind variables are not allowed in CREATE MATERIALIZED VIEW statements

Also, place cleanup code in try-finally clause.

Tests: cql-pytest:test_materialized_view.py(dev)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20210802083912.229886-1-pa.solodovnikov@scylladb.com>
2021-08-02 11:49:50 +03:00
Nadav Har'El
e8fe1817df cql-pytest: translate Cassandra's tests for timestamps
This is a translation of Cassandra's CQL unit test source file
validation/entities/TimestampTest.java into our our cql-pytest framework.

This test file checks has a few tests (8) on various features of
cell timestamps. All these tests pass on Cassandra and on Scylla - i.e.,
these tests no new Scylla bug was detected :-)

Two of the new tests are very slow (6 seconds each) and check a trivial
feature that was already checked elsewhere more efficiently (the fact
that TTL expiration works), so I marked them "skip" after verifying they
really pass.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210801142738.1633126-1-nyh@scylladb.com>
2021-08-02 09:25:49 +02:00
Pavel Solodovnikov
d07f681a95 test: test_non_deterministic_functions: add lwt to test cases names
The tests are related to LWT so add the corresponding prefix
to all the tests cases to emphasize that.

Tests: cql-pytest(dev)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20210801131820.164480-1-pa.solodovnikov@scylladb.com>
2021-08-01 16:23:30 +03:00
Avi Kivity
48860b135a Merge "cql3: fix current*() functions to be non-deterministic" from Pavel S
"
Previously, the following functions were
incorrectly marked as pure, meaning that the
function is executed at "prepare" step:

* `currenttimestamp()`
* `currenttime()`
* `currentdate()`
* `currenttimeuuid()`

For functions that possibly depend on timing and random seed,
this is clearly a bug. Cassandra doesn't have a notion of pure
functions, so they are lazily evaluated.

Make Scylla to match Cassandra behavior for these functions.

Add a unit-test for a fix (excluding `currentdate()` function,
because there is no way to use synthetic clock with query
processor and sleeping for a whole day to demonstrate correct
behavior is clearly not an option).

Also, extend the cql-pytest for #8604 since there are now more
non-deterministic CQL functions, they are all subject to the test
now.

Fixes: #8816
"

* 'timeuuid_function_pure_annotation_v3' of https://github.com/ManManson/scylla:
  test: test_non_deterministic_functions: test more non-pure functions
  cql3: change `current*()` CQL functions to be non-pure
2021-08-01 12:35:36 +03:00
Pavel Solodovnikov
a130921120 test: test_non_deterministic_functions: test more non-pure functions
Check that all existing non-pure functions (except for
`currentdate()`) work correctly with or without prepared
statements.

Tests: cql-pytest/test_non_deterministic_functions.py(dev)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2021-08-01 12:18:26 +03:00
Pavel Solodovnikov
21d758020a cql3: change current*() CQL functions to be non-pure
These include the following:
* `currenttimestamp()`
* `currenttime()`
* `currentdate()`
* `currenttimeuuid()`

Previously, they were incorrectly marked as pure, meaning
that the function is executed at "prepare" step.

For functions that possibly depend on timing and random seed,
this is clearly a bug. Cassandra doesn't have a notion of pure
functions, so they are lazily evaluated.

Make Scylla to match Cassandra behavior for these functions.

Add a unit-test for a fix (excluding `currentdate()` function,
because there is no way to use synthetic clock with query
processor and sleeping for a whole day to demonstrate correct
behavior is clearly not an option).

Tests: unit(dev, debug)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2021-08-01 12:17:23 +03:00
Pavel Solodovnikov
1ca7825cf6 test: add a test checking that bind markers within MVs SELECT statement don't lead to a crash
The request should fail with `InvalidRequest` exception and shouldn't
crash the database.

Don't check for actual error messages, because they are different
between Scylla and Cassandra.

The former has this message:

        Cannot use query parameters in CREATE MATERIALIZED VIEW statements

While the latter throws this:

        Bind variables are not allowed in CREATE MATERIALIZED VIEW statements

Tests: cql-pytest/test_materialized_view.py(scylla dev, cassandra trunk)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2021-07-30 17:57:24 +03:00
Piotr Sarna
1c7af8d46f cql-pytest: adjust a test case for Cassandra 4
One of the test cases stopped working against Cassandra 4, but that's
just because it returns a slightly different error type.
The test case is adjusted to work on both Scylla and new Cassandra.

Message-Id: <222a7f63a3e9739c6fc646173306fcdb3da25890.1627655555.git.sarna@scylladb.com>
2021-07-30 17:36:23 +03:00
Avi Kivity
0876248c2b Merge "cql3: cache function calls evaluation for non-deterministic functions" from Pavel S
"
`function_call` AST nodes are created for each function
with side effects in a CQL query, i.e. non-deterministic
functions (`uuid()`, `now()` and some others timeuuid-related).

These nodes are evaluated either when a query itself is executed
or query restrictions are computed (e.g. partition/clustering
key ranges for LWT requests).

We need to cache the calls since otherwise when handling a
`bounce_to_shard` request for an LWT query, we can possibly
enter an infinite bouncing loop (in case a function is used
to calculate partition key ranges for a query), since the
results can be different each time.

Furthermore, we don't support bouncing more than one time.
Returning `bounce_to_shard` message more than one time
will result in a crash.

Caching works only for LWT statements and only for the function
calls that affect partition key range computation for the query.

`variable_specifications` class is renamed to `prepare_context`
and generalized to record information about each `function_call`
AST node and modify them, as needed:
* Check whether a given function call is a part of partition key
  statement restriction.
* Assign ids for caching if above is true and the call is a part
  of an LWT statement.

There is no need to include any kind of statement identifier
in the cache key since `query_options` (which holds the cache)
is limited to a single statement, anyway.

Function calls are indexed by the order in which they appear
within a statement while parsing. There is no need to
include any kind of statement identifier to the cache key
since `query_options` (which holds the cache) is limited
to a single statement, anyway.

Note that `function_call::raw` AST nodes are not created
for selection clauses of a SELECT statement hence they
can only accept only one of the following things as parameters:
* Other function calls.
* Literal values.
* Parameter markers.

In other words, only parameters that can be immediately reduced
to a byte buffer are allowed and we don't need to handle
database inputs to non-pure functions separately since they
are not possible in this context. Anyhow, we don't even have
a single non-pure function that accepts arguments, so precautions
are not needed at the moment.

Add a test written in `cql-pytest` framework to verify
that both prepared and unprepared lwt statements handle
`bounce_to_shard` messages correctly in such scenario.

Fixes: #8604

Tests: unit(dev, debug)

NOTE: the patchset uses `query_options` as a container for
cached values. This doesn't look clean and `service::query_state`
seems to be a better place to store them. But it's not
forwarded to most of the CQL code and would mean that a huge number
of places would have to be amended.
The series presents a trade-off to avoid forwarding `query_state`
everywhere (but maybe it's the thing that needs to be done, nonetheless).
"

* 'lwt_bounce_to_shard_cached_fn_v6' of https://github.com/ManManson/scylla:
  cql-pytest: add a test for non-pure CQL functions
  cql3: cache function calls evaluation for non-deterministic functions
  cql3: rename `variable_specifications` to `prepare_context`
2021-07-30 14:21:11 +03:00
Pavel Solodovnikov
eaf70df203 cql-pytest: add a test for non-pure CQL functions
Introduce a test using `cql-pytest` framework to assert that
both prepared an unprepared LWT statements (insert with
`IF NOT EXISTS`) with a non-deterministic function call
work correctly in case its evaluation affects partition
key range computation (hence the choice of `cas_shard()`
for lwt query).

Tests: cql-pytest/test_non_deterministic_functions.py

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2021-07-30 01:22:50 +03:00
Tomasz Grabiec
7c28f77412 Merge 'Convert all remaining int tri-compares to std::strong_ordering' from Avi Kivity
Convert all known tri-compares that return an int to return std::strong_ordering.
Returning an int is dangerous since the caller can treat it as a bool, and indeed
this series uncovered a minor bug (#9103).

Test: unit (dev)

Fixes #1449

Closes #9106

* github.com:scylladb/scylla:
  treewide: remove redundant "x <=> 0" compares
  test: mutation_test: convert internal tri-compare to std::strong_ordering
  utils: int_range: change to std::strong_ordering
  test: change some internal comparators to std::strong_ordering
  utils: big_decimal: change to std::strong_ordering
  utils: fragment_range: change to std::strong_ordering
  atomic_cell: change compare_atomic_cell_for_merge() to std::strong_ordering
  types: drop scaffolding erected around lexicographical_tri_compare
  sstables: keys: change to std::strong_ordering internally
  bytes: compare_unsigned(): change to std::strong_ordering
  uuid: change comparators to std::strong_ordering
  types: convert abstract_type::compare and related to std::strong_ordering
  types: reduce boilerplate when comparing empty value
  serialized_tri_compare: change to std::strong_ordering
  compound_compat: change to std::strong-ordering
  types: change lexicographical_tri_compare, prefix_equality_tri_compare to std::strong_ordering
2021-07-29 21:43:54 +02:00
Avi Kivity
e44d3cc0ea Merge "Remove global storage service instance" from Pavel E
"
There are few places that call global storage service, but all
are easily fixable without significant changes.

1. alternator -- needs token metadata, switch to using proxy
2. api -- calls methods from storage service, all handlers are
   registered in main and can capture storage service from there
3. thrift -- calls methods from storage service, can carry the
   reference via controller
4. view -- needs tokens, switch to using (global) proxy
5. storage_service -- (surprisingly) can use "this"

tests: unit(dev), dtest(simple_boot_shutdown, dev)
"

* 'br-unglobal-storage-service' of https://github.com/xemul/scylla:
  storage_service: Make it local
  storage_service: Remove (de)?init_storage_service()
  storage_service: Use container() in run_with(out)_api_lock
  storage_service: Unmark update_topology static
  storage_service: Capture this when appropriate
  view: Use proxy to get token metadata from
  thrift: Use local storage service in handlers
  thrift: Carry sharded<storage_service>& down to handler
  api: Capture and use sharded<storage_service>& in handlers
  api: Carry sharded<storage_service>& down to some handlers
  alternator: Take token metadata from server's storage_proxy
  alternator: Keep storage_proxy on server
2021-07-29 11:47:16 +03:00
Avi Kivity
8d2255d82c Merge "Parallelize multishard_combining_reader_as_mutation_source test" from Pavel E
"
This is the 3rd slowest test in the set. There are 3 cases out
there that are hard-coded to be sequential. However, splitting
them into boost test cases helps running this test faster in
--parallel-cases mode. Timings for debug mode:

         Total before the patch: 25 min
     Sequential after the patch: 25 min
                     Basic case:  5 min
      Evict-paused-readers case:  5 min
    Single-mutation-buffer case: 15 min

tests: unit.multishard_combining_reader_as_mutation_source(debug)
"

* 'br-parallel-mcr-test' of https://github.com/xemul/scylla:
  test: Split test_multishard_combining_reader_as_mutation_source into 3
  test: Fix indentation after previous patch
  test: Move out internals of test_multishard_combining_reader_as_mutation_source
2021-07-29 11:39:02 +03:00
Pavel Emelyanov
f9132b582b storage_service: Make it local
There are 3 places that can now declare local instance:

- main
- cql_test_env
- boost gossiper test

The global pointer is saved in debug namespace for debugging.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-07-29 05:12:36 +03:00
Gleb Natapov
4764028cb3 raft: Remove leader_id from append_request
The filed is not used anywhere.

Message-Id: <YP0khmjK2JSp77AG@scylladb.com>
2021-07-28 20:30:07 +02:00
Avi Kivity
42e1f318d7 Merge "Respect "bypass cache" in sstable index caching" from Tomasz
"
This series changes the behavior of the system when executing reads
annotated with "bypass cache" clause in CQL. Such reads will not
use nor populate the sstable partition index cache and sstable index page cache.
"

* 'bypass-cache-in-sstable-index-reads' of github.com:tgrabiec/scylla:
  sstables: Do not populate page cache when searching in promoted index for "bypass cache" reads
  sstables: Do not populate partition index cache for "bypass cache" reads
2021-07-28 18:45:39 +03:00
Avi Kivity
331eb57e17 Revert "compression: define 'class' attribute for compression and deprecate 'sstable_compression'"
This reverts commit 5571ef0d6d. It causes
rolling upgrade failures.

Fixes #9055.

Reopens #8948.
2021-07-28 14:14:22 +03:00
Avi Kivity
0909e3c17d treewide: remove redundant "x <=> 0" compares
If x is of type std::strong_ordering, then "x <=> 0" is equivalent to
x. These no-ops were inserted during #1449 fixes, but are now unnecessary.
They have potential for harm, since they can hide an accidental of the
type of x to an arithmetic type, so remove them.

Ref #1449.
2021-07-28 13:30:32 +03:00
Avi Kivity
70f481a1f0 test: mutation_test: convert internal tri-compare to std::strong_ordering
Drop the temporary merge_container() overload we had to support
tri-compares that returned int.
2021-07-28 13:30:07 +03:00
Avi Kivity
11fa402ecc test: change some internal comparators to std::strong_ordering
Ref #1449.
2021-07-28 13:28:51 +03:00
Avi Kivity
7729ff03ad uuid: change comparators to std::strong_ordering
Ref #1449.
2021-07-28 13:20:32 +03:00
Avi Kivity
e52ebe2da5 types: convert abstract_type::compare and related to std::strong_ordering
Change comparators around types to std::strong_ordering.

Ref #1449.
2021-07-28 13:19:24 +03:00
Avi Kivity
d86e529239 serialized_tri_compare: change to std::strong_ordering
Also convert a users in mutation_test.

Ref #1449.
2021-07-28 13:19:00 +03:00
Benny Halevy
67d5addc09 test: mutation_reader_test: clustering_order_merger_test_generator: use explicit type for num_ranges
gcc 10.3.1 spews the following error:
```
_test_generator::generate_scenario(std::mt19937&) const’:
test/boost/mutation_reader_test.cc:3731:28: error: comparison of integer expressions of different signedness: ‘int’ and ‘long unsigned int’ [-Werror=sign-compare]
 3731 |         for (auto i = 0; i < num_ranges; ++i) {
      |                          ~~^~~~~~~~~~~~
```

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20210728073538.2467040-1-bhalevy@scylladb.com>
2021-07-28 11:22:59 +03:00
Avi Kivity
77a2b4b520 test: perf: perf_simple_query: add instructions_per_op to the json-result output
It's in text output, but 863b49af03 forgot to add it to the machine
readable results.

Closes #9017
2021-07-27 20:26:19 +02:00
Pavel Emelyanov
b3c89787be mutation_partition: Return immutable collection for range tombstones
Patch the .row_tombstones() to return the range_tombstone_list
wrapped into the immutable_collection<> so that callers are
guaranteed not to touch the collection itself, but still can
modify the tombstones.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-07-27 20:06:53 +03:00
Pavel Emelyanov
1bf643d4fd mutation_partition: Pin mutable access to range tombstones
Some callers of mutation_partition::row_tomstones() don't want
(and shouldn't) modify the list itself, while they may want to
modify the tombstones. This patch explicitly locates those that
need to modify the collection, because the next patch will
return immutable collection for the others.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-07-27 20:06:53 +03:00
Pavel Emelyanov
05b8cdfd24 mutation_partition: Return immutable collection for rows
Patch the .clustered_rows() method to return the btree of rows
wrapped into the immutable_collection<> so that callers are
guaranteed not to touch the collection itself, but still can
modify the elements in it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-07-27 20:06:53 +03:00
Pavel Emelyanov
e652b03b4e btree tests: Dont use iterator erase
Next patches will mark btree::iterator methods that modify
the tree itself as private, so stop using them in tests.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-07-27 20:06:53 +03:00
Avi Kivity
f86e65b4e7 Merge "Fix quadratic behavior in memtable/row_cache with lots of range tombstones" from Tomasz
"
This series fixes two issues which cause very poor efficiency of reads
when there is a lot of range tombstones per live row in a partition.

The first issue is in the row_cache reader. Before the patch, all range
tombstones up to the next row were copied into a vector, and then put
into the buffer until it's full. This would get quadratic if there is
much more range tombstones than fit in a buffer.

The fix is to avoid the accumulation of all tombstones in the vector
and invoke the callback instead, which stops the iteration as soon as
the buffer is full.

Fixes #2581.

The second, similar issue was in the memtable reader.

Tests:

  - unit (dev)
  - perf_row_cache_update (release)
"

* tag 'no-quadratic-rt-in-reads-v1' of github.com:tgrabiec/scylla:
  test: perf_row_cache_update: Uncomment test case for lots of range tombstones
  row_cache: Consume range tombstones incrementally
  partition_snapshot_reader: Avoid quadratic behavior with lots of range tombstones
  tests: mvcc: Relax monotonicity check
  range_tombstone_stream: Introduce peek_next()
2021-07-27 14:39:13 +03:00
Avi Kivity
2cca461652 Merge 'sstables: merge row consumer interfaces with implementations' from Wojciech Mitros
This patch follows #9002, further reducing the complexity of the sstable readers.
The split between row consumer interfaces and implementations has been first added in 2015, and there is no reason to create new implementations anymore. By merging those classes, we achieve a sizeable reduction in sstable reader length and complexity.
Refs #7952
Tests: unit(dev)

Closes #9073

* github.com:scylladb/scylla:
  sstables: merge row_consumer into mp_row_consumer_k_l
  sstables: move kl row_consumer
  sstables: merge consumer_m into mp_row_consumer_m
  sstables: move mp_row_consumer_m
2021-07-27 12:23:29 +03:00
Pavel Emelyanov
ca2dfac7d7 test: Split test_multishard_combining_reader_as_mutation_source into 3
There are 3 independent cases in this test that benefit
from running in parallel.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-07-27 09:29:20 +03:00