Commit Graph

28451 Commits

Author SHA1 Message Date
Nadav Har'El
5cbe9178fd alternator: add missing BatchGetItem metric
Unfortunately, defining metrics in Scylla requires some code
duplication, with the metrics declared in one place but exported in a
different place in the code. When we duplicated this code in Alternator,
we accidentally dropped the first metric - for BatchGetItem. The metric
was accounted in the code, but not exported to Prometheus.

In addition to fixing the missing metric, this patch also adds a test
that confirms that the BatchGetItem metric increases when the
BatchGetItem operation is used. This test failed before this patch, and
passes with it. The test only currently tests this for BatchGetItem
(and BatchWriteItem) but it can be later expanded to cover all the other
operations as well.

Fixes #9406

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210929121611.373074-1-nyh@scylladb.com>
2021-09-29 14:16:54 +02:00
Tomasz Grabiec
11a3b411c5 Merge 'mutation_source_test: test reverse reads' from Botond Dénes
Currently no mutation-source supports reading in reverse natively but
we are working on changing that, adding native reverse read support to
memtable, cache and sstable readers. To ensure that all mutation
sources work in a correct and uniform manner when reading in reverse,
we add a reverse test to the mutation source test suite. This test
reverses the data that it passes to `populate()`, then reads in
forward order (in reverse compared to the data order). For this we use
the currently established reverse read API: reverse schema (schema
order == query order) and half-reversed (legacy) slice.  All mutation
sources are prepared to work with reversed reads, using the
`make_reversing_reader()` adapter. As we progress with our native
reverse support, we will replace these adapters with native reversing
support. As part of this, we push down the reversing reader adapter
currently existing on the `query::consume_page()` level, to the
individual mutation sources.

Closes #9384

* github.com:scylladb/scylla:
  test: mutation_reader_test: reversed version of test_clustering_order_merger_sstable_set
  querier: consume_page(): remove now unused max_size parameter
  test/lib: mutation_source_test: test reading in reverse
  test: mutation_reader_test: clustering_combined_reader_mutation_source_test: prepare for reading in reverse
  test: flat_mutation_reader_test: test_reverse_reader_is_mutation_source: prepare for reading in reverse
  test: mutation_reader_test: test_manual_paused_evictable_reader_is_mutation_source: use query schema instead of table schema
  treewide: move reversing to the mutation sources
  mutation_query: reconcilable_result_builder: document reverse query preconditions
  sstable_set: time_series_sstable_set: reverse mode
  mutlishard_mutation_query: set max result size on used permits
  db/virtual_table: streaming_virtual_table::as_mutation_source(): use query schema instead of table schema
  flat_mutation_reader: make_reversing_reader(): add convenience stored slice
  mutation_reader: evictable_reader: add reverse read support
  flat_mutation_reader: make_flat_mutation_reader_from_fragments(): add reverse read support
  flat_mutation_reader: flat_mutation_reader_from_mutations(): add reverse read support
  flat_mutation_reader: flat_mutation_reader_from_mutations(): document preconditions
  query-request: introduce `half_reverse_slice`
  flat_mutation_reader_assertions: log what's expected
2021-09-29 12:57:57 +02:00
Avi Kivity
d4aa6c2746 Merge "compaction: Update backlog tracker correctly when schema is updated" from Raphael
"
Backlog tracker isn't updated correctly when facing a schema change, and
may leak a SSTable if compaction strategy is changed, which causes
backlog to be computed incorrectly. Most of these problems happen because
sstable set and tracker are updated independently, so it could happen
that tracker lose track (pun intended) of changes applied to set.

The first patch will fix the leak when strategy is changed, and the third
patch will make sure that tracker is updated atomically with sstable set,
so these kind of problems will not happen anymore.

Fixes #9157
"

* 'fixes_to_backlog_tracker_v4' of github.com:raphaelsc/scylla:
  compaction: Update backlog tracker correctly when schema is updated
  compaction: Don't leak backlog of input sstable when compaction strategy is changed
  compaction: introduce compaction_read_monitor_generator::remove_exhausted_sstables()
  compaction: simplify removal of monitors
2021-09-29 13:55:37 +03:00
Kamil Braun
075a894a89 test: mutation_reader_test: reversed version of test_clustering_order_merger_sstable_set 2021-09-29 12:15:48 +03:00
Botond Dénes
42b677ef6f querier: consume_page(): remove now unused max_size parameter 2021-09-29 12:15:48 +03:00
Botond Dénes
bc49c27a06 test/lib: mutation_source_test: test reading in reverse
To ensure all mutation sources uniformly support the current API of
reverse reading: reversed schema and half-reversed slice. This test will
also ensure that once we switch to native-reverse slice, all
mutation-sources will keep on working.
2021-09-29 12:15:48 +03:00
Kamil Braun
7d5273b044 test: mutation_reader_test: clustering_combined_reader_mutation_source_test: prepare for reading in reverse
For reversed reads we must adjust the lower/upper bounds used by the
`position_reader_queue` and `clustering_combined_reader`. The bounds are
calculated using the mutation schema, but we need bounds calculated
using the query schema which is reversed.
2021-09-29 12:15:48 +03:00
Botond Dénes
9399f379ec test: flat_mutation_reader_test: test_reverse_reader_is_mutation_source: prepare for reading in reverse
The mutation source test suite will soon test reads in reverse. Prepare
for this by checking the reversed flag on the slice and not reversing
the data when set. The test will have two modes effectively:
* Forward mode: data is reversed before read, the reversed again during
  read.
* Reverse mode: data is already reversed and it is reversed back during
  read.
2021-09-29 12:15:48 +03:00
Botond Dénes
c048d854d9 test: mutation_reader_test: test_manual_paused_evictable_reader_is_mutation_source: use query schema instead of table schema
The two might not be the same in case the schema was upgraded or if we
are reading in reverse. It is important to use the passed-in query
 schema consistently during a read.
2021-09-29 12:15:48 +03:00
Botond Dénes
41facb3270 treewide: move reversing to the mutation sources
Push down reversing to the mutation-sources proper, instead of doing it
on the querier level. This will allow us to test reverse reads on the
mutation source level.
The `max_size` parameter of `consume_page()` is now unused but is not
removed in this patch, it will be removed in a follow-up to reduce
churn.
2021-09-29 12:15:45 +03:00
Nadav Har'El
88177d7be7 test/alternator: add test for too many items in BatchWriteItem
DynamoDB limits the number of items that a BatchWriteItem call can write
to 25. As noted in issue #5057, in Alternator we don't have this limit
or any limit on the number of items in a BatchWriteItem - which probably
isn't wise.

This patch adds a simple xfailing test for this.

Refs #5057

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210912140736.76995-1-nyh@scylladb.com>
2021-09-29 10:48:58 +02:00
Nadav Har'El
a1bab2c4c9 Merge 'cql3: improve expression ergonomics' from Avi Kivity
The `expression` type (an std::variant) suffers from bad ergonomics:
 - std::variant has poor/no constraints, so compiler error messages are long and uninformative
 - it cannot be forward-declared (since std::variant does not support incomplete types)
 - the type name is long, polluting compiler error messages and debug symbols
 - it requires an artificial `nested_expression` when one expression is nested inside another

This series fixes those drawbacks by wrapping the variant in a class, adding constraints, and adding an extra indirection.

Test: unit (dev)

Closes #9402

* github.com:scylladb/scylla:
  cql3: expr: drop nested_expression
  cql3: expr: make expression forward declarable, easier to use
  cql3: expr: construct column_value explicitly
  cql3: expr: introduce as/as_if/is
  cql3: expr: introduce expr::visit, replacing std::visit
2021-09-29 10:47:39 +03:00
Takuya ASADA
cd7fe9a998 scylla_cpuscaling_setup: disable ondemand.service on Ubuntu
On Ubuntu, scaling_governor becomes powersave after rebooted, even we configured cpufrequtils.
This is because ondemand.service, it unconditionally change scaling_governor to ondemand or powersave.
cpufrequtils will start before ondemand.service, scaling_governor overwrite by ondemand.service.
To configure scaling_governor correctly, we have to disable this service.

Fixes #9324

Closes #9325
2021-09-29 10:32:34 +03:00
Avi Kivity
c72906a2ee cql3: expr: drop nested_expression
Now that expression can be nested in its component types
directly, we can remove nested_expression. Most of the patch
adjusts uses to drop the dereference that was needed for
nested_expression.
2021-09-28 23:49:21 +03:00
Avi Kivity
448c06f150 cql3: expr: make expression forward declarable, easier to use
Make expression a class, holding a unique_ptr to a variant,
instead of just a variant.

This has some advantages:
 - the constructor can be properly constrained
 - the type can be forward-declared
 - the type name is just "expression", rather than
   a huge variant. This makes compiler error messages easier
   to read.
 - the internal indirection allows removal of nested_expression
   (later in the series)
2021-09-28 23:49:21 +03:00
Avi Kivity
d43e72a747 cql3: expr: construct column_value explicitly
We have a few cases where a column_definition* is converted
directly to an expression without an explicit call to column_value{}.

The new expression implementation will not allow this, so make
these cases explicit. IMO this is better form than to rely
on the compiler picking the right expression subtype.
2021-09-28 23:49:21 +03:00
Avi Kivity
be44b579a1 cql3: expr: introduce as/as_if/is
Simple wrappers for std::get, std::get_if, std::holds_alternative.

The new names are shorter and IMO more readable.

Call sites are updated.

We will later replace the implementation.
2021-09-28 23:49:11 +03:00
Avi Kivity
e7db3def4f cql3: expr: introduce expr::visit, replacing std::visit
The new expr::visit() is just a wrapper around std::visit(),
but has better constraints. A call to expr::visit() with a
visitor that misses an overload will produce an error message
that points at the missing type. This is done using the new
invocable_on_expression concept. Note it lists the expression
types one by one rather than using template magic, since
otherwise we won't get the nice messages.

Later, we will change the implementation when expression becomes
our own type rather than std::variant.

Call sites are updated.
2021-09-28 23:48:42 +03:00
Botond Dénes
c7619de929 mutation_query: reconcilable_result_builder: document reverse query preconditions 2021-09-28 17:03:57 +03:00
Kamil Braun
7dc4ee35c9 sstable_set: time_series_sstable_set: reverse mode
`time_series_sstable_set` uses `clustering_combined_reader` to implement
efficient single-partition reads. It provides a `position_reader_queue`
to the reader. This queue returns readers to the sstables from the set
in order of the sstables' lower bounds, and with each reader it provides
an upper bound for the positions-in-partition returned by the reader.

Until now we would assume non-reversed queries only. Reversed queries
were implemented by performing forward query in the lower layers
and reversing the results at the upper-most layer of the reader stack.
Before pushing the reversing down to the sources (in particular,
to sstable readers), we need to support the reverse mode in
`time_series_sstable_set` and the queue it provides to
`clustering_combined_reader`.

This requires using different lower and upper bounds in the queue.
For non-reversed reads we used `sstable::min_position()` as the lower
bound and `sstable::max_position()` as the upper bound. For reversed
reads all comparisons performed by `clustering_combined_reader` will be
reversed, as it will use a reversed schema. We can then use
`sstable::max_position().reversed()` for the lower bound and
`sstable::min_position().reversed()` for the upper bound.
2021-09-28 17:03:57 +03:00
Botond Dénes
22e216563a mutlishard_mutation_query: set max result size on used permits
08042c1688 added the query max result size
to the permit but only set it for single partition queries. This patch
does the same for range-scans in preparation of `query::consume_page()`
not propagating max size soon.
2021-09-28 17:03:57 +03:00
Botond Dénes
dec282e050 db/virtual_table: streaming_virtual_table::as_mutation_source(): use query schema instead of table schema
The two might not be the same in case the schema was upgraded (unlikely
for virtual tables) or if we are reading in reverse. It is important to
use the passed-in query schema consistently during a read.
2021-09-28 17:03:57 +03:00
Botond Dénes
f5ef88c0c5 flat_mutation_reader: make_reversing_reader(): add convenience stored slice
This serves as a convenience slice storage for reads that have to
store an edited slice somewhere. This is common for reads that work
with a native-reversed slice and so have to convert the one used in the
query -- which is in half-reversed format.
2021-09-28 17:03:57 +03:00
Botond Dénes
2bd295ee80 mutation_reader: evictable_reader: add reverse read support
Evictable reader has to be made aware of reverse reads as it checks/edits the
slice. This shouldn't require reverse awareness normally, it is only
required because we still use the half-reversed (legacy) slice format
for reversed reads. Once we switch to the native format this commit can
be reverted.
2021-09-28 17:03:57 +03:00
Botond Dénes
eeebe4ab63 flat_mutation_reader: make_flat_mutation_reader_from_fragments(): add reverse read support
Implemented with the `make_reversing_reader()` adaptor.
2021-09-28 17:03:57 +03:00
Botond Dénes
cc222e5332 flat_mutation_reader: flat_mutation_reader_from_mutations(): add reverse read support
Implemented with the `make_reversing_reader()` adaptor.
2021-09-28 17:03:57 +03:00
Botond Dénes
1a2bdba25f flat_mutation_reader: flat_mutation_reader_from_mutations(): document preconditions 2021-09-28 17:03:57 +03:00
Kamil Braun
4bd601c6fd query-request: introduce half_reverse_slice
A utility function for converting between forward and half-reversed (or
'legacy'-reversed) slices to be used in the next commit.
2021-09-28 17:03:57 +03:00
Kamil Braun
270093b251 flat_mutation_reader_assertions: log what's expected 2021-09-28 17:03:57 +03:00
Tomasz Grabiec
c4328ffc4d tests: mutation_test: Add test for position_in_partition::reversed()
Message-Id: <20210927154942.44236-1-tgrabiec@scylladb.com>
2021-09-28 13:09:39 +02:00
Tomasz Grabiec
6bf873b663 Merge "raft: misc documentation edits" from Kostja
* scylla-dev/raft-misc-v4-docedit:
  raft: document pre-voting and protection against disruptive leaders
  raft: style edits of README.md.
  raft: document snapshot API
2021-09-28 12:12:46 +02:00
Konstantin Osipov
0adff23c21 raft: document pre-voting and protection against disruptive leaders 2021-09-27 22:04:18 +03:00
Konstantin Osipov
0e63e99b5a raft: style edits of README.md. 2021-09-27 22:04:04 +03:00
Konstantin Osipov
de2beac6ca raft: document snapshot API 2021-09-27 22:03:38 +03:00
Raphael S. Carvalho
9718173598 compaction: Update backlog tracker correctly when schema is updated
Currently the following can happen:
1) there's ongoing compaction with input sstable A, so sstable set
and backlog tracker both contains A.
2) ongoing compaction replaces input sstable A by B, so sstable set
contains only B now.
3) schema is updated, so a new backlog tracker is built without A
because sstable set now contains only B.
4) ongoing compaction tries to remove A from tracker, but it was
excluded in step 3.
5) tracker can now have a negative value if table is decreasing in
size, which leads to log(<negative number>) == -NaN

This problem happens because backlog tracker updates are decoupled
from sstable set updates. Given that the essential content of
backlog tracker should be the same as one of sstable set, let's move
tracker management to table.
Whenever sstable set is updated, backlog tracker will be updated with
the same changes, making their management less error prone.

Fixes #9157

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2021-09-27 14:15:29 -03:00
Raphael S. Carvalho
afd45b9f49 compaction: Don't leak backlog of input sstable when compaction strategy is changed
The generic backlog formula is: ALL + PARTIAL - COMPACTING

With transfer_ongoing_charges() we already ignore the effect of
ongoing compactions on COMPACTING as we judge them to be pointless.

But ongoing compactions will run to completion, meaning that output
sstables will be added to ALL anyway, in the formula above.

With stop_tracking_ongoing_compactions(), input sstables are never
removed from the tracker, but output sstables are added, which means
we end up with duplicate backlog in the tracker.

By removing this tracking mechanism, pointless ongoing compaction
will be ignored as expected and the leaks will be fixed.

Later, the intention is to force a stop on ongoing compactions if
strategy has changed as they're pointless anyway.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2021-09-27 14:03:28 -03:00
Raphael S. Carvalho
05126cfe29 compaction: introduce compaction_read_monitor_generator::remove_exhausted_sstables()
This new function makes it easier to remove monitor of exhausted
sstables.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2021-09-27 14:01:40 -03:00
Raphael S. Carvalho
35050a8217 compaction: simplify removal of monitors
by switching to unordered_map, removal of generated monitors is
made easier. this is a preparatory change for patch which will
remove monitor for all exhausted sstables

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2021-09-27 13:59:30 -03:00
Tomasz Grabiec
2b3ae6aca4 position_in_partition: Introduce reversed() transformation
It transforms the position from a forward-clustering-order schema
domain to a reversed-clustering-order schema domain.

The object still refers to the same element of the space of keys under this
transformation. However, the identification of the position,
the position_in_partition object, is schema-dependent, it is always
interpreted relative to some schema. Hence the need to transform it
when switching schema domains.

Message-Id: <20210917102612.308149-1-tgrabiec@scylladb.com>
2021-09-27 14:23:09 +03:00
Gleb Natapov
78774a485a raft: drop local snapshot if it cannot be installed
If a locally taken snapshot cannot be installed because newer one was
received meanwhile it should be dropped, otherwise it will take space
needlessly.

Message-Id: <YUrWXxVfBjEio1Ol@scylladb.com>
2021-09-27 13:03:23 +02:00
Asias He
1657e7be14 gossiper: Send generation number with shutdown message
Consider:
- n1, n2 in the cluster
- n2 shutdown
- n2 sends gossip shutdown message to n1
- n1 delays processing of the handler of shutdown message
- n2 restarts
- n1 learns new gossip state of n2
- n1 resumes to handle the shutdown message
- n1 will mark n2 as shutdown status incorrectly until n2 restarts again

To prevent this, we can send the gossip generation number along with the
shutdown message. If the generation number does not match the local
generation number for the remote node, the shutdown message will be
ignored.

Since we use the rpc::optional to send the generation number, it works
with mixed cluster.

Fixes #8597

Closes #9381
2021-09-27 11:08:43 +03:00
Avi Kivity
d7ac699a55 Revert "Merge "compaction: Update backlog tracker correctly when schema is updated" from Raphael"
This reverts commit b5cf0b4489, reversing
changes made to e8493e20cb. It causes
segmentation faults when sstable readers are closed.

Fixes #9388.
2021-09-26 18:31:49 +03:00
Avi Kivity
bf94c06fc7 Revert "Merge "simplifications and layer violation fix for compaction manager" from Raphael"
This reverts commit 7127c92acc, reversing
changes made to 88480ac504. We need to
revert b5cf0b4489 to fix #9388, and this stands
in the way.

Ref #9388.
2021-09-26 18:30:36 +03:00
Piotr Sarna
06f724857f transport: remove unused map of stream_id->query states
The map is never touched, so it only occupies precious space
for each connection.

Closes #9383
2021-09-26 13:41:58 +03:00
Avi Kivity
936de92876 Merge 'cql3: Add evaluate(expression) and use instead of term::bind()' from Jan Ciołek
This PR adds the function:
```c++
constant evaluate(const expression&, const query_options&);
```
which evaluates the given expression to a constant value.
It binds all the bound values, calls functions, and reduces the whole expression to just raw bytes and `data_type`, just like `bind()` and `get()` did for `term`.

The code is often similar to the original `bind()` implementation in `lists.cc`, `sets.cc`, etc.

* For some reason in the original code, when a collection contains `unset_value`, then the whole collection is evaluated to `unset_value`. I'm not sure why this is the case, considering it's impossible to have `unset_value` inside a collection, because we forbid bind markers inside collections. For example here: cc8fc73761/cql3/lists.cc (L134)
This seems to have been introduced by Pekka Enberg in 50ec81ee67, but he has left the company.
I didn't change the behaviour, maybe there is a reason behind it, although maybe it would be better to just throw `invalid_request_exception`.
* There was a strange limitation on map key size, it seems incorrect: cc8fc73761/cql3/maps.cc (L150), but I left it in.
* When evaluating a `user_type` value, the old code tolerated `unset_value` in a field, but it was later converted to NULL. This means that `unset_value` doesn't work inside a `user_type`, I didn't change it, will do in another PR.
* We can't fully get rid of `bind()` yet, because it's used in `prepare_term` to return a `terminal`. It will be removed in the next PR, where we finally get rid of `term`.

Closes #9353

* github.com:scylladb/scylla:
  cql3: types: Optimize abstract_type::contains_collection
  cql3: expr: Convert evaluate_IN_list to use evaluate(expression)
  cql3: expr: Use only evaluate(expression) to evaluate term
  cql3: expr: Implement evaluate(expr::function_call)
  cql3: expr: Implement evaluate(expr::usertype_constructor)
  cql3: expr: Implement evaluate(expr::collection_constructor)
  cql3: expr: Implement evaluate(expr::tuple_constructor)
  cql3: expr: Implement evaluate(expr::bind_variable)
  cql3: Add contains_collection/set_or_map to abstract_type
  cql3: expr: Add evaluate(expression, query_options)
  cql3: Implement term::to_expression for function_call
  cql3: Implement term::to_expression for user_type
  cql3: Implement term::to_expression for collections
  cql3: Implement term::to_expression for tuples
  cql3: Implement term::to_expression for marker classes
  cql3: expr: Add data_type to *_constructor structs
  cql3: Add term::to_expression method
  cql3: Reorganize term and expression includes
2021-09-26 12:58:11 +03:00
Eliran Sinvani
0b2861d014 Prepare for inheriting from reader_concurrency_semaphore
Some future and enterprise features requires us to inherit from
reader_concurrency_semaphore, this might require additional
"wrap up" operations to be done on stop which serves as a barrier
for the semaphore. Here we simply make stop virtual so it is
inherited and can be augmented.
This change have no significant impact on  performance since stop
can get called once in a lifetime of a semaphore.
The approach is to add two extenction points to the
reader_concurrency_semaphore class, one just before the stop code is
executed and one just after.

Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>

Closes #9373
2021-09-26 12:57:48 +03:00
Avi Kivity
2d352820f4 Update tools/java and tools/jmx submodules
* tools/java 9c5c0ad1fd...05ec511bbb (2):
  > reloc/build_reloc.sh: Add missing space
  > reloc: stop removing entire BUILDDIR

* tools/jmx 658818b...5c383b6 (1):
  > reloc: stop removing entire $BUILDDIR
2021-09-26 12:33:55 +03:00
Pavel Emelyanov
88e5b7c547 database: Shutdown in tests
There's a circular dependency:

  query processor needs database
  database owns large_data_handler and compaction_manager
  those two need qctx
  qctx owns a query_processor

Respectively, the latter hidden dependency is not "tracked" by
constructor arguments -- the query processor is started after
the database and is deferred to be stopped before it. This works
in scylla, because query processor doesn't really stop there,
but in cql_test_env it's problematic as it stops everything,
including the qctx.

Recent database start-stop sanitation revealed this problem --
on database stop either l.d.h. or compaction manager try to
start (or continue) messing with the query processor. One problem
was faced immediatelly and pluged with the 75e1d7ea safety check
inside l.d.h., but still cql_test_env tests continue suffering
from use after free on stopped query processor.

The fix is to partially revert the 4b7846da by making the tests
stop some pieces of the database (inclusing l.d.h. and compaction
manager) as it used to before. In scylla this is, probably, not
needed, at least now -- the database shutdown code was and still
is run right before the stopping one.

tests: unit(debug)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20210924080248.11764-1-xemul@scylladb.com>
2021-09-26 11:09:01 +03:00
Benny Halevy
7498ac4869 dht: boot_strapper: bootstrap: fixup indentation
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20210923144206.1690576-2-bhalevy@scylladb.com>
2021-09-26 11:09:01 +03:00
Benny Halevy
798aee6747 dht: boot_strapper: coroutinize bootstrap
Prepare for futurizing get_pending_address_ranges.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20210923144206.1690576-1-bhalevy@scylladb.com>
2021-09-26 11:09:01 +03:00