Commit Graph

27891 Commits

Author SHA1 Message Date
Avi Kivity
bd6460f00a sstables: convert parse(summary&) to a coroutine 2021-08-18 18:21:33 +03:00
Raphael S. Carvalho
3a1cf3aa88 database: document database::get_keyspace_local_ranges()
Documentation was extracted from abstract_replication_strategy::get_ranges(),
which says:
    // get_ranges() returns the list of ranges held by the given endpoint.
    // The list is sorted, and its elements are non overlapping and non wrap-around.

That's important because users of get_keyspace_local_ranges() expect
that the returned list is both sorted and non overlapping, so let's
document it to prevent someone from removing any of these properties.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20210805140628.537368-1-raphaelsc@scylladb.com>
2021-08-17 21:44:24 +03:00
Asias He
eaf4d2afb4 storage_service: Generate view update for load and stream
Currently, view will be not updated because the streaming reason is set
to streaming::stream_reason::rebuild. On the receiver side, only
streaming with the reason streaming::stream_reason::repair will trigger
view update.

Change the stream reason to repair to trigger view update for load and
stream. This makes load_and_stream behaves the same as nodetool refresh.

Note: However, this is not very efficient though.

Consider RF = 3, sst1, sst2, sst3 from the older cluster. When sst1 is
loaded, it streams to 3 replica nodes, if we generate view updates, we
will have 3 view updates for this replica (each of the peer nodes finds
its peer and writes the view update to peer). After loading sst2 and
sst3, we will have 9 view updates in total for a single partition.
If we create the view after the load and stream process, we will only
have 3 view updates for a single partition.

If we create the view after the load and stream process, we will only
have 3 view updates for a single partition.

Fixes #9205

Closes #9213
2021-08-17 21:44:24 +03:00
Tomasz Grabiec
9fe3e86368 db: Print more fields of read_command
Message-Id: <20210810143752.420988-1-tgrabiec@scylladb.com>
2021-08-17 12:24:40 +03:00
Pavel Emelyanov
9c7bcd1d85 bound_view: Rewrite tri_compare() tail
The new implementation is shorter and allows compiler to
produce nicer assembly. In particular clang-11 and -O3 flag:

Was:
    if (d1 == d2) {
        return w1 - w2;
    }
    return d1 < d2 ? w1 - (w1 <= 0) : -(w2 - (w2 <= 0));

    89 f0           mov    %esi,%eax
    39 d7           cmp    %edx,%edi
    74 13           je     403f69 <_Z7cmp_intiiii+0x19>
    7d 0a           jge    403f62 <_Z7cmp_intiiii+0x12>
    31 c9           xor    %ecx,%ecx
    85 c0           test   %eax,%eax
    0f 9e c1        setle  %cl
    29 c8           sub    %ecx,%eax
    c3              retq
    31 c0           xor    %eax,%eax
    85 c9           test   %ecx,%ecx
    0f 9e c0        setle  %al
    29 c8           sub    %ecx,%eax
    c3              retq

14 instructions 2 cond jumps, 2 cond sets

Now:
    return ((d1 <= d2) ? w1 << 1 : 1) - ((d2 <= d1) ? w2 << 1 : 1);

    8d 04 36        lea    (%rsi,%rsi,1),%eax
    39 d7           cmp    %edx,%edi
    be 01 00 00 00  mov    $0x1,%esi
    0f 4f c6        cmovg  %esi,%eax
    01 c9           add    %ecx,%ecx
    39 fa           cmp    %edi,%edx
    0f 4f ce        cmovg  %esi,%ecx
    29 c8           sub    %ecx,%eax
    c3              retq

9 instructions, 0 cond jumps, 2 cond movs

tests: unit(dev), perf(simple_query, release)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20210730092629.18940-1-xemul@scylladb.com>
2021-08-16 17:17:27 +03:00
Tomasz Grabiec
09b575474b Merge "test: raft: generators infrastructure with an actual random nemesis test" from Kamil
Operations and generators can be composed to create more complex
operations and generators. There are certain composition patterns useful
for many different test scenarios.

We implement a couple of such patterns. For example:
- Given multiple different operation types, we can create a new
  operation type - `either_of` - which is a "union" of the original
  operation types. Executing `either_of` operation means executing an
  operation of one of the original types, but the specific type
  can be chosen in runtime.
- Given a generator `g`, `op_limit(n, g)` is a new generator which
  limits the number of operations produced by `g`.
- Given a generator `g` and a time duration of `d` ticks, `stagger(g, d)` is a
  new generator which spreads the operations from `g` roughly every `d`
  ticks. (The actual definition in code is more general and complex but
  the idea is similar.)

Some of these patterns have correspodning notions in Jepsen, e.g. our
`stagger` has a corresponding `stagger` in Jepsen (although our
`stagger` is more general).

Finally, we implement a test that uses this new infrastructure.

Two `Executable` operations are implemented:
- `raft_call` is for calling to a Raft cluster with a given state
  machine command,
- `network_majority_grudge` partitions the network in half,
  putting the leader in the minority.

We run a workload of these operations against a cluster of 5 nodes with
6 threads for executing the operations: one "nemesis thread" for
`network_majority_grudge` and 5 "client threads" for `raft_call`.
Each client thread randomly chooses a contact point which it tries first
when executing a `raft_call`, but it can also "bounce" - call a
different server when the previous returned "not_a_leader" (we use the
generic "bouncing" wrapper to do this).

For now we only print the resulting history. In a follow-up patchset
we will analyze it for consistency anomalies.

* kbr/raft-test-generator-v4:
  test: raft: randomized_nemesis_test: a basic generator test
  test: raft: generator: a library of basic generators
  test: raft: introduce generators
  test: raft: introduce `future_set`
  test: raft: randomized_nemesis_test: handle `raft::stopped_error` in timeout futures
2021-08-16 15:55:25 +02:00
Kamil Braun
3344ac8a6c test: raft: randomized_nemesis_test: a basic generator test
The previous commits introduced basic the generator concept and a
library of most common composition patterns.

In this commit we implement a test that uses this new infrastructure.

Two `Executable` operations are implemented:
- `raft_call` is for calling to a Raft cluster with a given state
  machine command,
- `network_majority_grudge` partitions the network in half,
  putting the leader in the minority.

We run a workload of these operations against a cluster of 5 nodes with
6 threads for executing the operations: one "nemesis thread" for
`network_majority_grudge` and 5 "client threads" for `raft_call`.
Each client thread randomly chooses a contact point which it tries first
when executing a `raft_call`, but it can also "bounce" - call a
different server when the previous returned "not_a_leader" (we use the
generic "bouncing" wrapper to do this).

For now we only print the resulting history. In a follow-up patchset
we will analyze it for consistency anomalies.
2021-08-16 13:07:08 +02:00
Kamil Braun
66ec484730 test: raft: generator: a library of basic generators
Operations and generators can be composed to create more complex
operations and generators. There are certain composition patterns useful
for many different test scenarios.

This commit introduces a couple of such patterns. For example:
- Given multiple different operation types, we can create a new
  operation type - `either_of` - which is a "union" of the original
  operation types. Executing `either_of` operation means executing an
  operation of one of the original types, but the specific type
  can be chosen in runtime.
- Given a generator `g`, `op_limit(n, g)` is a new generator which
  limits the number of operations produced by `g`.
- Given a generator `g` and a time duration of `d` ticks, `stagger(g, d)` is a
  new generator which spreads the operations from `g` roughly every `d`
  ticks. (The actual definition in code is more general and complex but
  the idea is similar.)

And so on.

Some of these patterns have correspodning notions in Jepsen, e.g. our
`stagger` has a corresponding `stagger` in Jepsen (although our
`stagger` is more general).
2021-08-16 13:07:08 +02:00
Kamil Braun
d8863c5a7b test: raft: introduce generators
We introduce the concepts of "operations" and "generators", basic
building blocks that will allow us to declaratively write randomized
tests for torturing simulated Raft clusters.

An "operation" is a data structure representing a computation which
may cause side effects such as calling a Raft cluster or partitioning
the network, represented in the code with the `Executable` concept.
It has an `execute` function performing the computation and returns
a result of type `result_type`. Different computations of the same type
share state of type `state_type`. The state can, for example, contain
database handles.

Each execution is performed on an abstract `thread' (represented by a `thread_id`)
and has a logical starting time point. The thread and start point together form
the execution's `context` which is passed as a reference to `execute`.

Two operations may be called in parallel only if they are on different threads.

A generator, represented through the `Generator` concept, produces a
sequence of operations. An operation can be fetched from a generator
using the `op` function, which also returns the next state of the
generator (generators are purely functional data structures).

The generator concept is inspired by the generators in the Jepsen
testing library for distributed systems.

We also implement `interpreter` which "interprets", or "runs", a given
generator, by fetching operations from the generator and executing them
with concurrency controlled by the abstract threads.

The algorithm used in the interpreter is also similar to the interpreter
algorithm in Jepsen, although there are differences. Most notably we don't
have a "worker" concept - everything runs on a single shard; but we use
"abstract threads" combined with futures for concurrency.
There is also no notion of "process". Finally, the interpreter doesn't
keep an explicit history, but instead uses a callback `Recorder` to notify
the user about operation invocations and completions. The user can
decide to save these events in a history, or perhaps they can analyze
them on the fly using constant memory.
2021-08-16 13:07:08 +02:00
Kamil Braun
421b1b9494 test: raft: introduce future_set
A set of futures that can be polled.

Polling the set (`poll` function) returns the value of one of
the futures which became available or `std::nullopt` if the given
logical durationd passes (according to the given timer), whichever
event happens first.  The current implementation assumes sequential
polling.

New futures can be added to the set with `add`.
All futures can be removed from the set with `release`.
2021-08-16 13:07:08 +02:00
Kamil Braun
a5e92e1c45 test: raft: randomized_nemesis_test: handle raft::stopped_error in timeout futures
The timeout futures in `call` and `reconfigure` may be discarded after
Raft servers were `abort()`ed which would result in
`raft::stopped_error` and the test complained about discarded
exceptional futures. Discard these errors explicitly.
2021-08-16 13:07:08 +02:00
Takuya ASADA
cb19048186 docker: use dist/common/supervisor script for docker
supervisor scripts for Docker and supervisor scripts for offline
installer are almost same, drop Docker one and share same code to
deduplicate them.

Closes #9143

Fixes #9194
2021-08-16 13:36:14 +03:00
Avi Kivity
0ba697d515 Merge 'Add service level config change subscription API' from Eliran Sinvani
In order to decouple the service level controller from the systems logic, we introduce an API for subscribing to configuration changes. The timing of the call was determined with resource creation and destruction in mind. An API subscriber can create
resources that will be available from the very start of the service level existence it can also destroy them since the service level
is guarantied not to exist anymore at the time of the call to the deletion notification callback.

Testing:
unit tests - all + a newly added one.
dtests - next-gating (dev mode)

Closes #9097

* github.com:scylladb/scylla:
  service level controller: Subscriber API unit test
  Service Level Controller: Add a listener API for service level config changes
2021-08-16 11:47:33 +03:00
Eliran Sinvani
403db8e943 service level controller: Subscriber API unit test
Here we add a very simple unit test for the configuration
change API.
2021-08-16 11:38:59 +03:00
Eliran Sinvani
47d3862b63 Service Level Controller: Add a listener API for service level config
changes

This change adds an api for registering a listener for service_level
configuration chanhes. It notifies about removal addition and change of
service level.
The hidden assumption is that some listeners are going to create and/or
manage service level specific resources and this it what guided the
time of the call to the subscriber.
Addition and change of a service level are called before the actual
change takes place, this guaranties that resource creation can take
place before the service level or new config starts to be used.
The deletion notification is called only after the deletion took place
and this guranties that the service level can't be active and the
resources created can be safely destroyed.
2021-08-16 11:38:59 +03:00
Pavel Emelyanov
6dd67012bb main: Fix internode encryption warning check
It should check for dc || rack, not dc || dc. The correct behavior
is described in both -- the warning message and the commit that
introduced it (a0745f94).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20210730094549.19477-1-xemul@scylladb.com>
2021-08-16 11:14:20 +03:00
Calle Wilund
3633c077be commitlog/config: Make hard size enforcement false by default + add config opt
Refs #9053

Flips default for commitlog disk footprint hard limit enforcement to off due
to observed latency stalls with stress runs. Instead adds an optional flag
"commitlog_use_hard_size_limit" which can be turned on to in fact do enforce it.

Sort of tape and string fix until we can properly tweak the balance between
cl & sstable flush rate.

Closes #9195
2021-08-15 15:10:27 +03:00
Asias He
97bb2e47ff storage_service: Enable Repair Based Note Operations (RBNO) by default for replace
We decided to enable repair based node operations by default for replace
node operations.

To do that, a new option --allowed-repair-based-node-ops is added. It
lists the node operations that are allowed to enable repair based node
operations.

The operations can be bootstrap, replace, removenode, decommission and rebuild.

By default, --allowed-repair-based-node-ops is set to contain "replace".

Note, the existing option --enable-repair-based-node-ops is still in
play. It is the global switch to enable or disable the feature.

Examples:

- To enable bootstrap and replace node ops:

```
scylla --enable-repair-based-node-ops true --allowed-repair-based-node-ops replace,bootstrap
```

- To disable any repair based node ops:

```
scylla --enable-repair-based-node-ops false
```

Closes #9197
2021-08-15 13:30:46 +03:00
Nadav Har'El
b53eeb8a6c Merge 'Enable user-defined aggregates' from Piotr Sarna
It turns out that user-defined aggregates did not need any elaborate coding in order to make them exposed to the users. The whole infrastructure is already there, including system schema tables and support for running aggregate queries, so this series simply adds lots and lots of boilerplate glue code to make UDA usable.

It also comes with a simple test which shows that it's possible to define and use such an aggregate.

Performance not tested, since user-defined functions are still experimental, so nothing really changes in this matter.

Tests: unit(release)

Fixes #7201

Closes #9165

* github.com:scylladb/scylla:
  cql-pytest: add a test suite for user-defined aggregates
  cql-pytest: add context managers for functions and aggregates
  cql3: enable user-defined aggregates in CQL grammar
  cql3: add statements for user-defined aggregates
  cql3,functions: add checking if a function is used in UDA
  gms: add UDA feature
  migration_manager: add migrating user-defined aggregates
  db,schema_tables: add handling user-defined aggregates
  pagers: make a lambda mutable in fetch_page
  cql3: wrap handling paging result with with_thread_if_needed
  cql3: correctly mark function selectors as needing threads
  cql3: add user-define aggregate representation
2021-08-14 12:14:12 +03:00
Piotr Sarna
38c1fd0762 cql-pytest: add a test suite for user-defined aggregates
The test suite now consists of a single user aggregate:
a custom implementation for existing avg() built-in function,
as well as a couple of cases for catching incorrect operations,
like using wrong function signatures or dropping used functions.
2021-08-13 11:16:52 +02:00
Piotr Sarna
5f773d04d2 cql-pytest: add context managers for functions and aggregates
These context managers can be used to create temporary
user-defined functions and user-defined aggregates.
2021-08-13 11:16:52 +02:00
Piotr Sarna
2ebf018e74 cql3: enable user-defined aggregates in CQL grammar
Statements for creating and dropping user-defined aggregates
are now accepted by the grammar and can be used by the users.
2021-08-13 11:16:52 +02:00
Piotr Sarna
ec25cf965e cql3: add statements for user-defined aggregates
The following statements are added:
 - CREATE AGGREGATE
 - DROP AGGREGATE
2021-08-13 11:16:52 +02:00
Piotr Sarna
a9ae753cd6 cql3,functions: add checking if a function is used in UDA
If a function is used by a user-defined aggregate, it must not
be dropped or the aggregate would be left with a dangling function.
2021-08-13 11:16:47 +02:00
Piotr Sarna
da67c594c8 gms: add UDA feature
UDA stands for user-defined aggregates and the feature implies
that the whole cluster supports them.
2021-08-13 11:14:12 +02:00
Piotr Sarna
e1be04852b migration_manager: add migrating user-defined aggregates
User-defined aggregate creation and deletion can now be announced.
2021-08-13 11:14:12 +02:00
Piotr Sarna
84876a165b db,schema_tables: add handling user-defined aggregates
Aggregates are propagated, created and dropped very similarly
to user-defined functions - a set of helper functions
for aggregates are added based on the UDF implementation.
2021-08-13 11:14:11 +02:00
Piotr Sarna
ad2093539b pagers: make a lambda mutable in fetch_page
The lambda passed to with_thread_if_needed helper function
relies on moving its captured parameters, so it's made mutable
in order to avoid copying.
2021-08-13 11:13:43 +02:00
Piotr Sarna
260604d053 cql3: wrap handling paging result with with_thread_if_needed
One of the pagers did not spawn a Seastar thread even if it was
required by its underlying selectors - the behavior is now fixed.
2021-08-13 11:13:43 +02:00
Piotr Sarna
cac321cd12 cql3: correctly mark function selectors as needing threads
Function call selectors correctly checked if their arguments
are required to run in threaded context, but forgot to check
the function itself - which is now done.
2021-08-13 11:13:43 +02:00
Piotr Sarna
ee81453596 cql3: add user-define aggregate representation
A user-defined aggregate is represented as an aggregate
which calls its state function on each input row
and then finalizes its execution by calling its final function
on the final state, after all rows were already processed.
2021-08-13 11:13:41 +02:00
Piotr Sarna
58196e8ea6 db,view: avoid ignoring failed future in background view updates
The code for handling background view updates used to propagate
exceptions unconditionally, which leads to "exceptional future
ignored" warnings if the update was put to background.
From now on, the exception is only propagated if its future
is actually waited on.

Fixes #6187

Tested manually, the warning was not observed after the patch

Closes #9179
2021-08-12 17:32:35 +03:00
Piotr Sarna
ea0e0c924d configure,install-dependencies: add wasmtime dependency
If the wasmtime library is available for download, it will be
set up by install-dependencies and prepared for linking.

Closes #9151

[avi: regenerate toolchain, which also updates clang to 12.0.1]
2021-08-12 12:33:43 +03:00
Asias He
cc44edb4e2 database: Detemplate run_async
I initially tried to use a noncopyable_function to avoid the unnecessary
template usage.

However, since database::apply_in_memory is a hot function. It is better
to use with_gate directly. The run_async function does nothing but calls
with_gate anyway.

Closes #9160
2021-08-12 07:53:10 +03:00
Takuya ASADA
e5bb88b69a scylla_cpuscaling_setup: change scaling_governor path
On some environment /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
does not exist even it supported CPU scaling.
Instead, /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor is
avaliable on both environment, so we should switch to it.

Fixes #9191

Closes #9193
2021-08-11 15:31:14 +03:00
Nadav Har'El
89724533f8 test/cql-pytest: CREATE INDEX IF NOT EXISTS vs. Cassandra
What should the following pair of statements do?

    CREATE INDEX xyz ON tbl(a)
    CREATE INDEX IF NOT EXISTS xyz ON tbl(b)

There are two reasonable choices:
1. An index with the name xyz already exists, so the second command should
   do nothing, because of the "IF NOT EXISTS".
2. The index on tbl(b) does *not* yet exist, so the command should try to
   create it. And when it can't (because the name xyz is already taken),
   it should produce an error message.

Currently, Cassandra went with choice 1, and Scylla went with choice 2.

After some discussions on the mailing list, we agreed that Scylla's
choice is the better one and Cassandra's choice could be considered a
bug: The "IF NOT EXIST" feature is meant to allow idempotent creation of
an index - and not to make it easy to make mistakes without not noticing.
The second command listed above is most likely a mistake by the user,
not anything intentional: The command intended to ensure than an index
on column b exists, but after the silent success of the command, no such
index exists.

So this patch doesn't change any Scylla code (it just adds a comment),
and rather it adds a test which "enshrines" the current behavior.
The test passes on Scylla and fails on Cassandra so we tag it
"cassandra_bug", meaning that we consider this difference to be
intentional and we consider Cassandra's behavior in this case to be wrong.

Fixes #9182.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210811113906.2105644-1-nyh@scylladb.com>
2021-08-11 13:41:58 +02:00
Asias He
ce8fd051c9 storage_service: Fix argument in send_meta_data::do_receive
The extra status print is not needed in the log.

Fixes the following error:

ERROR 2021-08-10 10:54:21,088 [shard 0] storage_service -
service/storage_service.cc:3150 @do_receive: failed to log message:
fmt='send_meta_data: got error code={}, from node={}, status={}':
fmt::v7::format_error (argument not found)

Fixes #9183

Closes #9189
2021-08-11 11:35:30 +02:00
Asias He
040b626235 table: Fix is_shared assert for load and stream
The reader is used by load and stream to read sstables from the upload
directory which are not guaranteed to belong to the local shard.

Using the make_range_sstable_reader instead of
make_local_shard_sstable_reader.

Tests:

backup_restore_tests.py:TestBackupRestore.load_and_stream_using_snapshot_test
backup_restore_tests.py:TestBackupRestore.load_and_stream_to_new_cluster_2_test
backup_restore_tests.py:TestBackupRestore.load_and_stream_to_new_cluster_1_test
migration_test.py:TestLoadAndStream.load_and_stream_asymmetric_cluster_test
migration_test.py:TestLoadAndStream.load_and_stream_decrease_cluster_test
migration_test.py:TestLoadAndStream.load_and_stream_frozen_pk_test
migration_test.py:TestLoadAndStream.load_and_stream_increase_cluster_test
migration_test.py:TestLoadAndStream.load_and_stream_primary_replica_only_test

Fixes #9173

Closes #9185
2021-08-11 12:18:40 +03:00
Piotr Jastrzebski
db4c9199f5 sstables: remove unused uppermost_bound from clustering_ranges_walker and mutation_fragment_filter
Those methods are never used so it's better not to keep a dead code
around.

Tests: unit(dev)

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>

Closes #9188
2021-08-11 10:54:59 +02:00
Nadav Har'El
49ca1f86b2 Merge 'hints: error injection for pausing hint replay' from Piotr Dulikowski
Adds a `hinted_handoff_pause_hint_replay` error injection point. When
enabled, hint replay logic behaves as if it is run, but it gets stuck in
a loop and no hints are actually sent until the point is disabled again.

This injection point will be useful in dtests - it will simulate
infinitely slow hint replay and will make it possible to test how some
operations behave while hint replay logic is running.

The first intended use case of this injection point is testing the HTTP
API for waiting for hints (#8728).

Refs: #6649

Closes #8801

* github.com:scylladb/scylla:
  hints: fix indentation after previous patch
  hints: error injection for pausing hint replay
  hints: coroutinize lambda inside send_one_file
2021-08-11 11:42:29 +03:00
Piotr Dulikowski
f2e1339f38 hints: use an abort_source with sleep_abortable in flush+send loop
Each hint sender runs an asynchronous loop with tries to flush and then
send hints. Between each attempt, it sleeps at most 10 seconds using
sleep_abortable. However, an overload of sleep_abortable is used which
does not take an abort_source - it should abort the sleep in case
Seastar handles a SIGINT or SIGTERM signal. However, in order for that
to work, the application must not prevent default handling of those
signals in Seastar - but Scylla explicitly does it by disabling the
`auto_handle_sigint_sigterm` option in reactor config. As a result,
those sleeps are never aborted, and - because we wait for the async
loops to stop - they can delay shutdown by at most 10 seconds.

To fix that, an abort_source is added to the hints sender, and the
abort_source is triggered when the corresponding sender is requested to
stop.

Fixes: #9176

Closes #9177
2021-08-11 10:32:53 +02:00
Tomasz Grabiec
e177cd382b db: Remove superfluous } from read_command printout
Message-Id: <20210810131429.407903-1-tgrabiec@scylladb.com>
2021-08-10 17:32:34 +03:00
Michał Chojnowski
2aa0a2e6a1 test: perf: perf_collection: use the optimized version of bptree
Since key_compare does not conform to SimpleLessCompare, the benchmark
tests the non-optimized version of bptree (without SIMD key search).
We want to test the optimized version.

Closes #9180
2021-08-10 17:04:34 +03:00
Nadav Har'El
65381bd155 test/alternator: add tests for expression length limits
The DynamoDB documentation
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Limits.html
describes several hard limits on the size of the size of expressions
(ProjectionExpression, ConditionExpression, UpdateExpression,
FilterExpression) and various elements they contain.

In this patch we begin testing those limits with a comprehensive test for
the *length* of each of these four expressions: we test that lengths up to
(and including) 4096 bytes are allowed but longer expressions are rejected.
We also add TODOs for additional documented limits that should be tested
in the future.

Currently, this test passes on DynamoDB but xfails on Alternator because
Alternator does *not* enforce any limits on the expression length. I don't
think this is a real problem, and we may consider keeping it this way,
but we should at least be aware that this difference exists and an
xfailing test will remind us.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210810081948.2012120-2-nyh@scylladb.com>
2021-08-10 12:06:21 +02:00
Nadav Har'El
9d49a32486 test/alternator: add tests for attribute name limits
DynamoDB limits attribute names in items to lengths of up 65535 bytes,
but in some cases (such as key attributes) the limit is lower - 255.
This patch adds tests for many of these cases.

All the new tests pass on DynamoDB, but some still xfail on Alternator
because Alternator is too lenient - sometimes allowing longer attribute
names than DynamoDB allows. While this may sound great, it also has
downsides: The oversized attribute names perform badly, and as they
grow, Alternator's internal limits will be reached as well, and result
in an unsightly "internal server error" being reported instead of the
expected user-friendly error.

Refs #9169.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210810081948.2012120-1-nyh@scylladb.com>
2021-08-10 12:06:13 +02:00
Avi Kivity
112cee4960 Merge "make sstable::make_reader() return flat_mutation_reader_v2" from Michael
"
* Make `sstable::make_reader()` return `flat_mutation_reader_v2`,
  retain the old one as `sstable::make_reader_v1()`

* Start weaning tests off `sstable::make_reader_v1()` (done all the
  easy ones, i.e. those not involving range tombstones)
"

* tag 'sstable-make-reader-v2-v1' of github.com:cmm/scylla:
  tests: use flat_mutation_reader_v2 in the easier part of sstable_3_x_test
  tests: upgrade the "buffer_overflow" test to flat_mutation_reader_v2
  tests: get rid of sstable::make_reader_v1() in broken_sstable_test
  tests: get rid of sstable::make_reader_v1() in the trivial cases
  sstables: make sstable::make_reader() return flat_mutation_reader_v2
2021-08-10 12:57:10 +03:00
Avi Kivity
a7ef826c2b Merge "Fold validation compaction into scrub" from Botond
"
Validation compaction -- although I still maintain that it is a good
descriptive name -- was an unfortunate choice for the underlying
functionality because Origin has burned the name already as it uses it
for a compaction type used during repair. This opens the door for
confusion for users coming from Cassandra who will associate Validation
compaction with the purpose it is used for in Origin.
Additionally, since Origin's validation compaction was not user
initiated, it didn't have a corresponding `nodetool` command to start
it. Adding such a command would create an operational difference between
us and Origin.

To avoid all this we fold validation compaction into scrub compaction,
under a new "validation" mode. I decided against using the also
suggested `--dry-mode` flag as I feel that a new mode is a more natural
choice, we don't have to define how it interacts with all the other
modes, unlike with a `--dry-mode` flag.

Fixes: #7736

Tests: unit(dev), manual(REST API)
"

* 'scrub-validation-mode/v2' of https://github.com/denesb/scylla:
  compaction/compaction_descriptor: add comment to Validation compaction type
  compaction/compaction_descriptor: compaction_options: remove validate
  api: storage_service: validate_keyspace -> scrub_keyspace (validate mode)
  compaction/compaction_manager: hide perform_sstable_validation()
  compaction: validation compaction -> scrub compaction (validate mode)
  compaction/compaction_descriptor: compaction_options: add options() accessor
  compaction/compaction_descriptor: compaction_options::scrub::mode: add validate
2021-08-10 12:18:35 +03:00
Michael Livshin
c0ba657a86 tests: use flat_mutation_reader_v2 in the easier part of sstable_3_x_test
That is, anything not involving range tombstones.

Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2021-08-09 19:20:48 +03:00
Michael Livshin
7c2854a094 tests: upgrade the "buffer_overflow" test to flat_mutation_reader_v2
Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2021-08-09 19:20:48 +03:00
Michael Livshin
a4c43eda3a tests: get rid of sstable::make_reader_v1() in broken_sstable_test
Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2021-08-09 19:20:48 +03:00