Commit Graph

4568 Commits

Author SHA1 Message Date
Botond Dénes
bae62f899d mutation/mutation_compactor: consume_partition_end(): reset _stop
The purpose of `_stop` is to remember whether the consumption of the
last partition was interrupted or it was consumed fully. In the former
case, the compactor allows retreiving the compaction state for the given
partition, so that its compaction can be resumed at a later point in
time.
Currently, `_stop` is set to `stop_iteration::yes` whenever the return
value of any of the `consume()` methods is also `stop_iteration::yes`.
Meaning, if the consuming of the partition is interrupted, this is
remembered in `_stop`.
However, a partition whose consumption was interrupted is not always
continued later. Sometimes consumption of a partitions is interrputed
because the partition is not interesting and the downstream consumer
wants to stop it. In these cases the compactor should not return an
engagned optional from `detach_state()`, because there is not state to
detach, the state should be thrown away. This was incorrectly handled so
far and is fixed in this patch, but overwriting `_stop` in
`consume_partition_end()` with whatever the downstream consumer returns.
Meaning if they want to skip the partition, then `_stop` is reset to
`stop_partition::no` and `detach_state()` will return a disengaged
optional as it should in this case.

Fixes: #12629

Closes #13365
2023-03-29 17:48:45 +03:00
Calle Wilund
6525209983 alternator/rest api tests: Remove name assumption and rely on actual scylla info
Fixes #13332
The tests user the discriminator "system" as prefix to assume keyspaces are marked
"internal" inside scylla. This is not true in enterprise universe (replicated key
provider). It maybe/probably should, but that train is sailing right now.

Fix by removing one assert (not correct) and use actual API info in the alternator
test.

Closes #13333
2023-03-28 15:41:23 +03:00
Botond Dénes
b6c022a142 Merge 'cmake: sync with configure.py (15/n)' from Kefu Chai
this is the 15th changeset of a series which tries to give an overhaul to the CMake building system. this series has two goals:
    - to enable developer to use CMake for building scylla. so they can use tools (CLion for instance) with CMake integration for better developer experience
    - to enable us to tweak the dependencies in a simpler way. a well-defined cross module / subsystem dependency is a prerequisite for building this project with the C++20 modules.

this changeset includes following changes:

 - build: cmake: add two missing tests
 - build: cmake: port more cxxflags from configure.py

Closes #13262

* github.com:scylladb/scylladb:
  build: cmake: add missing source files to idl and service
  build: cmake: port more cxxflags from configure.py
  build: cmake: add two missing tests
2023-03-28 09:16:38 +03:00
Kamil Braun
cd282cf0ab Merge 'Raft, use schema commit log' from Gusev Petr
We need this so that we can have multi-partition mutations which are applied atomically. If they live on different shards, we can't guarantee atomic write to the commitlog.

Fixes: #12642

Closes #13134

* github.com:scylladb/scylladb:
  test_raft_upgrade: add a test for schema commit log feature
  scylla_cluster.py: add start flag to server_add
  ServerInfo: drop host_id
  scylla_cluster.py: add config to server_add
  scylla_cluster.py: add expected_error to server_start
  scylla_cluster.py: ScyllaServer.start, refactor error reporting
  scylla_cluster.py: fix ScyllaServer.start, reset cmd if start failed
  raft: check if schema commitlog is initialized Refuse to boot if neither the schema commitlog feature nor force_schema_commit_log is set. For the upgrade procedure the user should wait until the schema commitlog feature is enabled before enabling consistent_cluster_management.
  raft: move raft initialization after init_system_keyspace
  database: rename before_schema_keyspace_init->maybe_init_schema_commitlog
  raft: use schema commitlog for raft tables
  init_system_keyspace: refactoring towards explicit load phases
2023-03-27 13:27:30 +02:00
Botond Dénes
b5afdf56c3 Merge 'Cleanup keyspace compaction task' from Aleksandra Martyniuk
Task manager task implementations of classes that cover
cleanup keyspace compaction which can be started through
/storage_service/keyspace_compaction/ api.

Top level task covers the whole compaction and creates child
tasks on each shard.

Closes #12712

* github.com:scylladb/scylladb:
  test: extend test_compaction_task.py to test cleanup compaction
  compaction: create task manager's task for cleanup keyspace compaction on one shard
  compaction: create task manager's task for cleanup keyspace compaction
  api: add get_table_ids to get table ids from table infos
  compaction: create cleanup_compaction_task_impl
2023-03-27 11:52:51 +03:00
Botond Dénes
ab61704c54 Merge 'mutation: replace operator<<(.., const range_tombstone&) with fmt formatter' from Kefu Chai
this is a part of a series migrating from `operator<<(ostream&, ..)` based formatting to fmtlib based formatting. the goal here is to enable fmtlib to print `range_tombstone` and `range_tombstone_change` without using ostream<<. also, this change removes all existing callers of `operator<<(ostream, const range_tombstone &)` and `operator<<(ostream, const range_tombstone_change &)`, and then removes these two `operator<<`s.

Refs #13245

Closes #13260

* github.com:scylladb/scylladb:
  mutation: drop operator<<(ostream, const range_tombstone{_change,} &)
  mutation: use fmtlib to print range_stombstone{_change,}
  mutation: mutation_fragment_v2: specialize fmt::formatter<range_tombstone_change>
  mutation: range_tombstone: specialize fmt::formatter<range_tombstone>
2023-03-27 11:38:59 +03:00
Kefu Chai
33f4012eeb test: cql-pytest: test_describe: clamp bloom filter's fp rate
before this change, we use `round(random.random(), 5)` for
the value of `bloom_filter_fp_chance` config option. there are
chances that this expression could return a number lower or equal
to 6.71e-05.

but we do have a minimal for this option, which is defined by
`utils::bloom_calculations::probs`. and the minimal false positive
rate is 6.71e-05.

we are observing test failures where the we are using 0 for
the option, and scylla right rejected it with the error message of
```
bloom_filter_fp_chance must be larger than 6.71e-05 and less than or equal to 1.0 (got 0)
```.

so, in this change, to address the test failure, we always use a number
slightly greater or equal to a number slightly greater to the minimum to
ensure that the randomly picked number is in the range of supported
false positive rate.

Fixes #13313
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13314
2023-03-26 19:41:22 +03:00
Avi Kivity
f937fad25a Merge 'readers/multishard: shard_reader: fast-forward created reader to current range' from Botond Dénes
When creating the reader, the lifecycle policy might return one that was saved on the last page and survived in the cache. This reader might have skipped some fast-forwarding ranges while sitting in the cache. To avoid using a reader reading a stale range (from the read's POV), check its read range and fast forward it if necessary.

Fixes: https://github.com/scylladb/scylladb/issues/12916

Closes #12932

* github.com:scylladb/scylladb:
  readers/multishard: shard_reader: fast-forward created reader to current range
  readers/multishard: reader_lifecycle_policy: add get_read_range()
  test/boost/multishard_mutation_query_test: paging: handle range becoming wrapping
2023-03-26 18:39:50 +03:00
Kefu Chai
a5547ea11b build: cmake: add two missing tests
they are leftovers in f113dac5bf

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-03-26 14:01:21 +08:00
Botond Dénes
19560419d2 Merge 'treewide: improve compatibility with gcc 13' from Avi Kivity
An assortment of patches that reduce our incompatibilities with the upcoming gcc 13.

Closes #13243

* github.com:scylladb/scylladb:
  transport: correctly format unknown opcode
  treewide: catch by reference
  test: raft: avoid confusing string compare
  utils, types, test: extract lexicographical compare utilities
  test: raft: fsm_test: disambiguate raft::configuration construction
  test: reader_concurrency_semaphore_test: handle all enum values
  repair: fix signed/unsigned compare
  repair: fix incorrect signed/unsigned compare
  treewide: avoid unused variables in if statements
  keys: disambiguate construction from initializer_list<bytes>
  cql3: expr: fix serialize_listlike() reference-to-temporary with gcc
  compaction: error on invalid scrub type
  treewide: prevent redefining names
  api: task_manager: fix signed/unsigned compare
  alternator: streams: fix signed/unsigned comparison
  test: fix some mismatched signed/unsigned comparisons
2023-03-24 15:16:05 +02:00
Botond Dénes
0aa03f85a3 readers/multishard: reader_lifecycle_policy: add get_read_range()
Allows retrieving the current read-range for the reader on the given
shard (where the method is called).
2023-03-24 08:40:11 -04:00
Botond Dénes
1c7a66cd2a test/boost/multishard_mutation_query_test: paging: handle range becoming wrapping
After each page, the read range is adjusted so it continues from/after
the last read partition. Sometimes this can result in the range becoming
wrapped like this: (pk, pk]. In this case, we can just drop this range
and continue with the rest of the ranges (if there are multiple ones).
2023-03-24 08:40:11 -04:00
Tomasz Grabiec
c54a3d9c10 Merge 'Clean enabled features manipulations in system keyspace' from Pavel Emelyanov
There was an attempt to cut feature-service -> system-keyspace dependency (#13172) which turned out to require more changes. Here's a preparation squeezing from this future work.

This set
- leaves only batch-enabling API in feature service
- keeps the need for async context in feature service
- narrows down system keyspace features API to only load and store records
- relaxes features updating logic in sys.ks.
- cosmetic

Closes #13264

* github.com:scylladb/scylladb:
  feature_service: Indentation fix after previous patch
  feature_service: Move async context into enable()
  system_keyspace: Refactor local features load/save helpers
  feature_service: Mark supported_feature_set() const
  feature_service: Remove single feature enabling method
  boot: Enable features in batch
  gossiper: Enable features in batch
2023-03-24 13:12:49 +01:00
Petr Gusev
c1634ea5fa test_raft_upgrade: add a test for schema commit log feature
The test tries to start a node with
consistent_cluster_management but without
force_schema_commit_log. This is expected to fail,
since the schema commitlog feature should be enabled
by all the cluster nodes.
2023-03-24 16:08:17 +04:00
Petr Gusev
e407956e9f scylla_cluster.py: add start flag to server_add
Sometimes when creating a node it's useful
to just install it and not start. For example,
we may want to try to start it later with
expected error.

The ScyllaServer.install method has been made
exception safe, if an exception occurs, it
reverts to the original state. This allows
to not duplicate the try/except logic
in two of its call sites.
2023-03-24 16:08:17 +04:00
Petr Gusev
794d0e4000 ServerInfo: drop host_id
We are going to allow the
ScyllaCluster.add_server function not to
start the server if the caller has requested
that with a special parameter. The host_id
can only be obtained from a running node, so
add_server won't be able to return it in
this case. I've grepped the tests for host_id
and there doesn't seem to be any
reference to it in the code.
2023-03-24 16:08:17 +04:00
Petr Gusev
8e3392c64f scylla_cluster.py: add config to server_add
Sometimes when creating a node it's useful
to pass a custom node config.
2023-03-24 16:08:17 +04:00
Petr Gusev
c1d0ee2bce scylla_cluster.py: add expected_error to server_start
Sometimes it's useful to check that the node has failed
to start for a particular reason. If server_start can't
find expected_error in the node's log or if the
node has started without errors, it throws an exception.
2023-03-24 16:08:11 +04:00
Petr Gusev
a4411e9ec4 scylla_cluster.py: ScyllaServer.start, refactor error reporting
Extract the function that encapsulates all the error
reporting logic. We are going to use it in several
other places to implement expected_error feature.
2023-03-24 15:54:52 +04:00
Petr Gusev
21b505e67c scylla_cluster.py: fix ScyllaServer.start, reset cmd if start failed
The ScyllaServer expects cmd to be None if the
Scylla process is not running. Otherwise, if start failed
and the test called update_config, the latter will
try to send a signal to a non-existent process via cmd.
2023-03-24 15:54:52 +04:00
Petr Gusev
5a5d664a5a init_system_keyspace: refactoring towards explicit load phases
We aim (#12642) to use the schema commit log
for raft tables. Now they are loaded at
the first call to init_system_keyspace in
main.cc, but the schema commitlog is only
initialized shortly before the second
call. This is important, since the schema
commitlog initialization
(database::before_schema_keyspace_init)
needs to access schema commitlog feature,
which is loaded from system.scylla_local
and therefore is only available after the
first init_system_keyspace call.

So the idea is to defer the loading of the raft tables
until the second call to init_system_keyspace,
just as it works for schema tables.
For this we need a tool to mark which tables
should be loaded in the first or second phase.

To do this, in this patch we introduce system_table_load_phase
enum. It's set in the schema_static_props for schema tables.
It replaces the system_keyspace::table_selector in the
signature of init_system_keyspace.

The call site for populate_keyspace in init_system_keyspace
was changed, table_selector.contains_keyspace was replaced with
db.local().has_keyspace. This check prevents calling
populate_keyspace(system_schema) on phase1, but allows for
populate_keyspace(system) on phase2 (to init raft tables).
On this second call some tables from system keyspace
(e.g. system.local) may have already been populated on phase1.
This check protects from double-populating them, since every
populated cf is marked as ready_for_writes.
2023-03-24 15:54:46 +04:00
Nadav Har'El
4fdcee8415 test/alternator: increase CQL connection timeout
This patch increases the connection timeout in the get_cql_cluster()
function in test/cql-pytest/run.py. This function is used to test
that Scylla came up, and also test/alternator/run uses it to set
up the authentication - which can only be done through CQL.

The Python driver has 2-second and 5-second default timeouts that should
have been more than enough for everybody (TM), but in #13239 we saw
that in one case it apparently wasn't enough. So to be extra safe,
let's increase the default connection-related timeouts to 60 seconds.

Note this change only affects the Scylla *boot* in the test/*/run
scripts, and it does not affect the actual tests - those have different
code to connect to Scylla (see cql_session() in test/cql-pytest/util.py),
and we already increased the timeouts there in #11289.

Fixes #13239

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #13291
2023-03-23 16:03:20 +02:00
Avi Kivity
afe6b0d8c9 Merge 'reader_concurrency_semaphore: add trace points for important events' from Botond Dénes
Currently we have no visibility into what happens to a read in the reader concurrency semaphore as far as tracing is concerned. This series fixes that, storing a trace state pointer on the reader permit and using it to add trace messages to important semaphore related events:
* admission decision
* execution (execution stage functionality)
* eviction

This allows for seeing if the read suffered any delay in the semaphore.

Example tracing (2 pages):
```
Tracing session: 8cc80d50-c72d-11ed-8427-14e21cc3ed56

 activity                                                                                                                                  | timestamp                  | source    | source_elapsed | client
-------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-----------+----------------+-----------
                                                                                                                        Execute CQL3 query | 2023-03-20 10:43:16.773000 | 127.0.0.1 |              0 | 127.0.0.1
                                                                                                             Parsing a statement [shard 0] | 2023-03-20 10:43:16.773754 | 127.0.0.1 |             -- | 127.0.0.1
                                                                                                          Processing a statement [shard 0] | 2023-03-20 10:43:16.773837 | 127.0.0.1 |             83 | 127.0.0.1
          Creating read executor for token -4911109968640856406 with all: {127.0.0.1} targets: {127.0.0.1} repair decision: NONE [shard 0] | 2023-03-20 10:43:16.773874 | 127.0.0.1 |            121 | 127.0.0.1
                                                                                                     read_data: querying locally [shard 0] | 2023-03-20 10:43:16.773877 | 127.0.0.1 |            123 | 127.0.0.1
                                      Start querying singular range {{-4911109968640856406, pk{000d73797374656d5f736368656d61}}} [shard 0] | 2023-03-20 10:43:16.773881 | 127.0.0.1 |            128 | 127.0.0.1
                                                                             [reader concurrency semaphore] admitted immediately [shard 0] | 2023-03-20 10:43:16.773884 | 127.0.0.1 |            130 | 127.0.0.1
                                                                                   [reader concurrency semaphore] executing read [shard 0] | 2023-03-20 10:43:16.773890 | 127.0.0.1 |            137 | 127.0.0.1
                  Querying cache for range {{-4911109968640856406, pk{000d73797374656d5f736368656d61}}} and slice {(-inf, +inf)} [shard 0] | 2023-03-20 10:43:16.773903 | 127.0.0.1 |            149 | 127.0.0.1
 Page stats: 1 partition(s), 0 static row(s) (0 live, 0 dead), 100 clustering row(s) (100 live, 0 dead) and 0 range tombstone(s) [shard 0] | 2023-03-20 10:43:16.774674 | 127.0.0.1 |            920 | 127.0.0.1
                                                                   Caching querier with key 5eff94d2-e47a-43b2-8e3a-2d80a9cc3b3e [shard 0] | 2023-03-20 10:43:16.774685 | 127.0.0.1 |            931 | 127.0.0.1
                                                                                                                Querying is done [shard 0] | 2023-03-20 10:43:16.774688 | 127.0.0.1 |            934 | 127.0.0.1
                                                                                            Done processing - preparing a result [shard 0] | 2023-03-20 10:43:16.774706 | 127.0.0.1 |            953 | 127.0.0.1
                                                                                                                          Request complete | 2023-03-20 10:43:16.774225 | 127.0.0.1 |           1225 | 127.0.0.1

Tracing session: 8d26f630-c72d-11ed-8427-14e21cc3ed56

 activity                                                                                                                                                | timestamp                  | source    | source_elapsed | client
---------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-----------+----------------+-----------
                                                                                                                                      Execute CQL3 query | 2023-03-20 10:43:17.395000 | 127.0.0.1 |              0 | 127.0.0.1
                                                                                                                           Parsing a statement [shard 0] | 2023-03-20 10:43:17.395498 | 127.0.0.1 |             -- | 127.0.0.1
                                                                                                                        Processing a statement [shard 0] | 2023-03-20 10:43:17.395558 | 127.0.0.1 |             60 | 127.0.0.1
                        Creating read executor for token -4911109968640856406 with all: {127.0.0.1} targets: {127.0.0.1} repair decision: NONE [shard 0] | 2023-03-20 10:43:17.395597 | 127.0.0.1 |             99 | 127.0.0.1
                                                                                                                   read_data: querying locally [shard 0] | 2023-03-20 10:43:17.395600 | 127.0.0.1 |            102 | 127.0.0.1
                                                    Start querying singular range {{-4911109968640856406, pk{000d73797374656d5f736368656d61}}} [shard 0] | 2023-03-20 10:43:17.395604 | 127.0.0.1 |            106 | 127.0.0.1
 Found cached querier for key 5eff94d2-e47a-43b2-8e3a-2d80a9cc3b3e and range(s) {{{-4911109968640856406, pk{000d73797374656d5f736368656d61}}}} [shard 0] | 2023-03-20 10:43:17.395610 | 127.0.0.1 |            112 | 127.0.0.1
                                                                                                                               Reusing querier [shard 0] | 2023-03-20 10:43:17.395614 | 127.0.0.1 |            116 | 127.0.0.1
                                                                                                 [reader concurrency semaphore] executing read [shard 0] | 2023-03-20 10:43:17.395622 | 127.0.0.1 |            125 | 127.0.0.1
                 Page stats: 1 partition(s), 0 static row(s) (0 live, 0 dead), 11 clustering row(s) (11 live, 0 dead) and 0 range tombstone(s) [shard 0] | 2023-03-20 10:43:17.395711 | 127.0.0.1 |            213 | 127.0.0.1
                                                                                                                              Querying is done [shard 0] | 2023-03-20 10:43:17.395718 | 127.0.0.1 |            221 | 127.0.0.1
                                                                                                          Done processing - preparing a result [shard 0] | 2023-03-20 10:43:17.395734 | 127.0.0.1 |            236 | 127.0.0.1
                                                                                                                                        Request complete | 2023-03-20 10:43:17.395276 | 127.0.0.1 |            276 | 127.0.0.1

```
Fixes: https://github.com/scylladb/scylladb/issues/12781

Closes #13255

* github.com:scylladb/scylladb:
  reader_concurrency_semaphore: add trace points for important events
  reader_permit: refresh trace_state on new pages
  reader_permit: keep trace_state pointer on permit
  test/perf/perf_collection: give more unique names to key comparators
2023-03-23 15:37:33 +02:00
Nadav Har'El
b5e61e1b83 test/cql-pytest, lwt: test for detection of contradicting batches
Cassandra detects when a batch has both an IF EXISTS and IF NOT EXISTS
on the same row, and complains this is not a useful request (after all,
it can never succeed, because the batch can only succeed if both conditions
are true, and that can't be if one checks IF EXISTS and the other
IF NOT EXISTS).

This patch adds a test, test_lwt_with_batch_conflict_1, which checks
that this case results in an error. It passes on Cassandra, but xfails
on Scylla which doesn't report an error in this case.

A second test, test_lwt_with_batch_conflict_2, shows that the detection
of the EXISTS / NOT EXISTS conflict is special, and other conflicts
such as having both "r=1" and "r=2" for the same row, are NOT detected
by Cassandra.

Refs #13011.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #13270
2023-03-23 13:35:21 +02:00
Kefu Chai
1197664f09 test: network_topology_strategy_test: silence warning
clang warns when the implicit conversion changes the precision of the
converted number. in this case, the before being multiplied,
`std::numeric_limits<unsigned long>::max() >> 1` is implicitly
promoted to double so it can obtain the common type of double and
unsigned long. and the compiler warns:

```
/home/kefu/dev/scylladb/test/boost/network_topology_strategy_test.cc:129:84: error: implicit conversion from 'unsigned long' to 'double' changes value from 9223372036854775807 to 9223372036854775808 [-Werror,-Wimplicit-const-int-float-conversion]
    return static_cast<unsigned long>(d*(std::numeric_limits<unsigned long>::max() >> 1)) << 1;
                                       ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~
```
but

1. we don't really care about the precision here, we just want to map a
   double to a token represented by an int64_t
2. the maximum possible number being converted is less than
   9223372036854775807, which is the maximum number of int64_t, which
   is in general an alias of `long long`, not to mention that
   LONG_MAX is always 2147483647, after shifting right, the result
   would be 1073741823

so this is a false alarm. in order to silence it, we explicitly
cast the RHS of `*` operator to double.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13221
2023-03-23 08:55:29 +02:00
Nadav Har'El
d1e6d9103a Merge 'api: reference httpd::* symbols like 'httpd::*'' from Kefu Chai
this change is a leftover of 063b3be8a7, which failed to include the changes in the header files.

it turns out we have `using namespace httpd;` in seastar's `request_parser.rl`, and we should not rely on this statement to expose the symbols in `seatar::httpd` to `seastar` namespace. in this change,

also, sine `get_name()` previously a non-static member function of `seastar_test` is now a static member function, so we need to update the tests which capture `this` for calling this function, so they don't capture `this` anymore.

Closes #13202

* github.com:scylladb/scylladb:
  test: drop unused captured variables
  Update seastar submodule
2023-03-22 18:16:15 +02:00
Kefu Chai
596ea6d439 test: drop unused captured variables
this should silence the warning like:
```
test/boost/multishard_mutation_query_test.cc:493:29: error: lambda capture 'this' is not used [-Werror,-Wunused-lambda-capture]
    do_with_cql_env_thread([this] (cql_test_env& env) -> future<> {
                            ^~~~
test/boost/multishard_mutation_query_test.cc:577:29: error: lambda capture 'this' is not used [-Werror,-Wunused-lambda-capture]
    do_with_cql_env_thread([this] (cql_test_env& env) -> future<> {
                            ^~~~
2 errors generated.
```

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-03-22 21:21:04 +08:00
Botond Dénes
156e5d346d reader_permit: keep trace_state pointer on permit
And propagate it down to where it is created. This will be used to add
trace points for semaphore related events, but this will come in the
next patches.
2023-03-22 04:58:01 -04:00
Botond Dénes
27a4c24522 test/perf/perf_collection: give more unique names to key comparators
perf.cc has two key comparators: key_compare and key_tri_compare. These
are very generic name, in fact key_compare directly clashes with a
comparator with the same name in types.hh. Avoid the clash by renaming
both of these to a more unique name.
2023-03-22 04:58:01 -04:00
Nadav Har'El
2038388268 cql-pytest: translate Cassandra's tests for multi-column relations
This is a translation of Cassandra's CQL unit test source file
validation/operations/SelectMultiColumnRelationTest.java into our
cql-pytest framework.

The tests reproduce four already-known Scylla bugs and three new bugs.
All tests pass on Cassandra. Because of these bugs 9 of the 22 tests
are marked xfail, and one is marked skip (it crashes Scylla).

Already known issues:

Refs    #64: CQL Multi column restrictions are allowed only on a clustering
             key prefix
Refs  #4178: Not covered corner case for key prefix optimization in filtering
Refs  #4244: Add support for mixing token, multi- and single-column
             restrictions
Refs  #8627: Cleanly reject updates with indexed values where value > 64k

New issue discovered by these tests:

Refs #13217: Internal server error when null is used in multi-column relation
Refs #13241: Multi-column IN restriction with tuples of different lengths
             crashes Scylla
Refs #13250: One-element multi-column restriction should be handled like a
             single-column restriction

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #13265
2023-03-22 09:54:32 +02:00
Kefu Chai
124410c059 api: reference httpd::* symbols like 'httpd::*'
this change is a leftover of 063b3be,
which failed to include the changes in the header files.

it turns out we have `using namespace httpd;` in seastar's
`request_parser.rl`, and we should not rely on this statement to
expose the symbols in `seatar::httpd` to `seastar` namespace.
in this change,

* api/*.hh: all httpd symbols are referenced by `httpd::*`
  instead of being referenced as if they are in `seastar`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-03-21 15:49:10 +02:00
Avi Kivity
e75009cd49 treewide: catch by reference
gcc rightly warns about capturing by value, so capture by
reference.
2023-03-21 15:43:00 +02:00
Avi Kivity
eaad38c682 test: raft: avoid confusing string compare
gcc doesn't like comparing a C string to an sstring -- apparently
it has different promotion rules than clang. Fix by doing an
explicit conversion.
2023-03-21 15:43:00 +02:00
Avi Kivity
bdfc0aa748 utils, types, test: extract lexicographical compare utilities
UUID_test uses lexicograhical_compare from the types module. This
is a layering violation, since UUIDs are at a much lower level than
the database type system. In practical terms, this cause link failures
with gcc due to some thread-local-storage variables defined in types.hh
but not provided by any object, since we don't link with types.o in this
test.

Fix by extracting the relevant functions into a new header.
2023-03-21 15:42:53 +02:00
Avi Kivity
32a724fada test: raft: fsm_test: disambiguate raft::configuration construction
gcc thinks the constructor call is ambiguous since "{}" can match
the default constructor. Fix by making the parameter type explicit.

Use "{}" for the constructor call to avoid the most-vexing-parse
problem.
2023-03-21 13:45:57 +02:00
Avi Kivity
83e149c341 test: reader_concurrency_semaphore_test: handle all enum values
gcc considers values outside the enum class enumeration lists to be
valid, so handle them. In this case, we don't think they can happen,
so abort.
2023-03-21 13:45:57 +02:00
Avi Kivity
7bb717d2f9 treewide: prevent redefining names
gcc dislikes a member name that matches a type name, as it changes
the type name retroactively. Fix by fully-qualifying the type name,
so it is not changed by the newly-introduced member.
2023-03-21 13:42:49 +02:00
Nadav Har'El
77bf90bf7d Merge 'Sanitize {format_types|version_types} to/from string converters' from Pavel Emelyanov
There's a need to convert both -- version and format -- to string and back. Currently, there's a disperse set of helpers in sstables/ code doing that and this PR brings some other to it

- adds fmt::formatter<> specialization for both types
- leaves one set of {format|version}_from_string() helpers converting any string-ish object into value

refs: #12523

Closes #13214

* github.com:scylladb/scylladb:
  sstables: Expell sstable_version_types from_string() helper
  sstables: Generalize ..._from_string helpers
  sstables: Implement fmt::formatter<sstable_format_types>
  sstables: Implement fmt::formatter<sstable_version_types>
  sstables: Move format maps to namespace scope
2023-03-21 13:39:24 +02:00
Avi Kivity
0770b328c7 test: fix some mismatched signed/unsigned comparisons
gcc likes to complain about sized/unsigned compares as they
can yield surprising results. The fixes are trivial, so apply
them.
2023-03-21 13:15:12 +02:00
Pavel Emelyanov
8600cb2db0 feature_service: Move async context into enable()
Callers don't need to know that enabling features has this requirement
Indentation is deliberately left broken (until next patch)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-03-21 11:59:34 +03:00
Pavel Emelyanov
fe7609865d Merge 'reader_concurrency_semaphore: improve diagnostics printout' from Botond Dénes
Remove redundant "Total: ..." line.
Include the entire `reader_concurrency_semaphore::stats` in the printout. This includes a lot of metrics not exported to monitoring. These metrics are very valuable when debugging timeouts but are otherwise uninteresting. To avoid bloating our monitoring with such niche metrics, we dump them when they are interesting: when timeouts happen. To be really helpful, we do need historic values too, but this shouldn't be a problem: timeouts come in bursts, we usually get at least a handful of diagnostics dumps at a time.
New stats are also added to record the reason why reads are queued on the semaphore.

Printout before:
```
INFO  2023-03-14 12:43:54,496 [shard 0] reader_concurrency_semaphore - Semaphore test_reader_concurrency_semaphore_memory_limit_no_leaks with 4/4 count and 7168/4096 memory resources: kill limit triggered, dumping permit diagnostics:
permits count   memory  table/description/state
4       4       7K      *.*/reader/active/unused
2       0       0B      *.*/reader/waiting_for_admission

6       4       7K      total

Total: 6 permits with 4 count and 7K memory resources
```

Printout after:
```
INFO  2023-03-16 04:23:41,791 [shard 0] reader_concurrency_semaphore - Semaphore test_reader_concurrency_semaphore_memory_limit_no_leaks with 3/4 count and 7168/4096 memory resources: kill limit triggered, dumping permit diagnostics:
permits count   memory  table/description/state
2       2       6K      *.*/reader/active/unused
1       1       1K      *.*/reader/waiting_for_memory
2       0       0B      *.*/reader/waiting_for_admission

5       3       7K      total

Stats:
permit_based_evictions: 0
time_based_evictions: 0
inactive_reads: 0
total_successful_reads: 0
total_failed_reads: 0
total_reads_shed_due_to_overload: 0
total_reads_killed_due_to_kill_limit: 1
reads_admitted: 4
reads_enqueued_for_admission: 4
reads_enqueued_for_memory: 5
reads_admitted_immediately: 2
reads_queued_because_ready_list: 0
reads_queued_because_used_permits: 0
reads_queued_because_memory_resources: 0
reads_queued_because_count_resources: 4
reads_queued_with_eviction: 0
total_permits: 6
current_permits: 5
used_permits: 0
blocked_permits: 0
disk_reads: 0
sstables_read: 0
```

Closes #13173

* github.com:scylladb/scylladb:
  test/boost/reader_concurrency_semaphore_test: remove redundant stats printouts
  reader_concurrency_semaphore: do_dump_reader_permit_diagnostics(): print the stats
  reader_concurrency_semaphore: add stats to record reason for queueing permits
  reader_concurrency_semaphore: can_admit_read(): also return reason for rejection
2023-03-21 10:41:11 +03:00
Pavel Emelyanov
eecb9244dd sstables: Expell sstable_version_types from_string() helper
It's name is too generic despite it's narrow specialization. Also,
there's a version_from_string() method that does the same in a more
convenient way.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-03-21 09:56:18 +03:00
Pavel Emelyanov
6b04eb74d6 sstables: Implement fmt::formatter<sstable_version_types>
This way the version type can be fed as-is into fmt:: code, respectively
the conversion to string is as simple as fmt::to_string(v). So also drop
the explicit existing to_string() helper updating all callers.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-03-21 09:56:18 +03:00
Nadav Har'El
511308bccf test/cql-pytest: tests for single-element multi-column restrictions
It turns out that Cassandra handles a restriction like `(c2) = (1)` just
like `c2 = 1`, and is not limited like multi-column restrictions. In
particular, this query works despite missing "c1", and may also use an
index if c2 is indexed.

But currently in Scylla, `(c2) = (1)` is handled like a multi-column
restriction, so complains if c2 is not the first clustering key column,
and cannot use an index.

This patch adds several tests demonstrating this difference between
Scylla and Cassandra (#13250). The xfailing tests pass on Cassandra
but fail on Scylla.

Refs #13250

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #13252
2023-03-21 07:56:24 +02:00
Kefu Chai
d146535ec6 mutation: use fmtlib to print range_stombstone{_change,}
prepare for removing `operator<<(std::ostream&, const range_tombstone&)` and
`operator<<(std::ostream& out, const range_tombstone_change&)`.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-03-21 11:37:07 +08:00
Gleb Natapov
f017aa1ad3 service: raft: pass storage service to group0_state_machine
To apply topology_change commands group0_state_machine needs to have an
access to the storage service to support topology changes over raft.

Message-Id: <20230316112801.1004602-10-gleb@scylladb.com>
2023-03-20 11:45:57 +01:00
Gleb Natapov
2fc8e13dd8 raft: add server::wait_for_state_change() function
Add a function that allows waiting for a state change of a raft server.
It is useful for a user that wants to know when a node becomes/stops
being a leader.

Message-Id: <20230316112801.1004602-4-gleb@scylladb.com>
2023-03-20 11:31:55 +01:00
Nadav Har'El
c550e681d7 test/rest_api: fix flaky test for toppartitions
The REST test test_storage_service.py::test_toppartitions_pk_needs_escaping
was flaky. It tests the toppartition request, which unfortunately needs
to choose a sampling duration in advance, and we chose 1 second which we
considered more than enough - and indeed typically even 1ms is enough!
but very rarely (only know of only one occurance, in issue #13223) one
second is not enough.

Instead of increasing this 1 second and making this test even slower,
this patch takes a retry approach: The tests starts with a 0.01 second
duration, and is then retried with increasing durations until it succeeds
or a 5-seconds duration is reached. This retry approach has two benefits:
1. It de-flakes the test (allowing a very slow test to take 5 seconds
instead of 1 seconds which wasn't enough), and 2. At the same time it
makes a successful test much faster (it used to always take a full
second, now it takes 0.07 seconds on a dev build on my laptop).

A *failed* test may, in some cases, take 10 seconds after this patch
(although in some other cases, an error will be caught immediately),
but I consider this acceptable - this test should pass, after all,
and a failure indicates a regression and taking 10 seconds will be
the last of our worries in that case.

Fixes #13223.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #13238
2023-03-20 11:32:53 +02:00
Avi Kivity
bab29a2f27 Merge 'Unit tests cleanup for sstable generation changes' from Benny Halevy
This series cleans up unit test in preparation for PR #12994.
Helpers are added (or reused) to not rely on specific sstable generation numbers where possible (other than loading reference sstables that are committed to the repo with given generation numbers), and to generate the sstables for tests easily, taking advantage of generation management in `sstable_test_env`, `table_for_tests`, or `replica::table` itself.

Closes #13242

* github.com:scylladb/scylladb:
  test: add verify_mutation helpers.
  test: add make_sstable_containing memtable
  test: table_for_tests: add make_sstable function
  test: sstable_test_env: add make_sst_factory methods
  test: sstable_compaction_test: do not rely on specific generations
  tests: use make_sstable defaults as much as possible
  test: sstable_test_env: add make_table_for_tests
  test: sstable_datafile_test: do not rely on sepecific sstable generations
  test: sstable_test_env: add reusable_sst(shared_sstable)
  sstable: expose get_storage function
  test: mutation_reader_test: create_sstable: do not rely on specific generations
  test: mutation_reader_test: do_test_clustering_order_merger_sstable_set: rely on test_envsstable generation
  test: mutation_reader_test: combined_mutation_reader_test: define a local sst_factory function
  test: mutation_reader_test: do not use tmpdir
  test: use big format by default
  test: sstable_compaction_test: use highest sstable version by default
  test: test_env: make_db_config: set cfg host_id
  test: sstable_datafile_test: fixup indentation
  test: sstable_datafile_test: various tests: do_with_async
  test: sstable_3_x_test: validate_read, sstable_assertions: get shared_sstable
  test: sstable_3_x_test: compare_sstables: get shared_sstable
  test: sstable_3_x_test: write_sstables: return shared_sstable
  test: sstable_3_x_test: write, compare, validate_sstables: use env.tempdir
  test: sstable_3_x_test: compacted_sstable_reader: do not reopen compacted_sst
  test: lib: test_services: delete now unused stop_and_keep_alive
  test: sstable_compaction_test: use deferred_stop to stop table_for_tests
  test: sstable_compaction_test: compound_sstable_set_incremental_selector_test: do_with_async
  test: sstable_compaction_test: sstable_needs_cleanup_test: do_with_async
  test: sstable_compaction_test: leveled_05: fixup indentation
  test: sstable_compaction_test: leveled_05: do_with_async
  test: sstable_compaction_test: compact_02: do_with_async
  test: sstable_compaction_test: compact_sstables: simplify variable allocation
  test: sstable_compaction_test: compact_sstables: reindent
  test: sstable_compaction_test: compact_sstables: use thread
  test: sstable_compaction_test: sstable_rewrite: simplify variable allocation
  test: sstable_compaction_test: sstable_rewrite: fixup indentation
  test: sstable_compaction_test: sstable_rewrite: do_with_async
  test: sstable_compaction_test: compact: fixup indentation
  test: sstable_compaction_test: compact: complete conversion to async thread
  test: sstable_compaction_test: compaction_manager_basic_test: rename generations to idx
2023-03-20 11:16:46 +02:00
Nadav Har'El
8b0822be77 test/cql-pytest: reproducer for bug crashing Scylla on mismatched tuple
This patch addes a reproducing test for issue #13241, where attempting
a SELECT restriction (b,c,d) IN ((1,2)) - where the tuple is shorter
than needed - crashes Scylla (on segmentation fault) instead of
generating a clean error as it should (and as done on Cassandra).

The test also demonstractes that if the tuple is longer than needed
(instead of shorter), the behavior is correct, and it is also
correct if "=" is used instead of IN. Only the combination of IN
and too-short tuple seems to be broken - but broken in a bad way
(can be used to crash Scylla).

Because the test crashes Scylla when fails, it is marked "skip".

Refs #13241

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #13244
2023-03-20 11:13:02 +02:00