Commit Graph

3158 Commits

Author SHA1 Message Date
Calle Wilund
adda43edc7 CDC - do not remove log table on CDC disable
Fixes #10489

Killing the CDC log table on CDC disable is unhelpful in many ways,
partly because it can cause random exceptions on nodes trying to
do a CDC-enabled write at the same time as log table is dropped,
but also because it makes it impossible to collect data generated
before CDC was turned off, but which is not yet consumed.

Since data should be TTL:ed anyway, retaining the table should not
really add any overhead beyond the compaction to eventually clear
it. And user did set TTL=0 (disabled), then he is already responsible
for clearing out the data.

This also has the nice feature of meshing with the alternator streams
semantics.

Closes #10601
2022-05-31 19:07:07 +03:00
Mikołaj Sielużycki
bc18e97473 sstable_writer: Fix mutation order violation
The change
- adds a test which exposes a problem of a peculiar setup of
tombstones that trigger a mutation fragment stream validation exception
- fixes the problem

Applying tombstones in the order:

range_tombstone_change pos(ck1), after_all_prefixed, tombstone_timestamp=1
range_tombstone_change pos(ck2), before_all_prefixed, tombstone=NONE
range_tombstone_change pos(NONE), after_all_prefixed, tombstone=NONE

Leads to swapping the order of mutations when written and read from
disk via sstable writer. This is caused by conversion of
range_tombstone_change (in memory representation) to range tombstone
marker (on disk representation) and back.

When this mutation stream is written to disk, the range tombstone
markers type is calculated based on the relationship between
range_tombstone_changes. The RTC series as above produces markers
(start, end, start). When the last marker is loaded from disk, it's kind
gets incorrectly loaded as before_all_prefixed instead of
after_all_prefixed. This leads to incorrect order of mutations.

The solution is to skip writing a new range_tombstone_change with empty
tombstone if the last range_tombstone_change already has empty
tombstone. This is redundant information and can be safely removed,
while the logic of encoding RTCs as markers doesn't handle such
redundancy well.

Closes #10643
2022-05-31 13:39:48 +03:00
Konstantin Osipov
94a192a7aa Revert "test.py: temporarily disable raft"
This reverts commit 26128a222b.

The issue the commit depends on is fixed, so enable raft back.

Closes #10694
2022-05-31 14:39:26 +03:00
Piotr Sarna
7169e021e5 Merge 'cql3: support list subscripts in WHERE clause' from Avi Kivity
I noticed that `column_condition` (used in LWT `IF` clause) supports lists.
As part of the Grand Expression Unification we'll need to migrate that to
expressions, so we'll need to support list subscripts.

Use the opportunity to relax the normal filtering to allow filtering on
list subscripts: `WHERE my_list[:index] = :value`.

Closes #10645

* github.com:scylladb/scylla:
  test: cql-pytest: add test for list subscript filtering
  doc: document list subscripts usable in WHERE clause
  cql3: expr: drop restrictions on list subscripts
  cql3: expr: prepare_expr: support subscripted lists
  cql3: expressions: reindent get_value()
  cql3: expression: evaluate() support subscripting lists
2022-05-31 09:28:52 +02:00
Avi Kivity
4b53af0bd5 treewide: replace parallel_for_each with coroutine::parallel_for_each in coroutines
coroutine::parallel_for_each avoids an allocation and is therefore preferred. The lifetime
of the function object is less ambiguous, and so it is safer. Replace all eligible
occurences (i.e. caller is a coroutine).

One case (storage_service::node_ops_cmd_heartbeat_updater()) needed a little extra
attention since there was a handle_exception() continuation attached. It is converted
to a try/catch.

Closes #10699
2022-05-31 09:06:24 +03:00
Avi Kivity
248cdf0e34 test: cql-pytest: add test for list subscript filtering
Test match and mismatch, as well as out of bound cases.
2022-05-30 20:47:47 +03:00
Kamil Braun
78f81171ba Merge 'raft: test non-voters in randomized_nemesis_test' from Kamil Braun
We modify the `reconfigure` and `modify_config` APIs to take a vector of
<server_id, bool> pairs (instead of just a vector of server_ids), where
the bool indicates whether the server is a voter in the modified config.

The `reconfiguration` operation would previously shuffle the set of
servers and split it into two parts: members and non-members. Now it
partitions it into three parts: voters, non-voters, and non-members.

The PR also includes fixes for some liveness problems stumbled upon
during testing.

Closes #10640

* github.com:scylladb/scylla:
  test: raft: randomized_nemesis_test: include non-voters during reconfigurations
  raft: server: if `add_entry` with `wait_type::applied` successfully returns, ensure `state_machine::apply` is called for this entry
  raft: tracker: fix the definition of `voters()`
  raft: when printing `raft::server_address`, include `can_vote`
2022-05-30 15:06:35 +02:00
Avi Kivity
35e0474410 cql3: expr: prepare_expr: support subscripted lists
Infer the type of a list index as int32_type.

The error message when a non-subscriptable type is provided is
changed, so the corresponding test is changed too.
2022-05-30 13:29:49 +03:00
Piotr Sarna
5d59c841d0 Merge 'alternator: add Describe operations even if a feature...
is not supported' from Nadav Har'El

This small series implements the DescribeTimeToLive and
DescribeContinuousBackups operations in Alternator. Even if the
corresponding features aren't implemented, it can help applications that
we implement just the Describe operation that can say that this feature
is in fact currently disabled.

Fixes  #10660

Closes #10670

* github.com:scylladb/scylla:
  alternator: remove dead code
  alternator: implement DescribeContinuousBackups operation
  alternator: allow DescribeTimeToLive even without TTL enabled
2022-05-30 09:26:13 +02:00
Nadav Har'El
63132431e8 test/cql-pytest: reproducers for three scan bugs
This patch contains five tests which reproduce three old bugs in
Scylla's handling of multi-column restrictions like (c1,c2)<(1,2).

These old bugs are:

    Refs #64 (yes, a two-digit issue!)
    Refs #4244
    Refs #6200

The three github issues are closely intertwined, exposing the same
or similar bugs in our internal implementation, and I suspect that
eventually most of them could be fixed together.

In writing these tests, I carefully read all three issues and the
various failure scenarios described in them, tried to distill and
simplify the scenarios, and also consider various other broken
variants of the scenarios. The resulting tests are heavily commented,
explaining the motivation of each test and exactly which of the
above bugs it reproduces.

All five tests included in this patch pass on Cassandra and currently
fail on Scylla, so are marked "xfail".

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #10675
2022-05-30 09:25:00 +02:00
Avi Kivity
6bcbe3157a Merge "Turn table::config::sstables_manager* into table::sstables_manager&" from Pavel E
"
The table keeps references on sstables_ and compaction_ managers
(among other things), but the latter sits as a pointer on table's
config while the former -- as on-table direct reference.

This set unifies both by turning sstables manager on-config pointer
into on-table reference.

branch: https://github.com/xemul/scylla/tree/br-table-vs-sstables-manager
tests: https://jenkins.scylladb.com/job/releng/job/Scylla-CI/574/
"

* 'br-table-vs-sstables-manager' of https://github.com/xemul/scylla:
  tests: Remove sstables_manager& from column_family_test_config()
  table: Move sstables_manager from config onto table itself
  table, db, tests: Pass sstables_manager& into table constructor
2022-05-29 13:02:50 +03:00
Nadav Har'El
52362de3df test/cql-pytest: tests for assigning an empty string to non-string
In issues #7944 and #10625 it was noticed that by assigning an empty
string to a non-string type (int, date, etc.) using INSERT or
INSERT JSON, some combinations of the above can create "empty" values
while they should produce a clear error.

The tests added in this patch explore the different combinations of
types and insert modes, and reproduce several buggy cases in Scylla
(resulting in xfail'ing tests) as well as Cassandra.

We feared that there might be a way using those buggy statements to
create a partition with an empty key - something which used to kill
older versions of Scylla. But the tests show that this is not possible -
while a user can use the buggy statements to create an empty value,
Scylla refuses it when it is used as a single-column partition key.

Refs #10625
Refs #7944

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #10628
2022-05-27 16:37:01 +02:00
Michał Radwański
906cee7052 treewide: remove unqualified calls to std::move
clang 15 emits such a warning:

cql3/statements/raw/parsed_statement.cc:46:16: error: unqualified call to 'std::move' [-Werror,-Wunqualified-std-cast-call]
    , warnings(move(warnings))
               ^
               std::
cql3/statements/raw/parsed_statement.cc:52:101: error: unqualified call to 'std::move' [-Werror,-Wunqualified-std-cast-call]
    : prepared_statement(statement_, ctx.get_variable_specifications(), partition_key_bind_indices, move(warnings))
                                                                                                    ^
                                                                                                    std::

Closes #10656
2022-05-27 16:36:49 +02:00
Tomasz Grabiec
e9fbc0b6c5 Merge 'test.py: add cluster and new approval tests' from Konstantin Osipov
The purpose of this series is to introduce infrastructure
for managed scylla processes into test.py,
switch some existing suites to use test.py managed processes
and introduce cluster tests.

All of this is expected to make possible to test Raft topology
changes and schema changes using an easy to use and fast tool
such as test.py. In general this will allow testing Scylla clusters
from within the development test harness.

Branch URL: kostja/test.py.v5

Closes #10406

* github.com:scylladb/scylla:
  test: disable topology/test_null
  test.py: disable cdc_with_lwt_test it's flaky in debug mode
  test.py: workaround for a python bug
  test: cleanup (drop keyspace) in two cql tests to support --repeat
  test.py: respect --verbose even if output is a tty
  test: remove tools/cql_repl
  test.py: switch cql/ suite to pytest/tabular output
  test: remove a flaky test case
  test.py: implement CQL approval tests over pytest
  test.py: implement cql_repl
  test.py: add topology suite
  test.py: add common utility functions to test/pylib
  test.py: switch cql-pytest and rest_api suites to PythonTestSuite
  test.py: introduce PythonTest and PythonTestSuite
  test.py: use artifact registry
  test.py: temporarily disable raft
  test.py: (pylib) add Scylla Server and Artifact Registry
  test.py: (pylib) add Host Registry to track used server hosts
  test.py: (pylib) add a pool of scylla servers (or clusters)
2022-05-27 16:36:08 +02:00
Pavel Emelyanov
3dab0bfc8d tests: Remove sstables_manager& from column_family_test_config()
It's unused arg in there after last patch.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-05-27 16:37:21 +03:00
Pavel Emelyanov
490bf65e11 table: Move sstables_manager from config onto table itself
The manager reference is already available in constructor and thus
can be copied to on-table member.

The code that chooses the manager (user/system one) should be moved
from make_column_family_config() into add_column_family() method.

Once this happens, the get_sstables_manager() should be fixed to
return the reference from its new location. While at it -- mark the
method in question noexcept and add it's mutable overload.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-05-27 16:37:21 +03:00
Pavel Emelyanov
50e6810536 table, db, tests: Pass sstables_manager& into table constructor
In core code there's only one place that constructs table -- in
database.cc -- and this place currently has the sstables_manager pointer
sitting on table config (despite it's a pointer, it's always non-null).

All the tests always use the manager from one of _env's out there.

For now the new contructor arg is unused.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-05-27 16:27:44 +03:00
Kamil Braun
2dafd99e3b test: raft: randomized_nemesis_test: include non-voters during reconfigurations
We modify the `reconfigure` and `modify_config` APIs to take a vector of
<server_id, bool> pairs (instead of just a vector of server_ids), where
the bool indicates whether the server is a voter in the modified config.

The `reconfiguration` operation would previously shuffle the set of
servers and split it into two parts: members and non-members. Now it
partitions it into three parts: voters, non-voters, and non-members.
2022-05-27 12:06:18 +02:00
Nadav Har'El
e363febeb1 test/cql-pytest: reproducer for bug in index+filter+limit
This patch adds reproducing tests for wrong handling of LIMIT in a query
which uses a secondary index *and* filtering, described in issue #10649.

In that case, Scylla incorrectly limits the number of rows found in the
index *before* the filtering, while it should limit the number of rows
*after* the filtering.

The tests in this patch (which xfail on Scylla, and pass on Cassandra)
go beyond the minimum required to reproduce this bug. It turns out that
there are different sub-cases of this problem that go through different
code paths, namely whether the base table has clustering keys or just
partition keys, and whether the overall LIMITed result spans more than
one page. So these tests attempt to also cover all these sub-cases.
Without all these test sub-cases, an incomplete and incorrect fix of this
bug may, by chance, cause the original test to succeed.

Refs #10649

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #10658
2022-05-26 16:46:45 +02:00
Tomasz Grabiec
edd447ed32 Merge 'raft: test read_barriers in randomized_nemesis_test' from Kamil Braun
Introduce a new operation, `raft_read`, which calls `read_barrier`
on a server, reads the state of the server's state machine, and returns
that state.

Extend the generator in `basic_generator_test` to generate `raft_read`s.
Only do it if forwarding is enabled (although it may make sense to test
read barriers in non-forwarding scenario as well - we may think about it
and do it in a follow-up).

Check the consistency of the read results by comparing them with the model
and using the result to extend the model with any newly observed elements.

The patchset includes some fixes for correctness (#10578)
and liveness (handling aborts correctly).

Closes #10561

* github.com:scylladb/scylla:
  test: raft: randomized_nemesis_test: check consistency of reads
  test: raft: randomized_nemesis_test: perform linearizable reads using read_barriers
  test: raft: randomized_nemesis_test: add flags for disabling nemeses
  raft: server: in `abort()`, abort read barriers before waiting for rpc abort
  raft: server: handle aborts correctly in `read_barrier`
  raft: fsm: don't advance commit index further than match_idx during read_quorum
2022-05-26 16:46:35 +02:00
Nadav Har'El
d0ca09a925 alternator: implement DescribeContinuousBackups operation
Although we don't yet support the DynamoDB API's backup features (see
issue #5063), we can already implement the DescribeContinuousBackups
operation. It should just say that continuous backups, and point-in-time
restores, and disabled.

This will be useful for client code which tries to inquire about
continuous backups, even if not planning to use them in practice
(e.g., see issue #10660).

Refs #5063
Refs #10660

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2022-05-26 15:13:50 +03:00
Konstantin Osipov
7b3f9bb5fa test: disable topology/test_null
Scylla has a bug that only fires ones in a hundred runs in debug mode
when a schema change parallel to a topology change leads to a lost
keyspace and internal error. Disable the tests until Raft is enabled for
schema.
2022-05-26 14:09:58 +03:00
Konstantin Osipov
5e67e48f8b test.py: disable cdc_with_lwt_test it's flaky in debug mode
The test is flaky in debug mode, see issue #10661 for details.
2022-05-26 09:23:13 +03:00
Konstantin Osipov
180cc0fc4d test: cleanup (drop keyspace) in two cql tests to support --repeat 2022-05-25 20:26:42 +03:00
Konstantin Osipov
9512e076af test: remove tools/cql_repl
Remove tools/cql_repl from the source, build targets and
use in test.py. Superseded by ApprovalTest and
test/pylib/cql_repl/cql_repl.py.
2022-05-25 20:26:42 +03:00
Konstantin Osipov
b82c8447d7 test.py: switch cql/ suite to pytest/tabular output 2022-05-25 20:26:42 +03:00
Konstantin Osipov
c43736c53b test: remove a flaky test case
When we append an entry to a list with the same user-defined
timestamp, the behaviour is actually undefined. If the append
is processed by the same coordinator as the one that accepted
the existing entry, then it gets the same timeuuid as the list key,
and replaces (potentially) the existing list valiue. Then it
gets a timeuuid which maybe both larger and smaller than the existing
key's timeuuid, and then turns either to an append or a prepend.

The part of the timestamp responsible for the result is the shard
id's spoof node address implemented in scope of fixing Scylla's
timeuuid uniqueness. When the test was implemented all spoof node ids
where 0 on all shards and all coordinators. Later the difference
in behaviour was dormant because cql_repl would always execute
the append on the same shard.

We could fix Scylla to use a zero spoof node address in case a user
timestamp is supplied, but the purpose of this is unclear, it
may actually be to the contrary of the user's intent.
2022-05-25 20:26:42 +03:00
Konstantin Osipov
c119087719 test.py: implement cql_repl
Implement a pytest which would run CQL commands against
a scylla server and pretty print server output.

Will be used in existing Approval tests in subsequent patches.
2022-05-25 20:26:42 +03:00
Konstantin Osipov
5b9262f567 test.py: add topology suite 2022-05-25 20:26:42 +03:00
Konstantin Osipov
26fa9336d1 test.py: add common utility functions to test/pylib 2022-05-25 20:26:42 +03:00
Konstantin Osipov
1955f9168a test.py: switch cql-pytest and rest_api suites to PythonTestSuite
Manage scylla servers for rest_api and cql-pytest suites
using PythonTestSuite. The pool size determines the max
number of servers test.py would run concurrently per
suite. For tiny suites (rest_api) the cost of starting
the servers overweights the cost of running tests so keep
it at a minimum. cql-pytest cas dozens of tests, so run them
in 4 parallel tracks.
2022-05-25 20:26:42 +03:00
Konstantin Osipov
26128a222b test.py: temporarily disable raft
Raft boot sporadically hangs in the master due to
issue #10355.
2022-05-25 20:26:42 +03:00
Konstantin Osipov
7c1c83320c test.py: (pylib) add Scylla Server and Artifact Registry
Allow starting clusters of Scylla servers. Chain up the next
server start to the end of the previous one, and set the next
server's seed to the previous server.

As a workaround for a race between token dissemination through
gossip and streaming, change schema version to force a gossip
round and make sure all tokens end up at the joining node in time.

Make sure scylla start is not race prone.

auth::standard_role_manager creates "cassandra" role in an async loop
auth::do_after_system_ready(), which retries role creation with an
exponential back-off. In other words, even after CQL port is up, Scylla
may still be initializing.

This race condition could lead to spurious errors during cluster
bootstrap or during a test under CI.

When the role is ready, queries begin to work, so rely on this "side
effect".

To start or stop servers, use a new class, ScyllaCluster,
which encapsulates multiple servers united into a cluster.
In it, validate that a test case cleans up after itself.
Additionally, swallow startup errors and throw them when
the test is actually used.
2022-05-25 20:26:37 +03:00
Kamil Braun
6268c63739 test: raft: randomized_nemesis_test: check consistency of reads
The test would perform `read_barrier`s but not check the correctness
of the reads: whether the state observed by a read is consistent with
the model and recent enough (in short, check linearizability).

This commit adds the correctness checks.
2022-05-25 15:00:19 +02:00
Kamil Braun
6b2b400143 test: raft: randomized_nemesis_test: perform linearizable reads using read_barriers
Introduce a new operation, `raft_read`, which calls `read_barrier`
on a server, reads the state of the server's state machine, and returns
that state.

Extend the generator in `basic_generator_test` to generate `raft_read`s.
Only do it if forwarding is enabled (although it may make sense to test
read barriers in non-forwarding scenario as well - we may think about it
and do it in a follow-up).

For now, we don't check the consistency of the results of the reads.
They do return the observed state, but we don't compare it yet with the
model. For now we simply issue the reads concurrently with other
operations to introduce some more chaos to the cluster and check
liveness and consistency of existing operations.
2022-05-25 15:00:19 +02:00
Kamil Braun
4ea5807862 test: raft: randomized_nemesis_test: add flags for disabling nemeses
Makes it easier to debug stuff.
2022-05-25 15:00:16 +02:00
Konstantin Osipov
1b5472f0be test.py: (pylib) add Host Registry to track used server hosts 2022-05-25 14:59:01 +03:00
Konstantin Osipov
2781c4fc66 test.py: (pylib) add a pool of scylla servers (or clusters) 2022-05-25 14:59:01 +03:00
Avi Kivity
19316dee94 Merge 'auto-scale promoted index' from Benny Halevy
Add column_index_auto_scale_threshold_in_kb to the configuration (defaults to 10MB).

When the promoted index (serialized) size gets to this
threshold, it's halved by merging each two adjacent blocks
into one and doubling the desired_block_size.

Fixes #4217

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #10646

* github.com:scylladb/scylla:
  sstables: mx: add pi_auto_scale_events metric
  sstables: mx/writer: auto-scale promoted index
2022-05-25 10:27:19 +03:00
Tomasz Grabiec
f87274f66a sstable: partition_index_cache: Fix abort on bad_alloc during page loading
When entry loading fails and there is another request blocked on the
same page, attempt to erase the failed entry will abort because that
would violate entry_ptr guarantees, which is supposed to keep the
entry alive.

The fix in 92727ac36c was incomplete. It
only helped for the case of a single loader. This patch makes a more
general approach by relaxing the assert.

The assert manifested like this:

scylla: ./sstables/partition_index_cache.hh:71: sstables::partition_index_cache::entry::~entry(): Assertion `!is_referenced()' failed.

Fixes #10617

Closes #10653
2022-05-25 09:27:04 +03:00
Kamil Braun
e8f9bca288 Merge 'raft: test modify_config API in randomized_nemesis_test' from Kamil Braun
Extend the reconfiguration nemesis to send `modify_config` requests as
well as `reconfigure` requests. It chooses one or the other with
probability 1/2.

Fix a bunch of problems that surfaced during testing.

Closes #10544

* github.com:scylladb/scylla:
  test: raft: randomized_nemesis_test: send `modify_config` requests in reconfiguration nemsesis
  test: raft: randomized_nemesis_test: fix `rpc` reply ID generation
  test: raft: randomized_nemesis_test: during bouncing call, allow a leader to reroute to itself
  test: raft: randomized_nemesis_test: handle timed_out_error from modify_config
  service: raft: rpc: don't call `execute...` functions after `abort()`
  raft: server: fix bad_variant_access in `modify_config`
2022-05-24 16:47:31 +02:00
Benny Halevy
6677028212 sstables: mx/writer: auto-scale promoted index
Add column_index_auto_scale_threshold_in_kb to the configuration
(defaults to 10MB).

When the promoted index (serialized) size gets to this
threshold, it's halved by merging each two adjacent blocks
into one and doubling the desired_block_size.

Fixes #4217

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-05-24 13:32:35 +03:00
Kamil Braun
700a1fdd20 test: raft: randomized_nemesis_test: send modify_config requests in reconfiguration nemsesis
Extend the reconfiguration nemesis to send `modify_config` requests as
well as `reconfigure` requests. It chooses one or the other with
probability 1/2.
2022-05-24 11:39:08 +02:00
Kamil Braun
2222f095b3 test: raft: randomized_nemesis_test: fix rpc reply ID generation
When `rpc` wants to perform a two-way RPC call it sends a message
containing a `reply_id`. The other side will send the `reply_id` back
when answering, so the original side can match the response to the promise
corresponding to the future being waited on by the RPC caller.

Previously each instance of `rpc` generated reply IDs independently as
increasing integers starting from 0. The network delivers messages
based on Raft server IDs. A response message may thus be delievered not
to the original instance which invoked the RPC, but to a new instance
which uses the same Raft server ID (after we simulated a server
crash/stop and restart, creating a new server with the same ID that
reuses the previous instance's `persistence` instance but has a new `rpc`).
The new instance could have started a new RPC call using the same
`reply_id` as one currently being in-flight that was started by the
previous instance. The new instance could then receive and handle a
response that was intended for the previous instance, leading to weird
bugs.

Fix this by replacing the local reply ID counters by a global counter so
that every two-way RPC call gets a unique reply ID.
2022-05-24 11:39:08 +02:00
Kamil Braun
b9807f07e6 test: raft: randomized_nemesis_test: during bouncing call, allow a leader to reroute to itself
A server executing a `modify_config` call, even if it initially was a
leader and accepted the request, may end up throwing a `not_a_leader`
error, rerouting the caller to a new leader - but this new leader may be
that same server. This happens because `execute_modify_config`
translates certain errors that it considers transient (such as
`conf_change_in_progress`) into `not_a_leader{last_known_leader}`,
in attempt to notify the caller that they should retry the request; but
when this translation happens, the `last_known_leader` may be that same
server (it could have even lost leadership and then regained it back
while the request was being handled).

This is not strictly an error, and it should be safe for the client to
retry the request by sending it to the same server. The nemesis test
assumed that a server never returns `not_a_leader{itself}`; this commit
drops the assumption.

An alternative solution would be to extend the error types that are now
translated to `not_a_leader` so they include information about the last
known leader. This way the client does not lose information about the
original error and still gets a potential contact point for retry.
2022-05-24 11:36:51 +02:00
Kamil Braun
b33bc7a5d6 test: raft: randomized_nemesis_test: handle timed_out_error from modify_config
May be propagated from `rpc::send_modify_config` to the caller of
`modify_config`.
2022-05-24 11:36:51 +02:00
Nadav Har'El
dc5c9321fe test/cql-pytest: have new_test_table() recycle table names
Scylla has a long-standing bug (issue #7620) where having many
tombstones in the schema table significantly slows down further
schema operations.

Many cql-pytest tests use new_test_table() to create a temporary test
table with a specific schema. Before this patch, each temporary table
was created with a random name, and deleted after the test. When
running many tests on the same Scylla server, this results in a lot
of tombstones in the schema tables, and really slow schema operations.
For example, look at home much time it takes to run the same test file
N times:

$ test/cql-pytest/run --count N test_filtering.py

 N=25 -  16 seconds (total time for the N repetitions)
 N=50 -  41 seconds
N=100 - 122 seconds

Notice how progressively slower each repetition is becoming - the
total test time should have been linear in N, but it isn't!

In this patch, we keep a cache of already-deleted table names (not the
tables, just their names!) so as to reuse the same name when we can
instead of inventing a new random name. With this patch, the performance
improvement after some repetitions is amazing (compare to the table above):

 N=25 - 14 seconds
 N=50 - 29 seconds
N=100 - 46 seconds

Note how the testing time is now more-or-less linear in the number of
repetitions, as expected.

The table-name recycling trick is the same trick I already used in the
past for the translated Cassandra tests (test/cql-pytest/cassandra_tests).
The problem was even more obvious there because those tests create a
lot of different tables. But the same problem also exists in cql-pytest
in general, so let's solve it here too.

Refs #7620

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #10635
2022-05-24 11:32:25 +03:00
Avi Kivity
21728dff6f Merge 'cql: Remove support for null inside lists of IN values' from Jan Ciołek
Currently we support queries like:
```cql
SELECT * FROM ks.tab WHERE p IN (1, 2, null, 4);
```
Nothing can be equal to null so this is equivalent to:
```cql
SELECT * FROM ks.tab WHERE p IN (1, 2, 4);
```

Cassandra doesn't support it at all.
```cql
> SELECT * FROM ks.tab WHERE p IN (1, 2, null, 4)
Error: DbError(Invalid, "Invalid null value in condition for column p")

> SELECT * FROM ks.tab WHERE p IN (1, 2, ?, 4) # ? is NULL
Error: DbError(Invalid, "Invalid null value in condition for column p")

> SELECT * FROM ks.tab WHERE p IN ? # ? is (1, 2, null, 4)
Error: DbError(Invalid, "Invalid null value in condition for column p")
```
It makes little sense to send a null inside list of IN values and supporting it is a bit cumbersome.

Supporting it causes trouble because internally the values are represented as a list, not a tuple, and lists can't contain nulls.
Because of that code requires exceptions because in this single case there can be a null inside of a collection.

This PR starts treating a llist of IN values the same as any other list and as result nulls are forbidden inside them.

In case of a null the message is the same as any other collection:
```
null is not supported inside collections
```
I'm not entirely happy about it - someone could be confused if they received this message after a query that didn't involve any collections.
The problem with making a prettier error message is that once again we would have to give `evaluate` additional information that it's now evaluating a list of IN values. And we would end up back with `evaluate_IN_list`

I think we could consider adding some kind of generic context to evaluate. The context would contain the whole expression and a mark on the part that we are currently evaluating. Then in case of error we could use this context and use it to create more helpful error messages, e.g. point to the part of the expression where a problem occured. But that's outside of the scope of this PR.

Fixes #10579

Closes #10620

* github.com:scylladb/scylla:
  cql: Add test for null in IN list
  cql: Forbid null in lists of IN values
2022-05-24 09:15:13 +03:00
Jan Ciolek
ff3205cf19 cql: Add test for null in IN list
Added tests that check for error
when null or unset appears in a list
of IN values.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-05-24 00:17:54 +02:00
Jan Ciolek
f9b1fc0b69 cql: Forbid null in lists of IN values
We used to allow nulls in lists of IN values,
i.e. a query like this would be valid:
SELECT * FROM tab WHERE pk IN (1, null, 2);

This is an old feature that isn't really used
and is already forbidden in Cassandra.

Additionally the current implementation
doesn't allow for nulls inside the list
if it's sent as a bound value.
So something like:
SELECT * FROM tab WHERE pk IN ?;
would throw an error if ? was (1, null, 2).
This is inconsistent.

Allowing it made writing code cumbersome because
this was the only case where having a null
inside of a collection was allowed.
Because of it there needed to be
separate code paths to handle regular lists
and lists of NULL values.

Forbidding it makes the code nicer and consistent
at the cost of a feature that isn't really
important.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-05-24 00:17:41 +02:00