We have in test_filtering.py two tests which fail when running on an old
version of the Python driver which has a specific bug, so we skip those
tests if the buggy driver is installed.
But the code to check the driver version is duplicated twice, so in this
patch we move the version-checking-and-skipping code to a fixture, which
we can use twice.
The motivation is that in the next patch we will want to introduce a
third use of the same code - and a fixture is cleaner than a third
duplicate.
This patch is supposed to be code-movement only, without functional
changes.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
When memtable receives a tombstone it can happen under some workloads
that it covers data which is still in the memtable. Some workloads may
insert and delete data within a short time frame. We could reduce the
rate of memtable flushes if we eagerly drpo tombstoned data.
One workload which benefits is the raft log. It stores a row for each
uncommitted raft entry. When entries are committed they are
deleted. So the live set is expected to be short under normal
conditions.
Fixes#652.
This patch prevents virtual dirty from going negative during memtable
flush in case partition version merging erases data previously
accounted by the flush reader. There is an assert in
~flush_memory_accounter which guards for this.
This will start happening after tombstones are compacted with rows on
partition version merging.
This problem is prevented by the patch by having the cleaner notify
the memtable layer via callback about the amount of dirty memory released
during merging, so that the memtable layer can adjust its accounting.
Memtables and cache will compact eagerly, so tests should not expect
readers to produce exact mutations written, only those which are
equivalant after applying copmaction.
The compacting reader created using make_compacting_reader() was not
dropping range_tombstone_change fragments which were shadowed by the
partition tombstones. As a result the output fragment stream was not
minimal.
Lack of this change would cause problems in unit tests later in the
series after the change which makes memtables lazily compact partition
versions. In test_reverse_reader_reads_in_native_reverse_order we
compare output of two readers, and assume that compacted streams are
the same. If compacting reader doesn't produce minimal output, then
the streams could differ if one of them went through the compaction in
the memtable (which is minimal).
Preerequisite for eagerly applying tombstones, which we want to be
preemptible. Before the patch, apply path to the memtable was not
preemptible.
Because merging can now be defered, we need to involve snapshots to
kick-off background merging in case of preemption. This requires us to
propagate region and cleaner objects, in order to create a snapshot.
Before the change, the test artificiallu set the soft pressure
condition hoping that the background flusher will flush the
memtable. It won't happen if by the time the background flusher runs
the LSA region is updated and soft pressure (which is not really
there) is lifted. Once apply() becomes preemptibe, backgroun partition
version merging can lift the soft pressure, making the memtable flush
not occur and making the test fail.
Fix by triggering soft pressure on retries.
- Introduce a simpler substitute for `flat_mutation_reader`-resulting-from-a-downgrade that is adequate for the remaining uses but is _not_ a full-fledged reader (does not redirect all logic to an `::impl`, does not buffer, does not really have `::peek()`), so hopefully carries a smaller performance overhead. The name `mutation_fragment_v1_stream` is kind of a mouthful but it's the best I have
- (not tests) Use the above instead of `downgrade_to_v1()`
- Plug it in as another option in `mutation_source`, in and out
- (tests) Substitute deliberate uses of `downgrade_to_v1()` with `mutation_fragment_v1_stream()`
- (tests) Replace all the previously-overlooked occurrences of `mutation_source::make_reader()` with `mutation_source::make_reader_v2()`, or with `mutation_source::make_fragment_v1_stream()` where deliberate or still required (see below)
- (tests) This series still leaves some tests with `mutation_fragment_v1_stream` (i.e. at v1) where not called for by the test logic per se, because another missing piece of work is figuring out how to properly feed `mutation_fragment_v2` (i.e. range tombstone changes) to `mutation_partition`. While that is not done (and I think it's better to punt on it in this PR), we have to produce `mutation_fragment` instances in tests that `apply()` them to `mutation_partition`, thus we still use downgraded readers in those tests
- Remove the `flat_mutation_reader` class and things downstream of it
Fixes#10586Closes#10654
* github.com:scylladb/scylla:
fix "ninja dev-headers"
flat_mutation_reader ist tot
tests: downgrade_to_v1() -> mutation_fragment_v1_stream()
tests: flat_reader_assertions: refactor out match_compacted_mutation()
tests: ms.make_reader() -> ms.make_fragment_v1_stream()
repair/row_level: mutation_fragment_v1_stream() instead of downgrade_to_v1()
stream_transfer_task: mutation_fragment_v1_stream() instead of downgrade_to_v1()
sstables_loader: mutation_fragment_v1_stream() instead of downgrade_to_v1()
mutation_source: add ::make_fragment_v1_stream()
introduce mutation_fragment_v1_stream
tests: ms.make_reader() -> ms.make_reader_v2()
tests: remove test_downgrade_to_v1_clear_buffer()
mutation_source_test: fix indentation
tests: remove some redundant calls to downgrade_to_v1()
tests: remove some to-become-pointless ms.make_reader()-using tests
tests: remove some to-become-pointless reader downgrade tests
The test sometimes fails because the order of rows in the SELECT results
depends on how stream IDs for the different partition keys get generated.
In some runs the stream ID for pk=1 may go before the stream ID for
pk=4, in some runs the other way.
The fix is to use the same partition key but different clustering keys
for the different rows.
Refs: #10601Closes#10718
The projected limited replacement of downgraded v1 mutation reader
will not do its own buffering, so this test will be pointless.
Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
mutation_source are going to be created only from v2 readers and the
::make_reader() method family is scheduled for removal.
Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
Fixes#10489
Killing the CDC log table on CDC disable is unhelpful in many ways,
partly because it can cause random exceptions on nodes trying to
do a CDC-enabled write at the same time as log table is dropped,
but also because it makes it impossible to collect data generated
before CDC was turned off, but which is not yet consumed.
Since data should be TTL:ed anyway, retaining the table should not
really add any overhead beyond the compaction to eventually clear
it. And user did set TTL=0 (disabled), then he is already responsible
for clearing out the data.
This also has the nice feature of meshing with the alternator streams
semantics.
Closes#10601
The change
- adds a test which exposes a problem of a peculiar setup of
tombstones that trigger a mutation fragment stream validation exception
- fixes the problem
Applying tombstones in the order:
range_tombstone_change pos(ck1), after_all_prefixed, tombstone_timestamp=1
range_tombstone_change pos(ck2), before_all_prefixed, tombstone=NONE
range_tombstone_change pos(NONE), after_all_prefixed, tombstone=NONE
Leads to swapping the order of mutations when written and read from
disk via sstable writer. This is caused by conversion of
range_tombstone_change (in memory representation) to range tombstone
marker (on disk representation) and back.
When this mutation stream is written to disk, the range tombstone
markers type is calculated based on the relationship between
range_tombstone_changes. The RTC series as above produces markers
(start, end, start). When the last marker is loaded from disk, it's kind
gets incorrectly loaded as before_all_prefixed instead of
after_all_prefixed. This leads to incorrect order of mutations.
The solution is to skip writing a new range_tombstone_change with empty
tombstone if the last range_tombstone_change already has empty
tombstone. This is redundant information and can be safely removed,
while the logic of encoding RTCs as markers doesn't handle such
redundancy well.
Closes#10643
I noticed that `column_condition` (used in LWT `IF` clause) supports lists.
As part of the Grand Expression Unification we'll need to migrate that to
expressions, so we'll need to support list subscripts.
Use the opportunity to relax the normal filtering to allow filtering on
list subscripts: `WHERE my_list[:index] = :value`.
Closes#10645
* github.com:scylladb/scylla:
test: cql-pytest: add test for list subscript filtering
doc: document list subscripts usable in WHERE clause
cql3: expr: drop restrictions on list subscripts
cql3: expr: prepare_expr: support subscripted lists
cql3: expressions: reindent get_value()
cql3: expression: evaluate() support subscripting lists
coroutine::parallel_for_each avoids an allocation and is therefore preferred. The lifetime
of the function object is less ambiguous, and so it is safer. Replace all eligible
occurences (i.e. caller is a coroutine).
One case (storage_service::node_ops_cmd_heartbeat_updater()) needed a little extra
attention since there was a handle_exception() continuation attached. It is converted
to a try/catch.
Closes#10699
We modify the `reconfigure` and `modify_config` APIs to take a vector of
<server_id, bool> pairs (instead of just a vector of server_ids), where
the bool indicates whether the server is a voter in the modified config.
The `reconfiguration` operation would previously shuffle the set of
servers and split it into two parts: members and non-members. Now it
partitions it into three parts: voters, non-voters, and non-members.
The PR also includes fixes for some liveness problems stumbled upon
during testing.
Closes#10640
* github.com:scylladb/scylla:
test: raft: randomized_nemesis_test: include non-voters during reconfigurations
raft: server: if `add_entry` with `wait_type::applied` successfully returns, ensure `state_machine::apply` is called for this entry
raft: tracker: fix the definition of `voters()`
raft: when printing `raft::server_address`, include `can_vote`
Infer the type of a list index as int32_type.
The error message when a non-subscriptable type is provided is
changed, so the corresponding test is changed too.
is not supported' from Nadav Har'El
This small series implements the DescribeTimeToLive and
DescribeContinuousBackups operations in Alternator. Even if the
corresponding features aren't implemented, it can help applications that
we implement just the Describe operation that can say that this feature
is in fact currently disabled.
Fixes #10660Closes#10670
* github.com:scylladb/scylla:
alternator: remove dead code
alternator: implement DescribeContinuousBackups operation
alternator: allow DescribeTimeToLive even without TTL enabled
This patch contains five tests which reproduce three old bugs in
Scylla's handling of multi-column restrictions like (c1,c2)<(1,2).
These old bugs are:
Refs #64 (yes, a two-digit issue!)
Refs #4244
Refs #6200
The three github issues are closely intertwined, exposing the same
or similar bugs in our internal implementation, and I suspect that
eventually most of them could be fixed together.
In writing these tests, I carefully read all three issues and the
various failure scenarios described in them, tried to distill and
simplify the scenarios, and also consider various other broken
variants of the scenarios. The resulting tests are heavily commented,
explaining the motivation of each test and exactly which of the
above bugs it reproduces.
All five tests included in this patch pass on Cassandra and currently
fail on Scylla, so are marked "xfail".
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#10675
"
The table keeps references on sstables_ and compaction_ managers
(among other things), but the latter sits as a pointer on table's
config while the former -- as on-table direct reference.
This set unifies both by turning sstables manager on-config pointer
into on-table reference.
branch: https://github.com/xemul/scylla/tree/br-table-vs-sstables-manager
tests: https://jenkins.scylladb.com/job/releng/job/Scylla-CI/574/
"
* 'br-table-vs-sstables-manager' of https://github.com/xemul/scylla:
tests: Remove sstables_manager& from column_family_test_config()
table: Move sstables_manager from config onto table itself
table, db, tests: Pass sstables_manager& into table constructor
In issues #7944 and #10625 it was noticed that by assigning an empty
string to a non-string type (int, date, etc.) using INSERT or
INSERT JSON, some combinations of the above can create "empty" values
while they should produce a clear error.
The tests added in this patch explore the different combinations of
types and insert modes, and reproduce several buggy cases in Scylla
(resulting in xfail'ing tests) as well as Cassandra.
We feared that there might be a way using those buggy statements to
create a partition with an empty key - something which used to kill
older versions of Scylla. But the tests show that this is not possible -
while a user can use the buggy statements to create an empty value,
Scylla refuses it when it is used as a single-column partition key.
Refs #10625
Refs #7944
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#10628
The purpose of this series is to introduce infrastructure
for managed scylla processes into test.py,
switch some existing suites to use test.py managed processes
and introduce cluster tests.
All of this is expected to make possible to test Raft topology
changes and schema changes using an easy to use and fast tool
such as test.py. In general this will allow testing Scylla clusters
from within the development test harness.
Branch URL: kostja/test.py.v5
Closes#10406
* github.com:scylladb/scylla:
test: disable topology/test_null
test.py: disable cdc_with_lwt_test it's flaky in debug mode
test.py: workaround for a python bug
test: cleanup (drop keyspace) in two cql tests to support --repeat
test.py: respect --verbose even if output is a tty
test: remove tools/cql_repl
test.py: switch cql/ suite to pytest/tabular output
test: remove a flaky test case
test.py: implement CQL approval tests over pytest
test.py: implement cql_repl
test.py: add topology suite
test.py: add common utility functions to test/pylib
test.py: switch cql-pytest and rest_api suites to PythonTestSuite
test.py: introduce PythonTest and PythonTestSuite
test.py: use artifact registry
test.py: temporarily disable raft
test.py: (pylib) add Scylla Server and Artifact Registry
test.py: (pylib) add Host Registry to track used server hosts
test.py: (pylib) add a pool of scylla servers (or clusters)
The manager reference is already available in constructor and thus
can be copied to on-table member.
The code that chooses the manager (user/system one) should be moved
from make_column_family_config() into add_column_family() method.
Once this happens, the get_sstables_manager() should be fixed to
return the reference from its new location. While at it -- mark the
method in question noexcept and add it's mutable overload.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
In core code there's only one place that constructs table -- in
database.cc -- and this place currently has the sstables_manager pointer
sitting on table config (despite it's a pointer, it's always non-null).
All the tests always use the manager from one of _env's out there.
For now the new contructor arg is unused.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
We modify the `reconfigure` and `modify_config` APIs to take a vector of
<server_id, bool> pairs (instead of just a vector of server_ids), where
the bool indicates whether the server is a voter in the modified config.
The `reconfiguration` operation would previously shuffle the set of
servers and split it into two parts: members and non-members. Now it
partitions it into three parts: voters, non-voters, and non-members.
This patch adds reproducing tests for wrong handling of LIMIT in a query
which uses a secondary index *and* filtering, described in issue #10649.
In that case, Scylla incorrectly limits the number of rows found in the
index *before* the filtering, while it should limit the number of rows
*after* the filtering.
The tests in this patch (which xfail on Scylla, and pass on Cassandra)
go beyond the minimum required to reproduce this bug. It turns out that
there are different sub-cases of this problem that go through different
code paths, namely whether the base table has clustering keys or just
partition keys, and whether the overall LIMITed result spans more than
one page. So these tests attempt to also cover all these sub-cases.
Without all these test sub-cases, an incomplete and incorrect fix of this
bug may, by chance, cause the original test to succeed.
Refs #10649
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#10658
Introduce a new operation, `raft_read`, which calls `read_barrier`
on a server, reads the state of the server's state machine, and returns
that state.
Extend the generator in `basic_generator_test` to generate `raft_read`s.
Only do it if forwarding is enabled (although it may make sense to test
read barriers in non-forwarding scenario as well - we may think about it
and do it in a follow-up).
Check the consistency of the read results by comparing them with the model
and using the result to extend the model with any newly observed elements.
The patchset includes some fixes for correctness (#10578)
and liveness (handling aborts correctly).
Closes#10561
* github.com:scylladb/scylla:
test: raft: randomized_nemesis_test: check consistency of reads
test: raft: randomized_nemesis_test: perform linearizable reads using read_barriers
test: raft: randomized_nemesis_test: add flags for disabling nemeses
raft: server: in `abort()`, abort read barriers before waiting for rpc abort
raft: server: handle aborts correctly in `read_barrier`
raft: fsm: don't advance commit index further than match_idx during read_quorum
Although we don't yet support the DynamoDB API's backup features (see
issue #5063), we can already implement the DescribeContinuousBackups
operation. It should just say that continuous backups, and point-in-time
restores, and disabled.
This will be useful for client code which tries to inquire about
continuous backups, even if not planning to use them in practice
(e.g., see issue #10660).
Refs #5063
Refs #10660
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Scylla has a bug that only fires ones in a hundred runs in debug mode
when a schema change parallel to a topology change leads to a lost
keyspace and internal error. Disable the tests until Raft is enabled for
schema.
When we append an entry to a list with the same user-defined
timestamp, the behaviour is actually undefined. If the append
is processed by the same coordinator as the one that accepted
the existing entry, then it gets the same timeuuid as the list key,
and replaces (potentially) the existing list valiue. Then it
gets a timeuuid which maybe both larger and smaller than the existing
key's timeuuid, and then turns either to an append or a prepend.
The part of the timestamp responsible for the result is the shard
id's spoof node address implemented in scope of fixing Scylla's
timeuuid uniqueness. When the test was implemented all spoof node ids
where 0 on all shards and all coordinators. Later the difference
in behaviour was dormant because cql_repl would always execute
the append on the same shard.
We could fix Scylla to use a zero spoof node address in case a user
timestamp is supplied, but the purpose of this is unclear, it
may actually be to the contrary of the user's intent.
Implement a pytest which would run CQL commands against
a scylla server and pretty print server output.
Will be used in existing Approval tests in subsequent patches.
Manage scylla servers for rest_api and cql-pytest suites
using PythonTestSuite. The pool size determines the max
number of servers test.py would run concurrently per
suite. For tiny suites (rest_api) the cost of starting
the servers overweights the cost of running tests so keep
it at a minimum. cql-pytest cas dozens of tests, so run them
in 4 parallel tracks.