Commit Graph

3127 Commits

Author SHA1 Message Date
Konstantin Osipov
c43736c53b test: remove a flaky test case
When we append an entry to a list with the same user-defined
timestamp, the behaviour is actually undefined. If the append
is processed by the same coordinator as the one that accepted
the existing entry, then it gets the same timeuuid as the list key,
and replaces (potentially) the existing list valiue. Then it
gets a timeuuid which maybe both larger and smaller than the existing
key's timeuuid, and then turns either to an append or a prepend.

The part of the timestamp responsible for the result is the shard
id's spoof node address implemented in scope of fixing Scylla's
timeuuid uniqueness. When the test was implemented all spoof node ids
where 0 on all shards and all coordinators. Later the difference
in behaviour was dormant because cql_repl would always execute
the append on the same shard.

We could fix Scylla to use a zero spoof node address in case a user
timestamp is supplied, but the purpose of this is unclear, it
may actually be to the contrary of the user's intent.
2022-05-25 20:26:42 +03:00
Konstantin Osipov
c119087719 test.py: implement cql_repl
Implement a pytest which would run CQL commands against
a scylla server and pretty print server output.

Will be used in existing Approval tests in subsequent patches.
2022-05-25 20:26:42 +03:00
Konstantin Osipov
5b9262f567 test.py: add topology suite 2022-05-25 20:26:42 +03:00
Konstantin Osipov
26fa9336d1 test.py: add common utility functions to test/pylib 2022-05-25 20:26:42 +03:00
Konstantin Osipov
1955f9168a test.py: switch cql-pytest and rest_api suites to PythonTestSuite
Manage scylla servers for rest_api and cql-pytest suites
using PythonTestSuite. The pool size determines the max
number of servers test.py would run concurrently per
suite. For tiny suites (rest_api) the cost of starting
the servers overweights the cost of running tests so keep
it at a minimum. cql-pytest cas dozens of tests, so run them
in 4 parallel tracks.
2022-05-25 20:26:42 +03:00
Konstantin Osipov
26128a222b test.py: temporarily disable raft
Raft boot sporadically hangs in the master due to
issue #10355.
2022-05-25 20:26:42 +03:00
Konstantin Osipov
7c1c83320c test.py: (pylib) add Scylla Server and Artifact Registry
Allow starting clusters of Scylla servers. Chain up the next
server start to the end of the previous one, and set the next
server's seed to the previous server.

As a workaround for a race between token dissemination through
gossip and streaming, change schema version to force a gossip
round and make sure all tokens end up at the joining node in time.

Make sure scylla start is not race prone.

auth::standard_role_manager creates "cassandra" role in an async loop
auth::do_after_system_ready(), which retries role creation with an
exponential back-off. In other words, even after CQL port is up, Scylla
may still be initializing.

This race condition could lead to spurious errors during cluster
bootstrap or during a test under CI.

When the role is ready, queries begin to work, so rely on this "side
effect".

To start or stop servers, use a new class, ScyllaCluster,
which encapsulates multiple servers united into a cluster.
In it, validate that a test case cleans up after itself.
Additionally, swallow startup errors and throw them when
the test is actually used.
2022-05-25 20:26:37 +03:00
Konstantin Osipov
1b5472f0be test.py: (pylib) add Host Registry to track used server hosts 2022-05-25 14:59:01 +03:00
Konstantin Osipov
2781c4fc66 test.py: (pylib) add a pool of scylla servers (or clusters) 2022-05-25 14:59:01 +03:00
Tomasz Grabiec
f87274f66a sstable: partition_index_cache: Fix abort on bad_alloc during page loading
When entry loading fails and there is another request blocked on the
same page, attempt to erase the failed entry will abort because that
would violate entry_ptr guarantees, which is supposed to keep the
entry alive.

The fix in 92727ac36c was incomplete. It
only helped for the case of a single loader. This patch makes a more
general approach by relaxing the assert.

The assert manifested like this:

scylla: ./sstables/partition_index_cache.hh:71: sstables::partition_index_cache::entry::~entry(): Assertion `!is_referenced()' failed.

Fixes #10617

Closes #10653
2022-05-25 09:27:04 +03:00
Kamil Braun
e8f9bca288 Merge 'raft: test modify_config API in randomized_nemesis_test' from Kamil Braun
Extend the reconfiguration nemesis to send `modify_config` requests as
well as `reconfigure` requests. It chooses one or the other with
probability 1/2.

Fix a bunch of problems that surfaced during testing.

Closes #10544

* github.com:scylladb/scylla:
  test: raft: randomized_nemesis_test: send `modify_config` requests in reconfiguration nemsesis
  test: raft: randomized_nemesis_test: fix `rpc` reply ID generation
  test: raft: randomized_nemesis_test: during bouncing call, allow a leader to reroute to itself
  test: raft: randomized_nemesis_test: handle timed_out_error from modify_config
  service: raft: rpc: don't call `execute...` functions after `abort()`
  raft: server: fix bad_variant_access in `modify_config`
2022-05-24 16:47:31 +02:00
Kamil Braun
700a1fdd20 test: raft: randomized_nemesis_test: send modify_config requests in reconfiguration nemsesis
Extend the reconfiguration nemesis to send `modify_config` requests as
well as `reconfigure` requests. It chooses one or the other with
probability 1/2.
2022-05-24 11:39:08 +02:00
Kamil Braun
2222f095b3 test: raft: randomized_nemesis_test: fix rpc reply ID generation
When `rpc` wants to perform a two-way RPC call it sends a message
containing a `reply_id`. The other side will send the `reply_id` back
when answering, so the original side can match the response to the promise
corresponding to the future being waited on by the RPC caller.

Previously each instance of `rpc` generated reply IDs independently as
increasing integers starting from 0. The network delivers messages
based on Raft server IDs. A response message may thus be delievered not
to the original instance which invoked the RPC, but to a new instance
which uses the same Raft server ID (after we simulated a server
crash/stop and restart, creating a new server with the same ID that
reuses the previous instance's `persistence` instance but has a new `rpc`).
The new instance could have started a new RPC call using the same
`reply_id` as one currently being in-flight that was started by the
previous instance. The new instance could then receive and handle a
response that was intended for the previous instance, leading to weird
bugs.

Fix this by replacing the local reply ID counters by a global counter so
that every two-way RPC call gets a unique reply ID.
2022-05-24 11:39:08 +02:00
Kamil Braun
b9807f07e6 test: raft: randomized_nemesis_test: during bouncing call, allow a leader to reroute to itself
A server executing a `modify_config` call, even if it initially was a
leader and accepted the request, may end up throwing a `not_a_leader`
error, rerouting the caller to a new leader - but this new leader may be
that same server. This happens because `execute_modify_config`
translates certain errors that it considers transient (such as
`conf_change_in_progress`) into `not_a_leader{last_known_leader}`,
in attempt to notify the caller that they should retry the request; but
when this translation happens, the `last_known_leader` may be that same
server (it could have even lost leadership and then regained it back
while the request was being handled).

This is not strictly an error, and it should be safe for the client to
retry the request by sending it to the same server. The nemesis test
assumed that a server never returns `not_a_leader{itself}`; this commit
drops the assumption.

An alternative solution would be to extend the error types that are now
translated to `not_a_leader` so they include information about the last
known leader. This way the client does not lose information about the
original error and still gets a potential contact point for retry.
2022-05-24 11:36:51 +02:00
Kamil Braun
b33bc7a5d6 test: raft: randomized_nemesis_test: handle timed_out_error from modify_config
May be propagated from `rpc::send_modify_config` to the caller of
`modify_config`.
2022-05-24 11:36:51 +02:00
Nadav Har'El
dc5c9321fe test/cql-pytest: have new_test_table() recycle table names
Scylla has a long-standing bug (issue #7620) where having many
tombstones in the schema table significantly slows down further
schema operations.

Many cql-pytest tests use new_test_table() to create a temporary test
table with a specific schema. Before this patch, each temporary table
was created with a random name, and deleted after the test. When
running many tests on the same Scylla server, this results in a lot
of tombstones in the schema tables, and really slow schema operations.
For example, look at home much time it takes to run the same test file
N times:

$ test/cql-pytest/run --count N test_filtering.py

 N=25 -  16 seconds (total time for the N repetitions)
 N=50 -  41 seconds
N=100 - 122 seconds

Notice how progressively slower each repetition is becoming - the
total test time should have been linear in N, but it isn't!

In this patch, we keep a cache of already-deleted table names (not the
tables, just their names!) so as to reuse the same name when we can
instead of inventing a new random name. With this patch, the performance
improvement after some repetitions is amazing (compare to the table above):

 N=25 - 14 seconds
 N=50 - 29 seconds
N=100 - 46 seconds

Note how the testing time is now more-or-less linear in the number of
repetitions, as expected.

The table-name recycling trick is the same trick I already used in the
past for the translated Cassandra tests (test/cql-pytest/cassandra_tests).
The problem was even more obvious there because those tests create a
lot of different tables. But the same problem also exists in cql-pytest
in general, so let's solve it here too.

Refs #7620

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #10635
2022-05-24 11:32:25 +03:00
Avi Kivity
21728dff6f Merge 'cql: Remove support for null inside lists of IN values' from Jan Ciołek
Currently we support queries like:
```cql
SELECT * FROM ks.tab WHERE p IN (1, 2, null, 4);
```
Nothing can be equal to null so this is equivalent to:
```cql
SELECT * FROM ks.tab WHERE p IN (1, 2, 4);
```

Cassandra doesn't support it at all.
```cql
> SELECT * FROM ks.tab WHERE p IN (1, 2, null, 4)
Error: DbError(Invalid, "Invalid null value in condition for column p")

> SELECT * FROM ks.tab WHERE p IN (1, 2, ?, 4) # ? is NULL
Error: DbError(Invalid, "Invalid null value in condition for column p")

> SELECT * FROM ks.tab WHERE p IN ? # ? is (1, 2, null, 4)
Error: DbError(Invalid, "Invalid null value in condition for column p")
```
It makes little sense to send a null inside list of IN values and supporting it is a bit cumbersome.

Supporting it causes trouble because internally the values are represented as a list, not a tuple, and lists can't contain nulls.
Because of that code requires exceptions because in this single case there can be a null inside of a collection.

This PR starts treating a llist of IN values the same as any other list and as result nulls are forbidden inside them.

In case of a null the message is the same as any other collection:
```
null is not supported inside collections
```
I'm not entirely happy about it - someone could be confused if they received this message after a query that didn't involve any collections.
The problem with making a prettier error message is that once again we would have to give `evaluate` additional information that it's now evaluating a list of IN values. And we would end up back with `evaluate_IN_list`

I think we could consider adding some kind of generic context to evaluate. The context would contain the whole expression and a mark on the part that we are currently evaluating. Then in case of error we could use this context and use it to create more helpful error messages, e.g. point to the part of the expression where a problem occured. But that's outside of the scope of this PR.

Fixes #10579

Closes #10620

* github.com:scylladb/scylla:
  cql: Add test for null in IN list
  cql: Forbid null in lists of IN values
2022-05-24 09:15:13 +03:00
Jan Ciolek
ff3205cf19 cql: Add test for null in IN list
Added tests that check for error
when null or unset appears in a list
of IN values.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-05-24 00:17:54 +02:00
Jan Ciolek
f9b1fc0b69 cql: Forbid null in lists of IN values
We used to allow nulls in lists of IN values,
i.e. a query like this would be valid:
SELECT * FROM tab WHERE pk IN (1, null, 2);

This is an old feature that isn't really used
and is already forbidden in Cassandra.

Additionally the current implementation
doesn't allow for nulls inside the list
if it's sent as a bound value.
So something like:
SELECT * FROM tab WHERE pk IN ?;
would throw an error if ? was (1, null, 2).
This is inconsistent.

Allowing it made writing code cumbersome because
this was the only case where having a null
inside of a collection was allowed.
Because of it there needed to be
separate code paths to handle regular lists
and lists of NULL values.

Forbidding it makes the code nicer and consistent
at the cost of a feature that isn't really
important.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-05-24 00:17:41 +02:00
Avi Kivity
3fadae74e7 Merge "Keep cluster-join code in one place" from Pavel E
"
There are several issues with it

- it's scattered between main() and storage_service methods
- yet another incarnation of it also sits in the cql-test-env
- the prepare_to_join() and join_token_ring() names are lying to readers,
  as sometimes node joins the ring in prepare- stage
- storage service has to carry several private fields to keep the state
  between prepare- and join- parts
- some storage service dependencies are only needed to satisfy joining,
  but since they cannot start early enough, they are pushed to storage
  service uninitialized "in the hope" that it won't use them until join

This patch puts joining steps in one place and enlightens storage service
not to carry unneeded dependencies/state onboard. And eliminates one more
usage of global proxy instance while at it.

branch: https://github.com/xemul/scylla/tree/br-merge-init-server-and-join-cluster
tests: https://jenkins.scylladb.com/job/releng/job/Scylla-CI/466/
refs: #2795
"

* 'br-merge-init-server-and-join-cluster' of https://github.com/xemul/scylla:
  storage_service: Remove global proxy call
  storage_service: Remove sys_dist_ks from storage_service dependencies
  storage_service: Remove cdc_gen_service from storage_service dependencies
  storage_service: Make _cdc_gen_id local variable
  storage_service: Make _bootstrap_tokens local variable
  storage_service: Merge prepare- and join- private members
  storage_service: Move some code up the file
  storage_service: Coroutinize join_token_ring
  storage_service: Fix indentation after previous patch
  storage_service: Execute its .bootstrap() into async()
  storage_service: Dont assume async context in mark_existing_views_as_built
  storage_service: Merge init-server and join-cluster
  main, storage_service: Move wait for gossip to settle
  main, storage_service: Move passive announce subscription
  main, storage_service: Move early group0 join call
2022-05-23 17:33:02 +03:00
Pavel Emelyanov
d755fdc1f4 storage_service: Remove global proxy call
Storage service needs it to calculate schema version on join. The proxy
at this point can be passed as an argument to the joining helper.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-05-23 12:55:30 +03:00
Pavel Emelyanov
bc051387c5 storage_service: Remove sys_dist_ks from storage_service dependencies
The service in question is only needed join_cluster-time, no need to
keep it in the dependencies list. This also solves the dependency
trouble -- the distributed keyspace is sharded::start-ed after it's
passed to storage_service initialization.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-05-23 12:55:30 +03:00
Pavel Emelyanov
5a97ba7121 storage_service: Remove cdc_gen_service from storage_service dependencies
This service is only needed join-time, it's better to pass it as
argument to join_cluster(). This solves current reversed dependency
issuse -- the cdc_gen_svc is now started after it's passed to storage
service initialization.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-05-23 12:55:30 +03:00
Pavel Emelyanov
282cc070bc storage_service: Merge init-server and join-cluster
Now they always follow one another both in main and cql-test-env.
Also, despite the name, init-server does joins the cluster when it's
just a normal node restarting, so join-cluster is called when the
cluster is already joind. This merge make the function be named as
what it really does.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-05-23 12:55:30 +03:00
Pavel Emelyanov
b2b86b0c83 main, storage_service: Move wait for gossip to settle
And make cql-test-env configure to skip it not to slow down tests in
vain. Another side effect is that cql-test-env would trigger features
enabling at this point, but that's OK, they are enabled anyway.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-05-23 12:55:30 +03:00
Nadav Har'El
02fb6f33fb test/*/run: improve signal handling in test runners
When a user runs a script and presses control-C, a SIGINT (signal 2)
gets sent to every process in the script's "process group". By default,
every subprocess started by a script joins the parent's process group.

Our test/*/run test-runner scripts typically start two processes: scylla
and pytest. If we keep them in the same process group, a control-C
would kill them in a random order and that is ugly - if Scylla is
killed before pytest, we'll see a few test failures before pytest
is finally killed. So the existing code put Scylla in its own process
group, and killed it on exit after killing pytest.

But there were a few inconsistencies in our implementation, leading
to some annoying behaviors:

1. Doing "kill -2" to the runner's process (not a control-C which sends
   a signal to the process group) caused scylla and pytest to be killed
   on exit. So far so good. But, we should kill their entire process
   groups, not just the one process. This is important when pytest starts
   its own subprocesses (as happens in cql-pytest/test_tools.py),
   otherwise they just remain running.
   We need to call pgkill() instead of kill(), but also we forgot
   to start a new process group for the pytest run - so this patch
   fixes it.

2. Our exit handler - which kills the subprocesses - only gets called
   on signals which Python catches, and this is only SIGINT. Killing
   the test runner with SIGTERM or SIGHUP before this patch caused
   the subprocesses to be left running. In this patch we also catch
   SIGTERM and SIGHUP, so our exit handler is also run in that case.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #10629
2022-05-23 12:24:19 +03:00
Nadav Har'El
ac393a62a1 cql-pytest: translate Cassandra's tests for relations
This is a translation of Cassandra's CQL unit test source file
validation/operations/SelectSingleColumnRelationTest.java into our
cql-pytest framework.

This test file includes 23 tests for various types of SELECT operations
which involve relations, a.k.a expressions (i.e., WHERE).

All 23 tests pass on Cassandra. 3 of the tests fail on Scylla
reproducing 2 already known Scylla issues and three minor
previously-unknown issues:

Previously known issues:

Refs #2962:  Collection column indexing
Refs #10358: Comparison with UNSET_VALUE should produce an error

Three new (and minor) issue:

Refs #10577: Is max-clustering-key-restrictions-per-query too low?
Refs #10631: Invalid IN restriction is reported as a '=' restriction
Refs #10632: Column name printed in a strange way in error message

NOTE: Scylla supports some expressions which Cassandra does not. In some
cases the Cassandra unit test had checks that certain constructs are not
allowed, and I had to comment out such checks when the expression *does*
work in Scylla. But of course, in such cases, it is not enough to comment
out a check - we also need to verify that Scylla's unique behavior
is the correct one. For that, we will have separate cql-pytest test for
those features - they won't be in the translated Cassandra unit tests
(of course). For example, in this test I had to comment out a check
that filtering on *non*-frozen UDTs is not allowed. In a separate
patch which I'm sending in parallel, I added a new test -
test_filter_UDT_restriction_nonfrozen - which will verify that what
Scylla does in that case is the correct behavior.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2022-05-22 20:49:04 +03:00
Nadav Har'El
8af2c4aced test/cql-pytest: add test for filtering UDTs
Both Scylla and Cassandra support filtering on frozen UDTs, which are
compared using lexicographical order. This patch adds a test to verify
that the behavior here is the same - and indeed it is.

For *non*-frozen UDTs, Cassandra does not allow filtering on them (this
was decided in CASSANDRA-13247), but Scylla does. So we also add a test
on how non-frozen UDTs work - that passes on Scylla (and of course not
in Cassandra).

The two tests here - for frozen and non-frozen UDTs - are identical
(they just call the same function) - to ensure these two cases work the
same. This is important because we can't judge the correctness of the
non-frozen test by comparison to Cassandra - because it can't run there.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2022-05-22 20:20:15 +03:00
Nadav Har'El
1c355d773a test/cql-pytest: tests for IN restrictions and filtering
Cassandra only allows IN restrictions on primary key columns.
With filtering (ALLOW FILTERING), this limitation makes little
sense, and it turns out that Scylla does not have limitation.
So this patch adds a test that we support such queries *correctly*
(we can't compare to Cassandra because it doesn't implement this).

Another test checks IN restrictions on an *indexed* column
(without ALLOW FILTERING). We could have implemented this - the
indexed column behaves like a partition key - but we didn't,
so this test xfails. It also fails on Cassandra because as mentioned
above, Cassandra didn't implement IN except for primary key columns.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2022-05-22 20:20:15 +03:00
Nadav Har'El
bd71e6d962 test/cql-pytest: test more cases of overlapping restrictions
As noted by test_filtering.py::test_multiple_restrictions_on_same_column
Cassandra WHERE does not allow specifying two restrictions on the the
same column, but Scylla does allow it, and this test verifies that the
results are correct (conflicting restrictions would lead to no results,
but overlapping restrictions can return some results).

In this patch we add yet another example of multiple restrictions
on the same column that was seen in a Cassandra unit test - this time
one of the restrictions involves a IN. These patch helps confirm that
the expression evaluation is done correct (and, again, differently from
Cassandra - Cassandra results in an error in this case).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2022-05-22 20:20:15 +03:00
Avi Kivity
fcf292b5d1 Merge 'Add range deserializer' from Benny Halevy
Currently, the idl-generated deserialization code, e.g. mutation_partition_view::rows() deserializes and returns a complete utils::chunked_vector<deletable_row_view> . And that could be arbitrarily long.

To consume it gently, we don't need the whole vector in advance, but rather we can consume it one element at a time (and in a nested way for cells in a row in the future).

Use `range_deserializer` to consume range tombstones and rows one item at a time.

We may consider in the future also gently iterating over cells
in a row and then dipping into collection cells that might also
contain a large number of items.

Fixes #10558

Closes #10566

* github.com:scylladb/scylla:
  ser: use vector_deserializer by default for all idl vectors
  mutation_partition_view: do_accept_gently: use the range based deserializers
  idl-compiler: generate *_range methods using vector_deserializer
  serializer_impl: add vector_deserializer
  test: frozen_mutation_test: test_writing_and_reading_gently: log detailed error
2022-05-19 17:21:35 +03:00
Avi Kivity
5285ccbb12 Merge 'Add prune ghost rows statement' from Piotr Sarna
This series is split from another, bigger RFC series which provides
manual remedies to deal with inconsistencies between the base table
and its views. This part deals with ghost rows by providing a statement
which fetches view rows from a given range, then reads its corresponding
rows from the base table (cl=ALL), and finally removes rows which were
not present in the base table at all, qualifying them as ghost rows.
Motivations for introducing such a statement:
 * in case of detected inconsistencies, it can be used to fix
   materialized views without recreating them from scratch, which can
   take days and generates lots of throughput
 * a tool which periodically scrubs a materialized view can be easily
   created on top of this statement, especially that it's possible
   to remove ghost rows from a user-defined view token range;

This series comes with a unit test.

The reason for digging up this series is because it's still possible to end up with ghost rows in certain rather improbable scenarios, and we lack a way of fixing them without rebuilding the whole view. For instance, in case of a failed synchronous update to a local view, the user will be notified that the query failed, but a ghost row can be created nonetheless. The pruning statement introduced in this series would allow healing the failure locally, without rebuilding the whole view.

Tests: unit(dev)

Closes #10426

* github.com:scylladb/scylla:
  docs: add a paragraph on PRUNE MATERIALIZED VIEW statement
  service,test: add a test case for error during pruning
  tests: add ghost row deletion test case
  cql3: enable ghost row deletion via CQL
  cql3: add a statement for deleting ghost rows
  cql3: convert is_json statement parameter to enum
  pager: add ghost row deleting pager
  db,view: add delete ghost rows visitor
2022-05-19 17:21:35 +03:00
Gleb Natapov
c2ef390a52 service: raft: move group0 write path into a separate file
Writing into the group0 raft group on a client side involves locking
the state machine, choosing a state id and checking for its presence
after operation completes. The code that does it resides now in the
migration manager since the currently it is the only user of group0. In
the near future we will have more client for group0 and they all will
have to have the same logic, so the patch moves it to a separate class
raft_group0_client that any future user of group0 can use to write
into it.

Message-Id: <YoYAJwdTdbX+iCUn@scylladb.com>
2022-05-19 17:21:35 +03:00
Avi Kivity
08ed4d7405 Merge 'scylla-gdb.py: add commands to dump sstables summary and index-cache' from Botond Dénes
This series adds two commands:
* scylla sstable-summary
* scylla sstable-index-cache

The former dumps the content of the sstable summary. This component is kept in memory in its entirety, so this can be easily done. The latter command dumps the content of the sstable index cache. This contains all the index-pages that are currently cached. The promoted index is not dumped yet and there is no indication of whether a given entry is in the LRU or not, but this already allows at seeing what pages are in the cache and what aren't.

Closes #10546

* github.com:scylladb/scylla:
  scylla-gdb.py: add scylla sstable-index-cache command
  scylla-gdb.py: add scylla sstable-summary command
  test/scylla-gdb: add sstable fixture
  scylla-gdb.py: make chunked_vector a proper container wrapper"
  scylla-gdb.py: make small_vector a proper container wrapper"
  scylla-gdb.py: add sstring container wrapper
  scylla-gdb.py: add chunked_managed_vector container wrapper
  scylla-gdb.py: add managed_vector container wrapper
  scylla-gdb.py: std_variant: add workaround for clang template bug
  scylla-gdb.py: add bplus_tree container wrapper
2022-05-19 17:21:35 +03:00
Nadav Har'El
0040e9e7f4 Merge 'cql: Add proper validation for null and unset inside collections send as bound values' from Jan Ciołek
Let's say we have a query like:
```cql
INSERT INTO ks.t (list_column) VALUES (?);
```
And the driver sends a list with null inside as the bound value, something like `[1, 2, null, 4]`.

In such case we should throw `invalid_request_exception` because `nulls` are not allowed inside collections.

Currently when a query like this gets executed Scylla throws an ugly marshalling error.
This is because the validation code reads size of the next element, interprets it as an unsigned integer and tries to read this much.
In case of `null` element the size is `-1`, which when converted to unsigned `size_t` gives 18446744073709551615 and it fails to read this much.

This PR adds proper validation checks to make the error message better.

I also added some tests.
I originally tried to write them in python, but python driver really doesn't like sending invalid values.
Trying to send `[1, None, 2]` results in a list with empty value instead of null.
Trying to send `[1, UNSET_VALUE, 2]` Fails before query even leaves the driver.

Fixes #10580

Closes #10599

* github.com:scylladb/scylla:
  cql3: Add tests for null and unset inside collections
  cql3: Add null and unset checks in collection validation
2022-05-19 11:25:24 +03:00
Piotr Sarna
b8a36ff253 service,test: add a test case for error during pruning
The test case checks that errors which occur during materialized
view pruning are properly propagated back to the user.
2022-05-19 10:16:04 +02:00
Piotr Sarna
995468520e tests: add ghost row deletion test case
The tests checks if manually injected ghost rows are properly deleted
by the ghost row delete statement - and, that non-ghost regular rows
are left intact.
2022-05-19 10:16:03 +02:00
cvybhu
7adc572ec6 cql3: Add tests for null and unset inside collections
Add a bunch of tests that test what happens
when there is a null or unset value inside collections.

They are not allowed so every such attempt
should end with invalid_request_exception
with proper message.

I had to write a new function for collection serialization.
I tried to use data_value and its methods, but it's impossible
to create a data_value that represents an unset value.

Signed-off-by: cvybhu <jan.ciolek@scylladb.com>
2022-05-19 00:15:17 +02:00
Benny Halevy
4e78b4de1b ser: use vector_deserializer by default for all idl vectors
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-05-18 19:24:18 +03:00
Benny Halevy
5b902d9fd6 serializer_impl: add vector_deserializer
To be used for streaming through a serialized
vector, deserializing the items as we go
when dereferencing or incrementing the iterator.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-05-18 19:10:13 +03:00
Avi Kivity
e6fe9cc683 Merge 'Reapply: "disable_auto_compaction: stop ongoing compactions"' from Eliran Sinvani
Reapply: "disable_auto_compaction: stop ongoing compactions"
This is a reapplication of a former commit
4affa801a5 which was reverted by
8e8dc2c930.
This commit is a fixed version of the original where a call to the
compaction_manager constructor accidentally issued (`compaction_manager()`)
instead a call to retrieve a compaction manager reference
(`get_compaction_manager()`), we don't use this function because it
doesn't exist anymore - it existed at the time the patch was written
bu was removed in 9066224cf4 later on,
instead, we just use the private table member _compaction_manager which refs
the compaction manager.

The explanation for the bad effect is probably that a `this` pointer
capture down the call chain, resulted in a use after free which had
an unknown effect on the system. (memory corruption at startup).

Test: unit (dev,debug)
      write performance test as the one used to find the bug.
A screenshot of the performance test can be found at
https://github.com/scylladb/scylla/issues/10146/#issuecomment-1129578381

Fixes https://github.com/scylladb/scylla/issues/9313
Refs https://github.com/scylladb/scylla/issues/10146

For completeness, the original commit message was:
The api call disables new regular compaction jobs from starting
but it doesn't wait for ongoing compaction to stop and so it's
much less useful.

Returning after stopping regular compaction jobs and waiting
for them to stop guarantees that no regular compactions job are
running when nodetool disableautocompaction returns successfully.

Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>

Closes #10597

* github.com:scylladb/scylla:
  compaction_manager: Make invoking the empty constructor more explicit
  Reapply: "disable_auto_compaction: stop ongoing compactions"
2022-05-18 18:33:12 +03:00
Eliran Sinvani
c5e5692a01 compaction_manager: Make invoking the empty constructor more explicit
The compaction manager's empty constructor is supposed to be invoked
only in testing environment, however, it is easy to invoke it by mistake
from production code.
Here we add a more verbose constructor and making the default compaction
private, the verbose compiler need to be invoked with a tag
for_testing_tag, this will ensure that this constructor will be invoked
only when intended.
The unit tests were changed according to this new paradigm.

Tests: unit (dev)

Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
2022-05-18 14:57:10 +03:00
Benny Halevy
9e1c76ea9e test: frozen_mutation_test: test_writing_and_reading_gently: log detailed error
The nested exception is also interesting
and boost doesn't print it.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-05-18 08:21:54 +03:00
Botond Dénes
6fd1322bf3 test/boost/mutation_test: test_query_digest: use the same now everywhere
This test started failing sporadically of late. This failure is seen
quite often in CI tests but is very hard to reproduce locally. The
problem seems to be timing related, as the same seeds that fail in CI
don't fail locally. This patch is a speculative fix. The test has a
single time-related components: `gc_clock::now()`. This is invoked in 4
different places during a single iteration, giving ample opportunity for
off-by-one errors to appear. Although there is no solid proof for this
being the problem, this is a good candidate. This patch replaces all
those different invocations, with a single one per test: this value is
then propagated to all places that need it.

Fixes: #10554

Marking the patch as a fix for the issue, if the problem re-surfaces
after this patch we'll re-poen it.

Closes #10589
2022-05-17 16:25:04 +03:00
Avi Kivity
a8f9ed56fb Merge 'cql3: Replace relation class with expression' from Jan Ciołek
Before this change parser used to output instances of the `relation` class, which were later converted to `restrction`.
`relation` took care of initial processing such as preparing and some validation checks.

This PR aims to remove the `relation` class and perform it's functionality using only `expression`.
This is a step towards removing the legacy classes and converting all AST analysis to work on `expressions`.

Closes #10409

* github.com:scylladb/scylla:
  cql3: Remove scalar from bind_variable_scalar_prepare_expression
  cql3: expr: Remove shape_type from bind_variable
  cql3: Remove prepare_expression_multi_column
  cql3: Remove relation class
  cql3: Add more tests for expr::printer
  cql3: Make parser output expression for relations
  cql3: expr: add printer for expression
  cql3: expr: expr::to_restriction: Handle token relations
  cql3: expr: expr::to_restriction: Handle multi column relations
  cql3: expr: Add expr::to_restriction for single column relations
  cql3: expr: Add prepare_binary_operator
  cql3: expr: Change how prepare_expression handles bind_variable
  clq3: expr: Add columns to expr::token struct
  cql3: expr: Modify list_prepare_expression to handle lists of IN values
  cql3: expr: Add expr::as_if for non-const expressions
2022-05-17 12:51:54 +03:00
cvybhu
9b49d27a8d cql3: expr: Remove shape_type from bind_variable
shape_type was used in prepare_expression to differentiate
between a few cases and create the correct receivers.
This was used by the relation class.

Now creating the correct receiver has been delegated to the caller
of prepare_expression and all bind_variables can be handled
in the same simple way.

shape_type is not needed anymore.

Not having it is better because it simplifies things.

Signed-off-by: cvybhu <jan.ciolek@scylladb.com>
2022-05-16 18:17:58 +02:00
cvybhu
575c4bd76b cql3: Add more tests for expr::printer
Now that parser outputs expressions
it's much easier to check whether
expression printer works correctly.

We can prepare a bunch of strings
which will be parsed and then printed
back to string.
Then we can compare those strings.

It's much easier than creating
expresions to print manually.

The only downside is that this tests
only unprepared version of expression,
so instead of column_value there will
be unresolved identifier, insted of constant
untyped_constant etc.

Signed-off-by: cvybhu <jan.ciolek@scylladb.com>
2022-05-16 18:17:58 +02:00
cvybhu
51cdbdeacb cql3: Make parser output expression for relations
Parser used to output the where clause as a vector of relations,
but now we can change it to a vector of expressions.

Cql.g needs to be modified to output expressions instead
of relations.

The WHERE clause is kept in a few places in the code that
need to be changed to vector<expression>.

Finally relation->to_restriction is replaced by expr::to_restriction
and the expressions are converted to restrictions where required.

The relation class isn't used anywhere now and can be removed.

Signed-off-by: cvybhu <jan.ciolek@scylladb.com>
2022-05-16 18:17:58 +02:00
Michał Sala
f6bdc4d694 cql3: expr: add printer for expression
expression::printer is used to print CQL expressions
in a pretty way that allows them to be parsed back
to the same representation.

There is a bunch of things that need to be changed when
compared to the current implementation of opreatorr<<(expression)
to output something parsable.

column names should be printed without 'unresolved_identifier()'
and sometimes they need to be quoted to perserve case sensitivity.

I needed to write new code for printing constant values
because the current one did debug printing
(e.g. a set was printed as '1; 2; 3').

A list of IN values should be printed inside () intead of [],
but because it is internally represented as a list it is
by default printed with [].
To fix this a temporary tuple_constructor is created and printed.

Signed-off-by: cvybhu <jan.ciolek@scylladb.com>
2022-05-16 18:17:58 +02:00
cvybhu
be6e741b6c clq3: expr: Add columns to expr::token struct
The expr::token struct is created when something
like token(p1, p2) occurs in the WHERE clause.

Currently expr::token doesn't keep columns passed
as arguemnts to the token function.

They weren't needed because token() validation
was done inside token_relation.

Now that we want to use only expressions
we need to have columns inside the token struct
and validate that those are the correct columns.

Signed-off-by: cvybhu <jan.ciolek@scylladb.com>
2022-05-16 18:03:11 +02:00