Commit Graph

33204 Commits

Author SHA1 Message Date
Avi Kivity
aa5f4bf1f3 storage_proxy: convert handle_read_mutation_data() to a coroutine
The do_with() makes it at least a break-even, but there's some allocating
continuations that make it a win.

A variable named cmd had two different definitions (a value and a
lw_shared_ptr) that lived in different scopes. I renamed one to cmd1
to disambiguate. We should probably move that to the caller, but that
is not done here.
2022-10-03 19:33:39 +03:00
Avi Kivity
bcd134e9b8 storage_proxy: convert handle_read_data() to a coroutine
The do_with() makes it at least a break-even, but there's some allocating
continuations that make it a win.

A variable named cmd had two different definitions (a value and a
lw_shared_ptr) that lived in different scopes. I renamed one to cmd1
to disambiguate. We should probably move that to the caller, but that
is not done here.
2022-10-03 19:33:39 +03:00
Avi Kivity
167c8b1b5e storage_proxy: convert handle_write() to a coroutine
A do_with() makes this at least a break-even.

Some internal lambdas were not converted since they commonly
do not allocate or block.

A finally() continuation is converted to seastar::defer().
2022-10-03 19:33:39 +03:00
Avi Kivity
741d6609a5 storage_proxy: convert handle_counter_mutation() to a coroutine
The do_with means the coroutine conversion is free, and conversion
of parallel_for_each to coroutine::parallel_for_each saves a
possible allocation (though it would not have been allocated usually.

An inner continuation is not converted since it usually doesn't
block, and therefore doesn't allocate.
2022-10-03 19:33:39 +03:00
Avi Kivity
ac5fae4b93 storage_proxy: convert query_nonsingular_mutations_locally() to a coroutine
It's simpler, and the do_with() allocation + task cancels out the
coroutine allocation + task.
2022-10-03 19:33:29 +03:00
Botond Dénes
060dda8e00 Merge 'Reduce dependencies on large data handler header' from Benny Halevy
Reduce the false dependencies on db/large_data_handler.hh by
not including it from commonly used header files, and rather including
it only in the source files that actually need it.

The is in preparation for https://github.com/scylladb/scylladb/issues/11449

Closes #11654

* github.com:scylladb/scylladb:
  test: lib: do not include db/large_data_handler.hh in test_service.hh
  test: lib: move sstable test_env::impl ctor out of line
  sstables: do not include db/large_data_handler.hh in sstables.hh
  api/column_family: add include db/system_keyspace.hh
2022-09-30 13:27:38 +03:00
Tomasz Grabiec
5268f0f837 test: lib: random_mutation_generator: Don't generate mutations with marker uncompacted with shadowable tombstone
The generator was first setting the marker then applied tombstones.

The marker was set like this:

  row.marker() = random_row_marker();

Later, when shadowable tombstones were applied, they were compacted
with the marker as expected.

However, the key for the row was chosen randomly in each iteration and
there are multiple keys set, so there was a possibility of a key clash
with an earlier row. This could override the marker without applying
any tombstones, which is conditional on random choice.

This could generate rows with markers uncompacted with shadowable tombstones.

This broken row_cache_test::test_concurrent_reads_and_eviction on
comparison between expected and read mutations. The latter was
compacted because it went through an extra merge path, which compacts
the row.

Fix by making sure there are no key clashes.

Closes #11663
2022-09-30 11:27:01 +03:00
Kamil Braun
1793d43b15 test/pylib: scylla_cluster: mark server_remove as not implemented
The `server_remove` function did a very weird thing: it shut down a
server and made the framework 'forget' about it. From the point of view
of the Scylla cluster and the driver the server was still there.

Replace the function's body with `raise NotImplementedError`. In the
future it can be replaced with an implementation that calls
`removenode` on the Scylla cluster.

Remove `test_remove_server_add_column` from `test_topology`. It
effectively does the same thing as `test_stop_server_add_column`, except
that the framework also 'forgets' about the stopped server. This could
lead to weird situations because the forgotten server's IP could be
reused in another test that was running concurrently with this test.

Closes #11657
2022-09-29 21:03:18 +03:00
Benny Halevy
776b009c0f test: lib: do not include db/large_data_handler.hh in test_service.hh
It was needed for defining and referencing nop_lp_handler
and in sstable_3_x_test for testing the large_data_handler.

Remove the include from the commonly used header file
to reduce the false dependencies on large_data_handler.hh

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-09-29 18:36:16 +03:00
Benny Halevy
678d88576b test: lib: move sstable test_env::impl ctor out of line
To prepare for removing the include of db/large_data_handler.hh
from test/lib/test_services.hh

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-09-29 18:35:12 +03:00
Botond Dénes
ad04f200d3 Merge 'database: automatically take snapshot of base table views' from Benny Halevy
The logic to reject explicit snapshot of views/indexes was improved in aa127a2dbb. However, we never implemented auto-snapshot of
view/indexes when taking a snapshot of the base table.

This is implemented in this patch.

The implementation is built on top of
ba42852b0e
so it would be hard to backport to 5.1 or earlier
releases.

Fixes #11612

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes #11616

* github.com:scylladb/scylladb:
  database: automatically take snapshot of base table views
  api: storage_service: reject snapshot of views in api layer
2022-09-29 13:33:31 +03:00
Benny Halevy
ae7fd1c7b2 sstables: do not include db/large_data_handler.hh in sstables.hh
Reduce dependencies by only forward-declaring
class db::large_data_handler in sstables.hh

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-09-29 12:42:58 +03:00
Benny Halevy
fb7e55b0a8 api/column_family: add include db/system_keyspace.hh
For db::system_keyspace::load_view_build_progress that currently
indirectly satisfied via sstables/sstables.hh ->
db/large_data_handler.hh

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-09-29 12:42:54 +03:00
Nadav Har'El
4a7794fb64 alternator: better error message when adding a GSI to an existing table
Due to issue #11567, Alternator do not yet support adding a GSI to an
existing table via UpdateTable with the GlobalSecondaryIndexUpdates
parameter.

However, currently, we print a misleading error message in this case,
complaining about the AttributeDefinitions parameter. This parameter
is also required with GlobalSecondaryIndexUpdates, but it's not the
main problem, and the user is likely to be confused why the error message
points to that specific paramter and what it means that this parameter
is claimed to be "not supported" (while it is supported, in CreateTable).
With this patch, we report that GlobalSecondaryIndexUpdates is not
supported.

This patch does not fix the unsupported feature - it just improves
the error message saying that it's not supported.

Refs #11567

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #11650
2022-09-29 09:00:31 +03:00
Asias He
c194c811df repair: Yield in repair_service::do_decommission_removenode_with_repair
When walking through the ranges, we should yield to prevent stalls. We
do similar yield in other node operations.

Fix a stall in 5.1.dev.20220724.f46b207472a3 with build-id
d947aaccafa94647f71c1c79326eb88840c5b6d2

```
!INFO | scylla[6551]: Reactor stalled for 10 ms on shard 0. Backtrace:
0x4bbb9d2 0x4bba630 0x4bbb8e0 0x7fd365262a1f 0x2face49 0x2f5caff
0x36ca29f 0x36c89c3 0x4e3a0e1
````

Fixes #11146

Closes #11160
2022-09-28 18:21:35 +03:00
Avi Kivity
cf3830a249 Merge 'Add support for TRUNCATE USING TIMEOUT' from Benny Halevy
Extend the cql3 truncate statement to accept attributes,
similar to modification statements.

To achieve that we define cql3::statements::raw::truncate_statement
derived from raw::cf_statement, and implement its pure virtual
prepare() method to make a prepared truncate_statement.

The latter is no longer derived from raw::cf_statement,
and just stores a schema_ptr to get to the keyspace and column_family.

`test_truncate_using_timeout` cql-pytest was added to test
the new USING TIMEOUT feature.

Fixes #11408

Also, update docs/cql/ddl.rst truncate-statement section respectively.

Closes #11409

* github.com:scylladb/scylladb:
  docs: cql-extensions: add TRUNCATE to USING TIMEOUT section.
  docs: cql: ddl: add support for TRUNCATE USING TIMEOUT
  cql3, storage_proxy: add support for TRUNCATE USING TIMEOUT
  cql3: selectStatement: restrict to USING TIMEOUT in grammar
  cql3: deleteStatement: restrict to USING TIMEOUT|TIMESTAMP in grammar
2022-09-28 18:19:03 +03:00
Avi Kivity
19374779bb Merge 'Fix large data warning and docs' from Benny Halevy
The series contains fixes for system.large_* log warning and respective documentation.
This prepares the way for adding a new system.large_collections table (See #11449):

Fixes #11620
Fixes #11621
Fixes #11622

the respective fixes should be backported to different release branches, based on the respective patches they depend on (mentioned in each issue).

Closes #11623

* github.com:scylladb/scylladb:
  docs: adjust to sstable base name
  docs: large-partition-table: adjust for additional rows column
  docs: debugging-large-partition: update log warning example
  db/large_data_handler: print static cell/collection description in log warning
  db/large_data_handler: separate pk and ck strings in log warning with delimiter
2022-09-28 17:52:23 +03:00
Nadav Har'El
de1bc147bc Merge 'test.py: cleanups in topology test suites' from Kamil Braun
Fix the type of `create_server`, rename `topology_for_class` to `get_cluster_factory`, simplify the suite definitions and parameters passed to `get_cluster_factory`

Closes #11590

* github.com:scylladb/scylladb:
  test.py: replace `topology` with `cluster_size` in Topology tests
  test.py: rename `topology_for_class` to `get_cluster_factory`
  test/pylib: ScyllaCluster: fix create_server parameter type
2022-09-28 15:19:54 +03:00
Kamil Braun
1bcc28b48b test/topology_raft_disabled: reenable test_raft_upgrade
The test was disabled due to a bug in the Python driver which caused the
driver not to reconnect after a node was restarted (see
scylladb/python-driver#170).

Introduce a workaround for that bug: we simply create a new driver
session after restarting the nodes. Reenable the test.

Closes #11641
2022-09-28 15:13:42 +03:00
Mikołaj Grzebieluch
be8fcba8c1 raft: broadcast_tables: add support for bind variables
Extended the queries language to support bind variables which are bound in the
execution stage, before creating a raft command.

Adjusted `test_broadcast_tables.py` to prepare statements at the beginning of the test.

Fixed a small bug in `strongly_consistent_modification_statement::check_access`.

Closes #11525
2022-09-28 09:54:59 +03:00
Alejo Sanchez
02933c9b82 test.py: close aiohttp session for topology tests
Close the aiohttp ClientSession after pytest session finishes.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>

Closes #11648
2022-09-27 18:09:08 +02:00
Kamil Braun
82481ae31b Merge 'raft server, log size limit in bytes' from Gusev Petr
Before this patch we could get an OOM if we
received several big commands. The number of
commands was small, but their total size
in bytes was large.

snapshot_trailing_size is needed to guarantee
progress. Without this limit the fsm could
get stuck if the size of the next item is greater than
 max_log_size - (size of trailing entries).

Closes #11397

* github.com:scylladb/scylladb:
  raft replication_test, make backpressure test to do actual backpressure
  raft server, shrink_to_fit on log truncation
  raft server, release memory if add_entry throws
  raft server, log size limit in bytes
2022-09-27 14:25:08 +02:00
Kamil Braun
ed67f0e267 Merge 'test.py: fix topology init error handling' from Alecco
When there are errors starting the first cluster(s) the logs of the server logs are needed. So move `.start()` to the `try` block in `test.py` (out of `asynccontextmanager`).

While there, make `ScyllaClusterManager.start()` idempotent.

Closes #11594

* github.com:scylladb/scylladb:
  test.py: fix ScyllaClusterManager start/stop
  test.py: fix topology init error handling
2022-09-27 11:36:07 +02:00
Petr Gusev
bc50b7407f raft replication_test, make backpressure test to do actual backpressure
Before this patch this test didn't actually experience any backpressure since all the commands were executed sequentially.
2022-09-27 12:04:14 +04:00
Petr Gusev
cbfe033786 raft server, shrink_to_fit on log truncation
We don't want to keep memory we don't use, shrink_to_fit guarantees that.

In fact, boost::deque frees up memory when items are deleted, so this change has little effect at the moment, but it may pay off if we change the container in the future.
2022-09-27 12:02:36 +04:00
Petr Gusev
b34dfed307 raft server, release memory if add_entry throws
We consume memory from semaphore in add_entry_on_leader, but never release it if add_entry throws.
2022-09-27 12:02:34 +04:00
Benny Halevy
b178813cba docs: cql-extensions: add TRUNCATE to USING TIMEOUT section.
List the queries that support the TIMEOUT parameter.

Mention the newly added support for TRUNCATE
USING TIMEOUT.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-09-26 18:30:39 +03:00
Benny Halevy
b0bad0b153 docs: cql: ddl: add support for TRUNCATE USING TIMEOUT
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-09-26 18:30:39 +03:00
Benny Halevy
64140ccf05 cql3, storage_proxy: add support for TRUNCATE USING TIMEOUT
Extend the cql3 truncate statement to accept attributes,
similar to modification statements.

To achieve that we define cql3::statements::raw::truncate_statement
derived from raw::cf_statement, and implement its pure virtual
prepare() method to make a prepared truncate_statement.

The latter, statements::truncate_statement, is no longer derived
from raw::cf_statement, and just stores a schema_ptr to get to the
keyspace and column_family names.

`test_truncate_using_timeout` cql-pytest was added to test
the new USING TIMEOUT feature.

Fixes #11408

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-09-26 18:30:39 +03:00
Benny Halevy
27d3e48005 cql3: selectStatement: restrict to USING TIMEOUT in grammar
It is preferred to reject USING TLL / TIMESTAMP at the grammar
level rather than functionally validating the USING attributes.

test_using_timeout was adjusted respectively to expect the
`SyntaxException` error rather than `InvalidRequest`.

Note that cql3::statements::raw::select_statement validate_attrs
now asserts that the ttl or the timestamp attributes aren't set.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-09-26 18:30:39 +03:00
Benny Halevy
0728d33d5f cql3: deleteStatement: restrict to USING TIMEOUT|TIMESTAMP in grammar
It is preferred to reject USING TLL / TIMESTAMP at the grammar
level rather than functionally validating the USING attributes.

test_using_timeout was adjusted respectively to expect the
`SyntaxException` error rather than `InvalidRequest`.

Note that now delete_statement ctor asserts that the ttl
attribute is not set.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-09-26 18:30:39 +03:00
Kamil Braun
696bdb2de7 test.py: replace topology with cluster_size in Topology tests
First, a reminder of a few basic concepts in Scylla:
- "topology" is a mapping: for each node, its DC and Rack.
- "replication strategy" is a method of calculating replica sets in
  a cluster. It is not a cluster-global property; each keyspace can have
  a different replication strategy. A cluster may have multiple
  keyspaces.
- "cluster size" is the number of nodes in a cluster.

Replication strategy is orthogonal to topology. Cluster size can be
derived from topology and is also orthogonal to replication strategy.

test.py was confusing the three concepts together. For some reason,
Topology suites were specifying a "topology" parameter which contained
replication strategy details - having nothing to do with topology. Also
it's unclear why a test suite would specify anything to do with
replication strategies - after all, a test may create keyspaces with
different replication strategies, and a suite may contain multiple
different tests.

Get rid of the "topology" parameter, replace it with a simple
"cluster_size". In the future we may re-introduce it when we actually
implement the possibility to start clusters with custom topologies
(which involves configuring the snitch etc.) Simplify the test.py code.
2022-09-26 15:17:50 +02:00
Kamil Braun
0725ab3a3e test.py: rename topology_for_class to get_cluster_factory
The previous name had nothing to do with what the function calculated
and returned (it returned a `create_cluster` function; the standard name
for a function that constructs objects would be 'factory', so
`get_cluster_factory` is an appropriate name for a function that returns
cluster factories).
2022-09-26 11:45:44 +02:00
Kamil Braun
06cc4f9259 test/pylib: ScyllaCluster: fix create_server parameter type
The only usage of `ScyllaCluster` constructor passed a `create_server`
function which expected a `List[str]` for the second parameter, while
the constructor specified that the function should expect an
`Optional[List[str]]`. There was no reason for the latter, we can easily
fix this type error.

Also give a type hint for `create_cluster` function in
`PythonTestSuite.topology_for_class`. This is actually what catched the
type error.
2022-09-26 11:45:44 +02:00
Petr Gusev
27e60ecbf4 raft server, log size limit in bytes
Before this patch we could get an OOM if we
received several big commands. The number of
commands was small, but their total size
in bytes was large.

snapshot_trailing_size is needed to guarantee
progress. Without this limit the fsm could
get stuck if the size of the next item is
greater than max_log_size - (size of trailing entries).
2022-09-26 13:10:10 +04:00
Benny Halevy
d32c497cd9 database: automatically take snapshot of base table views
The logic to reject explicit snapshot of views/indexes
was improved in aa127a2dbb.
However, we never implemented auto-snapshot of
view/indexes when taking a snapshot of the base table.

This is implemented in this patch.

The implementation is built on top of
ba42852b0e
so it would be hard to backport to 5.1 or earlier
releases.

Fixes #11612

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-09-26 11:02:54 +03:00
Benny Halevy
55b0b8fe2c api: storage_service: reject snapshot of views in api layer
Rather than pushing the check to
`snapshot_ctl::take_column_family_snapshot`, just check
that explcitly when taking a snapshot of a particular
table by name over the api.

Other paths that call snapshot_ctl::take_column_family_snapshot
are internal and use it to snap views already.

With that, we can get rid of the allow_view_snapshots flag
that was introduced in aab4cd850c.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-09-26 10:44:56 +03:00
Nadav Har'El
868a884b79 test/cql-pytest: add reproducer for ignored IS NOT NULL
This test reproduces issue #10365: It shows that although "IS NOT NULL" is
not allowed in regular SELECT filters, in a materialized view it is allowed,
even for non-key columns - but then outright ignored and does not actually
filter out anything - a fact which already surprised several users.

The test also fails on Cassandra - it also wrongly allows IS NOT NULL
on the non-key columns but then ignores this in the filter. So the test
is marked with both xfail (known to fail on Scylla) and cassandra_bug
(fails on Cassandra because of what we consider to be a Cassandra bug).

Refs #10365
Refs #11606

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #11615
2022-09-26 09:02:08 +03:00
Anna Stuchlik
c5285bcb14 doc: remove the section about updating OS packages during upgrade from upgrade guides for Ubunut and Debian (from 4.5 to 4.6)
Closes #11629
2022-09-26 08:04:02 +03:00
Avi Kivity
ad2f1dc704 Merge 'Avoid default initialization of token_metadata and topology when not needed' from Pavel Emelyanov
The goal is not to default initialize an object when its fields are about to be immediately overwritten by the consecutive code.

Closes #11619

* github.com:scylladb/scylladb:
  replication_strategy: Construct temp tokens in place
  topology: Define copy-sonctructor with init-lists
2022-09-25 18:08:42 +03:00
Jan Ciolek
ac152af88c expression: Add for_each_boolean factor
boolean_factors is a function that takes an expression
and extracts all children of the top level conjunction.

The problem is that it returns a vector<expression>,
which is inefficent.

Sometimes we would like to iterate over all boolean
factors without allocations. for_each_boolean_factor
is implemented for this purpose.

boolean_factors() can be implemented using
for_each_boolean_factor, so it's done to
reduce code duplication.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-09-25 16:34:22 +03:00
Avi Kivity
2a538f5543 Merge 'Cut one of snitch->gossiper links' from Pavel Emelyanov
Snitch uses gossiper for several reasons, one of is to re-gossip the topology-related app states when property-file snitch config changes. This set cuts this link by moving re-gossiping into the existing storage_service::snitch_reconfigured() subscription. Since initial snitch state gossiping happens in storage service as well, this change is not unreasonable.

Closes #11630

* github.com:scylladb/scylladb:
  storage_service: Re-gossiping snitch data in reconfiguration callback
  storage_service: Coroutinize snitch_reconfigured()
  storage_service: Indentation fix after previous patch
  storage_service: Reshard to shard-0 earlier
  storage_service: Refactor snitch reconfigured kick
2022-09-25 16:08:48 +03:00
Benny Halevy
a1adbf1f59 docs: adjust to sstable base name
Since 244df07771 (scylla 5.1),
only the sstable basename is kept in the large_* system tables.
The base path can be determined from the keyspace and
table name.

Fixes #11621

Adjust the examples in documentation respectively.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-09-25 14:38:13 +03:00
Benny Halevy
33924201cc docs: large-partition-table: adjust for additional rows column
Since a7511cf600 (scylla 5.0),
sstables containing partitions with too many rows are recorded in system.large_partitions.

Adjust the doc respectively.

Fixes #11622

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-09-25 14:38:13 +03:00
Benny Halevy
92ff17c6e3 docs: debugging-large-partition: update log warning example
The log warning format has changed since f3089bf3d1
and was fixed in the previous patch to include
a delimiter between the partition key, clustering key, and
column name.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-09-25 14:38:13 +03:00
Benny Halevy
fcbbc3eb9c db/large_data_handler: print static cell/collection description in log warning
When warning about a large cell/collection in a static row,
print that fact in the log warning to make it clearer.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-09-25 14:37:42 +03:00
Benny Halevy
4670829502 db/large_data_handler: separate pk and ck strings in log warning with delimiter
Currently (since f3089bf3d1),
when printing a warning to the log about large rows and/or cells
the clustering key string is concatenated to the partition key string,
rendering the warning confsing and much less useful.

This patch adds a '/' delimiter to separate the fields,
and also uses one to separate the clustering key from the column name
for large cells.  In case of a static cell, the clustering key is null
hence the warning will look like: `pk//column`.

This patch does NOT change anything in the large_* system
table schema or contents.  It changes only the log warning format
that need not be backward compatible.

Fixes #11620

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-09-25 14:36:41 +03:00
Pavel Emelyanov
47958a4b37 storage_service: Re-gossiping snitch data in reconfiguration callback
Nowadays it's done inside snitch, and snitch needs to carry gossiper
refernece for that. There's an ongoing effort in de-globalizing snitch
and fixing its dependencies. This patch cuts this snitch->gossiper link
to facilitate the mentioned effort.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-09-23 14:31:55 +03:00
Pavel Emelyanov
932566d448 storage_service: Coroutinize snitch_reconfigured()
Next patch will add more sleeping code to it and it's simpler if the new
call is co_await-ed rather than .then()-ed

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-09-23 14:31:55 +03:00
Pavel Emelyanov
7fee98cad0 storage_service: Indentation fix after previous patch
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-09-23 14:31:55 +03:00