The do_with() makes it at least a break-even, but there's some allocating
continuations that make it a win.
A variable named cmd had two different definitions (a value and a
lw_shared_ptr) that lived in different scopes. I renamed one to cmd1
to disambiguate. We should probably move that to the caller, but that
is not done here.
The do_with() makes it at least a break-even, but there's some allocating
continuations that make it a win.
A variable named cmd had two different definitions (a value and a
lw_shared_ptr) that lived in different scopes. I renamed one to cmd1
to disambiguate. We should probably move that to the caller, but that
is not done here.
A do_with() makes this at least a break-even.
Some internal lambdas were not converted since they commonly
do not allocate or block.
A finally() continuation is converted to seastar::defer().
The do_with means the coroutine conversion is free, and conversion
of parallel_for_each to coroutine::parallel_for_each saves a
possible allocation (though it would not have been allocated usually.
An inner continuation is not converted since it usually doesn't
block, and therefore doesn't allocate.
Reduce the false dependencies on db/large_data_handler.hh by
not including it from commonly used header files, and rather including
it only in the source files that actually need it.
The is in preparation for https://github.com/scylladb/scylladb/issues/11449Closes#11654
* github.com:scylladb/scylladb:
test: lib: do not include db/large_data_handler.hh in test_service.hh
test: lib: move sstable test_env::impl ctor out of line
sstables: do not include db/large_data_handler.hh in sstables.hh
api/column_family: add include db/system_keyspace.hh
The generator was first setting the marker then applied tombstones.
The marker was set like this:
row.marker() = random_row_marker();
Later, when shadowable tombstones were applied, they were compacted
with the marker as expected.
However, the key for the row was chosen randomly in each iteration and
there are multiple keys set, so there was a possibility of a key clash
with an earlier row. This could override the marker without applying
any tombstones, which is conditional on random choice.
This could generate rows with markers uncompacted with shadowable tombstones.
This broken row_cache_test::test_concurrent_reads_and_eviction on
comparison between expected and read mutations. The latter was
compacted because it went through an extra merge path, which compacts
the row.
Fix by making sure there are no key clashes.
Closes#11663
The `server_remove` function did a very weird thing: it shut down a
server and made the framework 'forget' about it. From the point of view
of the Scylla cluster and the driver the server was still there.
Replace the function's body with `raise NotImplementedError`. In the
future it can be replaced with an implementation that calls
`removenode` on the Scylla cluster.
Remove `test_remove_server_add_column` from `test_topology`. It
effectively does the same thing as `test_stop_server_add_column`, except
that the framework also 'forgets' about the stopped server. This could
lead to weird situations because the forgotten server's IP could be
reused in another test that was running concurrently with this test.
Closes#11657
It was needed for defining and referencing nop_lp_handler
and in sstable_3_x_test for testing the large_data_handler.
Remove the include from the commonly used header file
to reduce the false dependencies on large_data_handler.hh
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
The logic to reject explicit snapshot of views/indexes was improved in aa127a2dbb. However, we never implemented auto-snapshot of
view/indexes when taking a snapshot of the base table.
This is implemented in this patch.
The implementation is built on top of
ba42852b0e
so it would be hard to backport to 5.1 or earlier
releases.
Fixes#11612
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closes#11616
* github.com:scylladb/scylladb:
database: automatically take snapshot of base table views
api: storage_service: reject snapshot of views in api layer
For db::system_keyspace::load_view_build_progress that currently
indirectly satisfied via sstables/sstables.hh ->
db/large_data_handler.hh
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Due to issue #11567, Alternator do not yet support adding a GSI to an
existing table via UpdateTable with the GlobalSecondaryIndexUpdates
parameter.
However, currently, we print a misleading error message in this case,
complaining about the AttributeDefinitions parameter. This parameter
is also required with GlobalSecondaryIndexUpdates, but it's not the
main problem, and the user is likely to be confused why the error message
points to that specific paramter and what it means that this parameter
is claimed to be "not supported" (while it is supported, in CreateTable).
With this patch, we report that GlobalSecondaryIndexUpdates is not
supported.
This patch does not fix the unsupported feature - it just improves
the error message saying that it's not supported.
Refs #11567
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#11650
When walking through the ranges, we should yield to prevent stalls. We
do similar yield in other node operations.
Fix a stall in 5.1.dev.20220724.f46b207472a3 with build-id
d947aaccafa94647f71c1c79326eb88840c5b6d2
```
!INFO | scylla[6551]: Reactor stalled for 10 ms on shard 0. Backtrace:
0x4bbb9d2 0x4bba630 0x4bbb8e0 0x7fd365262a1f 0x2face49 0x2f5caff
0x36ca29f 0x36c89c3 0x4e3a0e1
````
Fixes#11146Closes#11160
Extend the cql3 truncate statement to accept attributes,
similar to modification statements.
To achieve that we define cql3::statements::raw::truncate_statement
derived from raw::cf_statement, and implement its pure virtual
prepare() method to make a prepared truncate_statement.
The latter is no longer derived from raw::cf_statement,
and just stores a schema_ptr to get to the keyspace and column_family.
`test_truncate_using_timeout` cql-pytest was added to test
the new USING TIMEOUT feature.
Fixes#11408
Also, update docs/cql/ddl.rst truncate-statement section respectively.
Closes#11409
* github.com:scylladb/scylladb:
docs: cql-extensions: add TRUNCATE to USING TIMEOUT section.
docs: cql: ddl: add support for TRUNCATE USING TIMEOUT
cql3, storage_proxy: add support for TRUNCATE USING TIMEOUT
cql3: selectStatement: restrict to USING TIMEOUT in grammar
cql3: deleteStatement: restrict to USING TIMEOUT|TIMESTAMP in grammar
The series contains fixes for system.large_* log warning and respective documentation.
This prepares the way for adding a new system.large_collections table (See #11449):
Fixes#11620Fixes#11621Fixes#11622
the respective fixes should be backported to different release branches, based on the respective patches they depend on (mentioned in each issue).
Closes#11623
* github.com:scylladb/scylladb:
docs: adjust to sstable base name
docs: large-partition-table: adjust for additional rows column
docs: debugging-large-partition: update log warning example
db/large_data_handler: print static cell/collection description in log warning
db/large_data_handler: separate pk and ck strings in log warning with delimiter
Fix the type of `create_server`, rename `topology_for_class` to `get_cluster_factory`, simplify the suite definitions and parameters passed to `get_cluster_factory`
Closes#11590
* github.com:scylladb/scylladb:
test.py: replace `topology` with `cluster_size` in Topology tests
test.py: rename `topology_for_class` to `get_cluster_factory`
test/pylib: ScyllaCluster: fix create_server parameter type
The test was disabled due to a bug in the Python driver which caused the
driver not to reconnect after a node was restarted (see
scylladb/python-driver#170).
Introduce a workaround for that bug: we simply create a new driver
session after restarting the nodes. Reenable the test.
Closes#11641
Extended the queries language to support bind variables which are bound in the
execution stage, before creating a raft command.
Adjusted `test_broadcast_tables.py` to prepare statements at the beginning of the test.
Fixed a small bug in `strongly_consistent_modification_statement::check_access`.
Closes#11525
Before this patch we could get an OOM if we
received several big commands. The number of
commands was small, but their total size
in bytes was large.
snapshot_trailing_size is needed to guarantee
progress. Without this limit the fsm could
get stuck if the size of the next item is greater than
max_log_size - (size of trailing entries).
Closes#11397
* github.com:scylladb/scylladb:
raft replication_test, make backpressure test to do actual backpressure
raft server, shrink_to_fit on log truncation
raft server, release memory if add_entry throws
raft server, log size limit in bytes
When there are errors starting the first cluster(s) the logs of the server logs are needed. So move `.start()` to the `try` block in `test.py` (out of `asynccontextmanager`).
While there, make `ScyllaClusterManager.start()` idempotent.
Closes#11594
* github.com:scylladb/scylladb:
test.py: fix ScyllaClusterManager start/stop
test.py: fix topology init error handling
We don't want to keep memory we don't use, shrink_to_fit guarantees that.
In fact, boost::deque frees up memory when items are deleted, so this change has little effect at the moment, but it may pay off if we change the container in the future.
List the queries that support the TIMEOUT parameter.
Mention the newly added support for TRUNCATE
USING TIMEOUT.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Extend the cql3 truncate statement to accept attributes,
similar to modification statements.
To achieve that we define cql3::statements::raw::truncate_statement
derived from raw::cf_statement, and implement its pure virtual
prepare() method to make a prepared truncate_statement.
The latter, statements::truncate_statement, is no longer derived
from raw::cf_statement, and just stores a schema_ptr to get to the
keyspace and column_family names.
`test_truncate_using_timeout` cql-pytest was added to test
the new USING TIMEOUT feature.
Fixes#11408
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
It is preferred to reject USING TLL / TIMESTAMP at the grammar
level rather than functionally validating the USING attributes.
test_using_timeout was adjusted respectively to expect the
`SyntaxException` error rather than `InvalidRequest`.
Note that cql3::statements::raw::select_statement validate_attrs
now asserts that the ttl or the timestamp attributes aren't set.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
It is preferred to reject USING TLL / TIMESTAMP at the grammar
level rather than functionally validating the USING attributes.
test_using_timeout was adjusted respectively to expect the
`SyntaxException` error rather than `InvalidRequest`.
Note that now delete_statement ctor asserts that the ttl
attribute is not set.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
First, a reminder of a few basic concepts in Scylla:
- "topology" is a mapping: for each node, its DC and Rack.
- "replication strategy" is a method of calculating replica sets in
a cluster. It is not a cluster-global property; each keyspace can have
a different replication strategy. A cluster may have multiple
keyspaces.
- "cluster size" is the number of nodes in a cluster.
Replication strategy is orthogonal to topology. Cluster size can be
derived from topology and is also orthogonal to replication strategy.
test.py was confusing the three concepts together. For some reason,
Topology suites were specifying a "topology" parameter which contained
replication strategy details - having nothing to do with topology. Also
it's unclear why a test suite would specify anything to do with
replication strategies - after all, a test may create keyspaces with
different replication strategies, and a suite may contain multiple
different tests.
Get rid of the "topology" parameter, replace it with a simple
"cluster_size". In the future we may re-introduce it when we actually
implement the possibility to start clusters with custom topologies
(which involves configuring the snitch etc.) Simplify the test.py code.
The previous name had nothing to do with what the function calculated
and returned (it returned a `create_cluster` function; the standard name
for a function that constructs objects would be 'factory', so
`get_cluster_factory` is an appropriate name for a function that returns
cluster factories).
The only usage of `ScyllaCluster` constructor passed a `create_server`
function which expected a `List[str]` for the second parameter, while
the constructor specified that the function should expect an
`Optional[List[str]]`. There was no reason for the latter, we can easily
fix this type error.
Also give a type hint for `create_cluster` function in
`PythonTestSuite.topology_for_class`. This is actually what catched the
type error.
Before this patch we could get an OOM if we
received several big commands. The number of
commands was small, but their total size
in bytes was large.
snapshot_trailing_size is needed to guarantee
progress. Without this limit the fsm could
get stuck if the size of the next item is
greater than max_log_size - (size of trailing entries).
The logic to reject explicit snapshot of views/indexes
was improved in aa127a2dbb.
However, we never implemented auto-snapshot of
view/indexes when taking a snapshot of the base table.
This is implemented in this patch.
The implementation is built on top of
ba42852b0e
so it would be hard to backport to 5.1 or earlier
releases.
Fixes#11612
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Rather than pushing the check to
`snapshot_ctl::take_column_family_snapshot`, just check
that explcitly when taking a snapshot of a particular
table by name over the api.
Other paths that call snapshot_ctl::take_column_family_snapshot
are internal and use it to snap views already.
With that, we can get rid of the allow_view_snapshots flag
that was introduced in aab4cd850c.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
This test reproduces issue #10365: It shows that although "IS NOT NULL" is
not allowed in regular SELECT filters, in a materialized view it is allowed,
even for non-key columns - but then outright ignored and does not actually
filter out anything - a fact which already surprised several users.
The test also fails on Cassandra - it also wrongly allows IS NOT NULL
on the non-key columns but then ignores this in the filter. So the test
is marked with both xfail (known to fail on Scylla) and cassandra_bug
(fails on Cassandra because of what we consider to be a Cassandra bug).
Refs #10365
Refs #11606
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#11615
The goal is not to default initialize an object when its fields are about to be immediately overwritten by the consecutive code.
Closes#11619
* github.com:scylladb/scylladb:
replication_strategy: Construct temp tokens in place
topology: Define copy-sonctructor with init-lists
boolean_factors is a function that takes an expression
and extracts all children of the top level conjunction.
The problem is that it returns a vector<expression>,
which is inefficent.
Sometimes we would like to iterate over all boolean
factors without allocations. for_each_boolean_factor
is implemented for this purpose.
boolean_factors() can be implemented using
for_each_boolean_factor, so it's done to
reduce code duplication.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
Snitch uses gossiper for several reasons, one of is to re-gossip the topology-related app states when property-file snitch config changes. This set cuts this link by moving re-gossiping into the existing storage_service::snitch_reconfigured() subscription. Since initial snitch state gossiping happens in storage service as well, this change is not unreasonable.
Closes#11630
* github.com:scylladb/scylladb:
storage_service: Re-gossiping snitch data in reconfiguration callback
storage_service: Coroutinize snitch_reconfigured()
storage_service: Indentation fix after previous patch
storage_service: Reshard to shard-0 earlier
storage_service: Refactor snitch reconfigured kick
Since 244df07771 (scylla 5.1),
only the sstable basename is kept in the large_* system tables.
The base path can be determined from the keyspace and
table name.
Fixes#11621
Adjust the examples in documentation respectively.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Since a7511cf600 (scylla 5.0),
sstables containing partitions with too many rows are recorded in system.large_partitions.
Adjust the doc respectively.
Fixes#11622
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
The log warning format has changed since f3089bf3d1
and was fixed in the previous patch to include
a delimiter between the partition key, clustering key, and
column name.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
When warning about a large cell/collection in a static row,
print that fact in the log warning to make it clearer.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Currently (since f3089bf3d1),
when printing a warning to the log about large rows and/or cells
the clustering key string is concatenated to the partition key string,
rendering the warning confsing and much less useful.
This patch adds a '/' delimiter to separate the fields,
and also uses one to separate the clustering key from the column name
for large cells. In case of a static cell, the clustering key is null
hence the warning will look like: `pk//column`.
This patch does NOT change anything in the large_* system
table schema or contents. It changes only the log warning format
that need not be backward compatible.
Fixes#11620
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Nowadays it's done inside snitch, and snitch needs to carry gossiper
refernece for that. There's an ongoing effort in de-globalizing snitch
and fixing its dependencies. This patch cuts this snitch->gossiper link
to facilitate the mentioned effort.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Next patch will add more sleeping code to it and it's simpler if the new
call is co_await-ed rather than .then()-ed
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>