By default, cluster tests have skip_wait_for_gossip_to_settle=0 and
ring_delay_ms=0. In tests with gossip topology, it may lead to a race,
where nodes see different state of each other.
In case of test_auth_v2_migration, there are three nodes. If the first
node already knows that the third node is NORMAL, and the second node
does not, the system_auth tables can return incomplete results.
To avoid such a race, this commit adds a check that all nodes see other
nodes as NORMAL before any writes are done.
Refs: #24163Closesscylladb/scylladb#24185
There is a difference how ScyllaDB and Cassandra handle conditional
batches with different IF statements (such as "IF EXISTS" and "IF NOT
EXISTS"). Cassandra tries to detect condition conflicts, and prints
an error instead of silently failing the batch, but in ScyllaDB
we considered this check to be inconsistent and unhelpful, and
decided not to implement it.
In this series, we extend the documentation of the ScyllaDB behaviour
by extending the documents and improving relevant LWT tests.
Fixes: https://github.com/scylladb/scylladb/issues/13011
Backport not needed, only docs and minor tests changes.
Closesscylladb/scylladb#24086
* github.com:scylladb/scylladb:
test: mark difference in handling IFs in LWT as scylla_only
docs: cql: add explicit explanation how mixing IFs works in LWT
docs: lwt: add two missing spaces
PythonTestSuite::recycle_cluster is a function that releases resources
of an old, dirty cluster to make it reusable. It closes log_file and
maintenance_socket_dir for running nodes in a dirty cluster, however it
doesn't do the same for stopped nodes. It leads to leakage of file
descriptors of stopped nodes, which in turn can lead to hitting ulimit
of open files (that is often 1024) if the leaking test is repeated with
`./test.py --repeat ...`. The problem was detected when tests from
`test/cluster/dtest/` directory were executed with high `repeat` value.
This commit extends `recycle_cluster` to close and cleanup logfile and
`socket_dir` for nodes that are stopped (because self.servers in
ScyllaCluster is ChainMap of self.running and self.stopped).
Closesscylladb/scylladb#24243
There is a difference how ScyllaDB and Cassandra handle conditional
batches with different IF statements (such as "IF EXISTS" and "IF NOT
EXISTS"). Cassandra tries to detect condition conflicts, and prints
an error instead of silently failing the batch, but in ScyllaDB
we considered this check to be inconsistent and unhelpful, and
decided not to implement it.
This commit:
- Make test_lwt_with_batch_conflict_1 scylla_only instead of xfail,
change the scenario to pass with the current implementation.
- Add test_lwt_with_batch_conflict_3 that shows how Cassandra fails
batch statement with different conditions, even when the conditions
are not contradictory.
- Add test_lwt_with_batch_conflict_4/5 that shows how static rows
are handled in conditional batches.
Fixes: #13011
There are few problems found in the dtest shim code after scylladb/scylladb#21580 was merged:
- The call of `init_default_config()` method was missed in scylladb/scylladb#21580. It is required to handle dtest options and markers.
- The implementation of dtest shim uses `server_id` to format a name of a node in a cluster. This is a difference in behavior with dtest. Some of dtests use code like `cluster.nodes()["node1"]` to get access to a node object.
- Default timeout was missed in `ScyllaNode.wait_until_stopped()` method. Set it to 600 for debug mode or to 127 otherwise.
Closesscylladb/scylladb#24225
* github.com:scylladb/scylladb:
test.py: dtest: set default wait_seconds based on build mode
test.py: dtest: name nodes in cluster using index starting from 1
test.py: dtest: initialize default config in dtest setup fixture
It seems that tests in test/boost/combined_tests have to define a test
suite name, otherwise they aren't picked up by test.py.
Fixes#24199Closesscylladb/scylladb#24200
This PR adds the possibility to gather coverage for the boost tests when they're executed with pytest. Since the pytest will be used as the main runner for boost tests as well, we need this before switching the runners.
This pull request adds support for creating custom indexes (at a metadata level) as long as a supported custom class is provided (currently only vector search).
The patch contains:
- a change in CREATE INDEX statement that allows for the USING keyword to be present as long as one of the supported classes is used
- support for describing custom indexes in the DESCRIBE statement
- unit tests
Co-authored by: @Balwancia
Closesscylladb/scylladb#23720
* github.com:scylladb/scylladb:
test/cqlpy: add custom index tests
index: support storing metadata for custom indices
The current implementation of dtest shim use `server_id` to format a
name of a node in a cluster. This is a difference in behavior with dtest.
Some of dtests use code like `cluster.nodes()["node1"]` to get access
to a node object. This commit changes it to be more consistent with
dtest.
Clean different output files from the boost and unit tests.
Move logs for boost test to the testlog directory instead of having additional directory pytest
Fix the issue introduced with scylladb/scylladb#22960. Suite log dir was changed, and the path for metrics DB was relying on it. As a result, DB is now located in the mode directory instead of the root of the testlog.
Move the run_process method to the resource gather instance, since we need to start monitor to check memory consumption in the cgroup. Since resource_gather needs test.py test object, and pytest has no clue about it, adding a simple namespace object to emulate such a test object. It needed only to gather some information regarding the test to be able to add records to the DB.
Since we have two facades that can share the same run process procedure, adding a common method to handle this to avoid code duplication.
Instead of finding dynamically the repo root directory relatively to the temp dir, that's in most cases in the repo, will fail if a non-default temp dir parameter is used. Additionally, to have the single source of truth of finding the repo root directory switching to the constants.
Add injecting environment variables to the process
Switch from print to propper logger
Set buffer size to 1 to avoid losing any data from the boost test if the test collapsed.
Currently, run process logs and return stdout and stderr, but boost tests are using stderr only. So stderr redirected to stdout. This helps with Jenkins as well, since we are reducing the number of files to store.
resource_gather.py needs test.py test object to work. It needs some information about the test to be able to write down this information to the DB with metrics. When running with pytest, there's no such test object, that's why adding make_test_object to mimic the test.py's test object.
Switching the getting the mode for constructing path to chgroup to test
instead of suite. They are the same, but this helps to have emulate less
in make_test_object method.
Use universalasync library to make test.py async code compatible
with synchronous code of dtest/ccm
Also, copied unmodified error_example_test.py from dtest as an example.
Run the test in `dev` mode only.
Add parameters to server_start() method to provide ability to
change Scylla' CLI and env options on a node start.
Also, add `expected_server_up_state` parameter as we have for
server_add() method.
Rework `ScyllaLogFile.wait_for()` method to make it easier
to add required methods to ScyllaNode class of ccm-like shim.
Also, added `ScyllaLogFile.grep_for_errors()` method and
reworked `ScyllaLogFile.grep()`
Currently, stream_manager is initialized after storage_service and
so it is stopped before the storage_service is. In its stop method
storage_service accesses stream_manager which is uninitialized
at a time.
Move stream_manager initialization over the storage_service initialization.
Fixes: #23207.
Closesscylladb/scylladb#24008
This patch fixes "test/cqlpy/run --release 2025.1" which fails as
follows on all tests with indexes or views:
Secondary indexes are not supported on base tables with tablets
test/cqlpy/run can run cqlpy (and alternator) tests on various official
releases of Scylla which it knows how to download. When running old
versions of Scylla, we need to change the configuration options to those
that were needed on specific versions.
On new versions of Scylla we need to pass
--experimental-features=views-with-tablets
to be able to test materialized views, but in older versions we need to
remove that parameter because it didn't exist. We incorrectly removed it
for any versions 2025.1 or earlier, but that's incorrect - it just needs
to be removed for versions strictly earlier than 2025.1 - it is needed
for 2025.1 (I tested it is indeed needed even in the earliers RCs).
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#24144
chunked_vector is a replacement for std::vector that avoids large contiguous
allocations.
In this series, we add some missing modifiers and improve quality-of-life for
chunked_vector users (the static_assert patch).
Those modifiers were generally unused since they have O(n) complexity
and therefore not useful for hot paths, but they are used in some
control plane code on vectors which we'd like to replace with chunked_vectors.
A candidate for such a replacement is token_range_vector (see #3335).
This is a prerequisite for fixing some minor stalls; I don't expect we'll backport
fixes to those stalls.
Closesscylladb/scylladb#24162
* github.com:scylladb/scylladb:
utils: chunked_vector: add swap() method
utils: chunked_vector: add range insert() overloads
utils: chunked_vector: relax static_assert
utils: chunked_vector: implement erase() for single elements and ranges
utils: chunked_vector: implement insert() for single-element inserts
Negative load sizes don't make sense, but we've seen a case in
production, where a negative number was returned by ScyllaDB REST API,
so be prepared to handle these too.
Fixes: scylladb/scylladb#24134Closesscylladb/scylladb#24135
Lots of code from this test can be reused in PR #23861. I'm splitting it now in this change so we can merge it cleanly as a separate patch.
Refs #23564Closesscylladb/scylladb#24105
* github.com:scylladb/scylladb:
Refactor out code from test_restore_with_streaming_scopes
Refactor out code from test_restore_with_streaming_scopes
Refactor out code from test_restore_with_streaming_scopes
Refactor out code from test_restore_with_streaming_scopes
Refactor out code from test_restore_with_streaming_scopes
The non-streaming loading of sstables performs cleanup since recently [1]. For vnodes, unfortunately, cleanup is almost unavoidable, because of the nature of vnodes sharding, even if sstable is already clean. This leads to waste of IO and CPU for nothing. Skipping the cleanup in a smart way is possible, but requires too many changes in the code and in the on-disk data. However, the effort will not help existing SSTables and it's going to be obsoleted by tablets some time soon.
Said that, the easiest way to skip cleanup is the explicit --skip-cleanup option for nodetool and respective skip_cleanup parameter for API handler.
New feature, no backport
fixes#24136
refs #12422 [1]
Closesscylladb/scylladb#24139
* github.com:scylladb/scylladb:
nodetool: Add refresh --skip-cleanup option
api: Introduce skip_cleanup query parameter
distributed_loader: Don't create owned ranges if skip-cleanup is true
code: Push bool skip_cleanup flag around
Inserts an iterator range at some position.
Again we insert the range at the end and use std::rotate() to
move the newly inserted elements into place, forgoing possible
optimizations.
Unit tests are added.
Implement using std::rotate() and resize(). The elements to be erased
are rotated to the end, then resized out of existence.
Again we defer optimization for trivially copyable types.
Unit tests are added.
Needed for range_streamer with token_ranges using chunked_vector.
partition_range_compat's unwrap() needs insert if we are to
use it for chunked_vector (which we do).
Implement using push_back() and std::rotate().
emplace(iterator, args) is also implemented, though the benefit
is diluted (it will be moved after construction).
The implementation isn't optimal - if T is trivially copyable
then using std::memmove() will be much faster that std::rotate(),
but this complex optimization is left for later.
Unit tests are added.
Added function returning custom index class name.
Added printing custom index class name when using DESCRIBE.
Changed validation to reflect current support of indices.
One of pytest parameters in test_long_query_timeout_erm.py was
a CQL query containing spaces and special chars such as '*', '(', ')',
'{', '}'. After upgrading to Fedora 42, the test started to
fail with the error "test.pylib.rest_client.HTTPError: HTTP error 404"
with uri=`http://...[SELECT * FROM {}-True-False].dev.1`.
To prevent from such errors, this commit changes the parameter to
a string without spaces and such special characters.
Fixes: scylladb/scylladb#24124Closesscylladb/scylladb#24130