It's not a good usage as there is only one non-empty implementation.
Also we need to change it further in the following commit which
makes it incompatible with listener code.
There is already implicit logical dependency via migration_notifier
but in the next commits we'll be moving store_service out from it
as we need better control (i.e. return a value from the call).
- remove load_tablet_metadata(), instead we add wake_up_load_balancer flag
to update_tablet_metadata(), it reduces number of public functions and
also serves as a comment (removed comment with very similar meaning)
- reimplement the code to not use mutate_token_metadata(), this way
it's more readable and it's also needed as we'll split
update_tablet_metadata() in following commits so that we can have
subroutine which doesn't yield (for ensuring atomicity)
This is similar work as for drop_table in previous commit.
add_column_family_and_make_directory() behaves exactly the same
as before but calls to it in schema_applier will be replaced by
calls directly to split steps. Other usages will remain intact as
they don't need atomicity (like creating system tables at startup).
This is done so that actual dropping can be
an atomic step which could be composed with other
schema operations, and eventually all subsystems modified
via raft so that we could introduce atomic changes which
span across different subsystems.
We split drop_table_on_all_shards() into:
- prepare_tables_metadata_change_on_all_shards()
- prepare_drop_table_on_all_shards()
- drop_table()
- cleanup_drop_table_on_all_shards()
prepare_tables_metadata_change_on_all_shards() is necessary
because when applying multiple schema changes at once (e.g. drop
and add tables) we need to lock only once.
We add legacy_drop_table_on_all_shards() which
behaves exactly like old drop_table_on_all_shards() to be
compatible with code which doesn't need to play with atomicity.
Usages of legacy_drop_table_on_all_shards() in schema_applier
will be replaced with direct calls to split functions in the following
commits - that's the place we will take advantage of drop_table not
yielding (as it returns void now).
This will be the place for all atomic schema switching
operations.
Note that atomicity is observed only from single shard
point of view. All shards may switch at slightly different times
as global locking for this is not feasible.
Once we create types atomically the code which is before commit
may depend on newly added types, so it has to access both old and
new types. New storage called in_progress_types_storage was added.
- first phase is preemptive (prepare_update_keyspace)
- second phase is non-preemptive (update_keyspace)
This is done so that schema change can be applied atomically.
Aditionally create keyspace code was changed to share common
part with update keyspace flow.
This commit doesn't yet change the behaviour of the code,
as it doesn't guarantee atomicity, it will be done in following
commits.
Merging types code now returns generic affected_types structure which
is used both for notifications and dropping types. New static
function drop_types() replaces dropping lambda used before.
While I think it's not necessary for dropping nor notifications to
use per shard copies (like it's using before and after this patch)
it could just use string parameters or something similar but
this requires too many changes in other classes so it's out of scope
here.
In following commits we want to separate updating code from committing
shema change (making it visible). Since notifications should be issued
after change is visible we need to separate them and call after
committing.
In subsequent commits other notification types will be moved too.
We change here order of notification calls with regards to rest
of schema updating code. I.e. before keyspace notifications triggered
before tables were updated, after the change they will trigger once
everything is updated. There is no indication that notification
listeners depend on this behaviour.
This commit doesn't yet change how schema merging
works but it prepares the ground for it.
We split merging code into several functions.
Main reasons for it are that:
- We want to generalize and create some interface
which each subsystem would use.
- We need to pull mutation's apply() out
of the code because raft will call it directly,
and it will contain a mix of mutations from more
than one subsystem. This is needed because we have
the need to update multiple subsystems atomically
(e.g. auth and schema during auto-grant when creating
a table).
In this commit do_merge_schema() code is split between
prepare(), update(), commit(), post_commit(). The idea
behind each of these phases is described in the comments.
The last 2 phases are not yet implemented as it requires more
code changes but adding schema_applier enclosing class
will help to create some copied state in the future and
implement commit() and post_commit() phases.
The intention was to fail the REST API call in case --skip-cleanup is
requested for --load-and-stream loading. The corresponding if expression
is checking something else :( despite log message is correct.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#24208
By default, cluster tests have skip_wait_for_gossip_to_settle=0 and
ring_delay_ms=0. In tests with gossip topology, it may lead to a race,
where nodes see different state of each other.
In case of test_auth_v2_migration, there are three nodes. If the first
node already knows that the third node is NORMAL, and the second node
does not, the system_auth tables can return incomplete results.
To avoid such a race, this commit adds a check that all nodes see other
nodes as NORMAL before any writes are done.
Refs: #24163Closesscylladb/scylladb#24185
`read_checksum()` loads the checksum component from disk and stores a
non-owning reference in the shareable components. To avoid loading the
same component twice, the function has an early return statement.
However, this does not guarantee atomicity - two fibers or threads may
load the component and update the shareable components concurrently.
This can lead to use-after-free situations when accessing the component
through the shareable components, since the reference stored there is
non-owning. This can happen when multiple compaction tasks run on the
same SSTable (e.g., regular compaction and scrub-validate).
Fix this by not updating the reference in shareable components, if a
reference is already in place. Instead, create an owning reference to
the existing component for the current fiber. This is less efficient
than using a mutex, since the component may be loaded multiple times
from disk before noticing the race, but no locks are used for any other
SSTable component either. Also, this affects uncompressed SSTables,
which are not that common.
Fixes#23728.
Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
Closesscylladb/scylladb#23872
Fix for https://github.com/scylladb/scylladb/pull/24097
The stable branch does not contain the split API reference yet. This change fixes the 404 error raised when accessing the API reference on the stable branch due to the redirect.
Closesscylladb/scylladb#24259
* github.com:scylladb/scylladb:
docs: fix typo
docs: remove API reference redirect
There is a difference how ScyllaDB and Cassandra handle conditional
batches with different IF statements (such as "IF EXISTS" and "IF NOT
EXISTS"). Cassandra tries to detect condition conflicts, and prints
an error instead of silently failing the batch, but in ScyllaDB
we considered this check to be inconsistent and unhelpful, and
decided not to implement it.
In this series, we extend the documentation of the ScyllaDB behaviour
by extending the documents and improving relevant LWT tests.
Fixes: https://github.com/scylladb/scylladb/issues/13011
Backport not needed, only docs and minor tests changes.
Closesscylladb/scylladb#24086
* github.com:scylladb/scylladb:
test: mark difference in handling IFs in LWT as scylla_only
docs: cql: add explicit explanation how mixing IFs works in LWT
docs: lwt: add two missing spaces
PythonTestSuite::recycle_cluster is a function that releases resources
of an old, dirty cluster to make it reusable. It closes log_file and
maintenance_socket_dir for running nodes in a dirty cluster, however it
doesn't do the same for stopped nodes. It leads to leakage of file
descriptors of stopped nodes, which in turn can lead to hitting ulimit
of open files (that is often 1024) if the leaking test is repeated with
`./test.py --repeat ...`. The problem was detected when tests from
`test/cluster/dtest/` directory were executed with high `repeat` value.
This commit extends `recycle_cluster` to close and cleanup logfile and
`socket_dir` for nodes that are stopped (because self.servers in
ScyllaCluster is ChainMap of self.running and self.stopped).
Closesscylladb/scylladb#24243
The stable branch does not contain the split API reference yet.
This change fixes the 404 error raised when accessing the API reference on the stable branch.
There is a difference how ScyllaDB and Cassandra handle conditional
batches with different IF statements (such as "IF EXISTS" and "IF NOT
EXISTS"). Cassandra tries to detect condition conflicts, and prints
an error instead of silently failing the batch, but in ScyllaDB
we considered this check to be inconsistent and unhelpful, and
decided not to implement it.
This commit:
- Make test_lwt_with_batch_conflict_1 scylla_only instead of xfail,
change the scenario to pass with the current implementation.
- Add test_lwt_with_batch_conflict_3 that shows how Cassandra fails
batch statement with different conditions, even when the conditions
are not contradictory.
- Add test_lwt_with_batch_conflict_4/5 that shows how static rows
are handled in conditional batches.
Fixes: #13011
There is a difference how ScyllaDB and Cassandra handle conditional
batches with different IF statements (such as "IF EXISTS" and "IF NOT
EXISTS").
This commit explicitly documents the differences in the behavior.
Refs: #13011
There are few problems found in the dtest shim code after scylladb/scylladb#21580 was merged:
- The call of `init_default_config()` method was missed in scylladb/scylladb#21580. It is required to handle dtest options and markers.
- The implementation of dtest shim uses `server_id` to format a name of a node in a cluster. This is a difference in behavior with dtest. Some of dtests use code like `cluster.nodes()["node1"]` to get access to a node object.
- Default timeout was missed in `ScyllaNode.wait_until_stopped()` method. Set it to 600 for debug mode or to 127 otherwise.
Closesscylladb/scylladb#24225
* github.com:scylladb/scylladb:
test.py: dtest: set default wait_seconds based on build mode
test.py: dtest: name nodes in cluster using index starting from 1
test.py: dtest: initialize default config in dtest setup fixture
Today we send a reminder to PR's author when backport PRs has conflicts.
Often, PR authors wait for their PR to be reviewed/merged, but the merge is not happening because the PR now conflicts with master and so maintainers won't merge it. This can lead to a stall, where maintainers wait for the author to rebase and authors are waiting for merge.
In this PR we added the ability to notify the PR author as soon as base
branch moved forward and rebase is requried
Fixes: https://github.com/scylladb/scylla-pkg/issues/4955Closesscylladb/scylladb#24209
Since 5e1cf90a51
("build: replace tools/java submodule with packaged cassandra-stress")
we run pre-packaged cassandra-stress. As such, we don't need to look for
a Java runtime (which is missing on the frozen toolchain) and can
rely on the cassandra-stress package finding its own Java runtime.
Fix by just dropping all the Java-finding stuff.
Note: Java 11 is in fact present on the frozen toolchain, just
not in a way that pgo.py can find it.
Fixes#24176.
Closesscylladb/scylladb#24178
Blobs can be large, and unfragmented blobs can easily exceed 128k
(as seen in #23903). Rename get_blob() to get_blob_unfragmented()
to warn users.
Note that most uses are fine as the blobs are really short strings.
Closesscylladb/scylladb#24102
It seems that tests in test/boost/combined_tests have to define a test
suite name, otherwise they aren't picked up by test.py.
Fixes#24199Closesscylladb/scylladb#24200
This PR adds the possibility to gather coverage for the boost tests when they're executed with pytest. Since the pytest will be used as the main runner for boost tests as well, we need this before switching the runners.
Closesscylladb/scylladb#24236
* github.com:scylladb/scylladb:
test.py: add support for coverage for boost test
test.py: get the temp dir from facade
This PR adds the possibility to gather coverage for the boost tests when they're executed with pytest. Since the pytest will be used as the main runner for boost tests as well, we need this before switching the runners.
This pull request adds support for creating custom indexes (at a metadata level) as long as a supported custom class is provided (currently only vector search).
The patch contains:
- a change in CREATE INDEX statement that allows for the USING keyword to be present as long as one of the supported classes is used
- support for describing custom indexes in the DESCRIBE statement
- unit tests
Co-authored by: @Balwancia
Closesscylladb/scylladb#23720
* github.com:scylladb/scylladb:
test/cqlpy: add custom index tests
index: support storing metadata for custom indices
Move the run_process method to resource gather instance, since we need to start a monitor to check memory consumption in the cgroup. Pytest has concept of the test, but it is completely different from test.py. Resource gather instance take test instance to save and extract information about the test. Additional method emulating test.py test instance added not to rewrite the resource gather instance. Finally, combining all these changes to have ability to get metrics for test in both runners: test.py and pytest.
Closesscylladb/scylladb#24091
* github.com:scylladb/scylladb:
test.py: add missing parameter for boost tests for pytest runner
test.py: add support for boost_data_test_case in combined tests
test.py: clean log files after a successful run
test.py: attach output of the boost test to the report
test.py: fix metrics DB location
test.py: move run_process to resource_gather.py
test.py: unify using constant for finding repo root directory
test.py: refactor run_process in facade.py
test.py: add the possibility to create a test alike object
The current implementation of dtest shim use `server_id` to format a
name of a node in a cluster. This is a difference in behavior with dtest.
Some of dtests use code like `cluster.nodes()["node1"]` to get access
to a node object. This commit changes it to be more consistent with
dtest.