The local node's dc:rack pair is cached on system keyspace on start. However, most of other code don't need it as they get dc:rack from topology or directly from snitch. There are few places left that still mess with sysks cache, but they are easy to patch. So after this patch all the core code uses two sources of dc:rack -- topology / snitch -- instead of three.
Closes#15280
* github.com:scylladb/scylladb:
system_keyspace: Don't require snitch argument on start
system_keyspace: Don't cache local dc:rack pair
system_keyspace: Save local info with explicit location
storage_service: Get endpoint location from snitch, not system keyspace
snitch: Introduce and use get_location() method
repair: Local location variables instead of system keyspace's one
repair: Use full endpoint location instead of datacenter part
A reviewer noted that test_update_expression_list_append_non_list_arguments
has too much code duplication - the same long API call to run
"SET a = list_append(...)" was repeated many times.
So in this patch we add a short inner function "try_list_append" to
avoid this duplication.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes: #15298
Find progress of repair tasks based on the number of ranges
that have been repaired.
Fixes: [#1156](https://github.com/scylladb/scylla-enterprise/issues/1156).
Closes#14698
* github.com:scylladb/scylladb:
test: repair tasks test
repair: add methods making repair progress more precise
tasks: make progress related methods virtual
repair: add get_progress method to shard_repair_task_impl
repair: add const noexcept qualifiers to shard_repair_task_impl::ranges_size()
repair: log a name of a particular table repair is working on
tasks: delete move and copy constructors from task_manager::task::impl
Currently, the topology coordinator has the
`topology::transition_state::publish_cdc_generation` state responsible
for publishing the already created CDC generations to the user-facing
description tables. This process cannot fail as it would cause some CDC
updates to be missed. On the other hand, we would like to abort the
`publish_cdc_generation` state when bootstrap aborts. Of course, we
could also wait until handling this state finishes, even in the case of
the bootstrap abort, but that would be inefficient. We don't want to
unnecessarily block topology operations by publishing CDC generations.
The solution proposed by this PR is to remove the
`publish_cdc_generation` state completely and introduce a new background
fiber of the topology coordinator -- `cdc_generation_publisher` -- that
continually publishes committed CDC generations.
Apart from introducing the CDC generation publisher, we add
`test_cdc_generation_publishing.py` that verifies its correctness and we
adapt other CDC tests to the new changes.
Fixes#15194Closes#15281
* github.com:scylladb/scylladb:
test: test_cdc: introduce wait_for_first_cdc_generation
test: move cdc_streams_check_and_repair check
test: add test_cdc_generation_publishing
docs: remove information about publish_cdc_generation
raft topology: introduce the CDC generation publisher
system_keyspace: load unpublished_cdc_generations to topology
raft topology: mark committed CDC generations as unpublished
raft topology: add unpublished_cdc_generations to system.topology
Add tests for gossiper/endpoint/live and gossiper/endpoint/down
which run only in release mode.
Enable test_remove_node_with_concurrent_ddl and fix types and
variables names used by it, so that they can be reused in gossiper
test.
Fixes: #15223.
Closes#15244
* github.com:scylladb/scylladb:
test: topology: add gossiper test
test: fix types and variable names in wait_for_host_down
After introducing the CDC generation publisher,
test_cdc_log_entries_use_cdc_streams could (at least in theory)
fail by accessing system_distributed.cdc_streams_descriptions_v2
before the first CDC generation has been published.
To avoid flakiness, we simply wait until the first CDC generation
is published in a new function -- wait_for_first_cdc_generation.
The part of test_topology_ops that tests the
cdc_streams_check_and_repair request could (at least in theory)
fail on
`assert(len(gen_timestamps) + 1 == len(new_gen_timestamps))`
after introducing the CDC generation publisher because we can
no longer assume that all previously committed CDC generations
have been published before sending the request.
To prevent flakiness, we move this part of the test to
test_cdc_generations_are_published. This test allows for ensuring
that all previous CDC generations have been published.
Additionally, checking cdc_streams_check_and_repair there is
simpler and arguably fits the test better.
We add two test cases that test the new CDC generation publisher
to detect potential bugs like incorrect order of publications or
not publishing some generations at all.
The purpose of the second test case --
test_multiple_unpublished_cdc_generations -- is to enforce and test
a scenario when there are multiple unpublished CDC generations at
the same time. We expect that this is a rare case. The main fiber
of the topology coordinator would have to make much more progress
(like finishing two bootstraps) than the CDC generation publisher
fiber. Since multiple unpublished CDC generations might never
appear in other tests but could be handled incorrectly, having
such a test is valuable.
Some tests use non-threaded do_with_cql_env() and wrap the inner lambda with seastar::async(). The cql env already provides a helper for that
Closes#15305
* github.com:scylladb/scylladb:
cql_query_test: Fix indentation after previous patch
cql_query_test: Use do_with_cql_env_thread() explicitly
The Alternator tests can run against HTTPS - namely when using
test/alternator/run with the "--https" option (local Alternator
configured with HTTPS) or "--aws" option (DynamoDB, using HTTPS).
In some cases we make these HTTPS requests with verify=False, to avoid
checking the SSL certificates. E.g., this is necessary for Alternator
with a self-signed certificate. Unfortunately, the urllib3 library adds
an ugly warning message when SSL certificate verification is disabled.
In the past we tried to disable these warnings, using the documented
urllib3.disable_warnings() function, but it didn't help. It turns out
that pytest has its own warning handling, so to disable warnings in
pytest we must say so in a special configuration parameter in pytest.ini.
So in this patch, we drop the disable_warnings call from conftest.py
(where it didn't help), and instead put a similar declaration in
pytest.ini. The disable_warnings call in the test/alternator/run
script needs to remain - it is run outside pytest, so pytest.ini
doesn't affect it.
After this patch, running test/alternator/run with --https or --aws
finishes without warnings, as desired.
Fixes#15287
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#15292
Some tests use non-threaded do_with_cql_env() and wrap the inner lambda
with seastar::async(). The cql env already provides a helper for that
Indentation is deliberately left broken until next patch
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Passing the gate_closed_exception to the task promise
ends up with abandoned exception since no-one is waiting
for it.
Instead, enter the gate when the task is made
so it will fail make_task if the gate is already closed.
Fixes scylladb/scylladb#15211
In addition, this series adds a private abort_source for each task_manager module
(chained to the main task_manager::abort_source) and abort is requested on task_manager::module::stop().
gate holding in compaction_manager is hardened
and makes sure to stop compaction_manager and task_manager in sstable_compaction_test cases.
Closes#15213
* github.com:scylladb/scylladb:
compaction_manager: stop: close compaction_state:s gates
compaction_manager: gracefully handle gate close
task_manager: task: start: fixup indentation
task_manager: module: make_task: enter gate when the task is created
task_manaer: module: stop: request abort
task_manager: task::impl: subscribe to module about_source
test: compaction_manager_stop_and_drain_race_test: stop compaction and task managers
test: simple_backlog_controller_test: stop compaction and task managers
Improved the coverage of the tests for the list_append() function
in UpdateExpression - test that if one of its arguments is not a list,
including a missing attribute or item, it is reported as an error as
expected.
The new tests pass on both Alternator and DynamoDB.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#15291
On boot system keyspace is kicked to insert local info into system.local
table. Among other things there's dc:rack pair which sys.ks. gets from
its cache which, in turn, should have been previously initialized from
snitch on sys.ks. start. This patch makes the local info updating method
get the dc:rack from caller via argument. Callers, in turn, call snitch
directly, because these are main and cql_test_env startup routines.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
"experimental" option was marked "Unused" in 64bc8d2f7d. but we
chose to keep it in hope that the upgrade test does not fail.
despite that the upgrade tests per-se survived the "upgrade",
after the upgrade, the tests exercising the experimental features
are still failing hard. they have not been updated to set the
"experimental-features" option, and are still relying on
"experimental" to enable all the experimental features under
test.
so, in this change, let's just drop the option so that
scylla can fail early at seeing this "experimental" option.
this should help us to identify the tests relying on it
quicker. as the "experimental" features should only be used
in development environment, this change should have no impact
to production.
Refs #15214
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#15233
The s.service since d42685d0cb is having on-board query processor ref^w
pointer and can use it to join cluster
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#15236
Index caching was disabled by default because it caused performance regressions
for some small-partition workloads. See https://github.com/scylladb/scylladb/issues/11202.
However, it also means that there are workloads which could benefit from the
index cache, but (by default) don't.
As a compromise, we can set a default limit on the memory usage of index cache,
which should be small enough to avoid catastrophic regressions in
small-partition workloads, but big enough to accommodate workloads where
index cache is obviously beneficial.
This series adds such a configurable limit, sets it to to 0.2 of total cache memory by default,
and re-enables index caching by default.
Fixes#15118Closes#14994
* github.com:scylladb/scylladb:
test: boost/cache_algorithm_test: add cache_algorithm_test
sstables: partition_index_cache: deglobalize stats
utils: cached_file: deglobalize cached_file metrics
db: config: enable index caching by default
config: add index_cache_fraction
utils: lru: add move semantics to list links
Move partition_index_cache stats from a thread_local variable
to cache_tracker. After the change, partition_index_cache
receives a reference to the stats via constructor, instead of
referencing a global.
This is needed so that cache_tracker can know the memory usage
of index caches (for cache eviction purposes) without relying on
globals.
But it also makes sense even without that motive.
Move cached_file metrics from a thread_local variable
to cache_tracker.
This is needed so that cache_tracker can know
the memory usage of index caches (for purposes
of cache eviction) without relying on globals.
But it also makes sense even without that motive.
This PR collects followups described in #14972:
- The `system.topology` table is now flushed every time feature-related
columns are modified. This is done because of the feature check that
happens before the schema commitlog is replayed.
- The implementation now guarantees that, if all nodes support some
feature as described by the `supported_features` column, then support
for that feature will not be revoked by any node. Previously, in an
edge case where a node is the last one to add support for some feature
`X` in `supported_features` column, crashes before applying/persisting
it and then restarts without supporting `X`, it would be allowed to boot
anyway and would revoke support for the `X` in `system.topology`.
The existing behavior, although counterintuitive, was safe - the
topology coordinator is responsible for explicitly marking features as
enabled, and in order to enable a feature it needs to perform a special
kind of a global barrier (`barrier_after_feature_update`) which only
succeeds after the node has updated its features column - so there is no
risk of enabling an unsupported feature. In order to make the behavior
less confusing, the node now will perform a second check when it tries
to update its `supported_features` column in `system.topology`.
- The `barrier_after_feature_update` is removed and the regular global
`barrier` topology command is used instead. The `barrier` handler now
performs a feature check if the node did not have a chance to verify and
update its cluster features for the second time.
JOIN_NODE rpc will be sent separately as it is a big item on its own.
Fixes: #14972Closes#15168
* github.com:scylladb/scylladb:
test: topology{_experimental_raft}: don't stop gracefully in feature tests
storage_service: remove _topology_updated_with_local_metadata
topology_coordinator: remove barrier_after_feature_update
topology_coordinator: perform feature check during barrier
storage_service: repeat the feature check after read barrier
feature_service: introduce unsupported_feature_exception
feature_service: move startup feature check to a separate function
topology_coordinator: account for features to enable in should_preempt_balancing
group0_state_machine: flush system.topology when updating features columns
Scrub tests use a lot of temporary directories. This is suspected to
cause problems in some cases. To improve the situation, this patch:
* Creates a single root temporary directory for all scrub tests
* All further fixtures create their files/directories inside this root
dir.
* All scrub tests create their temporary directories within this root
dir.
* All temporary directories now use an appropriate "prefix", so we can
tell which temporary directory is part of the problem if a test fails.
Refs: #14309Closes#15117
The current cluster feature tests are stopping nodes in a graceful way.
Doing it gracefully isn't strictly necessary for the test scenarios and
we can switch `server_stop_gracefully` calls to `server_stop`. This only
became possible after a previous commit which causes `system.topology`
table to be flushed when cluster feature columns are modified, and will
server as a good test for it.
We want to stop supporting IPs for `--ignore-dead-nodes` in
`raft_removenode` and `--ignore-dead-nodes-for-replace` for
`raft_replace`. However, we shouldn't remove these features without the
deprecation period because the original `removenode` and `replace`
operations still support them. So, we add them for now.
The `IP -> Raft ID` translation is done through the new
`raft_address_map::find_by_addr` member function.
We update the documentation to inform about the deprecation of the IP
support for `--ignore-dead-nodes`.
Fixes#15126Closes#15156
* github.com:scylladb/scylladb:
docs: inform about deprecating IP support for --ignore-dead-nodes
raft topology: support IPs for --ignore-dead-nodes
raft_address_map: introduce find_by_addr
The cql_test_env has a virtual require_column_has_value() helper that better fits cql_assertions crowd. Also, the helper in question duplicates some existing code, so it can also be made shorter (and one class table helper gets removed afterwards)
Closes#15208
* github.com:scylladb/scylladb:
cql_assertions: Make permit from env
table: Remove find_partition_slow() helper
sstable_compaction_test: Do not re-decorate key
cql_test_env: Move .require_column_has_value
cql_test_env: Use table.find_row() shortcut
Motivation:
The user can bootstrap 3 different clusters and then connect them
(#14448). When these clusters start gossiping, their token rings will be
merged, but there will be 3 different group 0s in there. It results in a
corrupted cluster.
We need to prevent such situations from happening in clusters which
don't use Raft-based topology.
-------
Gossiper service sets its group0 id on startup if it is stored in
`scylla_local` or sets it during joining group0.
Send group0_id (if it is set) when the node tries to initiate the gossip
round. When a node gets gossip_digest_syn it checks if its group0 id
equals the local one and if not, the message is discarded.
Fixes#14448
Performed manual tests with the following scenario:
1. setup a cluster of two nodes (one compiled with and one without this patch)
2. setup a new node
3. create a basic keyspace and table
4. execute simple select and insert queries
Tested 4 scenarios: the seed node was with or without this patch, and the third node was with or without this patch.
These tests didn't detect any errors.
Closes#15004
* github.com:scylladb/scylladb:
tests: raft: cluster of nodes with different group0 ids
gossip: add group0_id attribute to gossip_digest_syn
To call table::find_row() one needs to provide a permit. Tests have
short and neat helper to create one from cql_test_env
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The is_partition_dead() local helper accepts partition key argument and
decorates it. Howerver, its caller gets partition key from decorated key
itself, and can just pass it along
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The require_column_has_value() finds the cell in three steps -- finds
partition, then row, then cell. The class table already has a method to
facilitate row finding by partition and clustering key
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The main goal of this PR is to stop cdc_generation_service from calling
system_keyspace::bootstrap_complete(). The reason why it's there is that
gen. service doesn't want to handle generation before node joined the
ring or after it was decommissioned. The cleanup is done with the help
of storage_service->cdc_generation_service explicit dependency brought
back and this, in turn, suddenly freed the raft and API code from the
need to carry cdc gen. service reference around.
Closes#15047
* github.com:scylladb/scylladb:
cdc: Remove bootstrap state assertion from after_join()
cdc: Rework gen. service check for bootstrap state
api: Don't carry cdc gen. service over
storage_service: Use local cdc gen. service in join_cluster()
storage_service: Remove cdc gen. service from raft_state_monitor_fiber()
raft: Do not carry cdc gen. service over
storage_service: Use local cdc gen. service in topo calls
storage_service: Bring cdc_generation_service dependency back
The reproducer for #14448.
The test starts two nodes with different group0_ids. The second node
is restarted and tries to join the cluster consisting of the first node.
gossip_digest_syn message should be rejected by the first node, so
the second node will not be able to join the cluster.
This test uses repair-based node operations to make this test easier.
If the second node successfully joins the cluster, their tokens metadata
will be merged and the repair service will allow to decommission the second node.
If not - decommissioning the second node will fail with an exception
"zero replica after the removal" thrown by the repair service.
Gossiper service sets its group0 id on startup if it is stored in `scylla_local`
or sets it during joining group0.
Send group0_id (if it is set) when the node tries to initiate the gossip round.
When a node gets gossip_digest_syn it checks if its group0 id equals the local
one and if not, the message is discarded.
Fixes#14448.
Modeled after get_live_members_synchronized,
get_unreachable_members_synchronized calls
replicate_live_endpoints_on_change to synchronize
the state of unreachable_members on all shards.
Fixes#12261Fixes#15088
Also, add rest_api unit test for those apis
Closes#15093
* github.com:scylladb/scylladb:
test: rest_api: add test_gossiper
gossiper: add get_unreachable_members_synchronized
The method in question accepts cdc_generation_service ref argument from
main and cql_test_env, but storage service now has local cdcv gen.
service reference, so this argument and its propagation down the stack
can be removed
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
It sort of reverts the 5a97ba7121 commit, because storage service now
uses the cdc generation service to serve raft topo updates which, in
turn, takes the cdc gen. service all over the raft code _just_ to make
it as an argument to storage service topo calls.
Also there's API carrying cdc gen. service for the single call and also
there's an implicit need to kick cdc gen. service on decommission which
also needs storage service to reference cdc gen. after boot is complete
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Add a class that handles log file browsing with the following features:
* mark: returns "a mark" to the current position of the log.
* wait_for: asynchronously checks if the log contains the given message.
* grep: returns a list of lines matching the regular expression in the log.
Add a new endpoint in `ManagerClient` to obtain the scylla logfile path.
Fixes#14782Closes#14834
We want to stop supporting IPs for --ignore-dead-nodes in
raft_removenode and --ignore-dead-nodes-for-replace for
raft_replace. However, we shouldn't remove these features without
the deprecation period because the original removenode and
replace operations still support them. So, we add them for now.
Additionally, we modify test_raft_ignore_nodes.py so that it
verifies the added IP support.
In the following commit, we add IP support for --ignore-dead-nodes
in raft_removenode and raft_replace. To implement it, we need
a way to translate IPs to Raft IDs. The solution is to add a new
member function -- find_by_addr -- to raft_address_map that
does the IP->ID translation.
The IP support for --ignore-dead-nodes will be deprecated and
find_by_addr shouldn't be called for other reasons, so it always
logs a warning.
We also add some unit tests for find_by_addr.
It's possible that compaction task is preempted after completion and
before reevaluation, causing pending_tasks to be > 1.
Let's only exit the loop if there are no pending tasks, and also
reduce 100ms sleep which is an eternity for this test.
Fixes#14809.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closes#15059