Draining hints may occur in one of the two scenarios:
* a node leaves the cluster and the local node drains all of the hints
saved for that node,
* the local node is being decommissioned.
Draining may take some time and the hint manager won't stop until it
finishes. It's not a problem when decommissioning a node, especially
because we want the cluster to retain the data stored in the hints.
However, it may become a problem when the local node started draining
hints saved for another node and now it's being shut down.
There are two reasons for that:
* Generally, in situations like that, we'd like to be able to shut down
nodes as fast as possible. The data stored in the hints won't
disappear from the cluster yet since we can restart the local node.
* Draining hints may introduce flakiness in tests. Replaying hints doesn't
have the highest priority and it's reflected in the scheduling groups we
use as well as the explicitly enforced throughput. If there are a large
number of hints to be replayed, it might affect our tests.
It's already happened, see: scylladb/scylladb#21949.
To solve those problems, we change the semantics of draining. It will behave
as before when the local node is being decommissioned. However, when the
local node is only being stopped, we will immediately cancel all ongoing
draining processes and stop the hint manager. To amend for that, when we
start a node and it initializes a hint endpoint manager corresponding to
a node that's already left the cluster, we will begin the draining process
of that endpoint manager right away.
That should ensure all data is retained, while possibly speeding up
the shutdown process.
There's a small trade-off to it, though. If we stop a node, we can then
remove it. It won't have a chance to replay hints it might've before
these changes, but that's an edge case. We expect this commit to bring
more benefit than harm.
We also provide tests verifying that the implementation works as intended.
Fixesscylladb/scylladb#21949Closesscylladb/scylladb#22811
Before this patch we silently allowed and ignored PER PARTITION LIMIT.
While using aggregate functions in conjunction with PER PARTITION LIMIT
can make sense, we want to disable it until we can offer proper
implementation, see #9879 for discussion.
We want to match Cassandra, and for queries with aggregate functions it
behaves as follows:
- it silently ignores PER PARTITION LIMIT if GROUP BY is present, which
matches our previous implementation.
- rejects PER PARTITION LIMIT when GROUP BY is *not* present.
This patch adds rejection of the second group.
Fixes#9879Closesscylladb/scylladb#23086
In case invoke_on_all(tester::start) throws, the sharded<tester>
instance remains non-stopped and calltrace is reported on test stop. Not
nice, fix it so that sharded<> thing is stopped in any case.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#23244
seastar::at_exit() was marked deprecated recently. so let's use
the recommended approach to perform cleanups.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Several updates and improvements to the retryable HTTP client functionality, as well as enhancements to error handling and integration with AWS services, as part of this PR. Below is a summary of the changes:
- Moved the retryable HTTP client functionality out of the S3 client to improve modularity and reusability across other services like AWS STS.
- Isolated the retryable_http_client into its own file, improving clarity and maintainability.
- Added a make_request method that introduces a response-skipping handler.
- Introduced a custom error handler constructor, providing greater flexibility in handling errors.
- Updated the STS and Instance Metadata Service credentials providers to utilize the new retryable HTTP client, enhancing their robustness and reliability.
- Extended the AWS error list to handle errors specific to the STS service, ensuring more granular and accurate error management for STS operations.
- Enhanced error handling for system errors returned by Seastar’s HTTP client, ensuring smoother operations.
- Properly closed the HTTP client in instance_profile_credentials_provider and sts_assume_role_credentials_provider to prevent resource leaks.
- Reduced the log severity in the retry strategy to avoid SCT test failures that occur when any log message is tagged as an ERROR.
No backport needed since we dont have any s3 related activity on the scylla side been released
Closesscylladb/scylladb#21933
* github.com:scylladb/scylladb:
s3_client: Adjust Log Severity in Retry Strategy
aws_error: Enhance error handling for AWS HTTP client
aws_error: Add STS specific error handling
credentials_providers: Close retryable clients in Credentials Providers
credentials_providers: Integrate retryable_http_client with Credentials Providers
s3_client: enhance `retryable_http_client` functionality
s3_client: isolate `retryable_http_client`
s3_client: Prepare for `retryable_http_client` relocation
s3_client: Remove `is_redirect_status` function
s3_client: Move retryable functionality out of s3 client
Before this patch, the load balancer was equalizing tablet count per
shard, so it achieved balance assuming that:
1) tablets have the same size
2) shards have the same capacity
That can cause imbalance of utilization if shards have different
capacity, which can happen in heterogeneous clusters with different
instance types. One of the causes for capacity difference is that
larger instances run with fewer shards due to vCPUs being dedicated to
IRQ handling. This makes those shards have more disk capacity, and
more CPU power.
After this patch, the load balancer equalizes shard's storage
utilization, so it no longer assumes that shards have the same
capacity. It still assumes that each tablet has equal size. So it's a
middle step towards full size-aware balancing.
One consequence is that to be able to balance, the load balancer need
to know about every node's capacity, which is collected with the same
RPC which collects load_stats for average tablet size. This is not a
significant set back because migrations cannot proceed anyway if nodes
are down due to barriers. We could make intra-node migration
scheduling work without capacity information, but it's pointless due
to above, so not implemented.
Also, per-shard goal for tablet count is still the same for all nodes in the cluster,
so nodes with less capacity will be below limit and nodes with more capacity will
be slightly above limit. This shouldn't be a significant problem in practice, we could
compensate for this by increasing the limit.
Refs #23042Closesscylladb/scylladb#23079
* github.com:scylladb/scylladb:
tablets: Make load balancing capacity-aware
topology_coordinator: Fix confusing log message
topology_coordinator: Refresh load stats after adding a new node
topology_coordinator: Allow capacity stats to be refreshed with some nodes down
topology_coordinator: Refactor load status refreshing so that it can be triggered from multiple places
test: boost: tablets_test: Always provide capacity in load_stats
test: perf_load_balancing: Set node capacity
test: perf_load_balancing: Convert to topology_builder
config, disk_space_monitor: Allow overriding capacity via config
storage_service, tablets: Collect per-node capacity in load_stats
Send digest ack and ack2 by host ids as well now since the id->ip
mapping is available after receiving digest syn. It allows to convert
more code to host id here.
We want to move to use host ids as soon as possible. Currently it is
possible only after the full gossiper exchange (because only at this
point gossiper state is added and with it address map entry). To make it
possible to move to host ids earlier this patch adds address map entries
on incoming communication during CLIENT_ID verb processing. The patch
also adds generation to CLIENT_ID to use it when address map is updated.
It is done so that older gossiper entries can be overwritten with newer
mapping in case of IP change.
The partition range vector is an std::vector, which means it performs
contiguous allocations. Large allocations are known to cause problems
(e.g., reactor stalls).
For paged queries, limit the vector size to 1000. If more partition keys
are available in the query result, discard them. Ideally, we should not
be fetching them at all, but this is not possible without knowing the
size of each partition.
Currently, each vector element is 120 bytes and the standard allocator's
max preferred contiguous allocation is 128KiB. Therefore, the chosen
value of 1000 satisfies the constraint (128 KiB / 120 = 1092 > 1000).
This should be good enough for most cases. Since secondary index queries
involve one base table query per partition key, these queries are slow.
A higher limit would only make them slower and increase the probability
of a timeout. For the same reason, saving a follow-up paged request from
the client would not increase the efficiency much.
For unpaged queries, do not apply any limit. This means they remain
susceptible to stalls, but unpaged queries are considered unoptimized
anyway.
Finally, update the unit test reproducer since the bug is now fixed.
Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
- Seastar's HTTP client is known to throw exceptions for various reasons, including network errors, TLS errors and other transient issues.
- Update error handling to correctly capture and process all exceptions from Seastar's HTTP client.
- Previously, only aws_exception was handled, causing retryable errors to be missed and `should_retry` not invoked.
- Now, all exceptions trigger the appropriate retry logic per the intended strategy.
- Add tests for the S3 proxy to ensure robustness and reliability of these enhancements.
Extended existing Scylla Tools tests to cover the new functionality of
reading SSTables from S3. This ensures that the new S3 integration is
thoroughly tested and performs as expected.
Added utility functions to handle S3 Fully Qualified Names (FQN). These
functions enable parsing, splitting, and identification of S3 paths,
enhancing our ability to work with S3 object storage more effectively.
During development of #22428 we decided that we have
no need for `object-storage.yaml`, and we'd rather store
the endpoints in `scylla.yaml` and get a REST api to exopose
the endpoints for free.
This patch removes the credentials provider used to read the
aws keys from this yaml file.
Followup work will remove the `object-storage.yaml` file
altogether and move the endpoints to `scylla.yaml`.
Signed-off-by: Robert Bindar <robert.bindar@scylladb.com>
Closesscylladb/scylladb#22951
Before this patch the load balancer was equalizing tablet count per
shard, so it achieved balance assuming that:
1) tablets have the same size
2) shards have the same capacity
That can cause imbalance of utilization if shards have different
capacity, which can happen in heterogenous clusters with different
instance types. One of the causes for capacity difference is that
larger instances run with fewer shards due to vCPUs being dedicated to
IRQ handling. This makes those shards have more disk capacity, and
more CPU power.
After this patch, the load balancer equalizes shard's storage
utilization, so it no longer assumes that shards have the same
capacity. It still assummes that each tablet has equal size. So it's a
middle step towards full size-aware balancing.
One consequence is that to be able to balance, the load balancer need
to know about every node's capacity, which is collected with the same
RPC which collects load_stats for average tablet size. This is not a
significant set back because migrations cannot proceed anyway if nodes
are down due to barriers. We could make intra-node migration
scheduling work without capacity information, but it's pointless due
to above, so not implemented.
With capacity-aware balancing, if we're missing capacity for a normal
node, we won't be able to proceed with tablet drain. Consider the
following scenario:
1. Nodes: A, B
2. refresh stats with A and B
3. Add node C
4. Node B goes down
5. removenode B starts
6. stats refreshing fails because B is down
If we don't have capacity stats for node C, load balancer cannot make
decisions and removenode is blocked indefinitely. A reproducer is
added in this patch.
To alleviate that, we allow capacity stats to be collected for nodes
which are reachable, we just don't update the table size part.
To keep table stats monotonic, we cache previous results per node, so
even if it's unreachable now, we use its last reported sizes. It's
still more accurate than not refreshing stats at all. A node can be
down for a long period, and other replicas can grow in size. It's not
perfect, because the stale node can skew the stats in its direction,
but ignoring it completely has its pitfalls too. Better solution is
left for later.
Move shared_load_stats to topology_builder.hh so that topology_builder
can maintain it. It will set capacity for all created nodes. Needed
after load balancer requires capacity to make decisions.
The test no longer worked becuase load balancer requires proper schema
in the database now. Convert to topology_builder which builds topology
in the database and create schema in the database (which needs proper
topology).
Intended for testing, or hot-fixing out-of-space issues in production.
Tablet load balancer uses this information for determining per-shard load
so reducing capacity will cause tablets to be migrated away from the node.
The scylla-sstable dump-* command suite has proven invaluable in many investigations. In certain cases however, I found that `dump-data` is quite cumbersome. An example would be trying to find certain values in an sstable, or trying to read the content of system tables when a node is down. For these cases, `dump-data` is very cumbersome: one has to trudge through tons of uninteresting metadata and do compaction in their heads. This PR introduces the new scylla-sstable query command, specifically targeted at situations like this: it allows executing queries on sstables, exposing to the user all the power of CQL, to tailor the output as they see fit.
Select everything from a table:
$ scylla sstable query --system-schema /path/to/data/system_schema/keyspaces-*/*-big-Data.db
keyspace_name | durable_writes | replication
-------------------------------+----------------+-------------------------------------------------------------------------------------
system_replicated_keys | true | ({class : org.apache.cassandra.locator.EverywhereStrategy})
system_auth | true | ({class : org.apache.cassandra.locator.SimpleStrategy}, {replication_factor : 1})
system_schema | true | ({class : org.apache.cassandra.locator.LocalStrategy})
system_distributed | true | ({class : org.apache.cassandra.locator.SimpleStrategy}, {replication_factor : 3})
system | true | ({class : org.apache.cassandra.locator.LocalStrategy})
ks | true | ({class : org.apache.cassandra.locator.NetworkTopologyStrategy}, {datacenter1 : 1})
system_traces | true | ({class : org.apache.cassandra.locator.SimpleStrategy}, {replication_factor : 2})
system_distributed_everywhere | true | ({class : org.apache.cassandra.locator.EverywhereStrategy})
Select everything from a single SSTable, use the JSON output (filtered through [jq](https://jqlang.github.io/jq/) for better readability):
$ scylla sstable query --system-schema --output-format=json /path/to/data/system_schema/keyspaces-*/me-3gm7_127s_3ndxs28xt4llzxwqz6-big-Data.db | jq
[
{
"keyspace_name": "system_schema",
"durable_writes": true,
"replication": {
"class": "org.apache.cassandra.locator.LocalStrategy"
}
},
{
"keyspace_name": "system",
"durable_writes": true,
"replication": {
"class": "org.apache.cassandra.locator.LocalStrategy"
}
}
]
Select a specific field in a specific partition using the command-line:
$ scylla sstable query --system-schema --query "select replication from scylla_sstable.keyspaces where keyspace_name='ks'" ./scylla-workdir/data/system_schema/keyspaces-*/*-Data.db
replication
-------------------------------------------------------------------------------------
({class : org.apache.cassandra.locator.NetworkTopologyStrategy}, {datacenter1 : 1})
Select a specific field in a specific partition using ``--query-file``:
$ echo "SELECT replication FROM scylla_sstable.keyspaces WHERE keyspace_name='ks';" > query.cql
$ scylla sstable query --system-schema --query-file=./query.cql ./scylla-workdir/data/system_schema/keyspaces-*/*-Data.db
replication
-------------------------------------------------------------------------------------
({class : org.apache.cassandra.locator.NetworkTopologyStrategy}, {datacenter1 : 1})
New functionality: no backport needed.
Closesscylladb/scylladb#22007
* github.com:scylladb/scylladb:
docs/operating-scylla: document scylla-sstable query
test/cqlpy/test_tools.py: add tests for scylla-sstable query
test/cqlpy/test_tools.py: make scylla_sstable() return table name also
scylla-sstable: introduce the query command
tools/utils: get_selected_operation(): use std::string for operation_options
utils/rjson: streaming_writer: add RawValue()
cql3/type_json: add to_json_type()
test/lib/cql_test_env: introduce do_with_cql_env_noreentrant_in_thread()
There are several API endpoints that walk a specific list of sstables and sum up their bytes_on_disk() values. All those endpoints accumulate a map of sstable names to their sizes, then squashe the maps together and, finally, sum up the map values to report it back. Maintaining these intermediate collections is the waste of CPU and memory, the usage values can be summed up instantly.
Also add a test for per-cf endpoints to validate the change, and generalize the helper functions while at it.
Closesscylladb/scylladb#23143
* github.com:scylladb/scylladb:
api: Generalize disk space counting for table and system
api: Use map_reduce_cf_raw() overload with table name
api: Don't collect sstables map to count disk space usage
test: Add unit test for total/live sstable sizes
There are two semaphores in table for synchronizing changes to sstable list:
sstable_set_mutation_sem: used to serialize two concurrent operations updating
the list, to prevent them from racing with each other.
sstable_deletion_sem: A deletion guard, used to serialize deletion and
iteration over the list, to prevent iteration from finding deleted files on
disk.
they're always taken in this order to avoid deadlocks:
sstable_set_mutation_sem -> sstable_deletion_sem.
problem:
A = tablet cleanup
B = take_snapshot()
1) A acquires sstable_set_mutation_sem for updating list
2) A acquires sstable_deletion_sem, then delete sstable before updating list
3) A releases sstable_deletion_sem, then yield
4) B acquires sstable_deletion_sem
5) B iterates through list and bumps sstable deleted in step 2
6) B fails since it cannot find the file on disk
Initial reaction is to say that no procedure must delete sstable before
updating the list, that's true.
But we want a iteration, running concurrently to cleanup, to not find sstables
being removed from the system. Otherwise, e.g. snapshot works with sstables
of a tablet that was just cleaned up. That's achieved by serializing iteration
with list update.
Since sstable_deletion_sem is used within the scope of deletion only, it's
useless for achieving this. Cleanup could acquire the deletion sem when
preparing list updates, and then pass the "permit" to deletion function, but
then sstable_deletion_sem would essentially become sstable_set_mutation_sem,
which was created exactly to protect the list update.
That being said, it makes sense to merge both semaphores. Also things become
easier to reason about, and we don't have to worry about deadlocks anymore.
The deletion goes through sstable_list_builder, which holds a permit throughout
its lifetime, which guarantees that list updates and deletion are atomic to
other concurrent operations. The interface becomes less error prone with that.
It allowed us to find discard_sstables() was doing deletion without any permit,
meaning another race could happen between truncate and snapshot.
So we're fixing race of (truncate|cleanup) with take_snapshot, as far as we
know. It's possible another unknown races are fixed as well.
Fixes#23049.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closesscylladb/scylladb#23117
A wrapper around a shared service allowing
safe plug and unplug of the service from its user
using a phased-barrier operation permit guarding
the service while in use.
Also add a unit test for this class.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Now that we support suite subfolders, there is no
need to create an own suite for object_store and auth_cluster, topology, topology_custom.
this PR merge all these folders into one: 'cluster"
this pr also introduce and apply 'prepare_3_nodes_cluster' fixture that allow preparing non-dirty 3 nodes cluster
that can be reused between tests(for tests that was in topology folder)
number of tests in master
release -3461
dev -3472
debug -3446
number of tests in this PR
release -3460
dev -3471
debug -3445
There is a minus one test in each mode because It was 2 test_topology_failure_recovery files(topology and topology_custom) with the same utility functions but different test cases. This PR merged them into one
Closesscylladb/scylladb#22917
* github.com:scylladb/scylladb:
test.py: merge object_store into cluster folder
test.py: merge auth_cluster into cluster folter
test.py: rename topology_custom folder to cluster
test.py: merge topology test suite into topology_custom
test.py delete conftest in topology_custom
test.py apply prepare_3_nodes_cluster in topology
test.py: introduce prepare_3_nodes_cluster marker
The pair of column_family/metrics/(total|live)_disk_space_used/{name}
reports the disk usage by sstables. The test creates table, populates,
flushes and checks that the size corresonds to what stat(2) reports for
the respective files.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Secondary index queries which fetch partitions from the base table can
cause large allocations that can lead to reactor stalls.
Reproduce this with a unit test that runs an indexed query on a table
with thousands of single-row partitions, and checks the memory stats for
any large contiguous allocations.
Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
For the limited voters feature to work properly we need to make sure that we are only managing the voter status through the topology coordinator. This means that we should not change the node votership from the storage_service module for the raft topology directly.
We can drop the voter status changes from the storage_service module because the topology coordinator will handle the votership changes eventually. The calls in the storage_service module were not essential and were only used for optimization (improving the HA under certain conditions).
Furthermore, the other bundled commit improves the reaction again by reacting to the node `on_up()` and `on_down()` events, which again shortens the reaction time and improves the HA.
The change has effect on the timing in the tablets migration test though, as it previously relied on the node being made non-voter from the service_storage `raft_removenode()` function. The fix is to add another server to the topology to make sure we will keep the quorum.
Previously the test worked because the test waits for an injection to be reached and it was ensured that the injection (log line) has only been triggered after the node has been made non-voter from the `raft_removenode()`. This is not the case anymore. An alternative fix would be to wait for the first node to be made non-voter before stopping the second server, but this would make the test more complex (and it is not strictly required to only use 4 servers in the test, it has been only done for optimization purposes).
Fixes: scylladb/scylladb#22860
Refs: scylladb/scylladb#18793
Refs: scylladb/scylladb#21969
No backport: Part of the limited voters new feature, so this shouldn't to be backported.
Closesscylladb/scylladb#22847
* https://github.com/scylladb/scylladb:
raft: use direct return of future for `run_op_with_retry`
raft: adjust the voters interface to allow atomic changes
raft topology: drop removing the node from raft config via storage_service
raft topology: drop changing the raft voters config via storage_service
In commit 2463e524ed, Scylla's default changed
from starting with one tablet per shard to starting 10 per shard. The
functional tests don't need more tablets and it can only slow down the
tests, so the patch added --tablets-initial-scale-factor=1 to test/*/suite.yaml
but forgot to add it to test/cqlpy/run.py (to affect test/cqlpy/run) so
this patch does this now.
This patch should *only* be about making tests faster, although to be
honest, I don't see any measurable improvement in test speed (10 isn't
so many). But, unfortunately, this is only part of the story. Over time
we allowed a few cqlpy tests to be written in a way that relies on having
only a small number of tablets or even exactly one tablet per shard (!).
These tests are buggy and should be fixed - see issues #23115 and #23116
as examples. But adding the option --tablets-initial-scale-factor=1 also
to run.py will make these bugs not affect test/cqlpy/run in the same way
as it doesn't affect test.py.
These buggy tests will still break with `pytest cqlpy` against a Scylla
you ran yourself manually, so eventually will still need to fix those
test bugs.
Refs #23115
Refs #23116Closesscylladb/scylladb#23125
The cqlpy test test_compaction.py::test_compactionstats_after_major_compaction
was written to assume we have just one tablet per shard - if there are many
tablets compaction splitting the data, the test scenario might not need
compaction in the way that the test assumes it does.
Recently (commit 2463e524ed) Scylla's default
was changed to have 10 tablets per shard - not one. This broke this test.
The same commit modified test/cqlpy/suite.yaml, but that affects only test.py
and not test/cqlpy/run, and also not manual runs against a manually-installed
Scylla. If this test absolutely requires a keyspace with 1 and not 10
tablets, then it should create one explicitly. So this is what this test
does (but only if tablets are in use; if vnodes are used that's fine
too).
Before this patch,
test/cqlpy/run test_compaction.py::test_compactionstats_after_major_compaction
fails. After the patch, it passes.
Fixes#23116Closesscylladb/scylladb#23121
The currently used versions of "wasmtime", "idna", "cap-std" and
"cap-primitives" packages had low to moderate security issues.
In this patch we update the dependencies to versions with these
issues fixed.
The update was performed by changing the "wasmtime" (and "wasmtime-wasi")
version in rust/wasmtime_bindings/Cargo.toml and updating rust/Cargo.lock
using the "cargo update" command with the affected package. To fix an
issue with different dependencies having different versions of
sub-dependencies, the package "smallvec" was also updated to "1.13.1".
After the dependency update, the Rust code also needed to be updated
because of the slightly changed API. One Wasm test case needed to be
updated, as it was actually using an incorrect Wat module and not
failing before. The crate also no longer allows multiple tables in
Wasm modules by default - it is now enabled by setting the "gc" crate
feature and configuring the Engine with config.wasm_reference_types(true).
Fixes https://github.com/scylladb/scylladb/issues/23127Closesscylladb/scylladb#23128
It is possible that the permit handed in to register_inactive_read() is already aborted (currently only possible if permit timed out). If the permit also happens to have wait for memory, the current code will attempt to call promise<>::set_exception() on the permit's promise to abort its waiters. But if the permit was already aborted via timeout, this promise will already have an exception and this will trigger an assert. Add a separate case for checking if the permit is aborted already. If so, treat it as immediate eviction: close the reader and clean up.
Fixes: scylladb/scylladb#22919
Bug is present in all live versions, backports are required.
Closesscylladb/scylladb#23044
* github.com:scylladb/scylladb:
reader_concurrency_semaphore: register_inactive_read(): handle aborted permit
test/boost/reader_concurrency_semaphore_test: move away from db::timeout_clock::now()
For the limited voters feature to work properly we need to make sure
that we are only managing the voter status through the topology
coordinator. This means that we should not change the node votership
from the storage_service module for the raft topology directly.
We can drop the voter status changes from the storage_service module
because the topology coordinator will handle the votership changes
eventually. The calls in the storage_service module were not essential
and were only used for optimization (improving the HA under certain
conditions).
This has effect on the timing in the tablets migration test though,
as it relied on the node being made non-voter from the service_storage
`raft_removenode()` function. The fix is to add another server to the
topology to make sure we will keep the quorum.
Previously the test worked because the test waits for an injection to be
reached and it was ensured that the injection (log line) has only been
triggered after the node has been made non-voter from the
`raft_removenode()`. This is not the case anymore. An alternative fix
would be to wait for the first node to be made non-voter before stopping
the second server, but this would make the test more complex (and it is
not strictly required to only use 4 servers in the test, it has been
only done for optimization purposes).
Fixes: scylladb/scylladb#22860
Refs: scylladb/scylladb#18793
Refs: scylladb/scylladb#21969
Previously, variables were marked as const, causing std::move() calls to
be redundant as reported by GCC warnings. This change either removes
const qualifiers or marks related lambdas as mutable, allowing the
compiler to properly utilize move constructors for better performance.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#23066
As a part of the moving to bare pytest we need to extract the required test
environment preparation steps into pytest's hooks/fixtures.
Do this for S3 mock stuff (MinioServer, MockS3Server, and S3ProxyServer)
and for directories with test artifacts.
For compatibility reason add --test-py-init CLI option for bare pytest
test runner: need to add it to pytest command if you need test.py
stuff in your tests (boost, topology, etc.)
Also, postpone initialization of TestSuite.artifacts and TestSuite.hosts
from import-time to runtime.
Closesscylladb/scylladb#23087
LIMIT and PER PARTITION LIMIT limit the number of rows returned or taken
into consideration by a query. It makes no logical sense to have this
value at less than 1. Cassandra also has this requirement.
This patch ensures that the limit value is strictly positive and adds
an explicit test for it - it was only tested in a test ported from
Cassandra, that is disabled due to other issues.
Closesscylladb/scylladb#23013
If hosts and/or dcs filters are specified for tablet repair and
some replicas match these filters, choose the replica that will
be the repair master according to round-robin principle
(currently it's always the first replica).
If hosts and/or dcs filters are specified for tablet repair and
no replica matches these filters, the repair succeeds and
the repair request is removed (currently an exception is thrown
and tablet repair scheduler reschedules the repair forever).
Fixes: https://github.com/scylladb/scylladb/issues/23100.
Needs backport to 2025.1 that introduces hosts and dcs filters for tablet repair
Closesscylladb/scylladb#23101
* github.com:scylladb/scylladb:
test: add new cases to tablet_repair tests
test: extract repiar check to function
locator: add round-robin selection of filtered replicas
locator: add tablet_task_info::selected_by_filters
service: finish repair successfully if no matching replica found