scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-03 13:37:04 +00:00

Author	SHA1	Message	Date
Dawid Mędrek	0a6137218a	db/hints: Cancel draining when stopping node Draining hints may occur in one of the two scenarios: * a node leaves the cluster and the local node drains all of the hints saved for that node, * the local node is being decommissioned. Draining may take some time and the hint manager won't stop until it finishes. It's not a problem when decommissioning a node, especially because we want the cluster to retain the data stored in the hints. However, it may become a problem when the local node started draining hints saved for another node and now it's being shut down. There are two reasons for that: * Generally, in situations like that, we'd like to be able to shut down nodes as fast as possible. The data stored in the hints won't disappear from the cluster yet since we can restart the local node. * Draining hints may introduce flakiness in tests. Replaying hints doesn't have the highest priority and it's reflected in the scheduling groups we use as well as the explicitly enforced throughput. If there are a large number of hints to be replayed, it might affect our tests. It's already happened, see: scylladb/scylladb#21949. To solve those problems, we change the semantics of draining. It will behave as before when the local node is being decommissioned. However, when the local node is only being stopped, we will immediately cancel all ongoing draining processes and stop the hint manager. To amend for that, when we start a node and it initializes a hint endpoint manager corresponding to a node that's already left the cluster, we will begin the draining process of that endpoint manager right away. That should ensure all data is retained, while possibly speeding up the shutdown process. There's a small trade-off to it, though. If we stop a node, we can then remove it. It won't have a chance to replay hints it might've before these changes, but that's an edge case. We expect this commit to bring more benefit than harm. We also provide tests verifying that the implementation works as intended. Fixes scylladb/scylladb#21949 Closes scylladb/scylladb#22811	2025-03-13 11:55:15 +02:00
Paweł Zakrzewski	d483051e44	cql3/select_statement: reject aggregate functions when PER PARTITION LIMIT is present Before this patch we silently allowed and ignored PER PARTITION LIMIT. While using aggregate functions in conjunction with PER PARTITION LIMIT can make sense, we want to disable it until we can offer proper implementation, see #9879 for discussion. We want to match Cassandra, and for queries with aggregate functions it behaves as follows: - it silently ignores PER PARTITION LIMIT if GROUP BY is present, which matches our previous implementation. - rejects PER PARTITION LIMIT when GROUP BY is not present. This patch adds rejection of the second group. Fixes #9879 Closes scylladb/scylladb#23086	2025-03-13 10:29:53 +02:00
Pavel Emelyanov	f50bcbf4d0	test/perf/s3: Don't forget to stop sharded<tester> on error In case invoke_on_all(tester::start) throws, the sharded<tester> instance remains non-stopped and calltrace is reported on test stop. Not nice, fix it so that sharded<> thing is stopped in any case. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#23244	2025-03-13 09:54:09 +02:00
Kefu Chai	68fc067106	perf/perf_sstable: fix the indent Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2025-03-12 19:00:50 +08:00
Kefu Chai	4f62f79622	perf/perf_sstable: stop using at_exit() seastar::at_exit() was marked deprecated recently. so let's use the recommended approach to perform cleanups. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2025-03-12 19:00:50 +08:00
Nadav Har'El	3ca2e6ddda	Merge 's3_client: Add retries to Security Token Service/EC2 instance metadata credentials providers' from Ernest Zaslavsky Several updates and improvements to the retryable HTTP client functionality, as well as enhancements to error handling and integration with AWS services, as part of this PR. Below is a summary of the changes: - Moved the retryable HTTP client functionality out of the S3 client to improve modularity and reusability across other services like AWS STS. - Isolated the retryable_http_client into its own file, improving clarity and maintainability. - Added a make_request method that introduces a response-skipping handler. - Introduced a custom error handler constructor, providing greater flexibility in handling errors. - Updated the STS and Instance Metadata Service credentials providers to utilize the new retryable HTTP client, enhancing their robustness and reliability. - Extended the AWS error list to handle errors specific to the STS service, ensuring more granular and accurate error management for STS operations. - Enhanced error handling for system errors returned by Seastar’s HTTP client, ensuring smoother operations. - Properly closed the HTTP client in instance_profile_credentials_provider and sts_assume_role_credentials_provider to prevent resource leaks. - Reduced the log severity in the retry strategy to avoid SCT test failures that occur when any log message is tagged as an ERROR. No backport needed since we dont have any s3 related activity on the scylla side been released Closes scylladb/scylladb#21933 * github.com:scylladb/scylladb: s3_client: Adjust Log Severity in Retry Strategy aws_error: Enhance error handling for AWS HTTP client aws_error: Add STS specific error handling credentials_providers: Close retryable clients in Credentials Providers credentials_providers: Integrate retryable_http_client with Credentials Providers s3_client: enhance `retryable_http_client` functionality s3_client: isolate `retryable_http_client` s3_client: Prepare for `retryable_http_client` relocation s3_client: Remove `is_redirect_status` function s3_client: Move retryable functionality out of s3 client	2025-03-12 10:19:15 +02:00
Avi Kivity	b1d9f80d85	Merge 'tablets: Make load balancing capacity-aware' from Tomasz Grabiec Before this patch, the load balancer was equalizing tablet count per shard, so it achieved balance assuming that: 1) tablets have the same size 2) shards have the same capacity That can cause imbalance of utilization if shards have different capacity, which can happen in heterogeneous clusters with different instance types. One of the causes for capacity difference is that larger instances run with fewer shards due to vCPUs being dedicated to IRQ handling. This makes those shards have more disk capacity, and more CPU power. After this patch, the load balancer equalizes shard's storage utilization, so it no longer assumes that shards have the same capacity. It still assumes that each tablet has equal size. So it's a middle step towards full size-aware balancing. One consequence is that to be able to balance, the load balancer need to know about every node's capacity, which is collected with the same RPC which collects load_stats for average tablet size. This is not a significant set back because migrations cannot proceed anyway if nodes are down due to barriers. We could make intra-node migration scheduling work without capacity information, but it's pointless due to above, so not implemented. Also, per-shard goal for tablet count is still the same for all nodes in the cluster, so nodes with less capacity will be below limit and nodes with more capacity will be slightly above limit. This shouldn't be a significant problem in practice, we could compensate for this by increasing the limit. Refs #23042 Closes scylladb/scylladb#23079 * github.com:scylladb/scylladb: tablets: Make load balancing capacity-aware topology_coordinator: Fix confusing log message topology_coordinator: Refresh load stats after adding a new node topology_coordinator: Allow capacity stats to be refreshed with some nodes down topology_coordinator: Refactor load status refreshing so that it can be triggered from multiple places test: boost: tablets_test: Always provide capacity in load_stats test: perf_load_balancing: Set node capacity test: perf_load_balancing: Convert to topology_builder config, disk_space_monitor: Allow overriding capacity via config storage_service, tablets: Collect per-node capacity in load_stats	2025-03-11 14:34:27 +02:00
Gleb Natapov	8425c26462	gossiper: start using host ids to send messages earlier Send digest ack and ack2 by host ids as well now since the id->ip mapping is available after receiving digest syn. It allows to convert more code to host id here.	2025-03-11 12:09:21 +02:00
Gleb Natapov	f0af3f261e	messaging_service: add temporary address map entry on incoming connection We want to move to use host ids as soon as possible. Currently it is possible only after the full gossiper exchange (because only at this point gossiper state is added and with it address map entry). To make it possible to move to host ids earlier this patch adds address map entries on incoming communication during CLIENT_ID verb processing. The patch also adds generation to CLIENT_ID to use it when address map is updated. It is done so that older gossiper entries can be overwritten with newer mapping in case of IP change.	2025-03-11 12:09:21 +02:00
Nikos Dragazis	76b31a3acc	cql3: secondary index: Limit the size of partition range vectors The partition range vector is an std::vector, which means it performs contiguous allocations. Large allocations are known to cause problems (e.g., reactor stalls). For paged queries, limit the vector size to 1000. If more partition keys are available in the query result, discard them. Ideally, we should not be fetching them at all, but this is not possible without knowing the size of each partition. Currently, each vector element is 120 bytes and the standard allocator's max preferred contiguous allocation is 128KiB. Therefore, the chosen value of 1000 satisfies the constraint (128 KiB / 120 = 1092 > 1000). This should be good enough for most cases. Since secondary index queries involve one base table query per partition key, these queries are slow. A higher limit would only make them slower and increase the probability of a timeout. For the same reason, saving a follow-up paged request from the client would not increase the efficiency much. For unpaged queries, do not apply any limit. This means they remain susceptible to stalls, but unpaged queries are considered unoptimized anyway. Finally, update the unit test reproducer since the bug is now fixed. Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>	2025-03-10 12:18:42 +02:00
Ernest Zaslavsky	8e46929474	aws_error: Enhance error handling for AWS HTTP client - Seastar's HTTP client is known to throw exceptions for various reasons, including network errors, TLS errors and other transient issues. - Update error handling to correctly capture and process all exceptions from Seastar's HTTP client. - Previously, only aws_exception was handled, causing retryable errors to be missed and `should_retry` not invoked. - Now, all exceptions trigger the appropriate retry logic per the intended strategy. - Add tests for the S3 proxy to ensure robustness and reliability of these enhancements.	2025-03-10 09:01:47 +02:00
Ernest Zaslavsky	6a3cef5703	metadata: Correct "DESCRIBE" output for keyspace metadata Update the "DESCRIBE" command output to accurately display `tablet` settings in keyspace metadata. Closes scylladb/scylladb#23056	2025-03-09 14:50:08 +02:00
Ernest Zaslavsky	050c3cdbc2	tests: Add Tests for Scylla-SSTable S3 Functionality Extended existing Scylla Tools tests to cover the new functionality of reading SSTables from S3. This ensures that the new S3 integration is thoroughly tested and performs as expected.	2025-03-09 10:17:48 +02:00
Ernest Zaslavsky	88c4fa6569	s3: Implement S3 Fully Qualified Name Manipulation Functions Added utility functions to handle S3 Fully Qualified Names (FQN). These functions enable parsing, splitting, and identification of S3 paths, enhancing our ability to work with S3 object storage more effectively.	2025-03-09 09:50:36 +02:00
Robert Bindar	27f2d64725	Remove object storage config credentials provider During development of #22428 we decided that we have no need for `object-storage.yaml`, and we'd rather store the endpoints in `scylla.yaml` and get a REST api to exopose the endpoints for free. This patch removes the credentials provider used to read the aws keys from this yaml file. Followup work will remove the `object-storage.yaml` file altogether and move the endpoints to `scylla.yaml`. Signed-off-by: Robert Bindar <robert.bindar@scylladb.com> Closes scylladb/scylladb#22951	2025-03-07 10:40:58 +03:00
Tomasz Grabiec	c4714180cc	tablets: Make load balancing capacity-aware Before this patch the load balancer was equalizing tablet count per shard, so it achieved balance assuming that: 1) tablets have the same size 2) shards have the same capacity That can cause imbalance of utilization if shards have different capacity, which can happen in heterogenous clusters with different instance types. One of the causes for capacity difference is that larger instances run with fewer shards due to vCPUs being dedicated to IRQ handling. This makes those shards have more disk capacity, and more CPU power. After this patch, the load balancer equalizes shard's storage utilization, so it no longer assumes that shards have the same capacity. It still assummes that each tablet has equal size. So it's a middle step towards full size-aware balancing. One consequence is that to be able to balance, the load balancer need to know about every node's capacity, which is collected with the same RPC which collects load_stats for average tablet size. This is not a significant set back because migrations cannot proceed anyway if nodes are down due to barriers. We could make intra-node migration scheduling work without capacity information, but it's pointless due to above, so not implemented.	2025-03-06 13:35:38 +01:00
Tomasz Grabiec	d6f8810e66	topology_coordinator: Allow capacity stats to be refreshed with some nodes down With capacity-aware balancing, if we're missing capacity for a normal node, we won't be able to proceed with tablet drain. Consider the following scenario: 1. Nodes: A, B 2. refresh stats with A and B 3. Add node C 4. Node B goes down 5. removenode B starts 6. stats refreshing fails because B is down If we don't have capacity stats for node C, load balancer cannot make decisions and removenode is blocked indefinitely. A reproducer is added in this patch. To alleviate that, we allow capacity stats to be collected for nodes which are reachable, we just don't update the table size part. To keep table stats monotonic, we cache previous results per node, so even if it's unreachable now, we use its last reported sizes. It's still more accurate than not refreshing stats at all. A node can be down for a long period, and other replicas can grow in size. It's not perfect, because the stale node can skew the stats in its direction, but ignoring it completely has its pitfalls too. Better solution is left for later.	2025-03-06 13:35:37 +01:00
Tomasz Grabiec	69c49fb1a7	test: boost: tablets_test: Always provide capacity in load_stats Move shared_load_stats to topology_builder.hh so that topology_builder can maintain it. It will set capacity for all created nodes. Needed after load balancer requires capacity to make decisions.	2025-03-06 13:35:37 +01:00
Tomasz Grabiec	dfc9101dfd	test: perf_load_balancing: Set node capacity Otherwise, load balancer will not make any plan once it becomes capacity-aware.	2025-03-06 13:35:37 +01:00
Tomasz Grabiec	6169401dbc	test: perf_load_balancing: Convert to topology_builder The test no longer worked becuase load balancer requires proper schema in the database now. Convert to topology_builder which builds topology in the database and create schema in the database (which needs proper topology).	2025-03-06 13:35:37 +01:00
Tomasz Grabiec	d01cc16d1e	config, disk_space_monitor: Allow overriding capacity via config Intended for testing, or hot-fixing out-of-space issues in production. Tablet load balancer uses this information for determining per-shard load so reducing capacity will cause tablets to be migrated away from the node.	2025-03-06 13:35:37 +01:00
Avi Kivity	28906c9261	Merge 'scylla-sstable: introduce the query command' from Botond Dénes The scylla-sstable dump-* command suite has proven invaluable in many investigations. In certain cases however, I found that `dump-data` is quite cumbersome. An example would be trying to find certain values in an sstable, or trying to read the content of system tables when a node is down. For these cases, `dump-data` is very cumbersome: one has to trudge through tons of uninteresting metadata and do compaction in their heads. This PR introduces the new scylla-sstable query command, specifically targeted at situations like this: it allows executing queries on sstables, exposing to the user all the power of CQL, to tailor the output as they see fit. Select everything from a table: $ scylla sstable query --system-schema /path/to/data/system_schema/keyspaces-/-big-Data.db keyspace_name \| durable_writes \| replication -------------------------------+----------------+------------------------------------------------------------------------------------- system_replicated_keys \| true \| ({class : org.apache.cassandra.locator.EverywhereStrategy}) system_auth \| true \| ({class : org.apache.cassandra.locator.SimpleStrategy}, {replication_factor : 1}) system_schema \| true \| ({class : org.apache.cassandra.locator.LocalStrategy}) system_distributed \| true \| ({class : org.apache.cassandra.locator.SimpleStrategy}, {replication_factor : 3}) system \| true \| ({class : org.apache.cassandra.locator.LocalStrategy}) ks \| true \| ({class : org.apache.cassandra.locator.NetworkTopologyStrategy}, {datacenter1 : 1}) system_traces \| true \| ({class : org.apache.cassandra.locator.SimpleStrategy}, {replication_factor : 2}) system_distributed_everywhere \| true \| ({class : org.apache.cassandra.locator.EverywhereStrategy}) Select everything from a single SSTable, use the JSON output (filtered through [jq](https://jqlang.github.io/jq/) for better readability): $ scylla sstable query --system-schema --output-format=json /path/to/data/system_schema/keyspaces-/me-3gm7_127s_3ndxs28xt4llzxwqz6-big-Data.db \| jq [ { "keyspace_name": "system_schema", "durable_writes": true, "replication": { "class": "org.apache.cassandra.locator.LocalStrategy" } }, { "keyspace_name": "system", "durable_writes": true, "replication": { "class": "org.apache.cassandra.locator.LocalStrategy" } } ] Select a specific field in a specific partition using the command-line: $ scylla sstable query --system-schema --query "select replication from scylla_sstable.keyspaces where keyspace_name='ks'" ./scylla-workdir/data/system_schema/keyspaces-/-Data.db replication ------------------------------------------------------------------------------------- ({class : org.apache.cassandra.locator.NetworkTopologyStrategy}, {datacenter1 : 1}) Select a specific field in a specific partition using ``--query-file``: $ echo "SELECT replication FROM scylla_sstable.keyspaces WHERE keyspace_name='ks';" > query.cql $ scylla sstable query --system-schema --query-file=./query.cql ./scylla-workdir/data/system_schema/keyspaces-/-Data.db replication ------------------------------------------------------------------------------------- ({class : org.apache.cassandra.locator.NetworkTopologyStrategy}, {datacenter1 : 1}) New functionality: no backport needed. Closes scylladb/scylladb#22007 github.com:scylladb/scylladb: docs/operating-scylla: document scylla-sstable query test/cqlpy/test_tools.py: add tests for scylla-sstable query test/cqlpy/test_tools.py: make scylla_sstable() return table name also scylla-sstable: introduce the query command tools/utils: get_selected_operation(): use std::string for operation_options utils/rjson: streaming_writer: add RawValue() cql3/type_json: add to_json_type() test/lib/cql_test_env: introduce do_with_cql_env_noreentrant_in_thread()	2025-03-06 13:42:45 +02:00
Tomasz Grabiec	7e7f1e6f91	storage_service, tablets: Collect per-node capacity in load_stats New RPC is introduced becuase load_stats was marked "final" in the IDL. Will be needed by capacity-aware load balancing.	2025-03-06 12:17:32 +01:00
Botond Dénes	1139cf3a98	Merge 'Speed up (and generalize) the way API calculates sstable disk usage' from Pavel Emelyanov There are several API endpoints that walk a specific list of sstables and sum up their bytes_on_disk() values. All those endpoints accumulate a map of sstable names to their sizes, then squashe the maps together and, finally, sum up the map values to report it back. Maintaining these intermediate collections is the waste of CPU and memory, the usage values can be summed up instantly. Also add a test for per-cf endpoints to validate the change, and generalize the helper functions while at it. Closes scylladb/scylladb#23143 * github.com:scylladb/scylladb: api: Generalize disk space counting for table and system api: Use map_reduce_cf_raw() overload with table name api: Don't collect sstables map to count disk space usage test: Add unit test for total/live sstable sizes	2025-03-06 11:26:35 +02:00
Raphael S. Carvalho	fedd838b9d	replica: Fix race of some operations like cleanup with snapshot There are two semaphores in table for synchronizing changes to sstable list: sstable_set_mutation_sem: used to serialize two concurrent operations updating the list, to prevent them from racing with each other. sstable_deletion_sem: A deletion guard, used to serialize deletion and iteration over the list, to prevent iteration from finding deleted files on disk. they're always taken in this order to avoid deadlocks: sstable_set_mutation_sem -> sstable_deletion_sem. problem: A = tablet cleanup B = take_snapshot() 1) A acquires sstable_set_mutation_sem for updating list 2) A acquires sstable_deletion_sem, then delete sstable before updating list 3) A releases sstable_deletion_sem, then yield 4) B acquires sstable_deletion_sem 5) B iterates through list and bumps sstable deleted in step 2 6) B fails since it cannot find the file on disk Initial reaction is to say that no procedure must delete sstable before updating the list, that's true. But we want a iteration, running concurrently to cleanup, to not find sstables being removed from the system. Otherwise, e.g. snapshot works with sstables of a tablet that was just cleaned up. That's achieved by serializing iteration with list update. Since sstable_deletion_sem is used within the scope of deletion only, it's useless for achieving this. Cleanup could acquire the deletion sem when preparing list updates, and then pass the "permit" to deletion function, but then sstable_deletion_sem would essentially become sstable_set_mutation_sem, which was created exactly to protect the list update. That being said, it makes sense to merge both semaphores. Also things become easier to reason about, and we don't have to worry about deadlocks anymore. The deletion goes through sstable_list_builder, which holds a permit throughout its lifetime, which guarantees that list updates and deletion are atomic to other concurrent operations. The interface becomes less error prone with that. It allowed us to find discard_sstables() was doing deletion without any permit, meaning another race could happen between truncate and snapshot. So we're fixing race of (truncate\|cleanup) with take_snapshot, as far as we know. It's possible another unknown races are fixed as well. Fixes #23049. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb/scylladb#23117	2025-03-06 11:00:48 +02:00
Benny Halevy	7a624e3df8	system_keyspace: call shutdown from stop and use that to replace the explicit shutdown when stopped in cql_test_env. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-03-05 08:30:23 +02:00
Benny Halevy	13a22cb6fd	utils: add class pluggable A wrapper around a shared service allowing safe plug and unplug of the service from its user using a phased-barrier operation permit guarding the service while in use. Also add a unit test for this class. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-03-05 08:25:50 +02:00
Nadav Har'El	e0f24c03e7	Merge 'test.py: merge all 'Topology' suite types int one folder 'cluster'' from Artsiom Mishuta Now that we support suite subfolders, there is no need to create an own suite for object_store and auth_cluster, topology, topology_custom. this PR merge all these folders into one: 'cluster" this pr also introduce and apply 'prepare_3_nodes_cluster' fixture that allow preparing non-dirty 3 nodes cluster that can be reused between tests(for tests that was in topology folder) number of tests in master release -3461 dev -3472 debug -3446 number of tests in this PR release -3460 dev -3471 debug -3445 There is a minus one test in each mode because It was 2 test_topology_failure_recovery files(topology and topology_custom) with the same utility functions but different test cases. This PR merged them into one Closes scylladb/scylladb#22917 * github.com:scylladb/scylladb: test.py: merge object_store into cluster folder test.py: merge auth_cluster into cluster folter test.py: rename topology_custom folder to cluster test.py: merge topology test suite into topology_custom test.py delete conftest in topology_custom test.py apply prepare_3_nodes_cluster in topology test.py: introduce prepare_3_nodes_cluster marker	2025-03-04 19:26:32 +02:00
Pavel Emelyanov	a8fc1d64bc	test: Add unit test for total/live sstable sizes The pair of column_family/metrics/(total\|live)_disk_space_used/{name} reports the disk usage by sstables. The test creates table, populates, flushes and checks that the size corresonds to what stat(2) reports for the respective files. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-04 19:52:33 +03:00
Nikos Dragazis	892690b953	test: Reproduce bug with large allocations from secondary index Secondary index queries which fetch partitions from the base table can cause large allocations that can lead to reactor stalls. Reproduce this with a unit test that runs an indexed query on a table with thousands of single-row partitions, and checks the memory stats for any large contiguous allocations. Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>	2025-03-04 18:39:28 +02:00
Patryk Jędrzejczak	c13b6c91d3	Merge 'raft topology: drop changing the raft voters config via storage_service' from Emil Maskovsky For the limited voters feature to work properly we need to make sure that we are only managing the voter status through the topology coordinator. This means that we should not change the node votership from the storage_service module for the raft topology directly. We can drop the voter status changes from the storage_service module because the topology coordinator will handle the votership changes eventually. The calls in the storage_service module were not essential and were only used for optimization (improving the HA under certain conditions). Furthermore, the other bundled commit improves the reaction again by reacting to the node `on_up()` and `on_down()` events, which again shortens the reaction time and improves the HA. The change has effect on the timing in the tablets migration test though, as it previously relied on the node being made non-voter from the service_storage `raft_removenode()` function. The fix is to add another server to the topology to make sure we will keep the quorum. Previously the test worked because the test waits for an injection to be reached and it was ensured that the injection (log line) has only been triggered after the node has been made non-voter from the `raft_removenode()`. This is not the case anymore. An alternative fix would be to wait for the first node to be made non-voter before stopping the second server, but this would make the test more complex (and it is not strictly required to only use 4 servers in the test, it has been only done for optimization purposes). Fixes: scylladb/scylladb#22860 Refs: scylladb/scylladb#18793 Refs: scylladb/scylladb#21969 No backport: Part of the limited voters new feature, so this shouldn't to be backported. Closes scylladb/scylladb#22847 * https://github.com/scylladb/scylladb: raft: use direct return of future for `run_op_with_retry` raft: adjust the voters interface to allow atomic changes raft topology: drop removing the node from raft config via storage_service raft topology: drop changing the raft voters config via storage_service	2025-03-04 13:59:47 +01:00
Nadav Har'El	d096aac200	test/cqlpy/run: reduce number of tablets In commit `2463e524ed`, Scylla's default changed from starting with one tablet per shard to starting 10 per shard. The functional tests don't need more tablets and it can only slow down the tests, so the patch added --tablets-initial-scale-factor=1 to test//suite.yaml but forgot to add it to test/cqlpy/run.py (to affect test/cqlpy/run) so this patch does this now. This patch should only* be about making tests faster, although to be honest, I don't see any measurable improvement in test speed (10 isn't so many). But, unfortunately, this is only part of the story. Over time we allowed a few cqlpy tests to be written in a way that relies on having only a small number of tablets or even exactly one tablet per shard (!). These tests are buggy and should be fixed - see issues #23115 and #23116 as examples. But adding the option --tablets-initial-scale-factor=1 also to run.py will make these bugs not affect test/cqlpy/run in the same way as it doesn't affect test.py. These buggy tests will still break with `pytest cqlpy` against a Scylla you ran yourself manually, so eventually will still need to fix those test bugs. Refs #23115 Refs #23116 Closes scylladb/scylladb#23125	2025-03-04 15:39:21 +03:00
Artsiom Mishuta	97a620cda9	test.py: merge object_store into cluster folder Now that we support suite subfolders, there is no need to create an own suite for object_store	2025-03-04 10:32:44 +01:00
Artsiom Mishuta	a283b391c2	test.py: merge auth_cluster into cluster folter Now that we support suite subfolders, there is no need to create an own suite for auth_cluster	2025-03-04 10:32:44 +01:00
Artsiom Mishuta	d1198f8318	test.py: rename topology_custom folder to cluster rename topology_custom folder to cluster as it contains not only topology test cases	2025-03-04 10:32:44 +01:00
Artsiom Mishuta	d8e17c4356	test.py: merge topology test suite into topology_custom Now that we support suite subfolders, there is no need to create an own suite for topology	2025-03-04 10:32:44 +01:00
Artsiom Mishuta	ef62dfa6a9	test.py delete conftest in topology_custom delete conftest in the sepatate commi for brtter diff listing during merge topology_custom and topology	2025-03-04 10:32:43 +01:00
Artsiom Mishuta	cf48444e3b	test.py apply prepare_3_nodes_cluster in topology apply prepare_3_nodes_cluster for all tests in the topology folder via applying mark at the test module level using pytestmark https://docs.pytest.org/en/stable/example/markers.html#marking-whole-classes-or-modules set initial initial_size for topology folder to 0	2025-03-04 10:32:43 +01:00
Artsiom Mishuta	20777d7fc6	test.py: introduce prepare_3_nodes_cluster marker prepare_3_nodes_cluster marker will allow preparing non-dirty 3 nodes cluster that can be reused between tests	2025-03-04 10:32:43 +01:00
Nadav Har'El	a56751e71b	test/cqlpy: fix test assuming just one tablet The cqlpy test test_compaction.py::test_compactionstats_after_major_compaction was written to assume we have just one tablet per shard - if there are many tablets compaction splitting the data, the test scenario might not need compaction in the way that the test assumes it does. Recently (commit `2463e524ed`) Scylla's default was changed to have 10 tablets per shard - not one. This broke this test. The same commit modified test/cqlpy/suite.yaml, but that affects only test.py and not test/cqlpy/run, and also not manual runs against a manually-installed Scylla. If this test absolutely requires a keyspace with 1 and not 10 tablets, then it should create one explicitly. So this is what this test does (but only if tablets are in use; if vnodes are used that's fine too). Before this patch, test/cqlpy/run test_compaction.py::test_compactionstats_after_major_compaction fails. After the patch, it passes. Fixes #23116 Closes scylladb/scylladb#23121	2025-03-04 10:15:29 +02:00
Kefu Chai	a43072a21e	cql3,test: replace boost::range::adjacent_find with std::ranges to reduce third-party dependencies and modernize the codebase. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22998	2025-03-04 10:08:02 +02:00
Artsiom Mishuta	d7f9c5654b	test.py: change test uname This commit change the test uname replacement fron "_" to "." to be able support sub-folders in scylla-pkg scripts logic Closes scylladb/scylladb#23130	2025-03-04 09:58:58 +02:00
Wojciech Mitros	dae7221342	rust: update dependencies The currently used versions of "wasmtime", "idna", "cap-std" and "cap-primitives" packages had low to moderate security issues. In this patch we update the dependencies to versions with these issues fixed. The update was performed by changing the "wasmtime" (and "wasmtime-wasi") version in rust/wasmtime_bindings/Cargo.toml and updating rust/Cargo.lock using the "cargo update" command with the affected package. To fix an issue with different dependencies having different versions of sub-dependencies, the package "smallvec" was also updated to "1.13.1". After the dependency update, the Rust code also needed to be updated because of the slightly changed API. One Wasm test case needed to be updated, as it was actually using an incorrect Wat module and not failing before. The crate also no longer allows multiple tables in Wasm modules by default - it is now enabled by setting the "gc" crate feature and configuring the Engine with config.wasm_reference_types(true). Fixes https://github.com/scylladb/scylladb/issues/23127 Closes scylladb/scylladb#23128	2025-03-04 09:45:23 +02:00
Pavel Emelyanov	e4e15a00b7	Merge 'reader_concurrency_semaphore: register_inactive_read(): handle aborted permit' from Botond Dénes It is possible that the permit handed in to register_inactive_read() is already aborted (currently only possible if permit timed out). If the permit also happens to have wait for memory, the current code will attempt to call promise<>::set_exception() on the permit's promise to abort its waiters. But if the permit was already aborted via timeout, this promise will already have an exception and this will trigger an assert. Add a separate case for checking if the permit is aborted already. If so, treat it as immediate eviction: close the reader and clean up. Fixes: scylladb/scylladb#22919 Bug is present in all live versions, backports are required. Closes scylladb/scylladb#23044 * github.com:scylladb/scylladb: reader_concurrency_semaphore: register_inactive_read(): handle aborted permit test/boost/reader_concurrency_semaphore_test: move away from db::timeout_clock::now()	2025-03-04 10:40:28 +03:00
Emil Maskovsky	834f506790	raft topology: drop changing the raft voters config via storage_service For the limited voters feature to work properly we need to make sure that we are only managing the voter status through the topology coordinator. This means that we should not change the node votership from the storage_service module for the raft topology directly. We can drop the voter status changes from the storage_service module because the topology coordinator will handle the votership changes eventually. The calls in the storage_service module were not essential and were only used for optimization (improving the HA under certain conditions). This has effect on the timing in the tablets migration test though, as it relied on the node being made non-voter from the service_storage `raft_removenode()` function. The fix is to add another server to the topology to make sure we will keep the quorum. Previously the test worked because the test waits for an injection to be reached and it was ensured that the injection (log line) has only been triggered after the node has been made non-voter from the `raft_removenode()`. This is not the case anymore. An alternative fix would be to wait for the first node to be made non-voter before stopping the second server, but this would make the test more complex (and it is not strictly required to only use 4 servers in the test, it has been only done for optimization purposes). Fixes: scylladb/scylladb#22860 Refs: scylladb/scylladb#18793 Refs: scylladb/scylladb#21969	2025-03-03 15:15:43 +01:00
Artsiom Mishuta	90106c6f19	test.py: skip test_incremental_read_repair[row-tombstone] skip test test_incremental_read_repair[row-tombstone] due to https://github.com/scylladb/scylladb/issues/21179 Closes scylladb/scylladb#23126	2025-03-03 15:26:28 +02:00
Kefu Chai	5571b537b5	tree: Make values mutable to enable move semantics Previously, variables were marked as const, causing std::move() calls to be redundant as reported by GCC warnings. This change either removes const qualifiers or marks related lambdas as mutable, allowing the compiler to properly utilize move constructors for better performance. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#23066	2025-03-03 13:53:02 +03:00
Evgeniy Naydanov	cb0e0ebcf7	test.py: extract prepare dirs and S3 mock steps to test/conftest.py As a part of the moving to bare pytest we need to extract the required test environment preparation steps into pytest's hooks/fixtures. Do this for S3 mock stuff (MinioServer, MockS3Server, and S3ProxyServer) and for directories with test artifacts. For compatibility reason add --test-py-init CLI option for bare pytest test runner: need to add it to pytest command if you need test.py stuff in your tests (boost, topology, etc.) Also, postpone initialization of TestSuite.artifacts and TestSuite.hosts from import-time to runtime. Closes scylladb/scylladb#23087	2025-03-03 13:24:37 +03:00
Paweł Zakrzewski	9e7f79d1ab	cql3/select_statement: require LIMIT and PER PARTITION LIMIT to be strictly positive LIMIT and PER PARTITION LIMIT limit the number of rows returned or taken into consideration by a query. It makes no logical sense to have this value at less than 1. Cassandra also has this requirement. This patch ensures that the limit value is strictly positive and adds an explicit test for it - it was only tested in a test ported from Cassandra, that is disabled due to other issues. Closes scylladb/scylladb#23013	2025-03-03 08:13:27 +02:00
Tomasz Grabiec	0343235aa2	Merge 'tablets: repair: fix hosts and dcs filters behavior for tablet repair' from Aleksandra Martyniuk If hosts and/or dcs filters are specified for tablet repair and some replicas match these filters, choose the replica that will be the repair master according to round-robin principle (currently it's always the first replica). If hosts and/or dcs filters are specified for tablet repair and no replica matches these filters, the repair succeeds and the repair request is removed (currently an exception is thrown and tablet repair scheduler reschedules the repair forever). Fixes: https://github.com/scylladb/scylladb/issues/23100. Needs backport to 2025.1 that introduces hosts and dcs filters for tablet repair Closes scylladb/scylladb#23101 * github.com:scylladb/scylladb: test: add new cases to tablet_repair tests test: extract repiar check to function locator: add round-robin selection of filtered replicas locator: add tablet_task_info::selected_by_filters service: finish repair successfully if no matching replica found	2025-03-01 14:47:43 +01:00

... 66 67 68 69 70 ...

11801 Commits