scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-01 05:35:48 +00:00

Author	SHA1	Message	Date
Aleksandra Martyniuk	4e2cd8640c	test: add test to check tablet repair tasks	2024-11-28 12:15:42 +01:00
Aleksandra Martyniuk	ab3858e050	test: topology_tasks: enable tablets Tablets are no longer an experimental feature, but topology_tasks test suite treats them as if they were. Enable tablets with their own config option in topology_tasks suite.	2024-11-28 11:42:40 +01:00
Kefu Chai	5e391eee25	treewide: use coroutine::parallel_for_each(range) when appropriate `coroutine::parallel_for_each` accepts both a range and a pair of iterators. let's use the former when appropriate. it is simpler this way. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21684	2024-11-27 21:00:47 +02:00
Botond Dénes	20bbb1113e	test/cqlpy: test_tools.py: use xfail more selectively ScyllaDB doesn't support counters with tablets yet. So scylla-sstable tests which use counter schema are marked with xfail, but this is done too aggressively, disabling too many tests that are otherwise fine. There are two tests affected: * test_scylla_sstable_script - this test uses early return when the schema parameter is the one with counters and tablets are enabled. This is still too eager because tablets are now always enabled. Also, the early return make the fact that this test is disabled hidden. So change the check to check whether tablets are used on the test keyspace and use xfail instead of sneaky early return. * test_scylla_sstable_dump_data - this test is blanket-disabled when run with the tablets parameter. Even though only 1 out of 5 schemas tested use counters. Remove the blanket xfail and only add it when test keyspace uses tablets and the schema parameter is the one with counters. This makes dozens of test run again, restoring the test coverage lost with the too eager use of xfail (and sneaky return). Refs: #18180 Closes scylladb/scylladb#21685	2024-11-27 12:17:56 +03:00
Kefu Chai	8ca1c57de0	test: s3_proxy: bring back InjectingHandler.log_message in `0dff187b7a`, we dropped `InjectingHandler.log_message()`, but this method was defined to override the default implementation provided by `BaseHTTPRequestHandler.log_message()`. this change flooded the standard output when testing `aws_error_injection_test` with `test.py` with logging messages like: ``` 127.0.0.1 - - [26/Nov/2024 17:27:34] "PUT /?Policy=0&Key=%2Ftest%2Ftestobject-large-817295 HTTP/1.1" 200 127.0.0.1 - - [26/Nov/2024 17:27:34] "PUT /?Policy=1&Key=%2Ftest%2Ftestobject-large-817306 HTTP/1.1" 200 ``` this is unexpected. in this change, we bring this method back, and additionally, we format the logging message lazily. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21689	2024-11-27 12:16:36 +03:00
Botond Dénes	ccb433d767	Merge 'tasks: add api_task_ttl for tasks started with API' from Aleksandra Martyniuk When users start an operation asynchronously with API, they are expected to check the operation's status. Hence, the status should be kept in task manager for reasonable time after the operation is done. The operations that are started internally usually don't need to stay in task manager for that long. Add api_task_ttl that will be used for tasks started with API. By default it's 1 hour. The time for which non-API tasks stay in task manager isn't changed. Fixes: #21499. Refs: #21425. No backport needed - previous versions may use task_ttl Closes scylladb/scylladb#21505 * github.com:scylladb/scylladb: test: add test to check user_task_ttl tasks: api: move make_task method docs: nodetool: update backup and restore commands docs docs: update task manager docs nodetool: add nodetool tasks user-ttl command node_ops: use user task ttl for node ops virtual task tasks: use user_task_ttl for tasks started by user api: task_manager: add /task_manager/user_ttl to get and set user task ttl tasks: add task_manager::task::is_user_task method tasks: keep updateable_value of task_ttl in task manager db: config: add user_task_ttl_seconds named value	2024-11-27 09:57:57 +02:00
Nikita Kurashkin	4ba8a6b1b4	Fix test for DESC TABLE on materialised view to be compatible with Scylla AND Cassandra Fixes #21026 Refs #21500 Closes scylladb/scylladb#21526	2024-11-27 09:49:23 +02:00
Ernest Zaslavsky	793f2c95d1	snapshots: Stop taking snapshots of MVs Stop taking snapshots of MVs and allow taking snapshot of individual tables, now one can take a snapshot of any base table, any view or index. Also add tests to cover new cases both boost test (using cc code) and pytest (using the API) Also, update documentation to reflect the change fixes: #21339 fixes: #20760 Closes scylladb/scylladb#21433	2024-11-26 15:27:30 +02:00
Kefu Chai	a5ee0c896b	treewide: migrate from boost::adaptors::filtered to std::views::filter Modernize the codebase by replacing Boost range adaptors with C++23 standard library views, reducing external dependencies and leveraging modern C++ language features. Key Changes: - Replace `boost::adaptors::filtered` with `std::views::filter` - Remove `#include <boost/range/adaptor/filtered.hpp>` - Utilize standard library range views Motivation: - Reduce project's external dependency footprint - Leverage standard library's range and view capabilities - Improve long-term code maintainability - Align with modern C++ best practices Implementation Challenges and Considerations: 1. Range Conversion and Move Semantics - `std::ranges::to` adaptor requires rvalue references - Necessitated updates to variable and parameter constness - Example: `cql3/restrictions/statement_restrictions.cc` modified to remove `const` from `common` to enable efficient range conversion 2. Range Iteration and Mutation - Range views may mutate internal state during iteration - Cannot pass ranges by const reference in some scenarios - Solution: Pass ranges by rvalue reference to explicitly indicate state invalidation Limitations: - One instance of `boost::adaptors::filtered` temporarily preserved due to lack of a C++23 alternative for `boost::join()` - A comprehensive replacement will be addressed in a follow-up change This change is part of our ongoing effort to modernize the codebase, reducing external dependencies and adopting modern C++ practices. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21648	2024-11-26 14:26:50 +02:00
Aleksandra Martyniuk	ac6a07117a	test: add test to check user_task_ttl	2024-11-26 09:57:42 +01:00
Aleksandra Martyniuk	1ade668d79	nodetool: add nodetool tasks user-ttl command	2024-11-26 09:57:23 +01:00
Evgeniy Naydanov	1e9d780e89	test.py: deselect random failures which can cause #21534 Following combinations of error injections and cluster events can cause #21534. Disable them for now because they break CI. Closes scylladb/scylladb#21658	2024-11-26 10:38:15 +02:00
Aleksandra Martyniuk	e703ba08f8	node_ops: use user task ttl for node ops virtual task Use user task ttl for node ops virtual task. Modify the test accordingly.	2024-11-25 14:21:53 +01:00
Nadav Har'El	61e8975930	Merge 'test/boost/view_schema_test: Improve test_view_update_generating_writetime' from Dawid Mędrek In this PR, we improve various aspects of the test: * increase obtained information whenever any test case fails, * split test cases, * elaborate on the semantics of generating view updates and what exactly we check and why. Backport: not needed, this is an enhancement. Closes scylladb/scylladb#21579 * github.com:scylladb/scylladb: test/boost/view_schema_test: Improve comments in test_view_update_generating_writetime test/boost/view_schema_test.cc: Improve checks in test_view_update_generating_writetime test/boost/view_schema_test.cc: Split test cases in test_view_update_generating_writetime	2024-11-25 13:46:56 +02:00
Evgeniy Naydanov	5d254b1fdf	test.py: topology_random_failures: increase timeout for Scylla startup We run topology_random_failures in debug mode only and sometimes Scylla is too slow in this mode. Increase timeout for Scylla startup from 30s to 180s to reduce flakiness. Fixes #21101 Closes scylladb/scylladb#21659	2024-11-25 09:58:46 +03:00
Dawid Mędrek	926eaf8fe9	test/boost/view_schema_test: Improve comments in test_view_update_generating_writetime In this commit, we elaborate on the semantics of generating view updates for each case the test goes through so that the reader less familiar with the logic has an easier time understanding it.	2024-11-24 22:48:15 +01:00
Dawid Mędrek	2d12acd09a	test/boost/view_schema_test.cc: Improve checks in test_view_update_generating_writetime We modify the checks in the test to obtain full information whenever a failure happens. Before this change, we compared the number of view updates one-by-one. As a result, when the first check failed, we didn't learn anything about the other two. Now we always compare them all at once. A negative impact of this commit is that if one of the lambdas throws an exception, we don't learn ANYTHING. However, a lambda throwing an exception is a more appalling problem than the comparison failing, and we DO learn about it in such a situation; so we accept that cost.	2024-11-24 22:48:13 +01:00
Dawid Mędrek	fb62fc6061	test/boost/view_schema_test.cc: Split test cases in test_view_update_generating_writetime We split some of the test cases so it's clearer what's going on in the test. Also, if a bug happens in the future, it should be easier to reason about it when it corresponds to exactly one CQL statement instead of possibly two.	2024-11-24 22:47:27 +01:00
Andrei Chekun	8bf62a086f	test.py: Create central conftest. Central conftest allows to reduce code duplication and execute all tests with one pytest command Closes scylladb/scylladb#21454	2024-11-24 20:09:48 +02:00
Nadav Har'El	7014aec452	Merge 'Alternator measuring RCU and WCU' from Amnon Heiman Read and Write Consumed Capacity units are an abstract way of measuring Alternator actions. In general, they correspond to the read or write data. In the long run, the RCU/WCU adds a way of charging an operation and limiting usage. This series addresses two issues: consume capacity request API and metering. The Alternator (and DynmoDB) API has an optional parameter allowing users to check the number of units an operation consumes. When a user adds that parameter, the response will contain the number of units used for the operation. This series adds the consume capacity support to the get_item and put_item, adds a metric to collect the overall RCU and WCU used, and adds a test for the new functionality. Follow-up PRs will add support for more operations and GSI. Replaces #19811 Partially implement: #5027 Closes scylladb/scylladb#21543 * github.com:scylladb/scylladb: alternator/test_metrics: Add tests for table consumption units test_returnconsumedcapacity.py: Add putItem tests Alternator: add WCU support Add test/alternator/test_returnconsumedcapacity.py alternator/executor: Add consume capacity for get_item alsternator/stats: Add rcu and wcu metrics to stats alternator/executor.hh: white-space cleanup Add the consume_capacity helper class	2024-11-24 19:27:03 +02:00
Dawid Mędrek	f913ae571f	db/view: Don't generate view updates for unselected columns The semantics of Scylla's materialized views may vary depending on how their primary keys correspond to the base table's one. One of the differences is how we handle writes to columns in the base table that are not selected by a view: * Case 1: The view's PK is a permutation of the base table's PK: Since the view's primary key cannot be changed in an update, a row in the view remains alive as long as the corresponding row in the base table is alive. The tricky part comes when the base table has columns that are NOT selected by the view. CQL3 used to not allow for defining a table that didn't have any other columns besides its primary key. Also, when inserting a row into a table, it was mandatory to provide at least one value aside from the primary key. At some point it changed [1] and the implementation of the solution relied on the notion of the row marker. Putting the details aside, consider the following scenario: (i) the base table has a primary key consisting of columns c_1, ..., c_k, and it has regular columns rc_1, ..., rc_n, (ii) the primary key of an MV defined on that table consists of a permutation of c_1, ..., c_k. The MV doesn't select at least one of the regular columns of the base table. Without loss of generality, let that unselected column be rc_1. (iii) the base table has a row R whose only non-null value is the one in the regular column rc_1. Now, what will R correspond to in the MV? The base table doesn't have a row marker, but all of its regular columns in the MV will be NULLs. That's NOT allowed. To solve that problem, all unselected columns have corresponding virtual columns in the MV; the only information they provide is whether there is a value in the base table or not. This way, the MV knows if a row is still alive or not. For that reason, we send view updates to virtual columns in the following cases: (i) the value in the column changes from NULL to a value, i.e. it's created, (ii) the value in the column exists, but its TTL has been updated. * Case 2: The view's PK has one more column that the base table's one: Since the primary key of the view has a regular column C from the base table, it is guaranteed that if there's a row in the MV, the corresponding row in the base table can remain alive: since C is part of the view's PK, it must have a value, so the row in the base table has a value in C too. The problem with virtual columns from the previous case doesn't manifest in this one. The liveness of the cell in C determines the liveness of the whole row in the view. The semantics gets more complex, but the conclusion is this: in case 1, virtual columns exist and we may need to generate view updates for them, while in case 2 virtual columns do NOT exist and so we don't generate view updates for them. What changes in this patch is we adjust the code to it. If a view has a regular column from the base table as part of its primary key, we no longer emit view updates when we change a column unselected by that view. It is purely an OPTIMIZATION change. [1]: https://issues.apache.org/jira/browse/CASSANDRA-4361 Fixes scylladb/scylladb#21652 Closes scylladb/scylladb#21653	2024-11-24 19:01:28 +02:00
Avi Kivity	29497f8c5d	Merge 'Automatically compute schema version of system tables' from Tomasz Grabiec Schema of system tables is defined statically and table_schema_version needs to be explicitly set in code like this: ``` builder.with_version(system_keyspace::generate_schema_version(table_id, version_offset)); ``` Whenever schema is changed, the schema version needs to change, otherwise we hit undefined behavior when trying to interpret mutation data created with the old schema using the new schema. It's not obvious that one needs to do that and developers often forget to do that. There were several instances of mistakes of omission, some caught during review, some not, e.g.: `31ea74b96e`. This patch changes definitions to call the new `schema_builder::with_hash_version()`, which will make the schema builder compute version from schema definition so that changes of the schema will automatically change the version. This way we no longer rely on the developer to remember to bump the version offset. All nodes should arrive at the same version, which is verified by existing `test_group0_schema_versioning` and a new unit test: `test_system_schema_version_is_stable`. Closes scylladb/scylladb#21602 * github.com:scylladb/scylladb: system_tables: Compute schema version automatically schema_builder: Introduce with_hash_version() schema: Store raw_view_info in schema::raw_schema schema: Remove dead comment hashing: Add hasher for unordered_map hashing: Add hasher for unique_ptr hashing: Add hasher for double [avi: add missing include <memory> to hashing.hh]	2024-11-24 18:44:32 +02:00
Alexander Turetskiy	e83ab28d2d	Improve compation on read of expired tombstones compact expired tombstones in cache even if they are blocked by commitlog fixes #16781 Closes scylladb/scylladb#21613	2024-11-22 10:31:21 +02:00
Tomasz Grabiec	0d2583600d	Merge 'Add tablet repair scheduler support' from Asias He This adds a new tablet migration kind: repair. It allows tablet repair scheduler to use this migration kind to schedule repair jobs. The current repair scheduler implementation does the following: - A tablet is picked to be repaired when is requested by user - The tablet repair can be scheduled along with tablet migration and rebuild. It runs in the tablet_migration track. - Repair jobs are scheduled in a smart way so that at any point in time, there are no more than configured jobs per shard, which is similar to scylla manager's control. New feature. No backport is needed. Closes scylladb/scylladb#21088 * github.com:scylladb/scylladb: test: Add tests for tablet repair scheduler repair: Add restful API for tablet repair repair: Add tablet repair scheduler internal API support docs: Update system_keyspace.md for tablet repair related info docs: Add docs for tablet repair migration repair: Add core tablet repair scheduler support messaging_service: Introduce TABLET_REPAIR verb tablet_allocator: Introduce stream_weight for tablet_migration_streaming_info network_topology_strategy: Preserve fields of task_info in reallocate_tablets	2024-11-20 13:28:17 +01:00
Amnon Heiman	1e4fb2442a	alternator/test_metrics: Add tests for table consumption units Adding tests to verify the RCU and WCU metrics. A new helper function check_increases_metric_exact check that a given metrics increased by a given number. Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2024-11-20 11:28:53 +02:00
Amnon Heiman	95c45ca269	test_returnconsumedcapacity.py: Add putItem tests This patch adds testing for putItem consume capacity. There is an additional test for number support. Numbers are encoded differently with alternator and dynamoDB, the test adds some flexibility in the result so it would pass both DynamoDB and Alternator. Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2024-11-20 11:27:43 +02:00
Botond Dénes	d94591c260	Merge 'treewide: replace boost::find_if with std::ranges::find_if' from Kefu Chai now that we are allowed to use C++23. we now have the luxury of using `std::ranges::find_if`. in this change, we: - replace `boost::find_if` with `std::ranges::find_if` - remove all `#include <boost/range/algorithm/find_if.hpp>` to reduce the dependency to boost for better maintainability, and leverage standard library features for better long-term support. this change is part of our ongoing effort to modernize our codebase and reduce external dependencies where possible. --- it's a cleanup, hence no need to backport. Closes scylladb/scylladb#21495 * github.com:scylladb/scylladb: treewide: replace boost::find_if with std::ranges::find_if counters: replace boost::find_if with std::ranges::find_if combine.hh: use std::iter_const_reference_t when appropriate	2024-11-20 09:58:13 +02:00
Botond Dénes	075ca6cc02	Merge 'cql3: respect PER PARTITION LIMIT for aggregate queries' from Paweł Zakrzewski Currently, PER PARTITION LIMIT is not implemented for aggregates and queries can result in more rows than expected from the same partition. Instrument the result_set_builder class so that it can enforce PER PARTITION LIMIT for aggregate queries, specifically: - add per_partition_limit to the result_set_builder - expose the number of input rows in the selector result_set_builder gets two new functions handling partition start and end: - accept_partition_end for notifying that a partition has been finished. This is also called when a page ends, so we cannot simply flush here, as a naive implementation could do. - accept_new_partition, where we flush_selectors() if it's indeed a new partition (and not a continuation of the previous) and the query has a grouping: we don't want to flush on new partition in a query like SELECT COUNT() FROM foo; Fixes #5363 Closes scylladb/scylladb#21125 github.com:scylladb/scylladb: test: enable PER PARTIION LIMIT + GROUP BY tests cql3: respect PER PARTITION LIMIT for aggregates cql3: selection: count input rows in the selector cql3: selection: pass per partition limit to the result_set_builder cql3: show different messages for LIMIT and PER PARTITION LIMIT in get_limit	2024-11-20 09:54:28 +02:00
Botond Dénes	5ccbd500e0	Merge 'repair: fix task_manager_module::abort_all_repairs' from Aleksandra Martyniuk Currently, task_manager_module::abort_all_repairs marks top-level repairs as aborted (but does not abort them) and aborts all existing shard tasks. A running repair checks whether its id isn't contained in _aborted_pending_repairs and then proceeds to create shard tasks. If abort_all_repairs is executed after _aborted_pending_repairs is checked but before shard tasks are created, then those new tasks won't be aborted. The issue is the most severe for tablet_repair_task_impl that checks the _aborted_pending_repairs content from different shards, that do not see the top-level task. Hence the repair isn't stopped but it creates shard repair tasks on all shards but the one that initialized repair. Abort top-level tasks in abort_all_repairs. Fix the shard on which the task abort is checked. Fixes: #21612. Needs backport to 6.1 and 6.2 as they contain the bug. Closes scylladb/scylladb#21616 * github.com:scylladb/scylladb: test: add test to check if repair is properly aborted repair: add shard param to task_manager_module::is_aborted repair: use task abort source to abort repair repair: drop _aborted_pending_repairs and utilize tasks abort mechanism repair: fix task_manager_module::abort_all_repairs	2024-11-20 06:43:01 +02:00
Asias He	ddfec068d0	test: Add tests for tablet repair scheduler	2024-11-20 09:42:41 +08:00
Amnon Heiman	3c46d78e6a	Add test/alternator/test_returnconsumedcapacity.py This patch adds testing for the consumedCapacity header. It's currently only test get_item The test works with both AWS and alternator. Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2024-11-19 18:43:28 +02:00
Nadav Har'El	733a4f94c7	Merge 'test/boost/view_schema_test.cc: Wait for views to build in test_view_update_generating_writetime' from Dawid Mędrek Before these changes, we didn't wait for the materialized views to finish building before writing to the base table. That led to generating an additional view update, which, in turn, led to test failures. The scenario corresponding to the summary above looked like this: 1. The test creates an empty table and MVs on it. 2. The view builder starts, but it doesn't finish immediately. 3. The test performs mutations to the base table. Since the views already exist, view updates are generated. 4. Finally, the view builder finishes. It notices that the base table has a row, so it generates a view update for it because it doesn't notice that we already have data in the view. We solve it by explicitly waiting for both views to finish building and only then start writing to the base table. Additionally, we also fix a lifetime issue of the row the test revolves around, further stabilizing CI. Fixes https://github.com/scylladb/scylladb/issues/20889 Backport: These changes have no semantic effect on the codebase, but they stabilize CI, so we want to backport them to the maintained versions of Scylla. Closes scylladb/scylladb#21632 * github.com:scylladb/scylladb: test/boost/view_schema_test.cc: Increase TTL in test_view_update_generating_writetime test/boost/view_schema_test.cc: Wait for views to build in test_view_update_generating_writetime	2024-11-19 18:10:52 +02:00
Dawid Mędrek	af4afc84ec	test/boost/view_schema_test.cc: Increase TTL in test_view_update_generating_writetime The auxiliary function `eventually()` (defined in `test/lib/eventually.hh`) tries to execute a passed function. If it throws, `eventually()` sleeps for `2^#previous_attempts` milliseconds and tries to perform it again. The default limit of attempts is 17. In `test_view_update_generating_writetime`, right before the last test case, we perform: ```cql UPDATE t USING TTL 10 AND TIMESTAMP 8 SET g=40 WHERE k=1 AND c=1; ``` The test case itself executes: ```cql SELECT WRITETIME(g) FROM t; ``` and asserts that the result of the query is equal to 8, i.e. it corresponds to the timestamp of the last write to the table `t`. However, if the test case keeps failing, then during its 14th attempt (so affter sleeping for at least `2^14 - 1` milliseconds, which amounts to about 16 seconds), we'll observe the following error: ``` [Exception] - std::runtime_error: Expected row not found: [0000000000000008] not in {result_message::rows {row: null}} ``` The reason behind it is the specified TTL is too short. 10 seconds will have already passed before the 14th attempt, so the value in the column `g` will be `NULL` again. In particular, the `WRITETIME(g)` will no longer be equal to `8`. To solve that issue, we change the TTL in the CQL statement to 300. The time spent on 17 loops of `eventually()` amounts to about `2^18 - 1` milliseconds, which is about 263 seconds. That's why setting the TTL to 300 seconds should be enough to prevent the error from occurring.	2024-11-19 13:02:34 +01:00
Dawid Mędrek	5ca0cc4e85	test/boost/view_schema_test.cc: Wait for views to build in test_view_update_generating_writetime Before these changes, we didn't wait for the materialized views to finish building before writing to the base table. That led to generating an additional view update, which, in turn, led to test failures. The scenario corresponding to the summary above looked like this: 1. The test creates an empty table and MVs on it. 2. The view builder starts, but it doesn't finish immediately. 3. The test performs mutations to the base table. Since the views already exist, view updates are generated. 4. Finally, the view builder finishes. It notices that the base table has a row, so it generates a view update for it because it doesn't notice that we already have data in the view. We solve it by explicitly waiting for both views to finish building and only then start writing to the base table. Fixes scylladb/scylladb#20889	2024-11-19 12:51:22 +01:00
Aleksandra Martyniuk	f5795e8aa4	test: add test to check if repair is properly aborted	2024-11-19 11:59:29 +01:00
Paweł Zakrzewski	b893e63b4a	test: enable PER PARTIION LIMIT + GROUP BY tests	2024-11-19 09:28:01 +01:00
Nadav Har'El	7607f5e33e	alternator: fix "/localnodes" to not return down nodes Alternator's "/localnodes" HTTP requests is supposed to return the list of nodes in the local DC to which the user can send requests. Before commit `bac7c33313` we used the gossiper is_alive() method to determine if a node should be returned. That commit changed the check to is_normal() - because a node can be alive but in non-normal (e.g., joining) state and not ready for requests. However, it turns out that checking is_normal() is not enough, because if node is stopped abruptly, other nodes will still consider it "normal", but down (this is so-called "DN" state). So we need to check both is_alive() and is_normal(). This patch also adds a test reproducing this case, where a node is shut down abruptly. Before this patch, the test failed ("/localnodes" continued to return the dead node), and after it it passes. Fixes #21538 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#21540	2024-11-19 10:04:59 +02:00
Botond Dénes	36870feb29	Merge 'test: route S3 Proxy server messages through logger' from Kefu Chai This change was created in the same spirit of `f8221b960f`. The S3ProxyServer (introduced in `8919e0abab`) currently prints its status directly to stdout, which can be distracting when reviewing test results. For example: ```console $ ./test.py --mode release object_store/test_backup::test_simple_backup_and_restore Found 1 tests. Setting minio proxy random seed to 1731924995 Starting S3 proxy server on ('127.193.179.2', 9002) ================================================================================ [N/TOTAL] SUITE MODE RESULT TEST ------------------------------------------------------------------------------ [1/1] object_store release [ PASS ] object_store.test_backup.1 Stopping S3 proxy server ------------------------------------------------------------------------------ CPU utilization: 3.1% ``` Move these messages to use proper logging to give developers more control over their visibility: - Make logger parameter mandatory in S3ProxyServer constructor - Route "Stopping S3 proxy" message through the provided logger - Add --log-level option to the standalone proxy server launcher The message is now hidden: ```console $ ./test.py --mode release object_store/test_backup::test_simple_backup_and_restore Found 1 tests. ================================================================================ [N/TOTAL] SUITE MODE RESULT TEST ------------------------------------------------------------------------------ [1/1] object_store release [ PASS ] object_store.test_backup.1 ------------------------------------------------------------------------------ CPU utilization: 4.1% ``` --- this change improves the developer experience, hence no need to backport. Closes scylladb/scylladb#21610 * github.com:scylladb/scylladb: test: route S3 Proxy server messages through logger test: s3_proxy: remove unused method	2024-11-19 06:42:28 +02:00
Kefu Chai	33a0e5b892	treewide: replace boost::find_if with std::ranges::find_if now that we are allowed to use C++23. we now have the luxury of using `std::ranges::find_if`. in this change, we: - replace `boost::find_if` with `std::ranges::find_if` - remove all `#include <boost/range/algorithm/find_if.hpp>` to reduce the dependency to boost for better maintainability, and leverage standard library features for better long-term support. this change is part of our ongoing effort to modernize our codebase and reduce external dependencies where possible. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-11-19 10:50:01 +08:00
Avi Kivity	b14871ad3f	Merge 'code cleanup: remove "sstring_view" and replace its usages by std::string_view' from Nadav Har'El For historic reasons, we have (in bytes.hh) a type sstring_view which is an alias for std::string_view - since the same standard type can hold a pointer into both a seastar::sstring and std::string. This alias in unnecessary and misleading to new developers, who might be misled to believe it is assume it is somehow different from std::string_view - when it isn't. This series removes all uses of sstring_view (changing them to use std::string_view), and in the last patch removes the alias itself. A few functions whose name referred to "sstring" but take a std::string_view were renamed. The patches are fairly mechanical and trivial, with no functional changes intended. To ease the review the series was split to a few smaller patches that modify specific areas of the code. Fixes #4062. Closes scylladb/scylladb#21617 * github.com:scylladb/scylladb: bytes: remove unused alias sstring_view change remaining sstring_view to std::string_view test: change sstring_view to std::string_view cql3: change sstring_view to std::string_view alternator: change sstring_view to std::string_view type: change from_sstring() to from_string_view() cross-tree: change to_sstring_view() to to_string_view()	2024-11-18 22:43:46 +02:00
Tomasz Grabiec	06d478793d	Merge 'mutation: switch from boost ranges to std ranges' from Avi Kivity Wean the mutation code (at least the headers) from boost ranges to std ranges, in order to reduce the dependency load. Cleanup, so no backport. Closes scylladb/scylladb#21601 * github.com:scylladb/scylladb: partition_snapshot_row_cursor.hh: switch from boost ranges to std ranges mutation: mutation_partition_v2.hh: switch from boost ranges to std ranges mutation: mutation_partition.hh: switch from boost ranges to std ranges partition_snapshot_reader.hh: drop unused include boost/range/algorithm/heap_algorithm.hpp	2024-11-18 21:23:29 +01:00
Nadav Har'El	e72aabae7f	test: change sstring_view to std::string_view Our "sstring_view" is an historic alias for the standard std::string_view. The test/ directory used this old alias in a few of random places, let's change them to use the standard type name. Refs #4062. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2024-11-18 16:26:20 +02:00
Nadav Har'El	da99dc3a7f	cross-tree: change to_sstring_view() to to_string_view() For historic reasons, we have (in bytes.hh) a type sstring_view which is an alias for std::string_view - since the same standard type can hold a pointer into both a seastar::sstring and std::string. This alias in unnecessary and misleading to new developers (who might assume it is somehow different from std::string_view). This patch doesn't yet remove all occurances of sstring_view (the request in #4062), but begins to do it by renaming one commonly-used function, to_sstring_view(bytes) to to_string_view() and of course changes all its uses to the new name. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2024-11-18 14:57:49 +02:00
Kefu Chai	cb24022b54	test: route S3 Proxy server messages through logger This change was created in the same spirit of `f8221b960f`. The S3ProxyServer (introduced in `8919e0abab`) currently prints its status directly to stdout, which can be distracting when reviewing test results. For example: ```console $ ./test.py --mode release object_store/test_backup::test_simple_backup_and_restore Found 1 tests. Setting minio proxy random seed to 1731924995 Starting S3 proxy server on ('127.193.179.2', 9002) ================================================================================ [N/TOTAL] SUITE MODE RESULT TEST ------------------------------------------------------------------------------ [1/1] object_store release [ PASS ] object_store.test_backup.1 Stopping S3 proxy server ------------------------------------------------------------------------------ CPU utilization: 3.1% ``` Move these messages to use proper logging to give developers more control over their visibility: - Make logger parameter mandatory in S3ProxyServer constructor - Route "Stopping S3 proxy" message through the provided logger - Add --log-level option to the standalone proxy server launcher The message is now hidden: ```console $ ./test.py --mode release object_store/test_backup::test_simple_backup_and_restore Found 1 tests. ================================================================================ [N/TOTAL] SUITE MODE RESULT TEST ------------------------------------------------------------------------------ [1/1] object_store release [ PASS ] object_store.test_backup.1 ------------------------------------------------------------------------------ CPU utilization: 4.1% ``` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-11-18 18:41:17 +08:00
Kefu Chai	0dff187b7a	test: s3_proxy: remove unused method neither `InjectingHandler.log_error`, nor `InjectingHandler.log_message` is used. so let's drop them. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-11-18 18:39:15 +08:00
Aleksandra Martyniuk	572b005774	repair: implement tablet_repair_task_impl::release_resources tablet_repair_task_impl keeps a vector of tablet_repair_task_meta, each of which keeps an effective_replication_map_ptr. So, after the task completes, the token metadata version will not change for task_ttl seconds. Implement tablet_repair_task_impl::release_resources method that clears tablet_repair_task_meta vector when the task finishes. Set task_ttl to 1h in test_tablet_repair to check whether the test won't time out. Fixes: #21503. Closes scylladb/scylladb#21504	2024-11-18 12:29:58 +02:00
Avi Kivity	3a6c0a9b36	Merge 'compaction: Perform integrity checks on compacting SSTables' from Nikos Dragazis This PR enables compaction tasks to verify the integrity of the input data through checksum and digest checks. The mechanism for integrity checking was introduced in previous PRs (#20207, #20720) as a built-in functionality of the input streams. This PR integrates this mechanism with compaction. The change applies to all compaction types and covers both compressed and uncompressed SSTables adhering to the 3.x format. If a compaction task reads only part of an SSTable, then only the per-chunk checksums are verified, not the digest. The PR consists of: * Changes to mx readers to support integrity checking. The kl readers, considered as compatibility-only, were left unchanged. Also, integrity checking on single-partition reversed reads (`data_consume_reversed_partition()`) remains unsupported by mx readers as this is not used in compaction. * Changes to `sstable` and `sstable_set` APIs to allow toggling integrity checks for mx readers. * Activation of integrity checking for all compaction types. * Tests for all compaction types with corrupted SSTables. Integrity checks come at a cost. For uncompressed SSTables, the cost is the loading of the CRC and Digest components from disk, and the calculation of checksums and digest from the actual data. For compressed SSTables, checksums are stored in-place and they are being checked already on all reads, so the only extra cost is the loading and calculation of the digest. The measurements show a ~5% regression in compaction performance for uncompressed SSTables, and a negligible regression for compressed SSTables. Command: `perf-sstable --smp=1 --cpuset=1 --poll-mode --mode=compaction --iterations=1000 --partitions 10000 --sstables=1 --key_size=4096 --num_columns=15 --column_size={32, 1024, 3500, 7000, 14500}` Uncompressed SSTables: ``` +--------------+-----------------------+----------------------+------------+ \| SSTable Size \| No Integrity (p/sec) \| Integrity (p/sec) \| Regression \| +--------------+-----------------------+----------------------+------------+ \| 50 MiB \| 65175.59 +- 80.82 \| 61814.63 +- 72.88 \| 5.16% \| \| 200 MiB \| 41795.10 +- 60.39 \| 39686.28 +- 45.05 \| 5.05% \| \| 500 MiB \| 21087.41 +- 30.72 \| 20092.93 +- 25.05 \| 4.72% \| \| 1 GiB \| 12781.64 +- 21.77 \| 12233.94 +- 21.71 \| 4.29% \| \| 2 GiB \| 6629.99 +- 9.40 \| 6377.13 +- 8.28 \| 3.81% \| +--------------+-----------------------+----------------------+------------+ ``` Compressed SSTables: ``` +--------------+-----------------------+----------------------+------------+ \| SSTable Size \| No Integrity (p/sec) \| Integrity (p/sec) \| Regression \| +--------------+-----------------------+----------------------+------------+ \| 50 MiB \| 53975.05 +- 63.18 \| 53825.93 +- 62.28 \| 0.28% \| \| 200 MiB \| 28687.94 +- 26.58 \| 28689.41 +- 26.91 \| 0% \| \| 500 MiB \| 13865.35 +- 15.50 \| 13790.41 +- 14.88 \| 0.54% \| \| 1 GiB \| 7858.10 +- 7.71 \| 7829.75 +- 9.66 \| 0.36% \| \| 2 GiB \| 4023.11 +- 2.43 \| 4010.54 +- 2.55 \| 0.31% \| +--------------+-----------------------+----------------------+------------+ (p/sec = partitions/sec) ``` Refs #19071. New feature, no backport is needed. Closes scylladb/scylladb#21153 * github.com:scylladb/scylladb: test: Add test for compaction with corrupted SSTables compaction: Enable integrity checks for all compaction types sstables: Add integrity option to factories for sstable_set readers sstables: Add integrity option to sstable::make_reader() sstables: Add integrity option to mx::make_reader() sstables: Load checksums and digests in mx full-scan reader sstables: Add integrity option to data_consume_single_partition() sstables: Disengage integrity_check from sstable class sstables: Allow data sources to disable digest check	2024-11-17 20:59:31 +02:00
Tomasz Grabiec	8738d9bfa0	system_tables: Compute schema version automatically This depends on the previous change to the schema_builder which makes version computation depend on definition only instead of being new time uuid. This way we avoid the possibility for a common mistake when schema of a system table is extended but we forget to bump up its version passed to .with_version().	2024-11-15 19:16:41 +01:00
Avi Kivity	1c26c8deeb	mutation: mutation_partition_v2.hh: switch from boost ranges to std ranges Consolidate on one range solution. Fallout in mutation_partition_v2.cc and row_cache_test.cc due to interoperability problems is adjusted.	2024-11-15 14:36:28 +02:00
Botond Dénes	fed2c6ba83	sstables/mx/reader: release column value buffer after consumed data_consume_rows_context_m has a _column_value buffer it uses to read key and column values into, preparing for parsing and consuming them. This buffer is reset (released) in a few different cases: * When using it for key - after consuming its content * When using it for column value - when a colum has no value However, the buffer is not released when used for a column value and the column is consumed. This means that if a large column is read from the sstable, this buffer can potentially linger and keep consuming memory until either one of the other release scenarios is hit, or the reader is destroyed. Add a third release scenario, releasing the buffer after the row end was consumed. This allows the buffer to be re-used between columns of the same row, at the same time ensuring that a large buffer will not linger. This patch can almost halve the memory consumption of reads in certain circumstances. Point in case: the test test_reader_concurrency_semaphore_memory_limit_engages starts to fail after this fix, because the read doesn't trigger the OOM limit anymore and needs doubling of the concurrency to keep passing. This issue was found in a dtest (`test_ics_refresh_with_big_sstable_files`), which writes some large cells of up to 7MiB. After reading the row containing this large cell, the reader holds on to the 7MiB buffer causing the semaphore's OOM protection to kick in down the line. Fixes: https://github.com/scylladb/scylladb/issues/21160 Closes scylladb/scylladb#21132	2024-11-14 17:24:53 +01:00

1 2 3 4 5 ...

7877 Commits