scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-23 01:50:35 +00:00

Author	SHA1	Message	Date
Aleksandra Martyniuk	9b5d69ae96	tasks: improve task_manager::lookup_virtual_task Currently, lookup_virtual_task gets the list of ids of all operations tracked by a virtual task and checks whether it contains given id. The list of all ids isn't required and the check whether one particular operation id is tracked by the virtual task may be quicker than listing all operations. Add virtual_task::contains method and use it in lookup_virtual_task.	2024-10-30 12:24:38 +01:00
Avi Kivity	73b1f66b70	Revert "Merge 'Allow explicitly enabling or disabling tablets when creating a new keyspace' from Benny Halevy" This reverts commit `c286434e4c`, reversing changes made to `6712fcc316`. The commit causes memtable_test to be very flaky in debug mode. Specifically, subtests test_exceptions_in_flush_on_sstable_open and test_exceptions_in_flush_on_sstable_write).	2024-10-30 00:55:29 +02:00
Avi Kivity	b9df3aec12	gdb: avoid @classmethod/@property combinations The @classmethod/@property combination was deprecated in Python 3.11 and removed[1] in Python 3.13. It's used in scylla-gdb.py, breaking it with Python 3.13. To fix, just make all users (size_t and _vptr_type) top-level functions. The definitions are all identical and don't need to be in class scope. [1] https://docs.python.org/3.13/library/functions.html#classmethod Closes scylladb/scylladb#21349	2024-10-29 19:37:07 +02:00
Gleb Natapov	cc7f25062a	topology coordinator: take a copy of a replication state in raft_topology_cmd_handler Current code takes a reference and holds it past preemption points. And while the state itself is not suppose to change the reference may become stale because the state is re-created on each raft topology command. Fix it by taking a copy instead. This is a slow path anyway. Fixes: scylladb/scylladb#21220 Closes scylladb/scylladb#21316	2024-10-29 15:47:43 +01:00
Avi Kivity	020ccbd76a	Merge 'utils: cached_file: Mark permit as awaiting on page miss' from Tomasz Grabiec Otherwise, the read will be considered as on-cpu during promoted index search, which will severely underutlize the disk because by default on-cpu concurrency is 1. I verified this patch on the worst case scenario, where the workload reads missing rows from a large partition. So partition index is cached (no IO) and there is no data file IO (relies on https://github.com/scylladb/scylladb/pull/20522). But there is IO during promoted index search (via cached_file). Before the patch this workload was doing 4k req/s, after the patch it does 30k req/s. The problem is much less pronounced if there is data file or partition index IO involved because that IO will signal read concurrency semaphore to invite more concurrency. Fixes #21325 Closes scylladb/scylladb#21323 * github.com:scylladb/scylladb: utils: cached_file: Mark permit as awaiting on page miss utils: cached_file: Push resource_unit management down to cached_file	2024-10-29 16:15:21 +02:00
Kamil Braun	36cc3bcc90	test: test_crash_coordinator_before_streaming: enable TRACE for `raft_topology` logger Issue scylladb/scylladb#21114 reported that sometimes during the test we timeout when waiting for node to restart after it was killed. Preliminary investigation showed that the node appears to be hanging inside `topology_state_load`, while holding `token_metadata` lock, which prevents `join_topology` from progressing. Enable TRACE level logging for `raft_topology` so we get more accurate info where inside `topology_state_load` the hang happens, once the problem reproduces again in CI. Closes scylladb/scylladb#21247	2024-10-29 12:46:47 +02:00
Kefu Chai	54d438168a	build: cmake: explicitly mark convenience libraries as STATIC before this change, these [convenience libraries](https://www.gnu.org/software/automake/manual/html_node/Libtool-Convenience-Libraries.html) were implicitly built as static libraries by default, but weren't explicitly marked as STATIC in CMake. While this worked with default settings, it could cause issues if `BUILD_SHARED_LIBS` is enabled. So before we are ready for building these components as shared libraries, let's mark all convenience libraries as STATIC for consistency and to prevent potential issues before we properly support shared library builds. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21274	2024-10-29 10:22:19 +01:00
Yaron Kaikov	94a9efbf1c	github: add script for backports automation instead of Mergify Adding an auto-backport.py script to handle backport automation instead of Mergify. The rules of backport are as follows: * Merged or Closed PRs with any backport/x.y label (one or more) and promoted-to-master label * Backport PR will be automatically assigned to the original PR author * In case of conflicts the backport PR will be open in the original autoor fork in draft mode. This will give the PR owner the option to resolve conflicts and push those changes to the PR branch (Today in Scylla when we have conflicts, the developers are forced to open another PR and manually close the backport PR opened by Mergify) * Fixing cherry-pick the wrong commit SHA. With the new script, we always take the SHA from the stable branch * Support backport for enterprise releases (from Enterprise branch) Fixes: https://github.com/scylladb/scylladb/issues/18973 Closes scylladb/scylladb#21302	2024-10-29 10:04:30 +02:00
Avi Kivity	49d3e281d6	Merge 'Sanitize /system/highest_supported_sstable_version API endpoint' from Pavel Emelyanov Its handler dereferences long chain of objects to get to the value it needs. There's shorter way. Also, the endpoint in question is not unregistered on stop. Closes scylladb/scylladb#21279 * github.com:scylladb/scylladb: api: Make get_highest_supported_sstable_version use proper service api: Move system::get_highest_supported_sstable_version set/unset api: Scaffold for sstables-format-selector	2024-10-28 21:42:41 +02:00
Pavel Emelyanov	b09bb6bc19	error_injection: Re-use enter() code in inject() overloads Most of inject() overloads check if the injection is enabled, then optionally clear the one-shot one, then do the injection. Everything but doing the injection is implemented in the enter() method, it's perfectly worth re-using one. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#21285	2024-10-28 21:37:20 +02:00
Kefu Chai	7610b907c6	build: include subdirectory rules in compilation database merge Previously in `e65185ba`, when merging Seastar's and ScyllaDB's compilation databases, the "prefix" parameter in merge-compdb.py was too restrictive. It only included build rules for files with "CMakeFiles" prefix, excluding source files in subdirectories like `apps/iotune/CMakeFiles/app_iotune.dir/iotune.cc.o`. In this change, we change the prefix parameter to an empty string to include all source files whose object files are located under build directories, regardless of their path structure. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21312	2024-10-28 21:34:21 +02:00
Avi Kivity	c286434e4c	Merge 'Allow explicitly enabling or disabling tablets when creating a new keyspace' from Benny Halevy Separate the configuration for enabling the tablets feature from the enablement of tablets when creating new keyspaces. This change always enables the TABLETS cluster feature and the tablets logic respectively. The `enable_tablets` config option just controls whether tablets are enabled or disabled by default for new keyspaces. If `enable_tablets` is set to `true`, tablets can be disabled using `CREATE KEYSPACE WITH tablets = { 'enabled': false }` as it is today. If `enable_tablets` is set to `false`, tablets can be enabled using `CREATE KEYSPACE WITH tablets = { 'enabled': true }`. The motivation for this change is to simplify the user experience of using tablets by setting the default for new keyspaces to false amd allowing the user to simply opt-in by using tablets = {enabled: true }. This is not pissible today. The user has to enable tablets by default for all new keyspaces (that use the NetworkTopologyStrategy) and then actively opt-out to use vnodes. * Not required to be backported to OSS versions. May be backported to specific enterprise versions Closes scylladb/scylladb#20729 * github.com:scylladb/scylladb: data_dictionary: keyspace_metadata::describe: print tablets enabled also when defaulted tablets_test: test enable/disable tablets when creating a new keyspace treewide: always allow tablets keyspaces feature_service: prevent enabling both tablets and gossip topology changes alternator: create_keyspace_metadata: enable tablets using feature_service	2024-10-28 21:33:17 +02:00
Nadav Har'El	6712fcc316	test/cql-pytest: add option to run cql-pytes tests against specific release This patch adds the option "--release <version>" to test/cql-pytest/run, which downloads the pre-compiled Scylla release with the given version number and runs the tests against that version. For example, it can be used to demonstrate that #15559 was indeed a regression between 2022.1 and 2022.2, by running a recently-added test against these two old versions: test/cql-pytest/run --release 2022.1 --runxfail \ test_prepare.py::test_duplicate_named_bind_marker_prepared test/cql-pytest/run --release 2022.2 --runxfail \ test_prepare.py::test_duplicate_named_bind_marker_prepared The first run passes, the second fails - showing the regression. The Scylla releases are downloaded from ScyllaDB's S3 bucket (downloads.scylladb.com). They are saved in the build/ directory (e.g., build/2022.2.9), and if that directory is not removed, when "run --release" requests the same version again, the previous download is reused. Release numbers can look like: * 5.4.7 * 5.4 (will get the latest in the 5.4 branch, e.g., 5.4.7) * 5.4.0~rc2 (a prerelease) * 2021.1.9 (Enterprise release) * 2023.1 (latest in this branch, Enterprise release) Fixes #13189 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#19228	2024-10-28 21:29:44 +02:00
Kefu Chai	f3dee5b636	build: enable CMAKE_CXX_EXTENSIONS explicitly before this change, Seastar enables CXX_EXTENSIONS in its own build rules. but it does not expose it to the parent project. but scylladb's CMake building system respect seastar's .pc file and includes the cflags exposed by it. without this change, scylladb included "-std=c++23" from seastar, and "-std=gnu++23" from itself. this is both confusing and inconsistent with the build rules generated by `configure.py`. in this change, we explicitly set `CMAKE_CXX_EXTENSIONS` when creating Seastar's building rules, so that it can populate this setting to its .pc file. in this way, we don't have two different options for specifying the C++ standard when building scylladb with CMake. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21311	2024-10-28 21:23:04 +02:00
Kefu Chai	8b80ef3290	build: Remove GCC ARM warning workaround (originally added in `193d1942`) The workaround was initially added to silence warnings on GCC < 6.4 for ARM platforms due to a compiler bug (gcc.gnu.org/bugzilla/show_bug.cgi?id=77728). Since our codebase now requires modern GCC versions for coroutine support, and the bug was fixed in GCC 6.4+, this workaround is no longer needed. Refs `193d1942f2` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21308	2024-10-28 21:19:56 +02:00
Avi Kivity	94c21e5c05	Merge 'sstables: Reduce amount of I/O for clustering-key-bounded reads from large partitions' from Tomasz Grabiec Single-row reads from large partition issue 64 KiB reads to the data file, which is equal to the default span of the promoted index block in the data file. If users would want to increase selectivity of the index to speed up single-row reads, this won't be effective. The reason is that the reader uses promoted index to look up the start position in the data file of the read, but end position will in practice extend to the next partition, and amount of I/O will be determined by the underlying file input stream implementation and its read-ahead heuristics. By default, that results in at least 2 IOs 32KB each. There is already infrastructure to lookup end position based on upper bound of the read, in anticipation for sharing the promoted index cache, but it's not effective becasue it's a non-populating lookup and the upper bound cursor has its own private cached_promoted_index, which is cold when positions are computed. It's non-populating on purpose, to avoid extra index file IO to read upper bound. In case upper bound is far-enough from the lower bound, this will only increase the cost of the read. The solution employed here is to warm up the lower bound cursor's cache before positions are computed, and use that cursor for non-populating lookup of the upper bound. We use the lower bound cursor and the slice's lower bound so that we read the same blocks as later lower-bound slicing would, so that we don't incur extra IO for cases where looking up upper bound is not worth it, that is when upper bound is far from the lower bound. If upper bound is near lower bound, then warming up using lower bound will populate cached_promoted_index with blocks which will allow us to locate the upper bound block accurately. This is especially important for single-row reads, where the bounds are around the same key. In this case we want to read the data file range which belongs to a single promoted index block. It doesn't matter that the upper bound is not exactly the same. They both will likely lie in the same block, and if not, binary search will bring adjacent blocks into cache. Even if upper bound is not near, the binary search will populate the cache with blocks which can be used to narrow down the data file range somewhat. Fixes #10030. The change was tested with perf-fast-forward. I populated the data set with `column_index_size_in_kb` set to 1 scylla perf-fast-forward --populate --run-tests=large-partition-slicing --column-index-size-in-kb=1 Test run: build/release/scylla perf-fast-forward --run-tests=large-partition-select-few-rows -c1 --keep-cache-across-test-cases --test-case-duration=0 This test issues two reads of subsequent keys from the middle of a large partition (1M rows in total). The first read will miss in the index file page cache, the second read will hit. Notice that before the change, the second read issued 2 aio requests worth of 64KiB in total. After the change, the second read issued 1 aio worth of 2 KiB. That's because promoted index block is larger than 1 KiB. I verified using logging that the data file range matches a single promoted index block. Also, the first read which misses in cache is still faster after the change. Before: ``` running: large-partition-select-few-rows on dataset large-part-ds1 Testing selecting few rows from a large partition: stride rows time (s) iterations frags frag/s mad f/s max f/s min f/s avg aio aio (KiB) blocked dropped idx hit idx miss idx blk c hit c miss c blk allocs tasks insns/f cpu 500000 1 0.009802 1 1 102 0 102 102 21.0 21 196 2 1 0 1 1 0 0 0 568 269 4716050 53.4% 500001 1 0.000321 1 1 3113 0 3113 3113 2.0 2 64 1 0 1 0 0 0 0 0 116 26 555110 45.0% ``` After: ``` running: large-partition-select-few-rows on dataset large-part-ds1 Testing selecting few rows from a large partition: stride rows time (s) iterations frags frag/s mad f/s max f/s min f/s avg aio aio (KiB) blocked dropped idx hit idx miss idx blk c hit c miss c blk allocs tasks insns/f cpu 500000 1 0.009609 1 1 104 0 104 104 20.0 20 137 2 1 0 1 1 0 0 0 561 268 4633407 43.1% 500001 1 0.000217 1 1 4602 0 4602 4602 1.0 1 2 1 0 1 0 0 0 0 0 110 26 313882 64.1% ``` Backports: none, not a regression Closes scylladb/scylladb#20522 * github.com:scylladb/scylladb: perf: perf_fast_forward: Add test case for querying missing rows perf-fast-forward: Allow overriding promoted index block size perf-fast-forward: Test subsequent key reads from the middle in test_large_partition_select_few_rows perf-fast-forward: Allow adding key offset in test_large_partition_select_few_rows perf-fast-forward: Use single-partition reads in test_large_partition_select_few_rows sstables: bsearch_clustered_cursor: Add more tracing points sstables: reader: Log data file range sstables: bsearch_clustered_cursor: Unify skip_info logging sstables: bsearch_clustered_cursor: Narrow down range using "end" position of the block sstables: bsearch_clustered_cursor: Skip even to the first block test: sstables: sstable_3_x_test: Improve failure message sstables: mx: writer: Never include partition_end marker in promoted index block width sstables: Reduce amount of I/O for clustering-key-bounded reads from large partitions sstables: clustered_cursor: Track current block	2024-10-28 21:13:23 +02:00
Tomasz Grabiec	0f2101b055	utils: cached_file: Mark permit as awaiting on page miss Otherwise, the read will be considered as on-cpu during promoted index search, which will severely underutlize the disk because by default on-cpu concurrency is 1. I verified this patch on the worst case scenario, where the workload reads missing rows from a large partition. So partition index is cached (no IO) and there is no data file IO. But there is IO during promoted index search (via cached_file). Before the patch this workload was doing 4k req/s, after the patch it does 30k req/s. The problem is much less pronounced if there is data file or index file IO involved because that IO will signal read concurrency semaphore to invite more concurrency.	2024-10-28 19:54:58 +01:00
Tomasz Grabiec	868f5b59c4	utils: cached_file: Push resource_unit management down to cached_file It saves us permit operations on the hot path when we hit in cache. Also, it will lay the ground for marking the permit as awaiting later.	2024-10-28 19:49:58 +01:00
Kamil Braun	101c1d50f0	Merge 'fix nodetool status to show zero-token nodes' from Abhinav Kumar Jha In the current scenario, the nodetool status doesn’t display information regarding zero token nodes. For example, if 5 nodes are spun by the administrator, out of which, 2 nodes are zero token nodes, then nodetool status only shows information regarding the 3 non-zero token nodes. This commit intends to fix this issue by leveraging the “/storage_service/host_id ” API and adding appropriate logic in scylla-nodetool.cc to support zero token nodes. A test is also added in nodetool/test_status.py to verify this logic. This test fails without this commit’s zero token node support logic, hence verifying the behavior. This PR fixes a bug. Hence we need to backport it. Backporting needs to be done only to 6.2 version, since earlier versions don't support zero token nodes. Fixes: scylladb/scylladb#19849 Fixes: scylladb/scylladb#17857 Closes scylladb/scylladb#20909 * github.com:scylladb/scylladb: fix nodetool status to show zero-token nodes test: move `wait_for_first_completed` to pylib/util.py token_metadata: rename endpoint_to_host_id_map getter and add support for joining nodes	2024-10-28 12:19:36 +01:00
Kefu Chai	9f8adcd207	backup_task: track the first failure uploading sstables before this change, we only record the exception returned by `upload_file()`, and rethrow the exception. but the exception thrown by `update_file()` not populated to its caller. instead, the exceptional future is ignored on pupose -- we need to perform the uploads in parallel. this is why the task is not marked fail even if some of the uploads performed by it fail. in this change, we - coroutinize `backup_task_impl::do_backup()`. strictly speaking, this is not necessary to populate the exception. but, in order to ensure that the possible exception is captured before the gate is closed, and to reduce the intentation, the teardown steps are performed explicitly. - in addition to note down the exception in the logging message, we also store it in a local variable, which it rethrown before this function returns. Fixes scylladb/scylladb#21248 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21254	2024-10-28 12:54:27 +03:00
Tzach Livyatan	1878af9399	Update os-support-info.rst - add CentOS ScyllaDB support RHEL 9 and derivatives, including CentOS 9. Fix https://github.com/scylladb/scylladb/issues/21309 Closes scylladb/scylladb#21310	2024-10-28 10:02:31 +02:00
Kefu Chai	8ac471b74b	dht: do not include unused headers in `8d1b3223`, we removed some unused "#include"s, but we failed to address all of them in "dht" subdirectory. and the unaddressed "#include"s are identified by the iwyu workflow. in this change, we address the leftovers. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21291	2024-10-28 09:58:42 +02:00
Anna Stuchlik	44a807f5bc	doc: improve the README file in the docs folder This commit improves the README file so that it's more helpful to documentation contributors. Especially, it: - Adds the link to the prerequisites. - Add information on troubleshooting (checking the links, headings, etc.) - Removes the section on creating a knowledge base article, as we no longer promote adding KBs in favor of creating a coherent documentation set. Fixes https://github.com/scylladb/scylladb/issues/21257 Closes scylladb/scylladb#21262	2024-10-28 09:55:40 +02:00
Anna Stuchlik	212eb204a7	doc: set 6.2 as the latest stable version This commit updates the configuration for ScyllaDB documentation so that: - 6.2 is the latest version. - 6.2 is removed from the list of unstable versions. It must be merged when ScyllaDB 6.2 is released. In addition, this commit uncomments the redirections that should be applied when version 6.2 is the latest stable version (which will happen when this commit is merged). No backport is required. Closes scylladb/scylladb#21133	2024-10-28 09:45:37 +02:00
Pavel Emelyanov	420baf5035	api: Make get_highest_supported_sstable_version use proper service This endpoint now grabs one via database -> table -> sstables manager chain, but there's shorter route, namely via sstables format selector. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-10-28 10:18:57 +03:00
Pavel Emelyanov	61c8b571e5	api: Move system::get_highest_supported_sstable_version set/unset It's currently registered with all other system endpoints and is not unregistered. Its correct place is in the sstables-format-selector set/unset functions. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-10-28 10:18:23 +03:00
Pavel Emelyanov	f090bdabbb	api: Scaffold for sstables-format-selector This "service" will have its own endpoint soon Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-10-28 10:17:38 +03:00
Botond Dénes	31342ecb5d	Merge 'tasks: fix virtual tasks children' from Aleksandra Martyniuk Fix how regular tasks that have a virtual parent are created in task_manager::module::make_task: set sequence number of a task and subscribe to module's abort source. Fixes: #21278. Needs backport to 6.2 Closes scylladb/scylladb#21280 * github.com:scylladb/scylladb: tasks: fix sequence number assignment tasks: fix abort source subscription of virtual task's child	2024-10-28 08:59:40 +02:00
Aleksandra Martyniuk	85d9565158	test: repair: drop log checks from test_repair_succeeds_with_unitialized_bm Currently, test_repair_succeeds_with_unitialized_bm checks whether repair finishes successfully and the error is properly handled if batchlog_manager isn't initialized. Error handling depends on logs, making the test fragile to external conditions and flaky. Drop the error handling check, successful repair is a sufficient passing condition. Fixes: #21167. Closes scylladb/scylladb#21208	2024-10-28 08:39:16 +02:00
Botond Dénes	416159e5d9	Merge 'docs/alternator: explain service discovery HTTP requests' from Nadav Har'El Add a description of the service discovery HTTP requests - `/` and `/localnodes` that was previously not documented except in a design document that is unfortunately no longer available publically (https://docs.google.com/document/d/1twgrs6IM1B10BswMBUNqm7bwu5HCm47LOYE-Hdhuu_8/edit). Fixes https://github.com/scylladb/scylladb/issues/20989 Developer-oriented documentation so no need to backport. Closes scylladb/scylladb#21000 * github.com:scylladb/scylladb: docs/alternator: explain service discovery HTTP requests docs/alternator: split Alternator-specific APIs from alternator.md	2024-10-28 08:21:28 +02:00
Benny Halevy	2268912589	docs: add documentation for scylla_identifier Commit `3a12ad96c7` added an sstable_identifier uuid to the SSTable scylla_metadata component, however it was under-documented and this patch adds the missing documentation for the sstable component format, and to the scylla sstable tool documentation. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#21221	2024-10-28 08:18:08 +02:00
Kefu Chai	0f9d2ab577	build: cmake: disable Seastar exception hack in `cc3953e5`, we disabled Seastar exception hack in configure.py. this change disabled the Seastar exception hack in the following two builds: - build generated directly by configure.py - build configured with multi-config generator using CMake but we also have non-multi-config build using CMake. to be more consistent, let's apply the equivalent change to non-multi-config build of CMake. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21233	2024-10-28 08:11:43 +02:00
Botond Dénes	be70755f47	Merge 'repair: Fix finished ranges metrics for removenode' from Asias He The skipped ranges should be multiplied by the number of tables Otherwise the finished ranges ratio will not reach 100%. Fixes #21174 Closes scylladb/scylladb#21252 * github.com:scylladb/scylladb: test: Add test_node_ops_metrics.py repair: Make the ranges more consistent in the log repair: Fix finished ranges metrics for removenode	2024-10-28 08:09:32 +02:00
Asias He	9868ccbac0	test: Add test_node_ops_metrics.py It tests the node_ops_metrics_done metric reaches 100% when a node ops is done. Refs: #21174	2024-10-28 08:45:37 +08:00
Pavel Emelyanov	2f9f76fddf	sstables_loader: Mark to_replica_set() private It's not called from outside Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#21210	2024-10-27 22:28:54 +02:00
Anna Stuchlik	ef4bcf8b3f	doc: remove the Cassandra references from notedool This PR removes the reference to Cassandra from the nodetool index, as the native nodetool is no longer a fork. In addition, it removes the Apache copyright. Fixes https://github.com/scylladb/scylladb/issues/21238 Closes scylladb/scylladb#21240	2024-10-27 22:26:33 +02:00
Kefu Chai	e65185ba6f	build: merge scylla's and seastar's compilation database Since commit `415c83fa`, Seastar is built as an external project. As a result, the compile_commands.json file generated by ScyllaDB's CMake build system no longer contains compilation rules for Seastar's object files. This limitation prevents tools from performing static analysis using the complete dependency tree of translation units. This change merges Seastar's compilation database with ScyllaDB's and places the combined database in the source root directory, maintaining backward compatibility. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21234	2024-10-27 22:01:29 +02:00
Tomasz Grabiec	850d9cfb59	node-exporter: Disable hwmon collector This collector reads nvme temperature sensor, which was observed to cause bad performance on Azure cloud following the reading of the sensor for ~6 seconds. During the event, we can see elevated system time (up to 30%) and softirq time. CPU utilization is high, with nvm_queue_rq taking several orders of magnitude more time than normally. There are signs of contention, we can see __pv_queued_spin_lock_slowpath in the perf profile, called. This manifests as latency spikes and potentially also throughput drop due to reduced CPU capacity. By default, the monitoring stack queries it once every 60s. Closes scylladb/scylladb#21165	2024-10-27 21:59:15 +02:00
Kefu Chai	f5b29331a2	build: populate --enable-dist --disable-dist to CMake before this change, the "dist" targets are always enabled in the CMake-based building system. but the build rules generated by `configure.py` does respect `--enable-dist` and `--disable-dist` command line options, and enable/distable the dist targets respectively. in this change, we - add an CMake option named "Scylla_DIST". the "dist" subdirectory in CMake only if this option is ON. - pouplate the `--enable-dist` and `--disable-dist` option down to cmake by setting the `Scylla_DIST` option, when creating the build system using CMake. this enables the CMake-based build system to be functionality wise more closer to the legacy building system. Refs scylladb/scylladb#2717 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21253	2024-10-27 21:57:46 +02:00
Kefu Chai	24d14b601b	treewide: s/boost::adaptors::map_values/std::views::values/ now that we are allowed to use C++23. we now have the luxury of using `std::views::values`. in this change, we: - replace `boost::adaptors::map_values` with `std::views::values` - update affected code to work with `std::views::values` - the places where we use `boost::join()` are not changed, because we cannot use `std::views::concat` yet. this helper is only available in C++26. to reduce the dependency to boost for better maintainability, and leverage standard library features for better long-term support. this change is part of our ongoing effort to modernize our codebase and reduce external dependencies where possible. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21265	2024-10-27 21:32:45 +02:00
Avi Kivity	3124711fc4	Merge 'Report rows_merged in compaction_history rest api and nodetool' from Łukasz Paszkowski Currently, running the `nodetool compactionhistory` command or using the rest api `curl -X GET --header "Accept: application/json" "http://localhost:10000/compaction_manager/compaction_history"` return compaction history without the `row_merged` field. The series computes rows merged during compaction and provides this information to users via both the nodetool command and the rest api. The `rows_merged` field contains information on merged clustering keys across multiple sstable files. For instance, compacting two sstables of a table consisting of 7 rows where two rows are part of the both sstables, the output would have the following format: {1: 5, 2: 2}. No backport is required. It extends the existing compaction history output. Fixes https://github.com/scylladb/scylladb/issues/666 Closes scylladb/scylladb#20481 * github.com:scylladb/scylladb: test/rest_api: Add tests for compactionhistory nodetool: Add rows merged stats into compactionhistory output compaction: Update compaction history with collected histogram compaction: Remove const qualifier from methods creating sstable readers sstable_set: Add optional statistics to make_local_shard_sstable_reader make_combined_reader: Add optional parameter, combined_reader_statistics reader_selector: Extend with maximum reader count mutation_fragment_merger: Create histogram while consuming mutation fragment batches	2024-10-27 21:26:11 +02:00
Kefu Chai	158008dd2c	mutation_writer: simplify classification using with_deserialized() return value Since `with_deserialized()` returns the lambda function's result, we can directly return the bucket from within the lambda instead of relying on side effects. This makes the code more explicit and functional. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21273	2024-10-27 21:20:55 +02:00
Nadav Har'El	6fdd0ebd3b	RBAC: confirm that unprivileged users can't read the roles table A worry was raised that an unprivileged user might be able to read the system.roles table - which contains the Alternator secret keys (and also CQL's hashed passwords). This patch adds tests that show that this worry is unjustified - and acts as a regression test to ensure it never becomes justified. The tests show that an unprivileged user cannot read the system.roles table using either CQL or Alternator APIs. More specifically, the two tests in this patch demonstrate that: * The Alternator API does not allow an unprivileged user to read ANY system table, unless explicitly granted permissions for that table. * The CQL API whitelists (see service::client_state::has_access) specific system tables - e.g., system_schema.tables - that are made readable to any unprivileged user. But the system.auth table is NOT whitelisted in this way - and is unreadable to unprivileged users unless explicitly granted permissions on that table. The new tests passes on both Scylla and Casssandra. Refs #5206 (that issue is about removing the Alternator secret keys from the roles table - but stealing CQL salted hashes is still pretty bad, so it's good to know that unprivileged users can't read them). Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#21215	2024-10-27 21:09:38 +02:00
Nadav Har'El	1634a64ffd	cql-pytest: test a few small materialized views CQL issues While documenting materialized view in a new document (Refs #16569) I encountered a few questions on how various CQL operations work on a table that has views, and this patch contains tests that clarify their answer - and can later guarantee that the answer doesn't unintentionally change in the future. The questions that these tests answer are: 1. That TRUNCATE on a base table also TRUNCATEs its views. This is just a basic test, with no attempt to reproduce issue #17635 (which is about the truncation of the base and views not being atomic). 2. That DROP TABLE is not allowed on a base table that has views. 3. That DROP KEYSPACE is allowed, even if there are tables with views. 4. Test that ALTER TABLE tbl DROP is never allowed in Cassandra, but allowed in some cases by Scylla 5. Test that ALTER TABLE tbl ADD is allowed, and "SELECT *" expands to select the new column into the materialized view as well. All the new tests pass on both Scylla and Cassandra. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#21142	2024-10-27 21:08:28 +02:00
Botond Dénes	7c75fc599f	streaming: stream-session: switch to tracking permit The stream-session is the receiving end of streaming, it reads the mutation fragment stream from an RPC stream and writes it onto the disk. As such, this part does no disk IO and therefore, using a permit with count resources is superfluous. Furthermore, after `d98708013c`, the count resources on this permit can cause a deadlock on the receiver end, via the `db::view::check_view_update_path()`, which wants to read the content of a system table and therefore has to obtain a permit of its own. Switch to a tracking-only permit, primarily to resolve the deadlock, but also because admission is not necessary for a read which does no IO. Refs: scylladb/scylladb#20885 (partial fix, solves only one of the deadlocks) Fixes: scylladb/scylladb#21264 Closes scylladb/scylladb#21059	2024-10-27 20:01:25 +02:00
Avi Kivity	7ffbfe8bb3	Merge 'Squash some sstables::test helpers' from Pavel Emelyanov There's a `missing_summary_first_last_sane` test case that uses some very specific way of modifying an sstable -- it loads one from resources, then tries to "write" the loaded stuff elsewhere. For that it uses a special purpose test::store() helper and a bunch of auxiliary ones from the same class. Those aux helpers are not used anywhere else and are also very special for this test case, so it make sense to keep this whole functionality in a single helper. Closes scylladb/scylladb#21255 * github.com:scylladb/scylladb: test: Squash test::change_generation_number() into test::store() test: Squash test::change_dir() into test::store() test: Coroutinize sstables::test::store()	2024-10-27 19:59:59 +02:00
Anna Stuchlik	aa0dadea48	doc: extend the ToC for CDC This commit adds the missing links to the CDC index page. Fixes https://github.com/scylladb/scylladb/issues/21137 Closes scylladb/scylladb#21286	2024-10-27 19:57:59 +02:00
Anna Stuchlik	b2b9622e32	doc: fix redundant references to version 6.2 This commit removes mentions of version 6.2 that were introduced with https://github.com/scylladb/scylladb/pull/17969. Now that the documentation is versioned, there should be no reference to specific versions. Fixes https://github.com/scylladb/scylladb/issues/21276 Closes scylladb/scylladb#21277	2024-10-27 14:47:40 +02:00
Paweł Zakrzewski	b077685fec	test/cql-pytest: GROUP BY with static columns This commit adds a new test case 'test_group_by_static_column_and_tombstones' to verify the behavior of GROUP BY queries with static columns. The test is adapted from Cassandra's test suite and aims to reproduce issue #21267. Original, larger test: cassandra_tests/validation/operations/select_group_by_test.py::testGroupByWithPaging() Closes scylladb/scylladb#21270	2024-10-27 14:45:53 +02:00
Aleksandra Martyniuk	910a6fc032	tasks: fix sequence number assignment Currently, children of virtual tasks do not have sequence number assigned. Fix it.	2024-10-25 15:30:13 +02:00

1 2 3 4 5 ...

45200 Commits