scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-29 20:57:00 +00:00

Author	SHA1	Message	Date
Gleb Natapov	acbc667d3e	storage_service: set raft topology change mode before using it in join_cluster ss::join_cluster calls raft_topology_change_enabled() before the mode is initialized below in the same function. Fix it by changing the order.	2025-01-02 18:44:19 +02:00
Gleb Natapov	491b7232de	locator: drop inet_address usage to figure out per dc/rack replication It allows to correctly calculate replication map even without knowing IPs of the nodes.	2025-01-02 18:44:19 +02:00
Gleb Natapov	c4b26ba8dc	test: drop test_old_ip_notification_repro.py The test no longer test anything since the address map is updated much earlier now by the gossiper itself, not by the notifiers. The functionality is tested by a unit test now.	2025-01-01 12:43:11 +02:00
Gleb Natapov	c4db90799a	test: address_map: check generation handling during entry addition Check that adding an entry with smaller generation does not overwrite existing entry.	2025-01-01 12:43:11 +02:00
Benny Halevy	85bd799308	storage_service: replicate_to_all_cores: prevent stalls when preparing per-table erms Although the `network_topology_stratergy::make_replication_map` -> `tablet_aware_replication_strategy::do_make_replication_map` is not cpu intensive it still allocates and constructs a shared `tablet_effective_replication_map`, and that might stall with thousands of tablet-based tables. Therefore coroutinize the preparation loop to allow yielding. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2024-12-31 14:52:39 +01:00
Gleb Natapov	745b6d7d0d	gossiper: ignore gossiper entries with local host id in gossiper mode as well We already ignore a gossiper entries with host id equal to local host id in raft mode since those entries are just outdated entries since before ip change. The same logic applies to gossiper mode as well though, so do the same in both modes. Fixes: scylladb/scylladb#21930 Message-ID: <Z20kBZvpJ1fP9WyJ@scylladb.com>	2024-12-31 15:50:12 +02:00
Avi Kivity	76cf5148e1	Merge 'message: introduce advanced rpc compression' from Michał Chojnowski This is a forward port (from scylla-enterprise) of additional compression options (zstd, dictionaries shared across messages) for inter-node network traffic. It works as follows: After the patch, messaging_service (Scylla's interface for all inter-node communication) compresses its network traffic with compressors managed by the new advanced_rpc_compression::tracker. Those compressors compress with lz4, but can also be configured to use zstd as long as a CPU usage limit isn't crossed. A precomputed compression dictionary can be fed to the tracker. Each connection handled by the tracker will then start a negotiation with the other end to switch to this dictionary, and when it succeeds, the connection will start being compressed using that dictionary. All traffic going through the tracker is passed as a single merged "stream" through dict_sampler. dictionary_service has access to the dict_sampler. On chosen nodes (in the "usual" configuration: the Raft leader), it uses the sampler to maintain a random multi-megabyte sample of the sampler's stream. Every several minutes, it copies the sample, trains a compression dictionary on it (by calling zstd's training library via the alien_worker thread) and publishes the new dictionary to system.dicts via Raft's write_mutation command. This update triggers (eventually) a callback on all nodes, which feeds the new dictionary to advanced_rpc_compression::tracker, and this switches (eventually) all inter-node connections to this dictionary. Closes scylladb/scylladb#22032 * github.com:scylladb/scylladb: messaging_service: use advanced_rpc_compression::tracker for compression message/dictionary_service: introduce dictionary_service service: make Raft group 0 aware of system.dicts db/system_keyspace: add system.dicts utils: add advanced_rpc_compressor utils: add dict_trainer utils: introduce reservoir_sampling utils: introduce alien_worker utils: add stream_compressor	2024-12-31 15:02:57 +02:00
Evgeniy Naydanov	4260f3f55a	test.py: topology_random_failures: log randomization parameters in test Logging randomization parameters in the pytest_generate_tests hook doesn't play well for us. To make these parameters more visible move the logging to the test level. Closes scylladb/scylladb#22055	2024-12-31 14:23:47 +02:00
Avi Kivity	2b48c2e72a	Merge 'build: add support for LTO and PGO to the building system' from Kefu Chai This changeset ports LTO and PGO support from scylla-enterprise.git to scylladb.git. Add support for Link-Time Optimization (LTO) and Profile-Guided Optimization (PGO) to improve performance. LTO provides ~7% performance gain and enables crucial binary layout optimizations for PGO. LTO Changes: - Add `-flto` flag to compile and link steps - Use `-ffat-lto-objects` to generate both LLVM IR and machine code - Enable cross-object optimization while maintaining fast test linking PGO Implementation: - Implement three-stage build process: 1. Context-free profiling (`-fprofile-generate`) 2. Context-sensitive profiling (`-fprofile-use` + `-fcs-profile-generate`) 3. Final optimization using merged profiles - Add release-pgo and release-cs-pgo build stages - Integrate with ninja build system - Stages can be enabled independently Profile Management: - Add `pgo/pgo.py` for workload profile collection - Store default profile in `pgo/profiles/profile.profdata.xz` using Git LFS - Add configure.py integration for profile detection and validation - Support custom profiles via `--use-profile` flag - Add profile regeneration script Both optimizations are recommended for maximum performance, though each PGO stage adds a full build cycle. Future optimization may allow dropping one PGO stage if performance impact is minimal. --- this is a forward port, hence no need to backport. Closes scylladb/scylladb#22039 * github.com:scylladb/scylladb: build: cmake: add CMake options for PGO support build: cmake: add "Scylla_ENABLE_LTO" option build: set LTO and PGO flags for Seastar in cmake build build: collect scylla libraries with `scylla_libs` variable build: Unify Abseil CXX flags configuration configure.py: prepare the build for a default PGO profile in version control configure.py: introduce profile-guided optimization pgo: add alternator workloads training pgo: add a repair workload pgo: add a counters workload pgo: add a secondary index workload pgo: add a LWT workload pgo: add a decommission workload pgo: add a clustering workload pgo: add a basic workload pgo: introduce a PGO training script configure.py: don't include non-default modes in dist-server-* rules configure.py: enable LTO in release builds by default configure.py: introduce link-time optimization configure.py: add a `default` to `add_tristate`. configure.py: unify build rules for cxxbridge .cc files and regular .cc files	2024-12-31 14:14:40 +02:00
Avi Kivity	4905b1bf76	Merge 'table: make update_effective_replication_map sync again' from Benny Halevy Commit `f2ff701489` introduced a yield in update_effective_replication_map that might cause the storage_group manager to be inconsistent with the new effective_replication_map (e.g. if yielding right before calling `handle_tablet_split_completion`. Also, yielding inside storage_service::replicate_to_all_cores update loop means that base tables and their views aren't updated atomically, that caused scylladb/scylladb#17786 This change essentially reverts `f2ff701489` and makes handle_tablet_split_completion synchronous too. The stopped compaction groups future is kept as a member and storage_group_manager::stop() consumes this future during table::stop(). - storage_service: replicate_to_all_cores: update base and view tables atomically Currently, the loop updating all tables (including views) with the new effective_replication_map may yield, and therefore expose a state where the base and view tables effective_replication_map and topology are out of sync (as seen in scylladb/scylladb#17786) To prevent that, loop over all base tables and for each table update the base table and all views atomically, without yielding, and so allow yielding only between base tables. * Regression was introduced in `f2ff701489`, so backport is required to 6.x, 2024.2 Closes scylladb/scylladb#21781 * github.com:scylladb/scylladb: storage_service: replicate_to_all_cores: clear_gently pending erms test_mv_topology_change: drop delay_after_erm_update injection case storage_service: replicate_to_all_cores: update base and view tables atomically table: make update_effective_replication_map sync again	2024-12-30 23:42:06 +02:00
Tomasz Grabiec	bf3d0b3543	reader_concurrency_semaphore: Optimize resource_units destruction by postponing wait list processing Observed 3% throughput improvement in sstable-heavy workload bounded by CPU. SStable parsing involves lots of buffer operations which obtain and destroy resource_units. Before the patch, reosurce_unit destruction invoked maybe_admit_waiters(), which performs some computations on waiting permits. We don't really need to admit on each change of resources, since the CPU is used by other things anyway. We can batch the computation. There is already a fiber which does this for processing the _ready_list. We can reuse it for processing _wait_list as well. The changes violate an assumption made by tests that releasing resources immediately triggers an admission check. Therefore, some of the BOOST_REQUIRE_EQUAL needs to be replaced with REQUIRE_EVENTUALLY_EQUAL as the admision check is now done in the fiber processing the _ready_list. `perf-simple-query` --tablets --smp 1 -m 1G results obtained for fixed 400MHz frequency: Before: ``` enable-cache=1 Running test with config: {partitions=10000, concurrency=100, mode=read, frontend=cql, query_single_key=no, counters=no} Disabling auto compaction Creating 10000 partitions... 112590.60 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 41353 insns/op, 17992 cycles/op, 0 errors) 122620.68 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 41310 insns/op, 17713 cycles/op, 0 errors) 118169.48 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 41353 insns/op, 17857 cycles/op, 0 errors) 120634.65 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 41328 insns/op, 17733 cycles/op, 0 errors) 117317.18 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 41347 insns/op, 17822 cycles/op, 0 errors) throughput: mean=118266.52 standard-deviation=3797.81 median=118169.48 median-absolute-deviation=2368.13 maximum=122620.68 minimum=112590.60 instructions_per_op: mean=41337.86 standard-deviation=18.73 median=41346.89 median-absolute-deviation=14.64 maximum=41352.53 minimum=41309.83 cpu_cycles_per_op: mean=17823.50 standard-deviation=111.75 median=17821.97 median-absolute-deviation=90.45 maximum=17992.04 minimum=17713.00 ``` After ``` enable-cache=1 Running test with config: {partitions=10000, concurrency=100, mode=read, frontend=cql, query_single_key=no, counters=no} Disabling auto compaction Creating 10000 partitions... 123689.63 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 40997 insns/op, 17384 cycles/op, 0 errors) 129643.24 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 40997 insns/op, 17325 cycles/op, 0 errors) 128907.27 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 41009 insns/op, 17325 cycles/op, 0 errors) 130342.56 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 40993 insns/op, 17286 cycles/op, 0 errors) 130294.09 tps ( 63.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 40972 insns/op, 17336 cycles/op, 0 errors) throughput: mean=128575.36 standard-deviation=2792.75 median=129643.24 median-absolute-deviation=1718.73 maximum=130342.56 minimum=123689.63 instructions_per_op: mean=40993.51 standard-deviation=13.23 median=40996.73 median-absolute-deviation=3.30 maximum=41008.86 minimum=40972.48 cpu_cycles_per_op: mean=17331.16 standard-deviation=35.02 median=17324.84 median-absolute-deviation=6.49 maximum=17383.97 minimum=17286.33 ``` Closes scylladb/scylladb#21918 [avi: patch was co-authored by Łukasz Paszkowski <lukasz.paszkowski@scylladb.com>]	2024-12-30 23:37:46 +02:00
Avi Kivity	b32b7ab806	Merge 'test.py: only access combined_tests executable if it is built' from Konstantin Osipov test.py: only access combined_tests executable if it is built Fixes #22038 Closes scylladb/scylladb#22069 * github.com:scylladb/scylladb: test.py: only access combined_tests if it exists test.py: rethrow CancelledError when executing a test	2024-12-30 15:15:39 +02:00
Piotr Smaron	2352063f20	server: set `connection_stage` to READY when authenticated If authentication is enabled, but STARTUP isn't followed by REGISTER (which is optional, and in practice only happens on only one of a driver's connections — because there's no point listening for the same events on multiple connections), connections are wrongly displayed in the system.clients as AUTHENTICATING instead of READY, even when they are ready. This commit fixes this problem. Fixes: scylladb/scylladb#12640 Closes scylladb/scylladb#21774	2024-12-30 14:04:26 +02:00
Kefu Chai	6281fb825f	test/pytest.ini: ignore warning on deprecated record_property fixture `record_property` generates XML which is not compatible with xunit2, so pytest decided to deprecated when the generating xunit reports. and pytest generates following warning when a test failure is reported using this fixture: ``` object_store/test_backup.py:337: PytestWarning: record_property is incompatible with junit_family 'xunit2' (use 'legacy' or 'xunit1') ``` this warning is not related to the test, but more about how we report a failure using pytrest. it is distracting, so let's silence it. See also https://github.com/pytest-dev/pytest/issues/5202 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22067	2024-12-30 10:58:31 +02:00
Nadav Har'El	27180620af	Merge 'topology_random_failures: deselect more cases which can cause #21534 ' from Evgeniy Naydanov There are many CI failures (repros of https://github.com/scylladb/scylladb/issues/21534) which caused by `stop_after_setting_mode_to_normal_raft_topology` and `stop_before_becoming_raft_voter` error injections in combination with some cluster events. Need to deselect them for now to make CI more stable. First batch deselected in https://github.com/scylladb/scylladb/pull/21658 Also, add the handling of topology state rollback caused by `stop_before_streaming` or `stop_after_updating_cdc_generation` error injections as a separate commit. See also https://github.com/scylladb/scylladb/issues/21872 and https://github.com/scylladb/scylladb/issues/21957 Closes scylladb/scylladb#22044 * github.com:scylladb/scylladb: test.py: topology_random_failures: more deselects for #21534 test.py: topology_random_failures: handle more node's hangs during 30s sleep	2024-12-30 10:52:22 +02:00
Konstantin Osipov	8b7a5ca88d	test.py: only access combined_tests if it exists When the scylla source tree is only partially built, we still may want to run the tests. test.py builds a case cache at boot, and executes --list-cases for that, for all built tests. After amalgamating boost unit tests into a single file, it started running it unconditionally, which broke partial builds. Hence, only use combined_tests executable if it exists. Fixes #22038	2024-12-27 14:54:13 -05:00
Konstantin Osipov	2b1ba9c3fd	test.py: rethrow CancelledError when executing a test Commit `870f3b00fc`, "Add option to fail after number of failures" adds tracking on the number of cancelled tests. For the purpose, it intercepts CancelledError and sets test's is_cancelled flag. This introduced a regression reported in gh-21636: Ctrl-C no longer works, since CancelledError is muted. There was no intent to mute the exception, re-throw it after accounting the test as cancelled.	2024-12-27 14:40:47 -05:00
Michał Chojnowski	fdb2d2209c	messaging_service: use advanced_rpc_compression::tracker for compression This patch sets up an `alien_worker`, `advanced_rpc_compression::tracker`, `dict_sampler` and `dictionary_service` in `main()`, and wires them to each other and to `messaging_service`. `messaging_service` compresses its network traffic with compressors managed by the `advanced_rpc_compression::tracker`. All this traffic is passed as a single merged "stream" through `dict_sampler`. `dictionary_service` has access to `dict_sampler`. On chosen nodes (by default: the Raft leader), it uses the sampler to maintain a random multi-megabyte sample of the sampler's stream. Every several minutes, it copies the sample, trains a compression dictionary on it (by calling zstd's training library via the `alien_worker` thread) and publishes the new dictionary to `system.dicts` via Raft. This update triggers a callback into `advanced_rpc_compression::tracker` on all nodes, which updates the dictionary used by the compressors it manages.	2024-12-27 10:17:58 +01:00
Kefu Chai	6adf70ec03	build: cmake: add CMake options for PGO support - "Scylla_BUILD_INSTRUMENTED" option Scylla_BUILD_INSTRUMENTED allows us to instrument the code at different level, namely, IR, and CSIR. this option mirrors "--pgo" and "--cspgo" options in `configure.py` . please note, the instrumentation at the frontend is not supported, as the IR based instrumentation is better when it comes to the use case of optimization for performance. see https://lists.llvm.org/pipermail/llvm-dev/2015-August/089044.html for the rationales. - "Scylla_PROFDATA_FILE" option this option allows us to specify the profile data previous generated with the "Scylla_BUILD_INSTRUMENTED" option. this option mirrors the `--use-profile` option in `configure.py`, but it does not take the empty option as a special case and consider it as a file fetched from Git LFS. that will be handled by another option in a follow-up change. please note, one cannot use -DScylla_BUILD_INSTRUMENTED=PGO and -DScylla_PROFDATA_FILE=... at the same time. clang just does not allow this. but CSPGO is fine. - "Scylla_PROFDATA_COMPRESSED_FILE" option this option allows us to specify the compressed profile data previouly generated with the "Scylla_BUILD_INSTRUMENTED" option. along with "Scylla_PROFDATA_FILE", this option mirros the functionality of `--use-profile` in `configure.py`. the goal is to ensure user always gets the result with the specified options. if anything goes wrong, we just error out. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-12-27 16:16:04 +08:00
Kefu Chai	4154789670	build: cmake: add "Scylla_ENABLE_LTO" option add an option named "Scylla_ENABLE_LTO", which is off by default. if it is on, build the whole tree with ThinLTO enabled. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-12-27 16:16:04 +08:00
Kefu Chai	2647369d46	build: set LTO and PGO flags for Seastar in cmake build This change extends scylla commit `7cb74df` to scylla-enterprise-commit 4ece7e1. we recently started building Seastar as an external project, so we need to prepare its compilation flags separately. in enterprise scylla, we prepare the LTO and PGO related cflags in `prepare_advanced_optimizations()`. this function is called when preparing the build rules directly from `configure.py`, and despite we have equivalant settings in CMake, they cannot be applied to Seastar due to the reason above. in this change, we set up the the LTO and PGO compilation flags when generating the buiding system for Seastar when building using CMake. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-12-27 16:16:04 +08:00
Kefu Chai	ffe8c5dcdb	build: collect scylla libraries with `scylla_libs` variable with which, we can set the properties of these targets in a single place. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-12-27 16:16:04 +08:00
Kefu Chai	610f1b7a0a	build: Unify Abseil CXX flags configuration - Set ABSL_GCC_FLAGS and ABSL_LLVM_FLAGS with a more generic absl_cxx_flags - Enables more flexible configuration of compiler flags for Abseil libraries - Provides a centralized approach to setting compilation flags Previously, sanitizer-specific flags were directly applied to Abseil library builds. This change allows for more extensible compiling flag management across different build configurations. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-12-27 16:16:04 +08:00
Michał Chojnowski	131b1d6f81	configure.py: prepare the build for a default PGO profile in version control This patch adds the following logic to the release build: pgo/profiles/profile.profdata.xz is the default profile file, compressed. This file is stored in version control using git LFS. A ninja rule is added which creates build/profile.profdata by decompressing it. If no profile file is explicitly specified, ./configure.py checks whether the compressed default profile file exists and is compressed. (If it exists, but isn't compressed, the user most likely has git lfs disabled or not installed. In this case, the file visible in the working tree will be the LFS placeholder text file describing the LFS metadata.) If the compressed file exists, build/profile.profdata is chosen as the used profile file. If it doesn't exist, a warning is printed and configure.py falls back to a profileless build. The default profile file can be explicitly disabled by passing the empty --use-profile="" to configure.py A script is added which re-generates the profile. After the script is run, the re-generated compressed profile can be staged, committed, pushed and merged to update the default profile.	2024-12-27 16:16:04 +08:00
Michał Chojnowski	a868b44ad8	configure.py: introduce profile-guided optimization This commit enables profile-guided optimizations (PGO) in the Scylla build. A full LLVM PGO requires 3 builds: 1. With -fprofile-generate to generate context-free (pre-inlining) profile. This profile influences inlining, indirect-call promotion and call graph simplifications. 2. With -fprofile-use=results_of_build_1 -fcs-profile-generate to generate context-sensitive (post-inlining) profile. This profile influences post-inline and codegen optimizations. 3. With -fprofile-use=merged_results_of_builds_1_2 to build the final binary with both profiles. We do all three in one ninja call by adding release-pgo and release-cs-pgo "stages" to release. They are a copy of regular release mode, just with the flags described above added. With the full course, release objects depend on the profile file produced by build/release-cs-pgo/scylla, while release-cs-pgo depends on the profile file generated by build/release-pgo/scylla. The stages are orthogonal and enabled with separate options. It's recommended to run them both for full performance, but unfortunately each one adds a full build of scylla to the compile time, so maybe we can drop one of them in the future if it turns out e.g. that regular PGO doesn't have a big effect. It's strongly recommended to combine PGO with LTO. The latter enables the entire class of binary layout optimizations, which for us is probably the most important part of the entire thing.	2024-12-27 16:16:04 +08:00
Marcin Maliszkiewicz	80989556ac	pgo: add alternator workloads training This patch adds a set of alternator workloads to pgo training script. To confirm that added workloads are indeed affecting profile we can compare: ⤖ llvm-profdata show ./build/release-pgo/profiles/workdirs/clustering/prof.profdata Instrumentation level: IR entry_first = 0 Total functions: 105075 Maximum function count: 1079870885 Maximum internal block count: 2197851358 and ⤖ llvm-profdata show ./build/release-pgo/profiles/workdirs/alternator/prof.profdata Instrumentation level: IR entry_first = 0 Total functions: 105075 Maximum function count: 5240506052 Maximum internal block count: 9112894084 to see that function counters are on similar levels, they are around 5x higher for alternator but that's because it combines 5 specific sub-workloads. To confirm that final profile contains alterantor functions we can inspect: ⤖ llvm-profdata show --counts --function=alternator --value-cutoff 100000 ./build/release-pgo/profiles/merged.profdata (...) Instrumentation level: IR entry_first = 0 Functions shown: 356 Total functions: 105075 Number of functions with maximum count (< 100000): 97275 Number of functions with maximum count (>= 100000): 7800 Maximum function count: 7248370728 Maximum internal block count: 13722347326 we can see that 356 functions which symbol name contains word alternator were identified as 'hot' (with max count grater than 100'000). Running: ⤖ llvm-profdata show --counts --function=alternator --value-cutoff 1 ./build/release-pgo/profiles/merged.profdata (...) Instrumentation level: IR entry_first = 0 Functions shown: 806 Total functions: 105075 Number of functions with maximum count (< 1): 67036 Number of functions with maximum count (>= 1): 38039 Maximum function count: 7248370728 Maximum internal block count: 13722347326 we can see that 806 alternator functions were executed at least once during training. And finally to confirm that alternator specific PGO brings any speedups we run: for workload in read scan write write_gsi write_rmw do ./build/release/scylla perf-alternator-workloads --smp 4 --cpuset "10,12,14,16" --workload $workload --duration 1 --remote-host 127.0.0.1 2> /dev/null \| grep median done results BEFORE: median 258137.51910849303 median absolute deviation: 786.06 median 547.2578202937141 median absolute deviation: 6.33 median 145718.19856685458 median absolute deviation: 5689.79 median 89024.67095807113 median absolute deviation: 1302.56 median 43708.101729598646 median absolute deviation: 294.47 results AFTER: median 303968.55333940056 median absolute deviation: 1152.19 median 622.4757636209254 median absolute deviation: 8.42 median 198566.0403745328 median absolute deviation: 1689.96 median 91696.44912842038 median absolute deviation: 1891.84 median 51445.356525664996 median absolute deviation: 1780.15 We can see that single node cluster tps increase is typically 13% - 17% with notable exceptions, improvement for write_gsi is 3% and for write workload whopping 36%. The increase is on top of CQL PGO. Write workload is executed more often because it's involved also as data preparation for read and scan. Some further improvement could be to separate preparation from training as it's done for CQL but it would be a bit odd if ~3x higher counters for one flow have so big impact. Additional disclaimers: - tests are performing exactly the same workloads as in training so there might be some bias - tests are running single node cluster, more realistic setup will likely show lower improvement Fixes https://github.com/scylladb/scylla-enterprise/issues/4066	2024-12-27 16:16:04 +08:00
Michał Chojnowski	95c8d88b96	pgo: add a repair workload This workload is added to teach PGO about repair. Tests are inconclusive about its alignment with existing workloads, because repair doesn't seem utilize 100% of the reactor.	2024-12-27 16:16:04 +08:00
Michał Chojnowski	1c9ce0a9ee	pgo: add a counters workload This workload is added to teach PGO about counters. Tests seem to show it's mostly aligned with existing CQL workloads. The config YAML is based on the default cassandra-stress schema.	2024-12-27 16:16:04 +08:00
Michał Chojnowski	47dc0399cb	pgo: add a secondary index workload This workload is added to teach PGO about secondary indexes. Tests seem to show that it's mostly aligned with existing CQL workloads. The config YAML was copied from one of scylla-cluster-test test cases.	2024-12-27 16:16:04 +08:00
Michał Chojnowski	e67f4a5c51	pgo: add a LWT workload This workload is added to teach PGO about LWT codepaths. Tests seem to show that it's mostly aligned with existing CQL workloads. The config YAML was copied from one of scylla-cluster-tests test cases.	2024-12-27 16:16:04 +08:00
Michał Chojnowski	e217c124a6	pgo: add a decommission workload This workload is added to teach PGO about streaming. Tests show that this workload is mostly orthogonal to CQL workloads (where "orthogonal" means that training on workload A doesn't improve workload B much, while training on workload A doesn't improve workload B much), so adding it to the training is quite important.	2024-12-27 16:16:04 +08:00
Michał Chojnowski	65abecaede	pgo: add a clustering workload In contrast to the basic workload, this workload uses clustering keys, CK range queries, RF=1, logged batches, and more CQL types. Tests seem to show that this workload is mostly aligned with the existing basic workload (where "aligned" means that training on workload A improves workload B about as much as training on workload B). The config YAML is based on the example YAML attached to cassandra-stress sources.	2024-12-27 16:16:04 +08:00
Michał Chojnowski	c1297dbcd2	pgo: add a basic workload This commit adds the default cassandra-stress workload to the PGO training suite.	2024-12-27 16:16:04 +08:00
Michał Chojnowski	f73b122de3	pgo: introduce a PGO training script Profile-guided optimization consists of the following steps: 1. Build the program as usual, but with with special options (instrumentation or just some supplementary info tables, depending on the exact flavor of PGO in use). 2. Collect an execution profile from the special binary by running a training workload on it. 3. Rebuild the program again, using the collected profile. This commit introduces a script automating step 2: running PGO training workloads on Scylla. The contents of training workloads will be added in future commits. The changes in configure.py responsible for steps 1. and 3. will also appear in future commits. As input, the script takes a path to the instrumented binary, a path to a the output file, and a directory with (optionally) prepopulated datasets for use in training. The output profile file can be then passed to the compiler to perform a PGO build. The script current supports two kinds of PGO instrumentation: LLVM instrumentation (binary instrumented with -fprofile-generate and -fcs-profile-generate passed to clang during compilation) and BOLT instrumentation (binary instrumented with `llvm-bolt -instrument`, with logs from this operation saved to $binary_path.boltlog) The actual training workloads for generating the profile will be added in later commits.	2024-12-27 16:16:04 +08:00
Michał Chojnowski	6f01ceae3d	configure.py: don't include non-default modes in dist-server-* rules dist-server-tar only includes default modes. Let dist-server-deb and dist-server-rpm behave consistently with it.	2024-12-27 16:16:04 +08:00
Michał Chojnowski	dd1a847d61	configure.py: enable LTO in release builds by default	2024-12-27 16:16:04 +08:00
Michał Chojnowski	4b03b91fbd	configure.py: introduce link-time optimization This patch introduces link-time optimization (LTO) to the build. The performance gains from LTO alone are modest (~7%), but it's vital ingredient of effective profile-guided optimization, which will be introduced later. In general, use of LTO is quite simple and transparent to build systems. It is sufficient to add the -flto flag to compile and link steps, and use a LTO-aware linker. At compile time, -ffat-lto-objects will cause the compiler to emit .o files both LTO-ready LLVM IR for main executable optimization and machine code for fast test linking. At link time, those pieces of IR will be compiled together, allowing cross-object optimization of the main executable and the fast linking of test executables. Due to it's high compile time cost, the optimization can be toggled with a configure.py option. As of this patch, it's disabled by default.	2024-12-27 16:16:04 +08:00
Michał Chojnowski	192cb6de4b	configure.py: add a `default` to `add_tristate`. It will be used in the next patch.	2024-12-27 16:16:04 +08:00
Michał Chojnowski	1224200d7a	configure.py: unify build rules for cxxbridge .cc files and regular .cc files This is going to prevent some code duplication in following patches.	2024-12-27 16:16:04 +08:00
Benny Halevy	3e22998dc1	sstables: parse(summary): reserve positions vector We know the number of positions in advance so reserve the chunked_vector capacity for that. Note: reservation replaces the existing reset of the positions member. This is safe since we parse the summary only once as sstable::read_summary() returns early if the summary component is already populated. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#21767	2024-12-26 13:33:29 +02:00
Yaron Kaikov	bc487c9456	.github: cherry-pick each commit instead of merge commit when available Until today, when we had a PR with multiple commits we cherry-pick the merge commit only, which created a PR with only one commit (the merge commit) with all relevant changes This was causing an issue when there was a need to backport part of the commits like in https://github.com/scylladb/scylladb/pull/21990 (reported by @gleb-cloudius) Changing the logic to cherry-pick each commit Closes scylladb/scylladb#22027	2024-12-26 13:10:18 +02:00
Kefu Chai	6acc5294a4	treewide: migrate from boost::copy_range to std::ranges::to now that we are allowed to use C++23. we now have the luxury of using `std::ranges::to`. in this change, we: - replace `boost::copy_range` to `std::ranges::to` - remove unused `#include` of boost headers Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21880	2024-12-26 11:46:26 +02:00
Kefu Chai	6c031ad92f	test/topology: Percent-encode URL in pytest artifact links When embedding HTML documents in pytest reports with links to test artifacts, parameterized test names containing special characters like "[" and "]" can cause URL encoding issues. These characters, when used verbatim in URLs, can trigger HTTP 400 errors on web servers. This commit resolves the issue by percent-encoding the URLs for artifact links, ensuring compatibility with servers like Jenkins and preventing "HTTP ERROR 400 Illegal Path Character" errors. Changes: - Percent-encode test artifact URLs to handle special characters - Improve link robustness for parameterized test names Fixes scylladb/scylla-pkg#4599 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21963	2024-12-26 10:23:52 +02:00
Konstantin Osipov	d87e1eb7ef	test: merge topology_experimental_raft into topology_custom This enables tablets in topology_custom, so explicitly disable them where tests don't support tablets. In scope of this rename patch a few imports. Importing dependencies from another test is a bad idea - please use shared libraries instead. Fixed #20193 Closes scylladb/scylladb#22014	2024-12-26 00:33:08 +02:00
Yaron Kaikov	0fc7e786dd	.github/scripts/auto-backport.py: fix wrong username param In `2e6755ecca` I have added a comment when PR has conflicts so the assignee can get a notification about it. There was a problem with the user mention param (a missing `.login`) Fixing it Closes scylladb/scylladb#22036	2024-12-25 20:41:34 +02:00
Avi Kivity	465449e4a1	test: combined_test: relicense Was inadvertantly released under the AGPL.	2024-12-25 13:53:54 +02:00
Avi Kivity	3ffe93b6ae	Merge 'Enhance load-and-stream with "scope"' from Pavel Emelyanov The main purpose of this change is to enhance the restore from object storage usage. Currently, restore uses the load-and-stream facility. When triggered, the restoring task opens the provided list of sstables directory from the remote bucket and then feeds the list of sstables to load_and_stream() method. The method, in turn, iterates over this list, reads mutations and for each mutation decides where to send one by checking the replication map (it's pretty much the same for both vnodes and tablets, but for tablets that are "fully contained" by a range there's the plan to stream faster). As described above, restore is governed by a single node and this single node reads all sstables from the object store, which can be very slow. This PR allows speeding things up. For that, the load-and-stream code is equipped with the "scope" filter which limits where mutations can be streamed to. There are four options for that -- all, dc, rack and node. The "all" is how things work currently, "dc" and "rack" filter out target nodes that don't belong to this node's dc/rack respectively. The "node" scope only streams mutations to local node. With the "node" scope it's possible to make all nodes in the cluster load mutations that belong to them in parallel, without re-sending them to peers. The last patch in this PR is the test that shows how it can be possible. Closes scylladb/scylladb#21169 * github.com:scylladb/scylladb: test: Add scope-streaming test (for restore from backup) api: New "scope" API param to load-and-stream calls sstables_loader: Propagate scope from API down sstables_loader: Filter tablets based on scope streamer: Disable scoped streaming of primary replica only sstables_loader: Introduce streaming scope sstables_loader: Wrap get_endpoints()	2024-12-25 13:52:51 +02:00
Nadav Har'El	23213e8696	Merge 'Make get_built_indexes REST API endpoint be consistent with system."IndexInfo" table' from Pavel Emelyanov It turned out that aforementioned APIs use slightly different sources of information about view build progress/status which sometimes results in different reporting of whether an index is built. It's good to make those two APIs consistent. Also add a test for the REST API endpoint (system table test was addressed by #21677). Closes scylladb/scylladb#21814 * github.com:scylladb/scylladb: test: Add tests for MVs and indexes reporting by API endpoint(s) api: Use built_views table in get_built_indexes API	2024-12-25 11:47:03 +02:00
Evgeniy Naydanov	5992e8b031	test.py: topology_random_failures: more deselects for #21534 More cases found which can cause the same 'local_is_initialized()' assertion during the node's bootstrap.	2024-12-25 06:38:13 +00:00
Evgeniy Naydanov	f337ecbafa	test.py: topology_random_failures: handle more node's hangs during 30s sleep The node is hanging and the coordinator just rollback a topology state. It's different from `stop_after_sending_join_node_request` and `stop_after_bootstrapping_initial_raft_configuration` because in these cases the coordinator just not able to start the topology change at all and a message in the coordinator's log is different. Error injections handled: - `stop_after_updating_cdc_generation` - `stop_before_streaming` And, actually, it can be any cluster event which lasts more than 30s.	2024-12-25 06:38:13 +00:00

1 2 3 4 5 ...

45994 Commits