scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-25 19:10:42 +00:00

Author	SHA1	Message	Date
Pavel Emelyanov	d4f3a3ee4f	cql: Remove unused "initial_tablets" mention from guardrails All tablets configuration was moved into its own "with tablets" section, this option name cannot be met among replication factors. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#23555	2025-04-06 16:52:07 +03:00
Nadav Har'El	431de48df9	test/alternator: test for item with many attributes A user complained that he couldn't read or write an item with more than 16 attributes (!) in Alternator. This isn't true, but I realized that we don't have a simple test for this case - all test use just a few attributes. So let's add such a test, doing PutItem, UpdateItem and GetItem with 400 attributes. Unsurprisingly, the test passes. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#23568	2025-04-03 22:35:49 +03:00
Nadav Har'El	a9a6f9eecc	test/alternator: increase timeout in Alternator RBAC test On our testing infrastructure, tests often run a hundred times (!) slower than usual, for various reasons that we can't always avoid. This is why all our test frameworks drastically increase the default timeouts. We forgot to increase the timeout in one place - where Alternator tests use CQL. This is needed for the Alternator role-based access control (RBAC) tests, which is configured via CQL and therefore the Alternator test unusually uses CQL. So in this patch we increase the timeout of CQL driver used by Alternator tests to the same high timeouts (60-120 seconds) used by the regular CQL tests. As the famous saying goes, these timeouts should be enough for anyone. Fixes #23569. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#23578	2025-04-03 22:31:08 +03:00
Benny Halevy	cdf9fe9e50	Update seastar submodule * seastar 2f13c461...ed8952fb (24): > file: explain dsync check in flush method > gate: add named_gate > tests: unit: add gate_test > reactor: Remove global task_quota extern declaration > future: Move report_failed_future to internal namespace > update boost cooking URL > smp: prefault: clear memory map after threads join > change format to sesatar::format > Prevent move / copy constructor / assignment on backtrace_buffer > Remove unnecesary flush calls from backtrace_buffer usage points > Make backtrace_buffer flush on destruction > Add `backtrace_buffer&` param to maybe_report_kernel_trace function > Prevent empty kernel callstack messages > Make cpu_stall_detector_linux_perf_event::maybe_report_kernel_trace function protected. > iotune: Add cli flag to force io depth > smp: prefault: decouple _stop_request from join_threads > reactor: more info, robustness on segfault > net/udp: fix ipv4_udp::next_port calculation > map_reduce: prevent mapper or reducer exception from poisoning state > build: Re-enable ASan's verify_asan_link_order check > tests: enable/disable internet-dependent tests at runtime > test: tls_test: rename test_simple_x509_client variants to avoid naming conflicts > tests: extend test.py to accept arbitrary ctest parameters from positional args > tests: add a handle for building tests in "offline" mode Closes scylladb/scylladb#23566	2025-04-03 19:45:37 +03:00
Botond Dénes	1198213000	Merge 'tablets: Make tablet allocation equalize per-shard load ' from Tomasz Grabiec Before, it was equalizing per-node load (tablet count), which is wrong in heterogeneous clusters. Nodes with fewer shards will end up with overloaded shards. Refs #23378 Closes scylladb/scylladb#23478 * github.com:scylladb/scylladb: tablets: Make tablet allocation equalize per-shard load tablets: load_balancer: Fix reporting of total load per node	2025-04-03 16:32:53 +03:00
Botond Dénes	fcdae20fd1	Merge 'Add tablet enforcing option' from Benny Halevy This series add a new config option: `tablets_mode_for_new_keyspaces` that replaces the existing `enable_tablets` option. It can be set to the following values: disabled: New keyspaces use vnodes by default, unless enabled by the tablets={'enabled':true} option enabled: New keyspaces use tablets by default, unless disabled by the tablets={'disabled':true} option enforced: New keyspaces must use tablets. Tablets cannot be disabled using the CREATE KEYSPACE option `tablets_mode_for_new_keyspaces=disabled` or `tablets_mode_for_new_keyspaces=enabled` control whether tablets are disabled or enabled by default for new keyspaces, respectively. In either cases, tablets can be opted-in or out using the `tablets={'enabled':...}` keyspace option, when the keyspace is created. `tablets_mode_for_new_keyspaces=enforced` enables tablets by default for new keyspaces, like `tablets_mode_for_new_keyspaces=enabled`. However, it does not allow to opt-out when creating new keyspaces by setting `tablets = {'enabled': false}` Refs scylladb/scylla-enterprise#4355 * Requires backport to 2025.1 Closes scylladb/scylladb#22273 * github.com:scylladb/scylladb: boost/tablets_test: verify failure to create keyspace with tablets and non network replication strategy tablets: enforce tablets using tablets_mode_for_new_keyspaces=enforced config option db/config: add tablets_mode_for_new_keyspaces option	2025-04-03 16:32:19 +03:00
Kefu Chai	3760a1c85e	cql3: Remove unnecessary 'virtual' specifiers from final class methods Remove 'virtual' specifiers from member functions in final classes where they can never be overridden. This addresses Clang errors like: ``` /home/kefu/dev/scylladb/cql3/column_identifier.hh:85:21: error: virtual method 'to_string' is inside a 'final' class and can never be overridden [-Werror,-Wunnecessary-virtual-specifier] 85 \| virtual sstring to_string() const; \| ^ 1 error generated. ``` This change improves code clarity and maintainability by eliminating redundant modifiers that could cause confusion. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#23570	2025-04-03 13:51:42 +03:00
Tomasz Grabiec	fe8187e594	Merge 'repair: release erm in repair_writer_impl::create_writer when possible' from Aleksandra Martyniuk Currently, repair_writer_impl::create_writer keeps erm to ensure that a sharder is valid. If we repair a tablet, erm blocks the state machine and no operation on any tablet of this table might be performed. Use auto_refreshing_sharder and topology_guard to ensure that the operation is safe and that tablet operations on the whole table aren't blocked. Fixes: #23453. Needs backport to 2025.1 that introduces the tablet repair scheduler. Closes scylladb/scylladb#23455 * github.com:scylladb/scylladb: \test: add test to check concurrent migration and repair of two different tablets repair: release erm in repair_writer_impl::create_writer when possible	2025-04-03 11:15:08 +02:00
Botond Dénes	7bbfa5293f	test/cluster/test_read_repair.py: increase read request timeout This test enables trace-level logging for the mutation_data logger, which seems to be too much in debug mode and the test read times out. Increase timeout to 1minute to avoid this. Fixes: #23513 Closes scylladb/scylladb#23558	2025-04-03 10:42:11 +03:00
Botond Dénes	07510c07a0	readers/mutation_readers: queue_reader_handle_v2::push_end_of_stream() raise _ex if set Instead of raising std::runtime_error("Dangling queue_reader_handle_v2") unconditionally. push() already raises _ex if set, best to be consistent. Unconditionally raising std::runtime_error can cause an error to be logged, when aborting an operation involving a queue reader. Although the original exception passed to queue_reader_handle_v2::abort() is most likely handled by higher level code (not logged), the generic std::runtime_error raised is not and therefore is logged. Fixes: #23550 Closes scylladb/scylladb#23554	2025-04-03 10:39:56 +03:00
Pavel Emelyanov	3bf4768205	Merge 'Unify http transport in EAR to use seastar http client' from Calle Wilund Fixes #22925 Refs #22885 Some providers in EAR were written before seastar got its own native http connector (as it is). Thus hand-made connectivity is used there. This PR unifies the code paths, and also extract some abstraction between providers where possible. One big reason for this is the handling of abrupt disconnects and retries; Seastar has some handling of things like EPIPE and ECONNRESET situations, that can be safely ignored in a REST call iff data was in fact transferred etc. This PR mainly takes the usage of seastar httpclient from gcp connector, makes a wrapper matching most of the usage of local client in kms connector, ensures common functionality and the replaces the code in the individual connectors. Closes scylladb/scylladb#22926 * github.com:scylladb/scylladb: encryption::gcp: Use seastar http client wrapper encryption::kms: Drop local http client and use seastar wrapper encryption: Break out a "httpclient" wrapper for seastar httpclient	2025-04-03 10:35:14 +03:00
Kefu Chai	0cd6cf1dc5	main: Remove unused member variable `_sys_ks` Fixes a Clang error by removing the unused private field `sstable_dict_deleter::_sys_ks` that was flagged with: [-Werror,-Wunused-private-field] ``` /home/kefu/.local/bin/clang++ -DBOOST_PROGRAM_OPTIONS_DYN_LINK -DBOOST_PROGRAM_OPTIONS_NO_LIB -DSCYLLA_BUILD_MODE=release -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"RelWithDebInfo\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/build -isystem /home/kefu/dev/scylladb/seastar/include -isystem /home/kefu/dev/scylladb/build/RelWithDebInfo/seastar/gen/include -isystem /home/kefu/dev/scylladb/abseil -isystem /home/kefu/dev/scylladb/build/rust -I/usr/include/p11-kit-1 -ffunction-sections -fdata-sections -O3 -g -gz -std=gnu++23 -flto=thin -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb/= -ffile-prefix-map=/home/kefu/dev/scylladb/build=. -ffile-prefix-map=/home/kefu/dev/scylladb/build/=build -march=westmere -Xclang -fexperimental-assignment-tracking=disabled -mllvm -inline-threshold=2500 -fno-slp-vectorize -ffat-lto-objects -std=gnu++23 -Werror=unused-result -DSEASTAR_API_LEVEL=7 -DSEASTAR_SSTRING -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_SCHEDULING_GROUPS_COUNT=19 -DSEASTAR_LOGGER_TYPE_STDOUT -DBOOST_PROGRAM_OPTIONS_NO_LIB -DBOOST_PROGRAM_OPTIONS_DYN_LINK -DBOOST_THREAD_NO_LIB -DBOOST_THREAD_DYN_LINK -DFMT_SHARED -MD -MT CMakeFiles/scylla.dir/RelWithDebInfo/main.cc.o -MF CMakeFiles/scylla.dir/RelWithDebInfo/main.cc.o.d -o CMakeFiles/scylla.dir/RelWithDebInfo/main.cc.o -c /home/kefu/dev/scylladb/main.cc /home/kefu/dev/scylladb/main.cc:1660:38: error: private field '_sys_ks' is not used [-Werror,-Wunused-private-field] 1660 \| db::system_keyspace& _sys_ks; \| ^ ``` The member variable is not referenced anywhere in the code, so removing it improves maintainability without affecting functionality. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#23545	2025-04-02 20:07:39 +03:00
Evgeniy Naydanov	84a5037056	test.py: cluster/suite.yaml: update test filters After switching to subfolders the filter `run_in_debug` for random failures test was just copied as is, but need to include the subfolder, actually. Also, `test_old_ip_notification_repro` was deleted, so, we don't need it in the `skip_in_debug` list. Closes scylladb/scylladb#23492	2025-04-02 19:29:27 +03:00
Kefu Chai	a09ec9d60d	.github: add delay before checking for required PR labels Improve the GitHub workflow to prevent premature email notifications about missing labels. Previously, contributors without write permissions to the scylladb repo would receive immediate notification emails about missing required backport labels, even if they were in the process of adding them. This change introduces a 1-minute grace period before checking for required labels, giving contributors sufficient time to add necessary labels (like backport labels) to their pull requests before any warning notifications are sent. The delay makes the experience more user-friendly for non-maintainer contributors while maintaining the labeling requirements. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#23539	2025-04-02 19:28:15 +03:00
Aleksandra Martyniuk	bae6711809	\test: add test to check concurrent migration and repair of two different tablets	2025-04-02 15:30:17 +02:00
Radosław Cybulski	c36614e16d	alternator: add size check to BatchItemWrite Add a size check for BatchItemWrite command - if the item count is bigger than configuration value `alternator_maximum_batch_write_size`, an error will be raised and no modification will happen. This is done to synchronize with DynamoDB, where maximum size of BatchItemWrite is 25. To avoid complaints from clients, who use our feature of BatchWriteItem being limitless we set default value to 100. Fixes #5057 Closes scylladb/scylladb#23232	2025-04-02 14:48:00 +03:00
Avi Kivity	882f405eed	Merge "Convert gossiper's endpoint state map to be host id based" from Gleb " The series makes endpoint state map in the gossiper addressable by host id instead of ips. The transition has implication outside of the gossiper as well. Gossiper based topology operations are affected by this change since they assume that the mapping is ip based. On wire protocol is not affected by the change as maps that are sent by the gossiper protocol remain ip based. If old node sends two different entries for the same host id the one with newer generation is applied. If new node has two ids that are mapped to the same ip the newer one is added to the outgoing map. Interoperability was verified manually by running mixed cluster. The series concludes the conversion of the system to be host id based. " * 'gleb/gossipper-endpoint-map-to-host-id-v2' of github.com:scylladb/scylla-dev: gossiper: make examine_gossiper private gossiper: rename get_nodes_with_host_id to get_node_ip treewide: drop id parameter from gossiper::for_each_endpoint_state treewide: move gossiper to index nodes by host id gossiper: drop ip from replicate function parameters gossiper: drop ip from apply_new_states parameters gossiper: drop address from handle_major_state_change parameter list gossiper: pass rpc::client_info to gossiper_shutdown verb handler gossiper: add try_get_host_id function gossiper: add ip to endpoint_state serialization: fix std::map de-serializer to not invoke value's default constructor gossiper: drop template from wait_alive_helper function gossiper: move get_supported_features and its users to host id storage_service: make candidates_for_removal host id based gossiper: use peers table to detect address change storage_service: use std::views::keys instead of std::views::transform that returns a key gossiper: move _pending_mark_alive_endpoints to host id gossiper: do not allow to assassinate endpoint in raft topology mode gossiper: fix indentation after previous patch gossiper: do not allow to assassinate non existing endpoint	2025-04-02 12:30:00 +03:00
Pavel Emelyanov	832d83ae4b	sstables_loader: Do not stop sharded<progress_monitor> unconditionally The member in question is unconditionally .stop()-ed in task's release_resources() method, however, it may happen that the thing wasn't .start()-ed in the first place. Start happens in the middle of the task's .run() method and there can be several reasons why it can be skipped -- e.g. the task is aborted early, or collecting sstables from S3 throws. fixes: #23231 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#23483	2025-04-02 12:09:02 +03:00
Kefu Chai	6da758d74c	config: mark uuid_sstable_identifiers_enabled unused the option of `uuid_sstable_identifier_enabled` was introduced in `f014ccf3` . the first version which has this change was 5.4, and 6.1 has been branched. during the discussion of backup and restore, we realized that we've been taking efforts to address problems which could have been addressed with the sstable with UUID-based identifier. see also #10459 which is the issue which proposed to implement UUID-v1 based sstable identifier. now that two major releases passed, we should have the luxury to mark this option "unused". this option which was previously introduced to keep the backward compatibility, and to allow user to opt-out of the feature for some reasons. so in this change, mark the option unused, so that if any user still sets this option with command line, they will get a clear error. but we still parse and handle this setting in `scylla.yaml`, so that this option is still respected for existing settings, and for existing tests, which are not yet prepared for the uuid-based sstable identifiers. Refs #10459 Fixes #20337 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#20341	2025-04-01 20:21:47 +03:00
Botond Dénes	3bad46a6e2	docs/dev: add tombstone.md An exhaustive document on the tombstone related internal logic as well as the user-facing aspects. Closes scylladb/scylladb#23454	2025-04-01 20:17:57 +03:00
Botond Dénes	a0d8102a1f	replica/memtable: s/make_flat_reader/make_mutation_reader/ Following the recent refactoring of removing "flat" and "v2" from reader names, replacing all the fully qualified names with simply "mutation_reader". Closes scylladb/scylladb#23346	2025-04-01 17:58:13 +03:00
Artsiom Mishuta	032b28d793	test.py: remove pylib_test from test.py/CI run pylib_test contains one pure Python test. This test does not test Scylla. This test is not deleted because it can be useful to run during pre-commit, for example, but it definitely should not be run in CI in modes with 3 repeats each. It does not make sense. It is a Unit test for test.py framework. Note: test still can be easily run by pytest via the command: ./tools/toolchain/dbuild pytest test/pylib_test Closes scylladb/scylladb#23181	2025-04-01 16:43:45 +03:00
Pavel Emelyanov	2ee9cec1d3	Merge 'Remove object_storage.yaml and move the endpoints to scylla.yaml' from Robert Bindar Move `object_storage.yaml` endpoints to `scylla.yaml` This change also removes the `object_storage.yaml` file altogether and adds tests for fetching the endpoints via the `v2/config/object_storage_endpoints` REST api. Also, `object_storage_config_file` options is moved to a deprecated state as it's no longer needed. This PR depends on #22951, the reviewers should review patch 393e1ac0ec066475ca94094265a5f88dbbdb1a1f Refs https://github.com/scylladb/scylladb/issues/22428 Closes scylladb/scylladb#22952 * github.com:scylladb/scylladb: Remove db::config::object_storage_config Move `object_storage.yaml` endpoints to `scylla.yaml`	2025-04-01 16:01:44 +03:00
Avi Kivity	69684e16d8	Merge 'sstables: add SSTable compression with shared dictionaries ' from Michał Chojnowski This PR extends Scylla's SSTable compression with the ability to use compression dictionaries shared across compression chunks. This involves several changes: - We refactor `compression_parameters` and friends (`compressor`, `sstables::local_compression`, `sstables::compression`) to prepare for making the construction of `compressor`s asynchronous, to enable sharing pieces of compressors (the dictionaries) across shards. - We introduce the notion of "hidden compression options" which are written to `CompressionInfo.db` and used to construct decompressors, like regular options, but don't appear in the schema. (We later stuff the SSTable's dictionary into `CompressionInfo.db` using a sequence of such options). - We add a cluster feature which guards the creation of dictionary-compressed SSTables. - We introduce a central "compressor factory" (one instance shared by all shards), which from this point onward is used to construct all `compressor` objects (one per SSTable) used to process the SSTables. When constructing a compressor for writing, it uses the "current"/"recommended" dictionary (which is passed to the factory from the actively-observed contents of the group0-managed `system.dicts`). When constructing a compressor for reading, it uses the dictionary written in the hidden compression options in CompressionInfo.db. And it keeps dictionaries deduplicated, so that each unique live dictionary blob has only one instance in memory, shared across shards. - We teach the relevant `lz4` and `zstd` compressor wrappers about the dictionaries. - We add a HTTP API call which samples pieces of the given table (i.e. the Data.db files) from across the cluster, trains a dictionary on it, and publishes it via `system.dicts` as the new current dictionary for that table. (And we add some RPC verbs to support that). - We add a HTTP API call which estimates the impact of various available compression configurations on the compression ratio. - We add an autotrainer fiber which periodically retrains dicts for dict-aware tables and publishes them if they seem to be a significant improvement. Known imperfections: - The factory currently keeps one dictionary instance on the entire node, but we probably want one copy per NUMA node. I didn't do that because exposing NUMA knowledge to Scylla seems to require some changes in Seastar first. New feature, no backporting involved. Closes scylladb/scylladb#23025 * github.com:scylladb/scylladb: docs: add user-facing documentation for SSTable compression with shared dicts docs/dev: add sstable-compression-dicts.md test: add test_sstable_compression_dictionaries_autotrain.py test: add test_sstable_compression_dictionaries_basic.py test/pylib/rest_client: add `keyspace_upgrade_sstables` helper main: run a sstable_dict_autotrainer api: add the estimate_compression_ratios API call dict_autotrainer: introduce sstable_dict_autotrainer db/system_keyspace: add query_dict_timestamp compress: add ZstdWithDictsCompressor and LZ4WithDictsCompressor main: clean up sstable compression dicts after table drops sstables/compress: discard hidden compression options after the decompressor is created compress: change compressor_ptr from shared_ptr to unique_ptr api: add the retrain_dict API call storage_service: add some dict-related routines main: in compression_dict_updated_callback, recognize and use SSTable compression dicts storage_service: add do_sample_sstables() messaging_service: add SAMPLE_SSTABLES and ESTIMATE_SSTABLE_VOLUME verbs db/system_keyspace: let `system.dicts` helpers be used for dicts other than the RPC compression dict raft/group0_state_machine: on `system.dicts` mutations, pass the affected partitition keys to the callback database: add sample_data_files() database: add take_sstable_set_snapshot() compress: teach `lz4_processor` about dictionaries compress: teach `zstd_processor` about dictionaries sstables: delegate compressor creation to the compressor factory sstables: plug an `sstable_compressor_factory` into `sstables_manager` sstables: introduce sstable_compressor_factory utils/hashers: add get_sha256() gms/feature_service: add the SSTABLE_COMPRESSION_DICTS cluster feature compress: add hidden dictionary options compress: remove `compression_parameters::get_compressor()` sstables/compress: remove get_sstable_compressor() sstables/compress: move ownership of `compressor` to `sstable::compression` compress: remove compressor::option_names() compress: clean up the constructor of zstd_processor compress: squash zstd.cc into compress.cc sstables/compress: break the dependency of `compression_parameters` on `compressor` compress.hh: switch compressor::name() from an instance member to a virtual call bytes: adapt fmt_hex to std::span<const std::byte>	2025-04-01 12:47:34 +03:00
Aleksandra Martyniuk	1dc29ddc86	repair: release erm in repair_writer_impl::create_writer when possible Currently, repair_writer_impl::create_writer keeps erm to ensure that a sharder is valid. If we repair a tablet, erm blocks the state machine and no operation on any tablet of this table might be performed. Use auto_refreshing_sharder and topology_guard to ensure that the operation is safe and that tablet operations on the whole table aren't blocked. Fixes: #23453.	2025-04-01 11:34:21 +02:00
Calle Wilund	c6674619b7	encryption::gcp: Use seastar http client wrapper Refs #22925 Remove direct usage of seastar http client, and instead share this with other connectors via the http client wrapper type.	2025-04-01 08:18:05 +00:00
Calle Wilund	491748cde3	encryption::kms: Drop local http client and use seastar wrapper Fixes #22925 Removes the boost based http client in favour of our seastar wrapper.	2025-04-01 08:18:05 +00:00
Calle Wilund	878f76df1f	encryption: Break out a "httpclient" wrapper for seastar httpclient Refs #22925 Adds some wrapping and helpers for the kind of REST operations we expect to perform. Some things like stream formatting is redundant visavi seastar, but on that level we only have \r\n encoded writing to output_stream and similar, which is less useful for things like logging.	2025-04-01 08:18:05 +00:00
Piotr Smaron	370707b111	service: restore default timeout in `announce_with_raft` This restored timeout seems to have been accidentally removed in `7081215552 (r2005352424)`. Without it, `raft_server_with_timeouts::run_with_timeout` will get `std::nullopt` as a value of the `timeout` parameter and perform an operation without any timeout, whereas previously it would have waited for the default timeout specified in `raft_server_for_group::default_op_timeout`. Closes scylladb/scylladb#23380	2025-04-01 10:20:16 +03:00
David Garcia	6e61fc323b	docs: redirect to docs.scylladb.com/manual/ Define a custom alert to redirect users to the latest version of the docs in https://docs.scylladb.com/manual/ Closes scylladb/scylladb#22636	2025-04-01 09:22:56 +03:00
Botond Dénes	bd9f51a29c	Merge 'transport/server.cc: set default timestamp info in EXECUTE and BATCH tracing' from Vladislav Zolotarov A default timestamp (not to confuse with the timestamp passed via 'USING TIMESTAMP' query clause) can be set using 0x20 flag and the <timestamp> field in the binary CQL frame payload of QUERY, EXECUTE and BATCH ops. It also happens to be a default of a Java CQL Driver. However, we were only setting the corresponding info in the CQL Tracing context of a QUERY operation. For an unknown reason we were not setting this for an EXECUTE and for a BATCH traces (I guess I simply forgot to set it back then). This patch fixes this. Fixes #23173 The issue fixed by this PR is not critical but the fix is simple and safe enough so we should backport it to all live releases. Closes scylladb/scylladb#23174 * github.com:scylladb/scylladb: CQL Tracing: set common query parameters in a single function transport/server.cc: set default timestamp info in EXECUTE and BATCH tracing	2025-04-01 09:16:02 +03:00
Pavel Emelyanov	b5a124f60c	sstable_directory: Move highest_generation_seen() to distributed_loader.cc This method is only used by the loader code (and tests). Also, There's the highest_version_seen() peer that sits in the loader code either. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#23324	2025-04-01 09:15:14 +03:00
Pavel Emelyanov	eafc767cc6	sstable/filesystem: Add convenience helper to generate filename In its operations the fs storage carefully generates full filename from all sstable parameters -- version, format, generation, keyspace and table names and component type or name. However, in all of the cases format, version and keyspace:table names are inherited from the sstable being operated on. This calls for a filename generation helper that wraps most of the arguments thus making the lines shorter. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#23384	2025-04-01 09:14:44 +03:00
Botond Dénes	0fdf2a2090	Merge 'test/pylib: servers_add: support list of property_files' from Benny Halevy So that a multi-dc/multi-rack cluster can be populated in a single call. * Enhancement, no backport required Closes scylladb/scylladb#23341 * github.com:scylladb/scylladb: test/pylib: servers_add: add auto_rack_dc parameter test/pylib: servers_add: support list of property_files	2025-04-01 09:14:20 +03:00
Jenkins Promoter	6c528f5027	Update pgo profiles - aarch64	2025-04-01 04:45:44 +03:00
Jenkins Promoter	3c12029584	Update pgo profiles - x86_64	2025-04-01 04:27:11 +03:00
Michał Chojnowski	36be9d1c9b	docs: add user-facing documentation for SSTable compression with shared dicts	2025-04-01 00:07:31 +02:00
Michał Chojnowski	d33ffb221b	docs/dev: add sstable-compression-dicts.md	2025-04-01 00:07:31 +02:00
Michał Chojnowski	f851efd4fa	test: add test_sstable_compression_dictionaries_autotrain.py Adds a test which checks that sstable compression dict autotraining does its job.	2025-04-01 00:07:31 +02:00
Michał Chojnowski	62da3d8363	test: add test_sstable_compression_dictionaries_basic.py Add a basic integration test for SSTable compression with shared dictionaries.	2025-04-01 00:07:30 +02:00
Michał Chojnowski	7b0eeefd79	test/pylib/rest_client: add `keyspace_upgrade_sstables` helper	2025-04-01 00:07:30 +02:00
Michał Chojnowski	3f7969313f	main: run a sstable_dict_autotrainer Create an instance of `sstable_dict_autotrainer` in `scylla_main` and run it.	2025-04-01 00:07:30 +02:00
Michał Chojnowski	a19d6d95f7	api: add the estimate_compression_ratios API call Add an API call which estimates the effectiveness of possible compression config changes. This can be used to make an informed decision about whether to change the compression method, without actually recompressing any SSTables.	2025-04-01 00:07:30 +02:00
Michał Chojnowski	4f0d453acf	dict_autotrainer: introduce sstable_dict_autotrainer Add a fiber responsible for periodic re-training of compression dictionaries (for tables which opted into dict-aware compression). As of this patch, it works like this: every `$tick_period` (15 minutes), if we are the current Raft leader, we check for dict-aware tables which have no dict, or a dict older than `$retrain_period`. For those tables, if they have enough data (>1GiB) for a training, we train a new dict and check if it's significantly better than the current one (provides ratio smaller than 95% of current ratio), and if so, we update the dict.	2025-04-01 00:07:30 +02:00
Michał Chojnowski	9d02e2c005	db/system_keyspace: add query_dict_timestamp Adds a helper method which queries the creation timestamp of a given dict in `system.dicts`. We will later use the age of the current SSTable compression dict to decide if another training should be done already.	2025-04-01 00:07:30 +02:00
Michał Chojnowski	cb1b291051	compress: add ZstdWithDictsCompressor and LZ4WithDictsCompressor Add new compressor names to `sstable_compression`. When those names are configured in the schema, new SSTables will be compressed with dict-aware Zstd or LZ4 respectively.	2025-04-01 00:07:30 +02:00
Michał Chojnowski	bea866a46f	main: clean up sstable compression dicts after table drops When a table is dropped, its corresponding dictionary in `system.dicts` -- if any -- should be deleted, otherwise it will remain forever as garbage. This commit implements such cleanup.	2025-04-01 00:07:30 +02:00
Michał Chojnowski	cee504f66f	sstables/compress: discard hidden compression options after the decompressor is created Dictionary contents are kept in the list of "compression options" in the header of `CompressionInfo.db`, and they are loaded from disk into memory when the `sstable::compression` object is populated. After the decompressor for the SSTable is created based on those dict contents, they are not needed in RAM anymore. And since they take up a sizeable amount of memory, we would like to free them. In this patch, we discard all "hidden compression options" (currently: only the dictionary contents) from the `sstable::compression` object right after the decompressor is created. (Those options are not supposed to be used for anything else anyway).	2025-04-01 00:07:30 +02:00
Michał Chojnowski	10fa4abde7	compress: change compressor_ptr from shared_ptr to unique_ptr Cleanup patch. After we moved the ownership of compressors to sstables, compressor objects never have shared lifetime. `unique_ptr` is more appropriate for them than `shared_ptr` now. (And besides expressing the intent better, using `unique_ptr` prevents an accidental cross-shard `shared_ptr` copy).	2025-04-01 00:07:29 +02:00
Michał Chojnowski	58ae278d10	api: add the retrain_dict API call Add an API call which will retrain the SSTable compression dictionary for a given table. Currently, it needs all nodes to be alive to succeed. We can relax this later.	2025-04-01 00:07:29 +02:00

1 2 3 4 5 ...

47334 Commits