scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-26 03:20:37 +00:00

Author	SHA1	Message	Date
Nikos Dragazis	bafe2bbbbc	db/config: Deprecate sstable_compression_dictionaries_allow_in_ddl The option is a knob that allows to reject dictionary-aware compressors in the validation stage of CREATE/ALTER statements, and in the validation of `sstable_compression_user_table_options`. It was introduced in `7d26d3c7cb` to allow the admins of Scylla Cloud to selectively enable it in certain clusters. For more details, check: https://github.com/scylladb/scylla-enterprise/issues/5435 As of this series, we want to start offering dictionary compression as the default option in all clusters, i.e., treat it as a generally available feature. This makes the knob redundant. Additionally, making dictionary compression the default choice in `sstable_compression_user_table_options` creates an awkward dependency with the knob (disabling the knob should cause `sstable_compression_user_table_options` to fall back to a non-dict compressor as default). That may not be very clear to the end user. For these reasons, mark the option as "Deprecated", remove all relevant tests, and adjust the business logic as if dictionary compression is always available. Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com> (cherry picked from commit `96e727d7b9`)	2025-11-04 15:40:46 +02:00
Nikos Dragazis	260c9972b0	boost/cql_query_test: Get expected compressor from config Since `5b6570be52`, the default SSTable compression algorithm for user tables is no longer hardcoded; it can be configured via the `sstable_compression_user_table_options.sstable_compression` option in scylla.yaml. Modify the `test_table_compression` test to get the expected value from the configuration. Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com> (cherry picked from commit `d95ebe7058`)	2025-10-31 23:50:20 +00:00
Pavel Emelyanov	9459a58116	Merge '[Backport 2025.4] cdc: improve cdc metadata loading' from Scylladb[bot] when loading CDC streams metadata for tablets from the tables, read only new entries from the history table instead of reading all entries. This improves the CDC metadata reloading, making it more efficient and predictable. the CDC metadata is loaded as part of group0 reload whenever the internal CDC tables are modified. on tablet split / merge, we create a new CDC timestamp and streams by writing them to the cdc_streams_history table by group0 operation, and when it's applied we reload the in-memory CDC streams map by reading from the tables and constructing the updated map. Previously, on every update, we would read the entire cdc_streams_history entries for the changed table, constructing all its streams and creating a new map from scratch. We improve this now by reading only new entries from cdc_streams_history and append them to the existing map. we can do this because we only append new entries to cdc_streams_history with higher timestamp than all previous entries. This makes this reloading more efficient and predictable, because previously we would read a number of entries that depends on the number of tablets splits and merges, which increases over time and is unbounded, whereas now we read only a single stream set on each update. Fixes https://github.com/scylladb/scylladb/issues/26732 backport to 2025.4 where cdc with tablets is introduced - (cherry picked from commit `8743422241`) - (cherry picked from commit `4cc0a80b79`) Parent PR: #26160 Closes scylladb/scylladb#26798 * github.com:scylladb/scylladb: test: cdc: extend cdc with tablets tests cdc: improve cdc metadata loading	2025-10-30 10:32:27 +03:00
Michael Litvak	59f97d0b71	test: cdc: extend cdc with tablets tests extend and improve the tests of virtual tables for cdc with tablets. split the existing virtual tables test to one test that validates the virtual tables against the internal cdc tables, and triggering some tablet splits in order to create entries in the cdc_streams_history table, and add another test with basic validation of the virtual tables when there are multiple cdc tables. (cherry picked from commit `4cc0a80b79`)	2025-10-30 02:44:47 +00:00
Michael Litvak	0a07c2cb19	cdc: improve cdc metadata loading when loading CDC streams metadata for tablets from the tables, read only new entries from the history table instead of reading all entries. This improves the CDC metadata reloading, making it more efficient and predictable. the CDC metadata is loaded as part of group0 reload whenever the internal CDC tables are modified. on tablet split / merge, we create a new CDC timestamp and streams by writing them to the cdc_streams_history table by group0 operation, and when it's applied we reload the in-memory CDC streams map by reading from the tables and constructing the updated map. Previously, on every update, we would read the entire cdc_streams_history entries for the changed table, constructing all its streams and creating a new map from scratch. We improve this now by reading only new entries from cdc_streams_history and append them to the existing map. we can do this because we only append new entries to cdc_streams_history with higher timestamp than all previous entries. This makes this reloading more efficient and predictable, because previously we would read a number of entries that depends on the number of tablets splits and merges, which increases over time and is unbounded, whereas now we read only a single stream set on each update. Fixes scylladb/scylladb#26732 (cherry picked from commit `8743422241`)	2025-10-30 02:44:47 +00:00
Pavel Emelyanov	080c55a115	lister: Fix race between readdir and stat Sometimes file::list_directory() returns entries without type set. In thase case lister calls file_type() on the entry name to get it. In case the call returns disengated type, the code assumes that some error occurred and resolves into exception. That's not correct. The file_type() method returns disengated type only if the file being inspected is missing (i.e. on ENOENT errno). But this can validly happen if a file is removed bettween readdir and stat. In that case it's not "some error happened", but a enry should be just skipped. In "some error happened", then file_type() would resolve into exceptional future on its own. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#26595 (cherry picked from commit `d9bfbeda9a`) Closes scylladb/scylladb#26767	2025-10-29 11:29:57 +02:00
Anna Stuchlik	93de570e33	doc: add --list-active-releases to Web Installer Fixes https://github.com/scylladb/scylladb/issues/26688 V2 of https://github.com/scylladb/scylladb/pull/26687 Closes scylladb/scylladb#26689 (cherry picked from commit `bd5b966208`) Closes scylladb/scylladb#26765	2025-10-29 11:28:51 +02:00
Patryk Jędrzejczak	680bfa9ab7	test: test_raft_recovery_stuck: reconnect driver after rolling restarts It turns out that #21477 wasn't sufficient to fix the issue. The driver may still decide to reconnect the connection after `rolling_restart` returns. One possible explanation is that the driver sometimes handles the DOWN notification after all nodes consider each other UP. Reconnecting the driver after restarting nodes seems to be a reliable workaround that many tests use. We also use it here. Fixes #19959 Closes scylladb/scylladb#26638 (cherry picked from commit `5321720853`) Closes scylladb/scylladb#26763	2025-10-29 11:27:49 +02:00
Anna Stuchlik	ed58815199	doc: add OS support for version 2025.4 Fixes https://github.com/scylladb/scylladb/issues/26450 Closes scylladb/scylladb#26616 (cherry picked from commit `6fa342fb18`) Closes scylladb/scylladb#26750	2025-10-29 11:27:08 +02:00
Botond Dénes	9c8812a154	Merge '[Backport 2025.4] LWT: use shards_ready_for_reads for replica locks' from Scylladb[bot] When a tablet is migrated between shards on the same node, during the write_both_read_new state we begin switching reads to the new shard. Until the corresponding global barrier completes, some requests may still use write_both_read_old erm, while others already use the write_both_read_new erm. To ensure mutual exclusion between these two types of requests, we must acquire locks on both the old and new shards. Once the global barrier completes, no requests remain on the old shard, so we can safely switch to acquiring locks only on the new shard. The idea came from the similar locking problem in the [counters for tablets PR](https://github.com/scylladb/scylladb/pull/26636#discussion_r2463932395). Fixes scylladb/scylladb#26727 backport: need to backport to 2025.4 - (cherry picked from commit `5ab2db9613`) - (cherry picked from commit `478f7f545a`) Parent PR: #26719 Closes scylladb/scylladb#26748 * github.com:scylladb/scylladb: paxos_state: use shards_ready_for_reads paxos_state: inline shards_for_writes into get_replica_lock	2025-10-29 11:26:29 +02:00
Botond Dénes	aac49601c6	Merge '[Backport 2025.4] cdc: garbage collect CDC streams for tablets' from Scylladb[bot] introduce helper functions that can be used for garbage collecting old cdc streams for tablets-based keyspaces. add a background fiber to the topology coordinator that runs periodically and checks for old CDC streams for tablets keyspaces that can be garbage collected. the garbage collection works by finding the newest cdc timestamp that has been closed for more than the configured cdc TTL, and removing all information from the cdc internal tables about cdc timestamps and streams up to this timestamp. in general it should be safe to remove information about these streams because they are closed for more than TTL, therefore all rows that were written to these streams with the configured TTL should be dead. the exception is if the TTL is altered to a smaller value, and then we may remove information about streams that still have live rows that were written with the longer ttl. Fixes https://github.com/scylladb/scylladb/issues/26669 - (cherry picked from commit `440caeabcb`) - (cherry picked from commit `6109cb66be`) Parent PR: #26410 Closes scylladb/scylladb#26728 * github.com:scylladb/scylladb: cdc: garbage collect CDC streams periodically cdc: helpers for garbage collecting old streams for tablets	2025-10-29 11:25:31 +02:00
Asias He	89364d3576	repair: Remove the regular mode name in the tablet repair api The patch `e34deb72f9` (repair: Rename incremental mode name) missed one place that references the removed regular mode name. Fixes #26503 Closes scylladb/scylladb#26660 (cherry picked from commit `5f1febf545`) Closes scylladb/scylladb#26684	2025-10-29 11:22:56 +02:00
Anna Stuchlik	68ea778b6b	doc: add support for Debian 12 Fixes https://github.com/scylladb/scylladb/issues/26640 Closes scylladb/scylladb#26668 (cherry picked from commit `9c0ff7c46b`) Closes scylladb/scylladb#26681	2025-10-29 11:22:29 +02:00
Botond Dénes	087f739bf9	Merge '[Backport 2025.4] alternator/executor: instantly mark view as built when creating it with base table' from Scylladb[bot] `CreateTable` request creates GSI/LSI together with the base table, the base table is empty and we don't need to actually build the view. In tablet-based keyspaces we can just don't create view building tasks and mark the view build status as SUCCESS on all nodes. Then, the view building worker on each node will mark the view as built in `system.built_views` (`view_building_worker::update_built_views()`). Vnode-based keyspaces will use the "old" logic of view builder, which will process the view and mark it as built. Fixes scylladb/scylladb#26615 This fix should be backported to 2025.4. - (cherry picked from commit `8fbf122277`) - (cherry picked from commit `bdab455cbb`) - (cherry picked from commit `34503f43a1`) Parent PR: #26657 Closes scylladb/scylladb#26670 * github.com:scylladb/scylladb: test/alternator/test_tablets: add test for GSI backfill with tablets test/alternator/test_tablets: add reproducer for GSI with tablets alternator/executor: instantly mark view as built when creating it with base table	2025-10-29 11:21:27 +02:00
Petr Gusev	332b776e87	paxos_state: use shards_ready_for_reads Acquiring locks on both shards for the entire tablet migration period is redundant. In most cases, locking only the old shard or only the new shard is sufficient. Using shards_ready_for_reads reduces the situations in which we need to lock both shards to: * intra-node migrations only * only during the write_both_read_new state Once the global barrier completes in the write_both_read_new state, no requests remain on the old shard, so we can safely acquire locks only on the new shard. Fixes scylladb/scylladb#26727 (cherry picked from commit `478f7f545a`)	2025-10-28 16:59:47 +00:00
Petr Gusev	ff0e7ac853	paxos_state: inline shards_for_writes into get_replica_lock No need to have two functions since both callers of get_replica_lock() use shards_for_writes() to compute the shards where the locks must be acquired. Also while at it, inline the acquire() lambda in get_replica_lock() and replace it with a loop over shards. This makes the code more strightforward. (cherry picked from commit `5ab2db9613`)	2025-10-28 16:59:47 +00:00
Michael Litvak	5319759bdb	cdc: garbage collect CDC streams periodically add a background fiber to the topology coordinator that runs periodically and checks for old CDC streams for tablets keyspaces that can be garbage collected. (cherry picked from commit `6109cb66be`)	2025-10-27 19:53:04 +00:00
Michael Litvak	55d9d5e7c2	cdc: helpers for garbage collecting old streams for tablets introduce helper functions that can be used for garbage collecting old cdc streams for tablets-based keyspaces. - get_new_base_for_gc: finds a new base timestamp given a TTL, such that all older timestamps and streams can be removed. - get_cdc_stream_gc_mutations: given new base timestamp and streams, builds mutations that update the internal cdc tables and remove the older streams. - garbage_collect_cdc_streams_for_table: combines the two functions above to find a new base and build mutations to update it for a specific table - garbage_collect_cdc_streams: builds gc mutations for all cdc tables (cherry picked from commit `440caeabcb`)	2025-10-27 19:53:04 +00:00
Jenkins Promoter	7f08d0a6cf	Update ScyllaDB version to: 2025.4.0-rc4	2025-10-27 14:57:11 +02:00
Patryk Jędrzejczak	c406e1dd17	Merge '[Backport 2025.4] raft topology: fix group0 tombstone GC in the Raft-based recovery procedure' from Scylladb[bot] Group0 tombstone GC considers only the current group 0 members while computing the group 0 tombstone GC time. It's not enough because in the Raft-based recovery procedure, there can be nodes that haven't joined the current group 0 yet, but they have belonged to a different group 0 and thus have a non-empty group 0 state ID. The current code can cause a data resurrection in group 0 tables. We fix this issue in this PR and add a regression test. This issue was uncovered by `test_raft_recovery_entry_loss`, which became flaky recently. We skipped this test for now. We will unskip it in a following PR because it's skipped only on master, while we want to backport this PR. Fixes #26534 This PR contains an important bugfix, so we should backport it to all branches with the Raft-based recovery procedure (2025.2 and newer). - (cherry picked from commit `1d09b9c8d0`) - (cherry picked from commit `6b2e003994`) - (cherry picked from commit `c57f097630`) Parent PR: #26612 Closes scylladb/scylladb#26682 * https://github.com/scylladb/scylladb: test: test group0 tombstone GC in the Raft-based recovery procedure group0_state_id_handler: remove unused group0_server_accessor group0_state_id_handler: consider state IDs of all non-ignored topology members	2025-10-27 10:15:49 +01:00
Avi Kivity	e85ab70054	Merge '[Backport 2025.4] tablet_metadata_guard: fix split/merge handling' from Petr Gusev The guard should stop refreshing the ERM when the number of tablets changes. Tablet splits or merges invalidate the tablet_id field (_tablet), which means the guard can no longer correctly protect ongoing operations from tablet migrations. The problem is specific to LWT, since tablet_metadata_guard is used mostly for heavy topology operations, which exclude with split and merge. The guard was used for LWT as an optimization -- we don't need to block topology operations or migrations of unrelated tablets. In the future, we could use the guard for regular reads/writes as well (via the token_metadata_guard wrapper). Fixes https://github.com/scylladb/scylladb/issues/26437 backports: need to backport to 2025.4 since the bug is relevant to LWT over tablets. (cherry picked from commit `e1667afa50`) (cherry picked from commit `6f4558ed4b`) (cherry picked from commit `64ba427b85`) (cherry picked from commit `ec6fba35aa`) (cherry picked from commit `b23f2a2425`) (cherry picked from commit `33e9ea4a0f`) (cherry picked from commit `03d6829783`) Parent PR: https://github.com/scylladb/scylladb/pull/26619 Closes scylladb/scylladb#26700 * github.com:scylladb/scylladb: test_tablets_lwt: add test_tablets_merge_waits_for_lwt test.py: add universalasync_typed_wrap tablet_metadata_guard: fix split/merge handling tablet_metadata_guard: add debug logs paxos_state: shards_for_writes: improve the error message storage_service: barrier_and_drain – change log level to info topology_coordinator: fix log message	2025-10-24 21:22:49 +03:00
Petr Gusev	41f8f6b571	test_tablets_lwt: add test_tablets_merge_waits_for_lwt (cherry picked from commit `03d6829783`)	2025-10-24 12:22:20 +02:00
Petr Gusev	31e4bb1bc3	test.py: add universalasync_typed_wrap The universalasync.wrap function doesn't preserve the type information, which confuses the VS Code Pylance plugin and makes code navigation hard. In this commit we fix the problem by adding a typed wrapped around universalasync.wrap. Fixes: scylladb/scylladb#26639 (cherry picked from commit `33e9ea4a0f`)	2025-10-24 12:21:21 +02:00
Petr Gusev	be94aab207	tablet_metadata_guard: fix split/merge handling The guard should stop refreshing the ERM when the number of tablets changes. Tablet splits or merges invalidate the tablet_id field (_tablet), which means the guard can no longer correctly protect ongoing operations from tablet migrations. Fixes scylladb/scylladb#26437 (cherry picked from commit `b23f2a2425`)	2025-10-24 12:21:21 +02:00
Petr Gusev	a5be65785c	tablet_metadata_guard: add debug logs (cherry picked from commit `ec6fba35aa`)	2025-10-24 12:21:21 +02:00
Petr Gusev	5720dd52b8	paxos_state: shards_for_writes: improve the error message Add the current token and tablet info, remove 'this_shard_id' since it's always written by the logging infrastructure. (cherry picked from commit `64ba427b85`)	2025-10-24 12:21:21 +02:00
Petr Gusev	aa2021888c	storage_service: barrier_and_drain – change log level to info Debugging global barrier issues is difficult without these logs. Since barriers do not occur frequently, increasing the log level should not produce excessive output. (cherry picked from commit `6f4558ed4b`)	2025-10-24 12:21:21 +02:00
Petr Gusev	a09c1b355e	topology_coordinator: fix log message (cherry picked from commit `e1667afa50`)	2025-10-24 12:21:21 +02:00
Pawel Pery	67e0c8e4b0	vector_search: fix flaky dns_refresh_aborted test The test process like that: - run long dns refresh process - request for the resolve hostname with short abort_source timer - result should be empty list, because of aborted request The test sometimes finishes long dns refresh before abort_source fired and the result list is not empty. There are two issues. First, as.reset() changes the abort_source timeout. The patch adds a get() method to the abort_source_timeout class, so there is no change in the abort_source timeout. Second, a sleep could be not reliable. The patch changes the long sleep inside a dns refresh lambda into condition_variable handling, to properly signal the end of the dns refresh process. Fixes: #26561 Fixes: VECTOR-268 It needs to be backported to 2025.4 Closes scylladb/scylladb#26566 (cherry picked from commit `10208c83ca`) Closes scylladb/scylladb#26598	2025-10-23 11:24:32 +02:00
Piotr Dulikowski	03d57bae80	Merge '[Backport 2025.4] storage_proxy: wait for write handlers destruction' from Scylladb[bot] `shared_ptr<abstract_write_response_handler>` instances are captured in the `lmutate` and `rmutate` lambdas of `send_to_live_endpoints()`. As a result, an `abstract_write_response_handler` object may outlive its removal from the `storage_proxy::_response_handlers` map -> `cancel_all_write_response_handlers()` doesn't actually wait for requests completion -> `sp::drain_on_shutdown()` doesn't guarantee all requests are drained -> `sp::stop_remote()` completes too early and `paxos_store` is destroyed while LWT local writes might still be in progress. In this PR we introduce a `write_handler_destroy_promise` to wait for such pending instances in `cancel_write_handlers()` and `cancel_all_write_response_handlers()` to prevent the `use-after-free`. A better long-term solution might be to replace `shared_ptr` with `unique_ptr` for `abstract_write_response_handler` and use a separate gate to track the `lmutate/rmutate` lambdas. We do not actually need to wait for these lambdas to finish before sending a timeout or error response to the client, as we currently do in `~abstract_write_response_handler`. Fixes scylladb/scylladb#26355 backport: need to be backported to 2025.4 since #26355 is reproduced on LWT over tablets - (cherry picked from commit `bf2ac7ee8b`) - (cherry picked from commit `b269f78fa6`) - (cherry picked from commit `bbcf3f6eff`) - (cherry picked from commit `8925f31596`) Parent PR: #26408 Closes scylladb/scylladb#26658 * github.com:scylladb/scylladb: test_tablets_lwt: add test_lwt_shutdown storage_proxy: wait for write handler destruction storage_proxy: coroutinize cancel_write_handlers storage_proxy: cancel_write_handlers: don't hold a strong pointer to handler	2025-10-23 10:49:52 +02:00
Patryk Jędrzejczak	76560ca095	test: test group0 tombstone GC in the Raft-based recovery procedure We add a regression test for the bug fixed in the previous commits. (cherry picked from commit `c57f097630`)	2025-10-22 17:13:34 +00:00
Patryk Jędrzejczak	8a11535a12	group0_state_id_handler: remove unused group0_server_accessor It became unused in the previous commit. (cherry picked from commit `6b2e003994`)	2025-10-22 17:13:34 +00:00
Patryk Jędrzejczak	d727a086c5	group0_state_id_handler: consider state IDs of all non-ignored topology members It's not enough to consider only the current group 0 members. In the Raft-based recovery procedure, there can be nodes that haven't joined the current group 0 yet, but they have belonged to a different group 0 and thus have a non-empty group 0 state ID. We fix this issue in this commit by considering topology members instead. We don't consider ignored nodes as an optimization. When some nodes are dead, the group 0 state ID handler won't have to wait until all these nodes leave the cluster. It will only have to wait until all these nodes are ignored, which happens at the beginning of the first removenode/replace. As a result, tombstones of group 0 tables will be purged much sooner. We don't rename the `group0_members` variable to keep the change minimal. There seems to be no precise and succinct name for the used set of nodes anyway. We use `std::ranges::join_view` in one place because: - `std::ranges::concat` will become available in C++26, - `boost::range::join` is not a good option, as there is an ongoing effort to minimize external dependencies in Scylla. (cherry picked from commit `1d09b9c8d0`)	2025-10-22 17:13:34 +00:00
Andrei Chekun	d1274f01aa	test.py: rewrite the wait_for_first_completed Rewrite wait_for first_completed to return only first completed task guarantee of awaiting(disappearing) all cancelled and finished tasks Use wait_for_first_completed to avoid false pass tests in the future and issues like #26148 Use gather_safely to await tasks and removing warning that coroutine was not awaited Closes scylladb/scylladb#26435 (cherry picked from commit `24d17c3ce5`) Closes scylladb/scylladb#26663 scylla-2025.4.0-rc3-candidate-20251023065946 scylla-2025.4.0-rc3	2025-10-22 18:12:52 +02:00
Michael Litvak	aa2065fe2e	storage_service: improve colocated repair error to show table names When requesting repair for tablets of a colocated table, the request fails with an error. Improve the error message to show the table names instead of table IDs, because the table names are more useful for users. Fixes scylladb/scylladb#26567 Closes scylladb/scylladb#26568 (cherry picked from commit `b808d84d63`) Closes scylladb/scylladb#26624	2025-10-22 15:25:15 +02:00
Asias He	5c7eb2ac61	repair: Fix uuid and nodes_down order in the log Fixes #26536 Closes scylladb/scylladb#26547 (cherry picked from commit `33bc1669c4`) Closes scylladb/scylladb#26630	2025-10-22 14:25:18 +02:00
Tomasz Grabiec	0621a8aee5	Merge '[Backport 2025.4] Synchronize tablet split and load-and-stream' from Scylladb[bot] Load-and-stream is broken when running concurrently to the finalization step of tablet split. Consider this: 1) split starts 2) split finalization executes barrier and succeed 3) load-and-stream runs now, starts writing sstable (pre-split) 4) split finalization publishes changes to tablet metadata 5) load-and-stream finishes writing sstable 6) sstable cannot be loaded since it spans two tablets two possible fixes (maybe both): 1) load-and-stream awaits for topology to quiesce 2) perform split compaction on sstable that spans both sibling tablets This patch implements # 1. By awaiting for topology to quiesce, we guarantee that load-and-stream only starts when there's no chance coordinator is handling some topology operation like split finalization. Fixes https://github.com/scylladb/scylladb/issues/26455. - (cherry picked from commit `3abc66da5a`) - (cherry picked from commit `4654cdc6fd`) Parent PR: #26456 Closes scylladb/scylladb#26651 * github.com:scylladb/scylladb: test: Add reproducer for l-a-s and split synchronization issue sstables_loader: Synchronize tablet split and load-and-stream	2025-10-22 14:23:04 +02:00
Jenkins Promoter	10db3f7c85	Update ScyllaDB version to: 2025.4.0-rc3	2025-10-22 14:11:52 +03:00
Michał Jadwiszczak	f6dde0aa4b	test/alternator/test_tablets: add test for GSI backfill with tablets The test should pass without the fix for scylladb/scylladb#26615, because the `executor::updata_table()` uses `service::prepare_new_view_announcement()`, which creates view building tasks for the view. But it's better to add this test. (cherry picked from commit `34503f43a1`)	2025-10-22 10:51:55 +00:00
Michał Jadwiszczak	207c273b29	test/alternator/test_tablets: add reproducer for GSI with tablets (cherry picked from commit `bdab455cbb`)	2025-10-22 10:51:54 +00:00
Michał Jadwiszczak	6df48aacd7	alternator/executor: instantly mark view as built when creating it with base table `CreateTable` request creates GSI/LSI together with the base table, the base table is empty and we don't need to actually build the view. In tablet-based keyspaces we can just don't create view building tasks and mark the view build status as SUCCESS on all nodes. Then, the view building worker on each node will mark the view as built in `system.built_views` (`view_building_worker::update_built_views()`). Vnode-based keyspaces will use the "old" logic of view builder, which will process the view and mark it as built. Fixes scylladb/scylladb#26615 (cherry picked from commit `8fbf122277`)	2025-10-22 10:51:54 +00:00
Pavel Emelyanov	45341ca246	Merge '[Backport 2025.4] s3_client: handle failures which require http::request updating' from Scylladb[bot] Apply two main changes to the s3_client error handling 1. Add a loop to s3_client's `make_request` for the case whe the retry strategy will not help since the request itself have to be updated. For example, authentication token expiration or timestamp on the request header 2. Refine the way we handle exceptions in the `chunked_download_source` background fiber, now we carry the original `exception_ptr` and also we wrap EVERY exception in `filler_exception` to prevent retry strategy trying to retry the request altogether Fixes: https://github.com/scylladb/scylladb/issues/26483 Should be ported back to 2025.3 and 2025.4 to prevent deadlocks and failures in these versions - (cherry picked from commit `55fb2223b6`) - (cherry picked from commit `db1ca8d011`) - (cherry picked from commit `185d5cd0c6`) - (cherry picked from commit `116823a6bc`) - (cherry picked from commit `43acc0d9b9`) - (cherry picked from commit `58a1cff3db`) - (cherry picked from commit `1d34657b14`) - (cherry picked from commit `4497325cd6`) - (cherry picked from commit `fdd0d66f6e`) Parent PR: #26527 Closes scylladb/scylladb#26650 * github.com:scylladb/scylladb: s3_client: tune logging level s3_client: add logging s3_client: improve exception handling for chunked downloads s3_client: fix indentation s3_client: add max for client level retries s3_client: remove `s3_retry_strategy` s3_client: support high-level request retries s3_client: just reformat `make_request` s3_client: unify `make_request` implementation	2025-10-22 11:33:53 +03:00
Piotr Dulikowski	1efb2eb174	view_building_worker: access tablet map through erm on sstable discovery Currently, the data returned by `database::get_tables_metadata()` and `database::get_token_metadata()` may not be consistent. Specifically, the tables metadata may contain some tablet-based tables before their tablet maps appear in the token metadata. This is going to be fixed after issue scylladb/scylladb#24414 is closed, but for the time being work around it by accessing the token metadata via `table`->effective_replication_map() - that token metadata is guaranteed to have the tablet map of the `table`. Fixes: scylladb/scylladb#26403 Closes scylladb/scylladb#26588 (cherry picked from commit `f76917956c`) Closes scylladb/scylladb#26631	2025-10-22 11:33:22 +03:00
Pavel Emelyanov	320ef84367	Merge '[Backport 2025.4] compaction/twcs: fix use after free issues' from Scylladb[bot] The `compaction_strategy_state` class holds strategy specific state via a `std::variant` containing different state types. When a compaction strategy performs compaction, it retrieves a reference to its state from the `compaction_strategy_state` object. If the table's compaction strategy is ALTERed while a compaction is in progress, the `compaction_strategy_state` object gets replaced, destroying the old state. This leaves the ongoing compaction holding a dangling reference, resulting in a use after free. Fix this by using `seastar::shared_ptr` for the state variant alternatives(`leveled_compaction_strategy_state_ptr` and `time_window_compaction_strategy_state_ptr`). The compaction strategies now hold a copy of the shared_ptr, ensuring the state remains valid for the duration of the compaction even if the strategy is altered. The `compaction_strategy_state` itself is still passed by reference and only the variant alternatives use shared_ptrs. This allows ongoing compactions to retain ownership of the state independently of the wrapper's lifetime. The method `maybe_wait_for_sstable_count_reduction()`, when retrieving the list of sstables for a possible compaction, holds a reference to the compaction strategy. If the strategy is updated during execution, it can cause a use after free issue. To prevent this, hold a copy of the compaction strategy so it isn’t yanked away during the method’s execution. Fixes #25913 Issue probably started after `9d3755f276`, so backport to 2025.4 - (cherry picked from commit `1cd43bce0e`) - (cherry picked from commit `35159e5b02`) - (cherry picked from commit `18c071c94b`) Parent PR: #26593 Closes scylladb/scylladb#26625 * github.com:scylladb/scylladb: compaction: fix use after free when strategy is altered during compaction compaction/twcs: pass compaction_strategy_state to internal methods compaction_manager: hold a copy to compaction strategy in maybe_wait_for_sstable_count_reduction	2025-10-22 11:32:28 +03:00
Petr Gusev	01658f9fcb	test_tablets_lwt: add test_lwt_shutdown (cherry picked from commit `8925f31596`)	2025-10-22 00:10:59 +00:00
Petr Gusev	e56f14b9c5	storage_proxy: wait for write handler destruction shared_ptr<abstract_write_response_handler> instances are captured in the lmutate/rmutate lambdas of send_to_live_endpoints(). As a result, an abstract_write_response_handler object may outlive its removal from the _response_handlers map. We use write_handler_destroy_promise to wait for such pending instances in cancel_write_handlers() and cancel_all_write_response_handlers() to prevent use-after-free. A better long-term solution might be to replace shared_ptr with unique_ptr for abstract_write_response_handler and use a separate gate to track the lmutate/rmutate lambdas. We do not actually need to wait for these lambdas to finish before sending a timeout or error response to the client, as we currently do in ~abstract_write_response_handler. Fixes scylladb/scylladb#26355 (cherry picked from commit `bbcf3f6eff`)	2025-10-22 00:10:59 +00:00
Petr Gusev	5865dad0c9	storage_proxy: coroutinize cancel_write_handlers The cancel_write_handlers() method was assumed to be called in a thread context, likely because it was first used from gossiper events, where a thread context already existed. Later, this method was reused in abort_view_writes() and abort_batch_writes(), where threads are created on the fly and appear redundant. The drain_on_shutdown() method also used a thread, justified by some "delicate lifetime issues", but it is unclear what that actually means. It seems that a straightforward co_await should work just fine. (cherry picked from commit `b269f78fa6`)	2025-10-22 00:10:59 +00:00
Petr Gusev	388dfbe3ee	storage_proxy: cancel_write_handlers: don't hold a strong pointer to handler A strong pointer was held for the duration of thread::yield(), preventing abstract_write_response_handler destruction and possibly delaying the sending of timeout or error responses to the client. This commit removes the strong pointer. Instead, we compute the next iterator before calling timeout_cb(), so if the handler is destroyed inside timeout_cb(), we already have a valid next iterator. (cherry picked from commit `bf2ac7ee8b`)	2025-10-22 00:10:59 +00:00
Raphael S. Carvalho	92a603699e	test: Add reproducer for l-a-s and split synchronization issue Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> (cherry picked from commit `4654cdc6fd`)	2025-10-21 12:26:55 +00:00
Raphael S. Carvalho	d998d9d418	sstables_loader: Synchronize tablet split and load-and-stream Load-and-stream is broken when running concurrently to the finalization step of tablet split. Consider this: 1) split starts 2) split finalization executes barrier and succeed 3) load-and-stream runs now, starts writing sstable (pre-split) 4) split finalization publishes changes to tablet metadata 5) load-and-stream finishes writing sstable 6) sstable cannot be loaded since it spans two tablets two possible fixes (maybe both): 1) load-and-stream awaits for topology to quiesce 2) perform split compaction on sstable that spans both sibling tablets This patch implements #1. By awaiting for topology to quiesce, we guarantee that load-and-stream only starts when there's no chance coordinator is handling some topology operation like split finalization. Fixes #26455. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> (cherry picked from commit `3abc66da5a`)	2025-10-21 12:26:54 +00:00

1 2 3 4 5 ...

49943 Commits