scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-30 19:46:48 +00:00

Author	SHA1	Message	Date
Aleksandra Martyniuk	fb2c46dfbe	repair: pass session_id to repair_writer_impl::create_writer (cherry picked from commit `09c74aa294`)	2025-03-19 10:07:00 +01:00
Aleksandra Martyniuk	b4e37600d6	repair: keep materialized topology guard in shard_repair_task_impl Keep materialized topology guard in shard_repair_task_impl and check it in check_in_abort_or_shutdown and before each range repair. (cherry picked from commit `47bb9dcf78`)	2025-03-19 10:04:17 +01:00
Aleksandra Martyniuk	6bbf20a440	repair: pass session_id to repair_meta Pass session_id of tablet repair down the stack from the repair request to repair_meta. The session_id will be utiziled in the following patches. (cherry picked from commit `928f92c780`)	2025-03-19 10:02:24 +01:00
Botond Dénes	b8797551eb	Merge '[Backport 2025.1] Rack aware tablet merge colocation migration ' from Tomasz Grabiec service: Introduce rack-aware co-location migrations for tablet merge Merge co-location can emit migrations across racks even when RF=#racks, reducing availability and affecting consistency of base-view pairing. Given replica set of sibling tablets T0 and T1 below: [T0: (rack1,rack3,rack2)] [T1: (rack2,rack1,rack3)] Merge will co-locate T1:rack2 into T0:rack1, T1 will be temporarily only at only a subset of racks, reducing availability. This is the main problem fixed by this patch. It also lays the ground for consistent base-view replica pairing, which is rack-based. For tables on which views can be created we plan to enforce the constraint that replicas don't move across racks and that all tablets use the same set of racks (RF=#racks). This patch avoids moving replicas across racks unless it's necessary, so if the constraint is satisfied before merge, there will be no co-locating migrations across racks. This constraint of RF=#racks is not enforced yet, it requires more extensive changes. Fixes #22994. Refs #17265. This patch is based on Raphael's work done in PR #23081. The main differences are: 1) Instead of sorting replicas by rack, we try to find replicas in sibling tablets which belong to the same rack. This is similar to how we match replicas within the same host. It reduces number of across-rack migrations even if RF!=#racks, which the original patch didn't handle. Unlike the original patch, it also avoids rack-overloaded in case RF!=#racks 2) We emit across-rack co-locating migrations if we have no other choice in order to finalize the merge This is ok, since views are not supported with tablets yet. Later, we will disallow this for tables which have views, and we will allow creating views in the first place only when no such migrations can happen (RF=#racks). 3) Added boost unit test which checks that rack overload is avoided during merge in case RF<#racks 4) Moved logging of across-rack migration to debug level 5) Exposed metric for across-rack co-locating migrations (cherry picked from commit `af949f3b6a`) Also backports dependent patches: - locator: network_topology_strategy: Fix SIGSEGV when creating a table when there is a rack with no normal nodes - locator: network_topology_startegy: Ignore leaving nodes when computing capacity for new tables - Merge 'test: tablets_test: Create proper schema in load balancer tests' from Tomasz Grabiec Closes scylladb/scylladb#22657 Closes scylladb/scylladb#22652 Closes scylladb/scylladb#23297 * github.com:scylladb/scylladb: service: Introduce rack-aware co-location migrations for tablet merge Merge 'test: tablets_test: Create proper schema in load balancer tests' from Tomasz Grabiec locator: network_topology_startegy: Ignore leaving nodes when computing capacity for new tables locator: network_topology_strategy: Fix SIGSEGV when creating a table when there is a rack with no normal nodes	2025-03-18 16:22:29 +02:00
Nadav Har'El	b1cf1890a9	alternator: document the state of tablet support in Alternator In commit `c24bc3b` we decided that creating a new table in Alternator will by default use vnodes - not tablets - because of all the missing features in our tablets implementation that are important for Alternator, namely - LWT, CDC and Alternator TTL. We never documented this, or the fact that we support a tag `experimental:initial_tablets` which allows to override this decision and create an Alternator table using tablets. We also never documented what exactly doesn't work when Alternator uses tablet. This patch adds the missing documentation in docs/alternator/new-apis.md (which is a good place for describing the `experimental:initial_tablets` tag). The patch also adds a new test file, test_tablets.py, which includes tests for all the statements made in the document regarding how `experimental:initial_tablets` works and what works or doesn't work when tablets are enabled. Two existing tests - for TTL and Streams non-support with tablets - are moved to the new test file. When the tablets feature will finally be completed, both the document and the tests will need to be modified (some of the tests should be outright deleted). But it seems this will not happen for at least several months, and that is too long to wait without accurate documentation. Fixes #21629 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#22462 (cherry picked from commit `c0821842de`) Closes scylladb/scylladb#23298	2025-03-16 18:25:21 +02:00
Jenkins Promoter	2f0ebe9f49	Update pgo profiles - aarch64	2025-03-15 04:21:14 +02:00
Jenkins Promoter	3633fb9ff8	Update pgo profiles - x86_64	2025-03-15 04:13:25 +02:00
Raphael S. Carvalho	33b5f27057	service: Introduce rack-aware co-location migrations for tablet merge Merge co-location can emit migrations across racks even when RF=#racks, reducing availability and affecting consistency of base-view pairing. Given replica set of sibling tablets T0 and T1 below: [T0: (rack1,rack3,rack2)] [T1: (rack2,rack1,rack3)] Merge will co-locate T1:rack2 into T0:rack1, T1 will be temporarily only at only a subset of racks, reducing availability. This is the main problem fixed by this patch. It also lays the ground for consistent base-view replica pairing, which is rack-based. For tables on which views can be created we plan to enforce the constraint that replicas don't move across racks and that all tablets use the same set of racks (RF=#racks). This patch avoids moving replicas across racks unless it's necessary, so if the constraint is satisfied before merge, there will be no co-locating migrations across racks. This constraint of RF=#racks is not enforced yet, it requires more extensive changes. Fixes #22994. Refs #17265. This patch is based on Raphael's work done in PR #23081. The main differences are: 1) Instead of sorting replicas by rack, we try to find replicas in sibling tablets which belong to the same rack. This is similar to how we match replicas within the same host. It reduces number of across-rack migrations even if RF!=#racks, which the original patch didn't handle. Unlike the original patch, it also avoids rack-overloaded in case RF!=#racks 2) We emit across-rack co-locating migrations if we have no other choice in order to finalize the merge This is ok, since views are not supported with tablets yet. Later, we will disallow this for tables which have views, and we will allow creating views in the first place only when no such migrations can happen (RF=#racks). 3) Added boost unit test which checks that rack overload is avoided during merge in case RF<#racks 4) Moved logging of across-rack migration to debug level 5) Exposed metric for across-rack co-locating migrations (cherry picked from commit `af949f3b6a`) Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Signed-off-by: Tomasz Grabiec <tgrabiec@scylladb.com>	2025-03-14 20:02:33 +01:00
Anna Stuchlik	11ecc886c3	doc: Remove "experimental" from ALTER KEYSPACE with Tablets Altering a keyspace with tablets is no longer experimental. This commit removes the "Experimental" label from the feature. Fixes https://github.com/scylladb/scylladb/issues/23166 Closes scylladb/scylladb#23183 (cherry picked from commit `562b5db5b8`) Closes scylladb/scylladb#23274	2025-03-14 13:57:55 +01:00
Botond Dénes	eb147ec564	Merge 'test: tablets_test: Create proper schema in load balancer tests' from Tomasz Grabiec This PR converts boost load balancer tests in preparation for load balancer changes which add per-table tablet hints. After those changes, load balancer consults with the replication strategy in the database, so we need to create proper schema in the database. To do that, we need proper topology for replication strategies which use RF > 1, otherwise keyspace creation will fail. Topology is created in tests via group0 commands, which is abstracted by the new `topology_builder` class. Tests cannot modify token_metadata only in memory now as it needs to be consistent with the schema and on-disk metadata. That's why modifications to tablet metadata are now made under group0 guard and save back metadata to disk. Closes scylladb/scylladb#22648 * github.com:scylladb/scylladb: test: tablets: Drop keyspace after do_test_load_balancing_merge_colocation() scenario tests: tablets: Set initial tablets to 1 to exit growing mode test: tablets_test: Create proper schema in load balancer tests test: lib: Introduce topology_builder test: cql_test_env: Expose topology_state_machine topology_state_machine: Introduce lock transition (cherry picked from commit `51a273401c`)	2025-03-13 14:08:30 +01:00
Tomasz Grabiec	637e5fc9b5	locator: network_topology_startegy: Ignore leaving nodes when computing capacity for new tables For example, nodes which are being decommissioned should not be consider as available capacity for new tables. We don't allocate tablets on such nodes. Would result in higher per-shard load then planned. Closes scylladb/scylladb#22657 (cherry picked from commit `3bb19e9ac9`)	2025-03-13 14:08:27 +01:00
Tomasz Grabiec	0d77754c63	locator: network_topology_strategy: Fix SIGSEGV when creating a table when there is a rack with no normal nodes In that case, new_racks will be used, but when we discover no candidates, we try to pop from existing_racks. Fixes #22625 Closes scylladb/scylladb#22652 (cherry picked from commit `e22e3b21b1`)	2025-03-13 14:00:48 +01:00
Benny Halevy	5481c9aedd	docs: document the views-with-tablets experimental feature Refs scylladb/scylladb#22217 Fixes scylladb/scylladb#22893 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#22896 (cherry picked from commit `55dbf5493c`) Closes scylladb/scylladb#23024	2025-03-10 13:26:36 +01:00
Botond Dénes	59db708cba	Merge '[Backport 2025.1] tablets: repair: fix hosts and dcs filters behavior for tablet repair' from Scylladb[bot] If hosts and/or dcs filters are specified for tablet repair and some replicas match these filters, choose the replica that will be the repair master according to round-robin principle (currently it's always the first replica). If hosts and/or dcs filters are specified for tablet repair and no replica matches these filters, the repair succeeds and the repair request is removed (currently an exception is thrown and tablet repair scheduler reschedules the repair forever). Fixes: https://github.com/scylladb/scylladb/issues/23100. Needs backport to 2025.1 that introduces hosts and dcs filters for tablet repair - (cherry picked from commit `9bce40d917`) - (cherry picked from commit `fe4e99d7b3`) - (cherry picked from commit `2b538d228c`) - (cherry picked from commit `c40eaa0577`) - (cherry picked from commit `c7c6d820d7`) Parent PR: #23101 Closes scylladb/scylladb#23109 * github.com:scylladb/scylladb: test: add new cases to tablet_repair tests test: extract repiar check to function locator: add round-robin selection of filtered replicas locator: add tablet_task_info::selected_by_filters service: finish repair successfully if no matching replica found	2025-03-10 12:49:01 +02:00
Botond Dénes	28690f8203	Merge '[Backport 2025.1] repair: Introduce Host and DC filter support' from Scylladb[bot] Currently, the tablet repair scheduler repairs all replicas of a tablet. It does not support hosts or DCs selection. It should be enough for most cases. However, users might still want to limit the repair to certain hosts or DCs in production. https://github.com/scylladb/scylladb/pull/21985 added the preparation work to add the config options for the selection. This patch adds the hosts or DCs selection support. Fixes https://github.com/scylladb/scylladb/issues/22417 New feature. No backport is needed. - (cherry picked from commit `4c75701756`) - (cherry picked from commit `5545289bfa`) - (cherry picked from commit `1c8a41e2dd`) - (cherry picked from commit `e499f7c971`) Parent PR: #22621 Closes scylladb/scylladb#23080 * github.com:scylladb/scylladb: test: add test to check dcs and hosts repair filter test: add repair dc selection to test_tablet_metadata_persistence repair: Introduce Host and DC filter support docs: locator: update the docs and formatter of tablet_task_info	2025-03-10 12:48:49 +02:00
Anna Stuchlik	235c859b98	doc: zero-token nodes and Arbiter DC This commit adds documentation for zero-token nodes and an explanation of how to use them to set up an arbiter DC to prevent a quorum loss in multi-DC deployments. The commit adds two documents: - The one in Architecture describes zero-token nodes. - The other in Cluster Management explains how to use them. We need separate documents because zero-token nodes may be used for other purposes in the future. In addition, the documents are cross-linked, and the link is added to the Create a ScyllaDB Cluster - Multi Data Centers (DC) document. Refs https://github.com/scylladb/scylladb/pull/19684 Fixes https://github.com/scylladb/scylladb/issues/20294 Closes scylladb/scylladb#21348 (cherry picked from commit `9ac0aa7bba`) Closes scylladb/scylladb#23201	2025-03-10 10:59:07 +01:00
Anna Stuchlik	5453e85f39	doc: remove the reference to the 6.2 version This commit removes the OSS version name, which is irrelevant and confusing for 2025.1 and later users. Also, it updates the warning to avoid specifying the release when the deprecated feature will be removed. Fixes https://github.com/scylladb/scylladb/issues/22839 Closes scylladb/scylladb#22936 (cherry picked from commit `d0a48c5661`) Closes scylladb/scylladb#23022	2025-03-07 12:53:42 +02:00
Anna Stuchlik	7a6bcb3a3f	doc: remove references to Enterprise This commit removes the redundant references to Enterprise, which are no longer valid. Fixes https://github.com/scylladb/scylladb/issues/22927 Closes scylladb/scylladb#22930 (cherry picked from commit `a28bbc22bd`) Closes scylladb/scylladb#22963	2025-03-07 12:53:22 +02:00
Anna Stuchlik	8b2a382eb6	doc: add support for Ubuntu 24.04 in 2024.1 Fixes https://github.com/scylladb/scylladb/issues/22841 Refs https://github.com/scylladb/scylla-enterprise/issues/4550 Closes scylladb/scylladb#22843 (cherry picked from commit `439463dbbf`) Closes scylladb/scylladb#23092	2025-03-07 12:51:13 +02:00
Dusan Malusev	cdd51d8b7a	docs: add instruction for installing cassandra-stress Signed-off-by: Dusan Malusev <dusan.malusev@scylladb.com> Closes scylladb/scylladb#21723 (cherry picked from commit `4e6ea232d2`) Closes scylladb/scylladb#22947	2025-03-07 11:48:46 +02:00
Anna Stuchlik	88a8d140b3	doc: add information about tablets limitation to the CQL page This commit adds a link to the Limitations section on the Tablets page to the CQL pag, the tablets option. This is actually the place where the user will need the information: when creating a keyspace. In addition, I've reorganized the section for better readability (otherwise, the section about limitations was easy to miss) and moved the section up on the page. Note that I've removed the updated content from the `_common` folder (which I deleted) to the .rst page - we no longer split OSS and Enterprise, so there's no need to keep using the `scylladb_include_flag` directive to include OSS- and Ent-specific content. Fixes https://github.com/scylladb/scylladb/issues/22892 Fixes https://github.com/scylladb/scylladb/issues/22940 Closes scylladb/scylladb#22939 (cherry picked from commit `0999fad279`) Closes scylladb/scylladb#23091	2025-03-07 11:48:07 +02:00
Aleksandra Martyniuk	1957dac2b4	test: add new cases to tablet_repair tests Add tests for tablet repair with host and dc filters that select one or no replica. (cherry picked from commit `c7c6d820d7`)	2025-03-05 10:59:00 +01:00
Aleksandra Martyniuk	1091ef89e1	test: extract repiar check to function (cherry picked from commit `c40eaa0577`)	2025-03-05 10:59:00 +01:00
Aleksandra Martyniuk	b081e07ffa	locator: add round-robin selection of filtered replicas (cherry picked from commit `2b538d228c`)	2025-03-05 10:58:59 +01:00
Aleksandra Martyniuk	1f102ca2f7	locator: add tablet_task_info::selected_by_filters Extract dcs and hosts filters check to a method. (cherry picked from commit `fe4e99d7b3`)	2025-03-05 10:36:51 +01:00
Aleksandra Martyniuk	8a98f0d5b6	service: finish repair successfully if no matching replica found If hosts and/or dcs filters are specified for tablet repair and no replica matches these filters, an exception is thrown. The repair fails and tablet repair scheduler reschedules it forever. Such a repair should actually succeed (as all specified relpicas were repaired) and the repair request should be removed. Treat the repair as successful if the filters were specified and selected no replica. (cherry picked from commit `9bce40d917`)	2025-03-05 10:36:50 +01:00
Anna Stuchlik	cdae92065b	doc: add the 2025.1 upgrade guides and reorganize the upgrade section This commit adds the upgrade guides relevant in version 2025.1: - From 6.2 to 2025.1 - From 2024.x to 2025.1 It also removes the upgrade guides that are not relevant in 2025.1 source available: - Open Source upgrade guides - From Open Source to Enterprise upgrade guides - Links to the Enterprise upgrade guides Also, as part of this PR, the remaining relevant content has been moved to the new About Upgrade page. WHAT NEEDS TO BE REVIEWED - Review the instructions in the 6.2-to-2025.1 guide - Review the instructions in the 2024.x-to-2025.1 guide - Verify that there are no references to Open Source and Enterprise. The scope of this PR does not have to include metrics - the info can be added in a follow-up PR. Fixes https://github.com/scylladb/scylladb/issues/22208 Fixes https://github.com/scylladb/scylladb/issues/22209 Fixes https://github.com/scylladb/scylladb/issues/23072 Fixes https://github.com/scylladb/scylladb/issues/22346 Closes scylladb/scylladb#22352 (cherry picked from commit `850aec58e0`) Closes scylladb/scylladb#23106	2025-03-04 08:15:08 +02:00
Jenkins Promoter	4813c48d64	Update pgo profiles - aarch64	2025-03-01 04:23:19 +02:00
Jenkins Promoter	b623b108c3	Update pgo profiles - x86_64	2025-03-01 04:05:24 +02:00
Aleksandra Martyniuk	7fdc7bdc4b	test: add test to check dcs and hosts repair filter (cherry picked from commit `e499f7c971`)	2025-02-27 12:14:47 +01:00
Aleksandra Martyniuk	c2e926850d	test: add repair dc selection to test_tablet_metadata_persistence (cherry picked from commit `1c8a41e2dd`)	2025-02-27 12:14:47 +01:00
Asias He	6d5b029812	repair: Introduce Host and DC filter support Currently, the tablet repair scheduler repairs all replicas of a tablet. It does not support hosts or DCs selection. It should be enough for most cases. However, users might still want to limit the repair to certain hosts or DCs in production. #21985 added the preparation work to add the config options for the selection. This patch adds the hosts or DCs selection support. Fixes #22417 (cherry picked from commit `5545289bfa`)	2025-02-27 12:14:44 +01:00
Aleksandra Martyniuk	ffeb55cf77	docs: locator: update the docs and formatter of tablet_task_info (cherry picked from commit `4c75701756`)	2025-02-26 23:49:50 +00:00
Jenkins Promoter	37aa7c216c	Update ScyllaDB version to: 2025.1.0-rc4	2025-02-25 21:33:18 +02:00
Gleb Natapov	0b0e9f0c32	treewide: include build_mode.hh for SCYLLA_BUILD_MODE_RELEASE where it is missing Fixes: #22914 Closes scylladb/scylladb#22915 (cherry picked from commit `914c9f1711`) Closes scylladb/scylladb#22962	2025-02-25 18:12:54 +03:00
Evgeniy Naydanov	871fabd60a	test.py: test_random_failures: improve handling of hung node In some cases the paused/unpaused node can hang not after 30s timeout. This make the test flaky. Change the condition to always check the coordinator's log if there is a hung node. Add `stop_after_streaming` to the list of error injections which can cause a node's hang. Also add a wait for a new coordinator election in cluster events which cause such elections. Closes scylladb/scylladb#22825 (cherry picked from commit `99be9ac8d8`) Closes scylladb/scylladb#23007	2025-02-25 14:31:51 +03:00
Pavel Emelyanov	aa5cb15166	Merge 'Alternator: implement UpdateTable operation to add or delete GSI' from Nadav Har'El In this series we implement the UpdateTable operation to add a GSI to an existing table, or remove a GSI from a table. As the individual commit messages will explained, this required changing how Alternator stores materialized view keys - instead of insisting that these key must be real columns (that is not the case when adding a GSI to an existing table), the materialized view can now take as its key any Alternator attribute serialized inside the ":attrs" map holding all non-key attributes. Fixes #11567. We also fix the IndexStatus and Backfilling attributes returned by DescribeTable - as DynamoDB API users use this API to discover when a newly added GSI completed its "backfilling" (what we call "view building") stage. Fixes #11471. This series should not be backported lightly - it's a new feature and required fairly large and intrusive changes that can introduce bugs to use cases that don't even use Alternator or its UpdateTable operations - every user of CQL materialized views or secondary indexes, as well as Alternator GSI or LSI, will use modified code. It should be backported to 2025.1, though - this version was actually branched long after this PR was sent, and it provides a feature that was promised for 2025.1. Closes scylladb/scylladb#21989 * github.com:scylladb/scylladb: alternator: fix view build on oversized GSI key attribute mv: clean up do_delete_old_entry test/alternator: unflake test for IndexStatus test/alternator: work around unrelated bug causing test flakiness docs/alternator: adding a GSI is no longer an unimplemented feature test/alternator: remove xfail from all tests for issue 11567 alternator: overhaul implementation of GSIs and support UpdateTable mv: support regular_column_transformation key columns in view alternator: add new materialized-view computed column for item in map build: in cmake build, schema needs alternator build: build tests with Alternator alternator: add function serialized_value_if_type() mv: introduce regular_column_transformation, a new type of computed column alternator: add IndexStatus/Backfilling in DescribeTable alternator: add "LimitExceededException" error type docs/alternator: document two more unimplemented Alternator features (cherry picked from commit `529ff3efa5`) Closes scylladb/scylladb#22826 scylla-2025.1.0-rc3-candidate-20250224022223 scylla-2025.1.0-rc3	2025-02-18 19:05:21 +02:00
Jenkins Promoter	13d79ba990	Update ScyllaDB version to: 2025.1.0-rc3	2025-02-18 15:06:57 +02:00
Nadav Har'El	35b410326b	test/topology_custom: fix very slow test test_localnodes_broadcast_rpc_address The test topology_custom/test_alternator::test_localnodes_broadcast_rpc_address sets up nodes with a silly "broadcast rpc address" and checks that Alternator's "/localnodes" requests returns it correctly. The problem is that although we don't use CQL in this test, the test framework does open a CQL connection when the test starts, and closes it when it ends. It turns out that when we set a silly "broadcast RPC address", the driver tends to try to connect to it when shutting down, I'm not even sure why. But the choice of the silly address was 1.2.3.4 is unfortunate, because this IP address is actually routable - and the driver hangs until it times out (in practice, in a bit over two minutes). This trivial patch changes 1.2.3.4 to 127.0.0.0 - and equally silly address but one to which connections fail immediately. Before this patch, the test often takes more than 2 minutes to finish on my laptop, after this patch, it always finishes in 4-5 seconds. Fixes #22744 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#22746 (cherry picked from commit `f89235517d`) Closes scylladb/scylladb#22875	2025-02-18 10:33:21 +02:00
Botond Dénes	12a3fcceae	Merge '[Backport 2025.1] sstable_loader: fix cross-shard resource cleanup in download_task_impl ' from Scylladb[bot] This PR addresses two related issues in our task system: 1. Prepares for asynchronous resource cleanup by converting release_resources() to a coroutine. This refactoring enables future improvements in how we handle resource cleanup. 2. Fixes a cross-shard resource cleanup issue in the SSTable loader where destruction of per-shard progress elements could trigger "shared_ptr accessed on non-owner cpu" errors in multi-shard environments. The fix uses coroutines to ensure resources are released on their owner shards. Fixes #22759 --- this change addresses a regression introduced by `d815d7013c`, which is contained by 2025.1 and master branches. so it should be backported to 2025.1 branch. - (cherry picked from commit `4c1f1baab4`) - (cherry picked from commit `b448fea260`) Parent PR: #22791 Closes scylladb/scylladb#22871 * github.com:scylladb/scylladb: sstable_loader: fix cross-shard resource cleanup in download_task_impl tasks: make release_resources() a coroutine	2025-02-18 10:32:48 +02:00
Gleb Natapov	040c59674a	api: initialize token metadata API after starting the gossiper Token metadata API now depend on gossiper to do ip to host id mappings, so initialized it after the gossiper is initialized and de-initialized it before gossiper is stopped. Fixes: scylladb/scylladb#22743 Closes scylladb/scylladb#22760 (cherry picked from commit `d288d79d78`) Closes scylladb/scylladb#22854	2025-02-18 10:32:24 +02:00
Asias He	b50a6657e8	repair: Add await_completion option for tablet_repair api Set true to wait for the repair to complete. Set false to skip waiting for the repair to complete. When the option is not provided, it defaults to false. It is useful for management tool that wants the api to be async. Fixes #22418 Closes scylladb/scylladb#22436 (cherry picked from commit `fb318d0c81`) Closes scylladb/scylladb#22851	2025-02-18 10:31:53 +02:00
Botond Dénes	93479ffcf9	Merge '[Backport 2025.1] raft/group0_state_machine: load current RPC compression dict on startup' from Michał Chojnowski We are supposed to be loading the most recent RPC compression dictionary on startup, but we forgot to port the relevant piece of logic during the source-available port. This causes a restarted node not to use the dictionary for RPC compression until the next dictionary update. Fix that. Fixes #22738 This is more of a bugfix than an improvement, so it should be backported to 2025.1. * (cherry picked from commit [`dd82b40`](`dd82b40186`)) * (cherry picked from commit [`8fb2ea6`](`8fb2ea61ba`)) Additionally cherry picked https://github.com/scylladb/scylladb/pull/22836 to fix the timeout. Parent PR: #22739 Closes scylladb/scylladb#22837 * github.com:scylladb/scylladb: test_rpc_compression.py: fix an overly-short timeout test_rpc_compression.py: test the dictionaries are loaded on startup raft/group0_state_machine: load current RPC compression dict on startup	2025-02-18 10:31:23 +02:00
Botond Dénes	38bd74b2d4	tools/scylla-nodetool: netstats: don't assume both senders and receivers The code currently assumes that a session has both sender and receiver streams, but it is possible to have just one or the other. Change the test to include this scenario and remove this assumption from the code. Fixes: #22770 Closes scylladb/scylladb#22771 (cherry picked from commit `87e8e00de6`) Closes scylladb/scylladb#22874	2025-02-17 14:34:36 +02:00
Takuya ASADA	6ee1779578	dist: fix upgrade error from 2024.1 We need to allow replacing nodetool from scylla-enterprise-tools < 2024.2, just like we did for scylla-tools < 5.5. This is required to make packages able to upgrade from 2024.1. Fixes #22820 Closes scylladb/scylladb#22821 (cherry picked from commit `b5e306047f`) Closes scylladb/scylladb#22867 scylla-2025.1.0-rc2-candidate-20250217070033 scylla-2025.1.0-rc2	2025-02-16 14:47:48 +02:00
Kefu Chai	9fe2301647	sstable_loader: fix cross-shard resource cleanup in download_task_impl Previously, download_task_impl's destructor would destroy per-shard progress elements on whatever shard the task was destroyed on. In multi-shard environments, this caused "shared_ptr accessed on non-owner cpu" errors when attempting to free memory allocated on a different shard. Fix by: - Convert progress_per_shard into a sharded service - Stop the service on owner shards during cleanup using coroutines - Add operator+= to stream_progress to leverage seastar's built-in adder instead of a custom adder struct Alternative approaches considered: 1. Using foreign_ptr: Rejected as it would require interface changes that complicate stream delegation. foreign_ptr manages the underlying pointee with another smart pointer but does not expose the smart pointer instance in its APIs, making it impossible to use shared_ptr<stream_progress> in the interface. 2. Using vector<stream_progress>: Rejected for similar interface compatibility reasons. This solution maintains the existing interfaces while ensuring proper cross-shard cleanup. Fixes scylladb/scylladb#22759 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> (cherry picked from commit `b448fea260`)	2025-02-15 22:46:43 +00:00
Kefu Chai	6b27459de3	tasks: make release_resources() a coroutine Convert tasks::task_manager::task::impl::release_resources() to a coroutine to prepare for upcoming changes that will implement asynchronous resource release. This is a preparatory refactoring that enables future coroutine-based implementation of resource cleanup logic. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> (cherry picked from commit `4c1f1baab4`)	2025-02-15 22:46:43 +00:00
Jenkins Promoter	48130ca2e9	Update pgo profiles - aarch64	2025-02-15 04:20:15 +02:00
Jenkins Promoter	5054087f0b	Update pgo profiles - x86_64	2025-02-15 04:05:06 +02:00
Botond Dénes	889fb9c18b	Update tools/java submodule * tools/java 807e991d...6dfe728a (1): > dist: support smooth upgrade from enterprise to source availalbe Fixes: scylladb/scylladb#22820	2025-02-14 11:14:07 +02:00

1 2 3 4 5 ...

46422 Commits