scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-24 10:30:38 +00:00

Author	SHA1	Message	Date
Andrzej Jackowski	2edd87f2e1	test: remove unneeded semicolons from python test (cherry picked from commit `e63cfc38b3`)	2026-02-16 16:23:36 +00:00
Jenkins Promoter	5abc2fea9f	Update pgo profiles - aarch64	2026-02-15 05:01:51 +02:00
Pawel Pery	f4b79c1b1d	Revert "Merge 'vector_search: add validator tests' from Pawel Pery" This reverts commit `bcd1758911`, reversing changes made to `b2c2a99741`. There is a design decision to not introduce additional test orchestration tool for scylladb.git (see comments for #27499). One commit has already been reverted in `55c7bc7`. Last CI runs made validator test flaky, so it is a time to remove all remaining validator tests. It needs a backport to 2026.1 to remove remaining validator tests from there. Fixes: VECTOR-497 Closes scylladb/scylladb#28568 (cherry picked from commit `81d11a23ce`) Closes scylladb/scylladb#28577	2026-02-09 15:16:40 +02:00
Michał Hudobski	f633f57163	auth: add CDC streams and timestamps to vector search permissions It turns out that the cdc driver requires permissions to two additional system tables. This patch adds them to VECTOR_SEARCH_INDEXING and modifies the unit tests. The integration with vector store was tested manually, integration tests will be added in vector-store repository in a follow up PR. Fixes: SCYLLADB-522 Closes scylladb/scylladb#28519 (cherry picked from commit `6b9fcc6ca3`) Closes scylladb/scylladb#28538	2026-02-05 10:31:39 +01:00
Jenkins Promoter	09ed4178a6	Update ScyllaDB version to: 2026.1.0-rc2	2026-02-03 18:15:05 +02:00
Patryk Jędrzejczak	2bf7a0f65e	Merge '[Backport 2026.1] storage_service: set up topology properly in maintenance mode' from Scylladb[bot] We currently make the local node the only token owner (that owns the whole ring) in maintenance mode, but we don't update the topology properly. The node is present in the topology, but in the `none` state. That's how it's inserted by `tm.get_topology().set_host_id_cfg(host_id);` in `scylla_main`. As a result, the node started in maintenance mode crashes in the following way in the presence of a vnodes-based keyspace with the NetworkTopologyStrategy: ``` scylla: locator/network_topology_strategy.cc:207: locator::natural_endpoints_tracker::natural_endpoints_tracker( const token_metadata &, const network_topology_strategy::dc_rep_factor_map &): Assertion `!_token_owners.empty() && !_racks.empty()' failed. ``` Both `_token_owners` and `_racks` are empty. The reason is that `_tm.get_datacenter_token_owners()` and `_tm.get_datacenter_racks_token_owners()` called above filter out nodes in the `none` state. This bug basically made maintenance mode unusable in customer clusters. We fix it by changing the node state to `normal`. We also extend `test_maintenance_mode` to provide a reproducer for Fixes #27988 This PR must be backported to all branches, as maintenance mode is currently unusable everywhere. - (cherry picked from commit `a08c53ae4b`) - (cherry picked from commit `9d4a5ade08`) - (cherry picked from commit `c92962ca45`) - (cherry picked from commit `408c6ea3ee`) - (cherry picked from commit `53f58b85b7`) - (cherry picked from commit `867a1ca346`) - (cherry picked from commit `6c547e1692`) - (cherry picked from commit `7e7b9977c5`) Parent PR: #28322 Closes scylladb/scylladb#28499 * https://github.com/scylladb/scylladb: test: test_maintenance_mode: enable maintenance mode properly test: test_maintenance_mode: shutdown cluster connections test: test_maintenance_mode: run with different keyspace options test: test_maintenance_mode: check that group0 is disabled by creating a keyspace test: test_maintenance_mode: get rid of the conditional skip test: test_maintenance_mode: remove the redundant value from the query result storage_proxy: skip validate_read_replica in maintenance mode storage_service: set up topology properly in maintenance mode	2026-02-03 10:39:41 +01:00
Patryk Jędrzejczak	b62e1b405b	test: test_maintenance_mode: enable maintenance mode properly The same issue as the one fixed in `394207fd69`. This one didn't cause real problems, but it's still cleaner to fix it. (cherry picked from commit `7e7b9977c5`)	2026-02-02 17:02:16 +00:00
Patryk Jędrzejczak	f3d2a16e66	test: test_maintenance_mode: shutdown cluster connections Leaked connections are known to cause inter-test issues. (cherry picked from commit `6c547e1692`)	2026-02-02 17:02:16 +00:00
Patryk Jędrzejczak	eee99ebb3d	test: test_maintenance_mode: run with different keyspace options We extend the test to provide a reproducer for #27988 and to avoid similar bugs in the future. The test slows down from ~14s to ~19s on my local machine in dev mode. It seems reasonable. (cherry picked from commit `867a1ca346`)	2026-02-02 17:02:16 +00:00
Patryk Jędrzejczak	c248744c5a	test: test_maintenance_mode: check that group0 is disabled by creating a keyspace In the following commit, we make the rest run with multiple keyspaces, and the old check becomes inconvenient. We also move it below to the part of the code that won't be executed for each keyspace. Additionally, we check if the error message is as expected. (cherry picked from commit `53f58b85b7`)	2026-02-02 17:02:16 +00:00
Patryk Jędrzejczak	4ba3c08d45	test: test_maintenance_mode: get rid of the conditional skip This skip has already caused trouble. After `0668c642a2`, the skip was always hit, and the test was silently doing nothing. This made us miss #26816 for a long time. The test was fixed in `222eab45f8`, but we should get rid of the skip anyway. We increase the number of writes from 256 to 1000 to make the chance of not finding the key on server A even lower. If that still happens, it must be due to a bug, so we fail the test. We also make the test insert rows until server A is a replica of one row. The expected number of inserted rows is a small constant, so it should, in theory, make the test faster and cleaner (we need one row on server A, so we insert exactly one such row). It's possible to make the test fully deterministic, by e.g., hardcoding the key and tokens of all nodes via `initial_token`, but I'm afraid it would make the test "too deterministic" and could hide a bug. (cherry picked from commit `408c6ea3ee`)	2026-02-02 17:02:16 +00:00
Patryk Jędrzejczak	c8c21cc29c	test: test_maintenance_mode: remove the redundant value from the query result (cherry picked from commit `c92962ca45`)	2026-02-02 17:02:16 +00:00
Patryk Jędrzejczak	e95689c96b	storage_proxy: skip validate_read_replica in maintenance mode In maintenance mode, the local node adds only itself to the topology. However, the effective replication map of a keyspace with tablets enabled contains all tablet replicas. It gets them from the tablets map, not the topology. Hence, `network_topology_strategy::sanity_check_read_replicas` hits ``` throw std::runtime_error(format("Requested location for node {} not in topology. backtrace {}", id, lazy_backtrace())); ``` for tablet replicas other than the local node. As a result, all requests to a keyspace with tablets enabled and RF > 1 fail in debug mode (`validate_read_replica` does nothing in other modes). We don't want to skip maintenance mode tests in debug mode, so we skip the check in maintenance mode. We move the `is_debug_build()` check because: - `validate_read_replicas` is a static function with no access to the config, - we want the `!_db.local().get_config().maintenance_mode()` check to be dropped by the compiler in non-debug builds. We also suppress `-Wunneeded-internal-declaration` with `[[maybe_unused]]`. (cherry picked from commit `9d4a5ade08`)	2026-02-02 17:02:15 +00:00
Patryk Jędrzejczak	6094f4b7b2	storage_service: set up topology properly in maintenance mode We currently make the local node the only token owner (that owns the whole ring) in maintenance mode, but we don't update the topology properly. The node is present in the topology, but in the `none` state. That's how it's inserted by `tm.get_topology().set_host_id_cfg(host_id);` in `scylla_main`. As a result, the node started in maintenance mode crashes in the following way in the presence of a vnodes-based keyspace with the NetworkTopologyStrategy: ``` scylla: locator/network_topology_strategy.cc:207: locator::natural_endpoints_tracker::natural_endpoints_tracker( const token_metadata &, const network_topology_strategy::dc_rep_factor_map &): Assertion `!_token_owners.empty() && !_racks.empty()' failed. ``` Both `_token_owners` and `_racks` are empty. The reason is that `_tm.get_datacenter_token_owners()` and `_tm.get_datacenter_racks_token_owners()` called above filter out nodes in the `none` state. This bug basically made maintenance mode unusable in customer clusters. We fix it by changing the node state to `normal`. We also update its rack, datacenter, and shards count. Rack and datacenter are present in the topology somehow, but there is nothing wrong with updating them again. The shard count is also missing, so we better update it to avoid other issues. Fixes #27988 (cherry picked from commit `a08c53ae4b`)	2026-02-02 17:02:15 +00:00
Avi Kivity	ad64dc7c01	Merge '[Backport 2026.1] load_stats: fix problem with load_stats refresh throwing no_such_column_family' from Scylladb[bot] When the topology coordinator refreshes load_stats, it caches load_stats for every node. In case the node becomes unresponsive, and fresh load_stats can not be read from the node, the cached version of load_stats will be used. This is to allow the load balancer to have at least some information about the table sizes and disk capacities of the host. During load_stats refresh, we aggregate the table sizes from all the nodes. This procedure calls db.find_column_family() for each table_id found in load_stats. This function will throw if the table is not found. This will cause load_stats refresh to fail. It is also possible for a table to have been dropped between the time load_stats has been prepared on the host, and the time it is processed on the topology coordinator. This would also cause an exception in the refresh procedure. This fixes this problem by checking if the table still exists. Fixes: #28359 - (cherry picked from commit `71be10b8d6`) - (cherry picked from commit `92dbde54a5`) Parent PR: #28440 Closes scylladb/scylladb#28471 * github.com:scylladb/scylladb: test: add test and reproducer for load_stats refresh exception load_stats: handle dropped tables when refreshing load_stats	2026-02-01 13:51:31 +02:00
Jenkins Promoter	bafd185087	Update pgo profiles - aarch64 scylla-2026.1.0-rc1-candidate-20260201021518 scylla-2026.1.0-rc1	2026-02-01 05:06:48 +02:00
Jenkins Promoter	07d1f8f48a	Update pgo profiles - x86_64	2026-02-01 04:20:45 +02:00
Ferenc Szili	523d529d27	test: add test and reproducer for load_stats refresh exception This patch adds a test and reproducer for the issue where the load_stats refresh procedure throws exceptions if any of the tables have been dropped since load_stats was produced. (cherry picked from commit `92dbde54a5`)	2026-02-01 00:34:26 +00:00
Ferenc Szili	c8dbd43ed5	load_stats: handle dropped tables when refreshing load_stats When the topology coordinator refreshes load_stats, it caches load_stats for every node. In case the node becomes unresponsive, and fresh load_stats can not be read from the node, the cached version of load_stats will be used. This is to allow the load balancer to have at least some information about the table sizes and disk capacities of the host. During load_stats refresh, we aggregate the table sizes from all the nodes. This procedure calls db.find_column_family() for each table_id found in load_stats. This function will throw if the table is not found. This will cause load_stats refresh to fail. It is also possible for a table to have been dropped between the time load_stats has been prepared on the host, and the time it is processed on the topology coordinator. This would also cause an exception in the refresh procedure. This patch fixes this problem by checking if the table still exists. (cherry picked from commit `71be10b8d6`)	2026-02-01 00:34:26 +00:00
Botond Dénes	0cf9f41649	Merge '[Backport 2026.1] docs: add documentation for automatic repair' from Scylladb[bot] Explain what automatic repair is and how to configure it. While at it, improve the existing repair documentation a bit. Fixes: SCYLLADB-130 This PR missed the 2026.1 branch date, so it needs backport to 2026.1, where the auto repair feature debuts. - (cherry picked from commit `a84b1b8b78`) - (cherry picked from commit `57b2cd2c16`) - (cherry picked from commit `1713d75c0d`) Parent PR: #28199 Closes scylladb/scylladb#28424 * github.com:scylladb/scylladb: docs: add feature page for automatic repair docs: inter-link incremental-repair and repair documents docs: incremental-repair: fix curl example	2026-01-30 16:01:03 +02:00
Botond Dénes	dc89e2ea37	Merge '[Backport 2026.1] test: test_alternator_proxy_protocol: fix race between node startup and test start' from Scylladb[bot] test_alternator_proxy_protocol starts a node and connects via the alternator ports. Starting a node, by default, waits until the CQL ports are up. This does not guarantee that the alternator ports are up (they will be up very soon after this), so there is a short window where a connection to the alternator ports will fail. Fix by adding a ServerUpState=SERVING mode, which waits for the node to report to its supervisor (systemd, which we are pretending to be) that its ports are open. The test is then adjusted to request this new ServerUpState. Fixes #28210 Fixes #28211 Flaky tests are only in master and branch-2026.1, so backporting there. - (cherry picked from commit `ebac810c4e`) - (cherry picked from commit `59f2a3ce72`) Parent PR: #28291 Closes scylladb/scylladb#28443 * github.com:scylladb/scylladb: test: test_alternator_proxy_protocol: wait for the node to report itself as serving test: cluster_manager: add ability to wait for supervisor STATUS=serving	2026-01-30 15:59:09 +02:00
Tomasz Grabiec	797f56cb45	Merge '[Backport 2026.1] Improve load balancer logging and other minor cleanups' from Scylladb[bot] Contains various improvements to tablet load balancer. Batched together to save on the bill for CI. Most notably: - Make plan summary more concise, and print info only about present elements. - Print rack name in addition to DC name when making a per-rack plan - Print "Not possible to achieve balance" only when this is the final plan with no active migrations - Print per-node stats when "Not possible to achieve balance" is printed - amortize metrics lookup cost - avoid spamming logs with per-node "Node {} does not have complete tablet stats, ignoring" Backport to 2026.1: since the changes enhance debuggability and are relatively low risk Fixes #28423 Fixes #28422 - (cherry picked from commit `32b336e062`) - (cherry picked from commit `df32318f66`) - (cherry picked from commit `f2b0146f0f`) - (cherry picked from commit `0d090aa47b`) - (cherry picked from commit `12fdd205d6`) - (cherry picked from commit `615b86e88b`) - (cherry picked from commit `7228bd1502`) - (cherry picked from commit `4a161bff2d`) - (cherry picked from commit `ef0e9ad34a`) - (cherry picked from commit `9715965d0c`) - (cherry picked from commit `8e831a7b6d`) Parent PR: #28337 Closes scylladb/scylladb#28428 * github.com:scylladb/scylladb: tablets: tablet_allocator.cc: Convert tabs to spaces tablets: load_balancer: Warn about incomplete stats once for all offending nodes tablets: load_balancer: Improve node stats printout tablets: load_balancer: Warn about imbalance only when there are no more active migrations tablets: load_balancer: Extract print_node_stats() tablet: load_balancer: Use empty() instead of size() where applicable tablets: Fix redundancy in migration_plan::empty() tablets: Cache pointer to stats during plan-making tablets: load_balancer: Print rack in addition to DC when giving context tablets: load_balancer: Make plan summary concise tablets: load_balancer: Move "tablet_migration_bypass" injection point to make_plan()	2026-01-30 14:08:34 +01:00
Pawel Pery	be1d418bc0	vector_search: allow full secondary indexes syntax while creating the vector index Vector Search feature needs to support creating vector indexes with additional filtering column. There will be two types of indexes: global which indexes vectors per table, and local which indexes vectors per partition key. The new syntaxes are based on ScyllaDB's Global Secondary Index and Local Secondary Index. Vector indexes don't use secondary indexes functionalities in any way - all indexing, filtering and processing data will be done on Vector Store side. This patch allows creating vector indexes using this CQL syntax: ``` CREATE TABLE IF NOT EXISTS cycling.comments_vs ( commenter text, comment text, comment_vector VECTOR <FLOAT, 5>, created_at timestamp, discussion_board_id int, country text, lang text, PRIMARY KEY ((commenter, discussion_board_id), created_at) ); CREATE CUSTOM INDEX IF NOT EXISTS global_ann_index ON cycling.comments_vs(comment_vector, country, lang) USING 'vector_index' WITH OPTIONS = { 'similarity_function': 'DOT_PRODUCT' }; CREATE CUSTOM INDEX IF NOT EXISTS local_ann_index ON cycling.comments_vs((commenter, discussion_board_id), comment_vector, country, lang) USING 'vector_index' WITH OPTIONS = { 'similarity_function': 'DOT_PRODUCT' }; ``` Currently, if we run these queries to create indexes we will receive such errors: ``` InvalidRequest: Error from server: code=2200 [Invalid query] message="Vector index can only be created on a single column" InvalidRequest: Error from server: code=2200 [Invalid query] message="Local index definition must contain full partition key only. Redundant column: XYZ" ``` This commit refactors `vector_index::check_target` to correctly validate columns building the index. Vector-store currently support filtering by native types, so the type of columns is checked. The first column from the list must be a vector (to build index based on these vectors), so it is also checked. Allowed types for columns are native types without counter (it is not possible to create a table with counter and vector) and without duration (it is not possible to correctly compare durations, this type is even not allowed in secondary indexes). This commits adds cqlpy test to check errors while creating indexes. Fixes: SCYLLADB-298 This needs to be backported to version 2026.1 as this is a fix for filtering support. Closes scylladb/scylladb#28366 (cherry picked from commit `f49c9e896a`) Closes scylladb/scylladb#28448	2026-01-30 11:25:01 +01:00
Patryk Jędrzejczak	46923f7358	Merge '[Backport 2026.1] Introduce TTL and retries to address resolution' from Scylladb[bot] In production environments, we observed cases where the S3 client would repeatedly fail to connect due to DNS entries becoming stale. Because the existing logic only attempted the first resolved address and lacked a way to refresh DNS state, the client could get stuck in a failure loop. Introduce RR TTL and connection failure retry to - re-resolve the RR in a timely manner - forcefully reset and re-resolve addresses - add a special case when the TTL is 0 and the record must be resolved for every request Fixes: CUSTOMER-96 Fixes: CUSTOMER-139 Should be backported to 2025.3/4 and 2026.1 since we already encountered it in the production clusters for 2025.3 - (cherry picked from commit `bd9d5ad75b`) - (cherry picked from commit `359d0b7a3e`) - (cherry picked from commit `ce0c7b5896`) - (cherry picked from commit `5b3e513cba`) - (cherry picked from commit `66a33619da`) - (cherry picked from commit `6eb7dba352`) - (cherry picked from commit `a05a4593a6`) - (cherry picked from commit `3a31380b2c`) - (cherry picked from commit `912c48a806`) Parent PR: #27891 Closes scylladb/scylladb#28405 * https://github.com/scylladb/scylladb: connection_factory: includes cleanup dns_connection_factory: refine the move constructor connection_factory: retry on failure connection_factory: introduce TTL timer connection_factory: get rid of shared_future in dns_connection_factory connection_factory: extract connection logic into a member connection_factory: remove unnecessary `else` connection_factory: use all resolved DNS addresses s3_test: remove client double-close	2026-01-30 11:10:48 +01:00
Avi Kivity	4032e95715	test: test_alternator_proxy_protocol: wait for the node to report itself as serving Use the new ServerUpState=SERVING mechanism to wait to the alternator ports to be up, rather than relying on the default waiting for CQL, which happens earlier and therefore opens a window where a connection to the alternator ports will fail. (cherry picked from commit `59f2a3ce72`)	2026-01-29 22:46:11 +02:00
Avi Kivity	eab10c00b1	test: cluster_manager: add ability to wait for supervisor STATUS=serving When running under systemd, ScyllaDB sends a STATUS=serving message to systemd. Co-opt this mechanism by setting up NOTIFY_SOCKET, thus making the cluster manager pretend it is systemd. Users of the cluster manager can now wait for the node to report itself up, rather than having to parse log files or retry connections. (cherry picked from commit `ebac810c4e`)	2026-01-29 19:48:53 +00:00
Patryk Jędrzejczak	091c3b4e22	test: test_gossiper_orphan_remover: get host ID of the bootstrapping node before it crashes The test is currently flaky. It tries to get the host ID of the bootstrapping node via the REST API after the node crashes. This can obviously fail. The test usually doesn't fail, though, as it relies on the host ID being saved in `ScyllaServer._host_id` at this point by `ScyllaServer.try_get_host_id()` repeatedly called in `ScyllaServer.start()`. However, with a very fast crash and unlucky timings, no such call may succeed. We deflake the test by getting the host ID before the crash. Note that at this point, the bootstrapping node must be serving the REST API requests because `await log.wait_for("finished do_send_ack2_msg")` above guarantees that the node has started the gossip shadow round, which happens after starting the REST API. Fixes #28385 Closes scylladb/scylladb#28388 (cherry picked from commit `a2c1569e04`) Closes scylladb/scylladb#28417	2026-01-29 11:25:10 +01:00
Tomasz Grabiec	19eadafdef	tablets: tablet_allocator.cc: Convert tabs to spaces (cherry picked from commit `8e831a7b6d`)	2026-01-29 09:06:49 +00:00
Tomasz Grabiec	358fc15893	tablets: load_balancer: Warn about incomplete stats once for all offending nodes To reduce log spamming when all nodes are missing stats. (cherry picked from commit `9715965d0c`)	2026-01-29 09:06:49 +00:00
Tomasz Grabiec	32124d209e	tablets: load_balancer: Improve node stats printout Make it more concise: - reduce precision for load to 6 fractional digits - reduce precision for tablets/shard to 3 fractional digits - print "dc1/rack1" instead of "dc=dc1 rack=rack1", like in other places - print "rd=0 wr=0" instead of "stream_read=0 stream_write=0" Example: load_balancer - Node 477569c0-f937-11f0-ab6f-541ce4a00601: dc10/rack10c load=170.666667 tablets=1 shards=12 tablets/shard=0.083 state=normal cap=64424509440 stream: rd=0 wr=0 load_balancer - Node 47678711-f937-11f0-ab6f-541ce4a00601: dc10/rack10c load=0.000000 tablets=0 shards=12 tablets/shard=0.000 state=normal cap=64424509440 stream: rd=0 wr=0 load_balancer - Node 47832560-f937-11f0-ab6f-541ce4a00601: dc10/rack10c load=0.000000 tablets=0 shards=12 tablets/shard=0.000 state=normal cap=64424509440 stream: rd=0 wr=0 (cherry picked from commit `ef0e9ad34a`)	2026-01-29 09:06:49 +00:00
Tomasz Grabiec	c7f4bda459	tablets: load_balancer: Warn about imbalance only when there are no more active migrations Otherwise, it may be only a temporary situation due to lack of candidates, and may be unnecessarily alerting. Also, print node stats to allow assessing how bad the situation is on the spot. Those stats can hint to a cause of imbalance, if balancing is per-DC and racks have different capacity. (cherry picked from commit `4a161bff2d`)	2026-01-29 09:06:49 +00:00
Tomasz Grabiec	568af3cd8d	tablets: load_balancer: Extract print_node_stats() (cherry picked from commit `7228bd1502`)	2026-01-29 09:06:49 +00:00
Tomasz Grabiec	bd694dd1a1	tablet: load_balancer: Use empty() instead of size() where applicable (cherry picked from commit `615b86e88b`)	2026-01-29 09:06:49 +00:00
Tomasz Grabiec	9672e0171f	tablets: Fix redundancy in migration_plan::empty() (cherry picked from commit `12fdd205d6`)	2026-01-29 09:06:49 +00:00
Tomasz Grabiec	8cec41acf2	tablets: Cache pointer to stats during plan-making Saves on lookup cost, esp. for candidate evaluation. This showed up in perf profile in the past. Also, lays the ground for splitting stats per rack. (cherry picked from commit `0d090aa47b`)	2026-01-29 09:06:49 +00:00
Tomasz Grabiec	d207de0d76	tablets: load_balancer: Print rack in addition to DC when giving context Load-balancing can be now per-rack instead of per-DC. So just printing "in DC" is confusing. If we're balancing a rack, we should print which rack is that. (cherry picked from commit `f2b0146f0f`)	2026-01-29 09:06:49 +00:00
Tomasz Grabiec	edde4e878e	tablets: load_balancer: Make plan summary concise Before: load_balancer - Prepared 1 migration plans, out of which there were 1 tablet migration(s) and 0 resize decision(s) and 0 tablet repair(s) and 0 rack-list colocation(s) After: load_balancer - Prepared plan: migrations: 1 We print only stats about elements which are present. (cherry picked from commit `df32318f66`)	2026-01-29 09:06:49 +00:00
Tomasz Grabiec	be1c674f1a	tablets: load_balancer: Move "tablet_migration_bypass" injection point to make_plan() Just a cleanup. After this, we don't have a new scope in the outmost make_plan() just for injection handling. (cherry picked from commit `32b336e062`)	2026-01-29 09:06:49 +00:00
Botond Dénes	a7cff37024	docs: add feature page for automatic repair Explain what the feature is and how to confiture it. Inter-link all the repair related pages, so one can discover all about repair, regardless of which page they land on. (cherry picked from commit `1713d75c0d`)	2026-01-29 00:25:21 +00:00
Botond Dénes	9431bc5628	docs: inter-link incremental-repair and repair documents The user can now discover the general explanatio of repair when reading about incremental repair, useful if they don't know what repair is. The user can now discover incremental repair while reading the generic repair procedure document. (cherry picked from commit `57b2cd2c16`)	2026-01-29 00:25:21 +00:00
Botond Dénes	14db8375ac	docs: incremental-repair: fix curl example Currently it is regular text, make it a code block so it is easier to read and copy+paste. (cherry picked from commit `a84b1b8b78`)	2026-01-29 00:25:21 +00:00
Ernest Zaslavsky	614020b5d5	aws_error: handle all restartable nested exception types Previously we only inspected std::system_error inside std::nested_exception to support a specific TLS-related failure mode. However, nested exceptions may contain any type, including other restartable (retryable) errors. This change unwraps one nested exception per iteration and re-applies all known handlers until a match is found or the chain is exhausted. Closes scylladb/scylladb#28240 (cherry picked from commit `cb2aa85cf5`) Closes scylladb/scylladb#28345	2026-01-28 14:58:28 +02:00
Anna Stuchlik	e091afb400	doc: add the version name to the Install pages This is a follow-up to https://github.com/scylladb/scylladb/pull/28022 It adds the version name to more install pages. Closes scylladb/scylladb#28289 (cherry picked from commit `c25b770342`) Closes scylladb/scylladb#28362	2026-01-28 12:52:23 +02:00
Ernest Zaslavsky	edc46fe6a1	connection_factory: includes cleanup (cherry picked from commit `912c48a806`)	2026-01-27 22:43:08 +00:00
Ernest Zaslavsky	f8b9b767c2	dns_connection_factory: refine the move constructor Clean up the awkward move constructor that was declared in the header but defaulted in a separate compilation unit, improving clarity and consistency. (cherry picked from commit `3a31380b2c`)	2026-01-27 22:43:08 +00:00
Ernest Zaslavsky	23d038b385	connection_factory: retry on failure If connecting to a provided address throws, renew the address list and retry once (and only once) before giving up. (cherry picked from commit `a05a4593a6`)	2026-01-27 22:43:08 +00:00
Ernest Zaslavsky	3e2d1384bf	connection_factory: introduce TTL timer Add a TTL-based timer to connection_factory to automatically refresh resolved host name addresses when they expire. (cherry picked from commit `6eb7dba352`)	2026-01-27 22:43:08 +00:00
Ernest Zaslavsky	bd7481e30c	connection_factory: get rid of shared_future in dns_connection_factory Move state management from dns_connection_factory into state class itself to encapsulate its internal state and stop managing it from the `dns_connection_factory` (cherry picked from commit `66a33619da`)	2026-01-27 22:43:08 +00:00
Ernest Zaslavsky	16d7b65754	connection_factory: extract connection logic into a member extract connection logic into a private member function to make it reusable (cherry picked from commit `5b3e513cba`)	2026-01-27 22:43:08 +00:00
Ernest Zaslavsky	e30c01eae6	connection_factory: remove unnecessary `else` (cherry picked from commit `ce0c7b5896`)	2026-01-27 22:43:08 +00:00

1 2 3 4 5 ...

51736 Commits