scylladb

Author	SHA1	Message	Date
Ernest Zaslavsky	e56081d588	treewide: seastar module update and fix broken rest client start using `write_body` in `rest/client` to properly set headers due to changes applied to seastar's http client Seastar module update ``` b6be384e Merge 'http: generalize Content-Type setting' from Nadav Har'El 74472298 http: generalize request's Content-Type setting 9fd5a1cc http: generalize reply's Content-Type setting a2665f38 memory: Remove deprecated enable_abort_on_allocation_failure() d2a5a8a9 resource.cc: Remove some dead code 7ad9f424 http: Add support of multiple key repetitions for the request a636baca task: Move task::get_backtrace() definition in its class a0101efa Fixed "doxygen" spelling in error message db969482 Merge 'http/reply: introduce set_cookie()' from Botond Dénes 5357b434 http/reply: introduce set_cookie() 1ddcf05f http/reply: make write_reply*() public 4b782d73 http/connection: start_response(): fix indentation 720feca0 http/reply: encapsulate reply writing in write_reply() 3e19917d Merge 'exceptions: log thrown and propagated exception with distinct log levels' from Botond Dénes db9aea93 Merge 'Correctly wrap up abandoned yielding directory lister' from Pavel Emelyanov dbb2bf3f test: Add test for input_stream::read_exactly() a5308ec9 file/directory_lister: Correctly wrap up fallback generator 4f0811f4 file/directory_lister: Convert on-stack queue to shared pointer 59801da7 tests: Add directory lister early drop cases 33233032 http/reply: s/write_reply_to_connection/write_reply/ 69b93620 http/reply: write_reply_{to_connection,headers}(): pass output stream 56e9bda7 test: Convert directory_test into seastar test 96782358 Merge 'Improve io_tester's seqwrite and append workloads' from Pavel Emelyanov 8b46e3d4 SEASTAR_ASSERT: assert to stderr and flush stream 3370e22a tutorial.md: use current_exception_as_future() e977453a Add fixture support for seastar::testing 3e70d7f7 io_tester: Do not set append_is_unlikely unconditionally 2a4ae7b4 io_tester: Count file size overflows 5e678bb5 io_tester: Tuneup size overflow check d5dad8ce io_tester: Move position management code to io_class_data 5586a056 io_tester: Rename seqwrite -> overwrite 92df2fb2 io_tester: Relax return value of create_and_fill_file() 03d9500d io_tester: Dont fill file for APPEND d6844a7b io_tester: Indentation fix after previous patch fb9e0088 io_tester: Coroutinize create_and_fill_file() 2f802f57 exceptions: log thrown and propagated exception with distinct log levels 4971fa70 util: move log-level into own header 39448fc1 Merge 'Fix and tune http::request setup by client' from Pavel Emelyanov 52d0c4fb iostream: Move output_stream::write(scattered_message) lower 7a52f734 Merge 'read_first_line: Missing pragma and licence' from Ernest Zaslavsky d0881b7e read_first_line: Add missing license boilerplate 988a0e99 read_first_line:: Add missing `#pragma once` 42675266 http: Make client::make_request accept const request& c7709fb5 http: Make request making API return exceptional future not throw b68ed89b http: Move request content length header setup 1d96dac6 http: Move request version configuration 072e86f6 http: Setup request once ``` Closes scylladb/scylladb#25915 (cherry picked from commit `44d34663bc`) Closes scylladb/scylladb#26100	2025-09-19 11:40:59 +03:00
Avi Kivity	f6b6312cf4	Merge 'sstables/trie: prepare for integrating BTI indexes with sstable readers and writers' from Michał Chojnowski This is yet another part in the BTI index project. Overarching issue: https://github.com/scylladb/scylladb/issues/19191 Previous part: https://github.com/scylladb/scylladb/pull/25626 Next parts: introducing the new components, Partitions.db and Rows.db This is the preparatory, uncontroversial part of https://github.com/scylladb/scylladb/pull/26039, which has been split out to a separate PR to make the main part (which, after a revision, will be posted later) smaller. This series contains several small fixes and changes to BTI-related code added earlier, which either have to be done (i.e. propagating `reader_permit` to IO calls in index reads) or just deserved to be done. There's no single theme for the changes in this PR, refer to the individual commits for details. The changes are for the sake of new and unreleased code. No backporting should be done. Closes scylladb/scylladb#26075 * github.com:scylladb/scylladb: sstables/mx/reader: remove mx::make_reader_with_index_reader test/boost/bti_index_test: fix indentation sstables/trie/bti_index_reader: in last_block_offset(), return offset from the beginning of partition, not file sstables/trie: support reader_permit and trace_state properly sstables/trie/bti_node_reader: avoid calling into `cached_file` if the target position is already cached sstables/trie/bti_index_reader: get rid of the seastar::file wrapper in read_row_index_header sstables/trie/bti_index_reader: support BYPASS CACHE test/boost/bti_index_test: use read_bti_partitions_db_footer where appropriate sstables/trie: change the signature of bti_partition_index_writer::finish sstables/bti_index: improve signatures of special member functions in index writers streaming/stream_transfer_task: coroutinize `estimate_partitions()` types/comparable_bytes: add a missing implementation for date_type_impl sstables: remove an outdated FIXME storage_service: delete `get_splits()` sstables/trie: fix some comment typos in bti_index_reader.cc sstables/mx/writer: rename _pi_write_m.tomb to partition_tombstone	2025-09-18 12:10:27 +03:00
Pavel Emelyanov	65638232e8	Merge 'utils: azure: Catch system errors when probing IMDS and bump the verbosity of logs' from Nikos Dragazis This PR fixes a bug in the Azure default credential provider that would cause the `test_azure_provider_with_incomplete_creds` unit test to be flaky. The provider would assume that an unreachable IMDS endpoint would always result in a timeout, but network errors are also possible (e.g., ICMP "host unreachable"). The issue is triggered by this particular test because it sets the IMDS endpoint to a non-routable address. Some routers choose to silently drop such packets, while others return ICMP errors. To fix it, the default credential provider has been updated to catch system errors as well. This PR also raises the log level of the default credential provider from DEBUG to INFO, making it easier for operators to diagnose authentication issues. More details in the commit messages. Fixes #25641. Closes scylladb/scylladb#25696 * github.com:scylladb/scylladb: utils: azure: Catch system errors when detecting IMDS utils: azure: Bump default credential logs from DEBUG to INFO	2025-09-18 07:43:00 +03:00
Ernest Zaslavsky	c9c245c756	rest_client: set `version` on http::request to avoid invalid state Upcoming changes in Seastar cause `rest::simple_send` to move the `http::request` into `seastar::http::experimental::client::make_request` when called multiple times. This leaves the original request in an invalid state. Specifically, the `_version` field becomes empty, causing request validation to fail. This patch ensures `version` is explicitly set to prevent such failures. Fixes: https://github.com/scylladb/scylladb/issues/26018 Closes scylladb/scylladb#26066	2025-09-18 07:36:25 +03:00
Michał Chojnowski	1f85069389	sstables/trie: support reader_permit and trace_state properly Before this patch, `reader_permit` taken by `bti_index_reader`. wasn't actually being passed down to disk reads. In this patch, we fix this FIXME by propagating the permit down to the I/O operations on the `cached_file`. Also, it didn't take `trace_state_ptr` at all. In this patch, we add a `trace_state_ptr` argument and propagate it down to disk reads. (We combine the two changes because the permit and the trace state are passed together everywhere anyway).	2025-09-17 12:22:40 +02:00
Benny Halevy	3a6208b319	utils: stall_free: clear_gently: release wrapped objects As discussed in https://github.com/scylladb/scylladb/pull/24606#discussion_r2281870939 clear_gently of shared pointers should release the wrapped object reference and when the object's use_count reaches 1, the object itself would be cleared_gently, before it's destroyed. This behavior is similar to the way we clear gently containers like arrays or vectors, and so it is extended in this patch to smart pointers like unique_ptr and foreign_ptr. The unit tests are adjusted respectively to expect the smart pointers to be reset after clear_gently, plus the use of `reset()` for `foreign_ptr<shared_ptr<>>` was replaced by `clear_gently().get()` which now ensures the reference to a shared object is released, and awaited for, if it happens on a foreign owner shard, unlike reset of a foreign_ptr that kicks off destroy of that shared object in the background on the owner shard - causing flakiness. Fixes #25723 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#25759	2025-09-17 11:44:26 +03:00
Pavel Emelyanov	6fb66b796a	s3: Add metrics to show S3 prefetch bytes The chunked download source sends large GET requests and then consumes data as it arrives. Sometimes it can stop reading from socket early and drop the in-flight data. The existing read-bytes metrics show only the number of consumed bytes, we we also want to know the number of requested bytes Refs #25770 (accounting of read-bytes) Fixes #25876 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#25877	2025-09-16 23:40:47 +03:00
Nikos Dragazis	58e8142a06	utils: azure: Catch system errors when detecting IMDS When the default credential provider probes IMDS to check its availability, it assumes that application-level connection timeouts are the only error that can occur when the node is not an Azure VM, i.e., the packets will be silently dropped somewhere in the network. However, this has proven not always true for the `test_azure_provider_with_incomplete_creds` unit test, which overrides the default IMDS endpoint with a non-routeable IP from TEST-NET-1 [1]. This test has been reported to fail in some local setups where routers respond with ICMP "host unreachable" errors instead of silently dropping the packets. This error propagates to user space as an EHOSTUNREACH system error, which is not caught by the default credential provider, causing the test to fail. The reason we use a non-routeable address in this test is to ensure that IMDS probing will always fail, even if running the test on an Azure VM. Theoretically, the same problem applies to the default IMDS endpoint as well (169.254.169.254). The RFC 3927 [2] mandates that packets targeting link-local addresses (169.254/16) must not be forwarded, but the exact behavior is left to implementation. Since we cannot predict how routers will behave, fix this by catching all relevant system errors when probing IMDS. [1] https://datatracker.ietf.org/doc/html/rfc5735 [2] https://datatracker.ietf.org/doc/html/rfc3927 Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>	2025-09-16 15:27:59 +03:00
Nikos Dragazis	78bcecd570	utils: azure: Bump default credential logs from DEBUG to INFO The default credential provider produces diagnostic logs on each step as it walks through the credential chain. These logs are useful for operators to diagnose authentication problems as they expose information about which credential sources are being evaluated, in which order, why they fail, and which source is eventually selected. Promote them from DEBUG to INFO level. Additionally, concatenate the logs for environment credentials into a single log statement to avoid interleaving with other logs. Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>	2025-09-16 15:20:52 +03:00
Botond Dénes	ee7c85919e	Revert "treewide: seastar module update and fix broken rest client" This reverts commit `44d34663bc` of PR https://github.com/scylladb/scylladb/pull/25915. Breaks articact tests on ARM, blocking us from building new images from master.	2025-09-16 08:31:08 +03:00
Ernest Zaslavsky	44d34663bc	treewide: seastar module update and fix broken rest client start using `write_body` in `rest/client` to properly set headers due to changes applied to seastar's http client Seastar module update ``` b6be384e Merge 'http: generalize Content-Type setting' from Nadav Har'El 74472298 http: generalize request's Content-Type setting 9fd5a1cc http: generalize reply's Content-Type setting a2665f38 memory: Remove deprecated enable_abort_on_allocation_failure() d2a5a8a9 resource.cc: Remove some dead code 7ad9f424 http: Add support of multiple key repetitions for the request a636baca task: Move task::get_backtrace() definition in its class a0101efa Fixed "doxygen" spelling in error message db969482 Merge 'http/reply: introduce set_cookie()' from Botond Dénes 5357b434 http/reply: introduce set_cookie() 1ddcf05f http/reply: make write_reply*() public 4b782d73 http/connection: start_response(): fix indentation 720feca0 http/reply: encapsulate reply writing in write_reply() 3e19917d Merge 'exceptions: log thrown and propagated exception with distinct log levels' from Botond Dénes db9aea93 Merge 'Correctly wrap up abandoned yielding directory lister' from Pavel Emelyanov dbb2bf3f test: Add test for input_stream::read_exactly() a5308ec9 file/directory_lister: Correctly wrap up fallback generator 4f0811f4 file/directory_lister: Convert on-stack queue to shared pointer 59801da7 tests: Add directory lister early drop cases 33233032 http/reply: s/write_reply_to_connection/write_reply/ 69b93620 http/reply: write_reply_{to_connection,headers}(): pass output stream 56e9bda7 test: Convert directory_test into seastar test 96782358 Merge 'Improve io_tester's seqwrite and append workloads' from Pavel Emelyanov 8b46e3d4 SEASTAR_ASSERT: assert to stderr and flush stream 3370e22a tutorial.md: use current_exception_as_future() e977453a Add fixture support for seastar::testing 3e70d7f7 io_tester: Do not set append_is_unlikely unconditionally 2a4ae7b4 io_tester: Count file size overflows 5e678bb5 io_tester: Tuneup size overflow check d5dad8ce io_tester: Move position management code to io_class_data 5586a056 io_tester: Rename seqwrite -> overwrite 92df2fb2 io_tester: Relax return value of create_and_fill_file() 03d9500d io_tester: Dont fill file for APPEND d6844a7b io_tester: Indentation fix after previous patch fb9e0088 io_tester: Coroutinize create_and_fill_file() 2f802f57 exceptions: log thrown and propagated exception with distinct log levels 4971fa70 util: move log-level into own header 39448fc1 Merge 'Fix and tune http::request setup by client' from Pavel Emelyanov 52d0c4fb iostream: Move output_stream::write(scattered_message) lower 7a52f734 Merge 'read_first_line: Missing pragma and licence' from Ernest Zaslavsky d0881b7e read_first_line: Add missing license boilerplate 988a0e99 read_first_line:: Add missing `#pragma once` 42675266 http: Make client::make_request accept const request& c7709fb5 http: Make request making API return exceptional future not throw b68ed89b http: Move request content length header setup 1d96dac6 http: Move request version configuration 072e86f6 http: Setup request once ``` Closes scylladb/scylladb#25915	2025-09-13 17:14:28 +03:00
Radosław Cybulski	436150eb52	treewide: fix spelling errors Fix spelling errors reported by copilot on github. Remove single use namespace alias. Closes scylladb/scylladb#25960	2025-09-12 15:58:19 +03:00
Avi Kivity	c91b326d5a	Merge 'transport: replace throwing protocol_exception with returns' from Dario Mirovic Replace throwing `protocol_exception` with returning it as a result or an exceptional future in the transport server module. The goal is to improve performance. Most of the `protocol_exception` throws were made from `fragmented_temporary_buffer` module, by passing `exception_thrower()` to its `read` methods. `fragmented_temporary_buffer` is changed so that it now accepts an exception creator, not exception thrower. `fragmented_temporary_buffer_concepts::ExceptionCreator` concept replaced `fragmented_temporary_buffer_concepts::ExceptionThrower` and all methods that have been throwing now return failed result of type `utils::result_with_eptr`. This change is then propagated to the callers. The scope of this patch is `protocol_exception`, so commitlog just calls `.value()` method on the result. If the result failed, that will throw the exception from the result, as defined by `utils::result_with_eptr_throw_policy`. This means that the behavior of commitlog module stays the same. transport server module handles results gracefully. All the caller functions that return non-future value `T` now return `utils::result_with_eptr<T>`. When the caller is a function that returns a future, and it receives failed result, `make_exception_future(std::move(failed_result).value())` is returned. The rest of the callstack up to the transport server `handle_error` function is already working without throwing, and that's how zero throws is achieved. cql3 module changes do the same as transport server module. Benchmark that is not yet merged has commit `67fbe35833e2d23a8e9c2dcb5e04580231d8ec96`, [GitHub diff view](https://github.com/scylladb/scylladb/compare/master...nuivall:scylladb:perf_cql_raw). It uses either read or write query. Command line used: ``` ./build/release/scylla perf-cql-raw --workdir ~/tmp/scylladir --smp 1 --developer-mode 1 --workload write --duration 300 --concurrency 1000 --username cassandra --password cassandra 2>/dev/null ``` The only thing changed across runs is `--workload write`/`--workload read`. Built and run on `release` target. <details> ``` throughput: mean= 36946.04 standard-deviation=1831.28 median= 37515.49 median-absolute-deviation=1544.52 maximum=39748.41 minimum=28443.36 instructions_per_op: mean= 108105.70 standard-deviation=965.19 median= 108052.56 median-absolute-deviation=53.47 maximum=124735.92 minimum=107899.00 cpu_cycles_per_op: mean= 70065.73 standard-deviation=2328.50 median= 69755.89 median-absolute-deviation=1250.85 maximum=92631.48 minimum=66479.36 ⏱ real=5:11.08 user=2:00.20 sys=2:25.55 cpu=85% ``` ``` throughput: mean= 40718.30 standard-deviation=2237.16 median= 41194.39 median-absolute-deviation=1723.72 maximum=43974.56 minimum=34738.16 instructions_per_op: mean= 117083.62 standard-deviation=40.74 median= 117087.54 median-absolute-deviation=31.95 maximum=117215.34 minimum=116874.30 cpu_cycles_per_op: mean= 58777.43 standard-deviation=1225.70 median= 58724.65 median-absolute-deviation=776.03 maximum=64740.54 minimum=55922.58 ⏱ real=5:12.37 user=27.461 sys=3:54.53 cpu=83% ``` ``` throughput: mean= 37107.91 standard-deviation=1698.58 median= 37185.53 median-absolute-deviation=1300.99 maximum=40459.85 minimum=29224.83 instructions_per_op: mean= 108345.12 standard-deviation=931.33 median= 108289.82 median-absolute-deviation=55.97 maximum=124394.65 minimum=108188.37 cpu_cycles_per_op: mean= 70333.79 standard-deviation=2247.71 median= 69985.47 median-absolute-deviation=1212.65 maximum=92219.10 minimum=65881.72 ⏱ real=5:10.98 user=2:40.01 sys=1:45.84 cpu=85% ``` ``` throughput: mean= 38353.12 standard-deviation=1806.46 median= 38971.17 median-absolute-deviation=1365.79 maximum=41143.64 minimum=32967.57 instructions_per_op: mean= 117270.60 standard-deviation=35.50 median= 117268.07 median-absolute-deviation=16.81 maximum=117475.89 minimum=117073.74 cpu_cycles_per_op: mean= 57256.00 standard-deviation=1039.17 median= 57341.93 median-absolute-deviation=634.50 maximum=61993.62 minimum=54670.77 ⏱ real=5:12.82 user=4:10.79 sys=11.530 cpu=83% ``` This shows ~240 instructions per op increase for reads and ~180 instructions per op increase for writes. Tests have been run multiple times, with almost identical results. Each run lasted 300 seconds. Number of operations executed is roughly 38k per second 300 seconds = 11.4m ops. Update: I have repeated the benchmark with clean state - reboot computer, put in performance mode, rebuild, closed other apps that might affect CPU and disk usage. run count: 5 times before and 5 times after the patch duration: 300 seconds Average write throughput median before patch: 41155.99 Average write throughput median after patch: 42193.22 Median absolute deviation is also lower now, with values in range 350-550, while the previous runs' values were in range 750-1350. </details> Built and run on `release` target. <details> ./build/release/scylla perf-simple-query --smp 1 --duration 300 --concurrency 1000 --enable-cache false --bypass-cache 2>/dev/null ``` throughput: mean= 14910.90 standard-deviation=477.72 median= 14956.73 median-absolute-deviation=294.16 maximum=16061.18 minimum=13198.68 instructions_per_op: mean= 659591.63 standard-deviation=495.85 median= 659595.46 median-absolute-deviation=324.91 maximum=661184.94 minimum=658001.49 cpu_cycles_per_op: mean= 213301.49 standard-deviation=2724.27 median= 212768.64 median-absolute-deviation=1403.85 maximum=225837.15 minimum=208110.12 ⏱ real=5:19.26 user=5:00.22 sys=15.827 cpu=98% ``` ./build/release/scylla perf-simple-query --smp 1 --duration 300 --concurrency 1000 --enable-cache false 2>/dev/null ``` throughput: mean= 93345.45 standard-deviation=4499.00 median= 93915.52 median-absolute-deviation=2764.41 maximum=104343.64 minimum=79816.66 instructions_per_op: mean= 65556.11 standard-deviation=97.42 median= 65545.11 median-absolute-deviation=71.51 maximum=65806.75 minimum=65346.25 cpu_cycles_per_op: mean= 34160.75 standard-deviation=803.02 median= 33927.16 median-absolute-deviation=453.08 maximum=39285.19 minimum=32547.13 ⏱ real=5:03.23 user=4:29.46 sys=29.255 cpu=98% ``` ./build/release/scylla perf-simple-query --smp 1 --duration 300 --concurrency 1000 --enable-cache true 2>/dev/null ``` throughput: mean= 206982.18 standard-deviation=15894.64 median= 208893.79 median-absolute-deviation=9923.41 maximum=232630.14 minimum=127393.34 instructions_per_op: mean= 35983.27 standard-deviation=6.12 median= 35982.75 median-absolute-deviation=3.75 maximum=36008.24 minimum=35952.14 cpu_cycles_per_op: mean= 17374.87 standard-deviation=985.06 median= 17140.81 median-absolute-deviation=368.86 maximum=26125.38 minimum=16421.99 ⏱ real=5:01.23 user=4:57.88 sys=0.124 cpu=98% ``` ./build/release/scylla perf-simple-query --smp 1 --duration 300 --concurrency 1000 --enable-cache false --bypass-cache 2>/dev/null ``` throughput: mean= 16198.26 standard-deviation=902.41 median= 16094.02 median-absolute-deviation=588.58 maximum=17890.10 minimum=13458.74 instructions_per_op: mean= 659752.73 standard-deviation=488.08 median= 659789.16 median-absolute-deviation=334.35 maximum=660881.69 minimum=658460.82 cpu_cycles_per_op: mean= 216070.70 standard-deviation=3491.26 median= 215320.37 median-absolute-deviation=1678.06 maximum=232396.48 minimum=209839.86 ⏱ real=5:17.33 user=4:55.87 sys=18.425 cpu=99% ``` ./build/release/scylla perf-simple-query --smp 1 --duration 300 --concurrency 1000 --enable-cache false 2>/dev/null ``` throughput: mean= 97067.79 standard-deviation=2637.79 median= 97058.93 median-absolute-deviation=1477.30 maximum=106338.97 minimum=87457.60 instructions_per_op: mean= 65695.66 standard-deviation=58.43 median= 65695.93 median-absolute-deviation=37.67 maximum=65947.76 minimum=65547.05 cpu_cycles_per_op: mean= 34300.20 standard-deviation=704.66 median= 34143.92 median-absolute-deviation=321.72 maximum=38203.68 minimum=33427.46 ⏱ real=5:03.22 user=4:31.56 sys=29.164 cpu=99% ``` ./build/release/scylla perf-simple-query --smp 1 --duration 300 --concurrency 1000 --enable-cache true 2>/dev/null ``` throughput: mean= 223495.91 standard-deviation=6134.95 median= 224825.90 median-absolute-deviation=3302.09 maximum=234859.90 minimum=193209.69 instructions_per_op: mean= 35981.41 standard-deviation=3.16 median= 35981.13 median-absolute-deviation=2.12 maximum=35991.46 minimum=35972.55 cpu_cycles_per_op: mean= 17482.26 standard-deviation=281.82 median= 17424.08 median-absolute-deviation=143.91 maximum=19120.68 minimum=16937.43 ⏱ real=5:01.23 user=4:58.54 sys=0.136 cpu=99% ``` </details> Fixes: #24567 This PR is a continuation of #24738 [transport: remove throwing protocol_exception on connection start](https://github.com/scylladb/scylladb/pull/24738). This PR does not solve a burning issue, but is rather an improvement in the same direction. As it is just an enhancement, it should not be backported. Closes scylladb/scylladb#25408 * github.com:scylladb/scylladb: test/cqlpy: add protocol exception tests test/cqlpy: `test_protocol_exceptions.py` refactor message frame building test/cqlpy: `test_protocol_exceptions.py` refactor duplicate code transport: replace `make_frame` throw with return result cql3: remove throwing `protocol_exception` transport: replace throw in validate_utf8 with result_with_exception_ptr return transport: replace throwing protocol_exception with returns utils: add result_with_exception_ptr test/cqlpy: add unknown compression algorithm test case	2025-09-10 21:54:15 +03:00
Avi Kivity	fc64333040	Merge 'sstables/trie: add BTI index readers and writers' from Michał Chojnowski This is yet another part in the BTI index project. Overarching issue: https://github.com/scylladb/scylladb/issues/19191 Previous part: https://github.com/scylladb/scylladb/pull/25506/ Next part: plugging the BTI index readers and writers into sstable readers and writers. The new code added in this PR isn't used outside of tests yet, but it's posted as a separate PR for reviewability. This series implements, on top of the key translation logic, and abstract trie writing and traversal logic, a writer and a reader of sstable index files (which map primary keys to positions in Data.db), as described in `f16fb6765b/src/java/org/apache/cassandra/io/sstable/format/bti/BtiFormat.md`. Caveats: 1. I think the added test has reasonable coverage, but that depends on running it multiple times. (Though it shouldn't need more than a few runs to catch any bug it covers). It's somewhat awkward as a test meant for running in CI, it's better as something you run many times after a relevant change. 2. These readers and writers are intended to be compatible with Cassandra, but I did NOT do any compatibility testing. The writers and readers added here have only been tested against each other, not against Cassandra's readers and writers. 3. This didn't undergo any proper benchmarking and optimization work. I was doing some measurements in the past, but everything was rewritten so much since then that the my old measurements are effectively invalidated. Frankly I have no idea what the performance of all this branchy-branchy logic is now. No backports needed, new functionality. Closes scylladb/scylladb#25626 * github.com:scylladb/scylladb: test/manual: add bti_cassandra_compatibility_test test/lib/random_schema: add some constraints for generated uuid and time/date values test/lib/random_utils: add a variant of get_bytes which takes an `engine&` test/boost: add bti_index_test sstables/writer: add an accessor for the current write position in Data.db sstables/trie: introduce bti_index_reader sstables/trie: add bti_partition_index_writer.cc sstables/trie: add bti_row_index_writer.cc utils/bit_cast: add a new overload of write_unaligned() sstables/trie: add trie_writer::add_partial() sstables/consumer: add read_56() sstables/trie: make bti_node_reader::page_ptr copy-constructible sstables: extract abstract_index_reader from index_reader.hh to its own header sstables/trie: add an accessor to the file_writer under bti_node_sink sstables/types: make `deletion_time::operator tombstone()` const sstables/types: add sstables::deletion_time::make_live() sstables/trie: fix a special case in max_offset_from_child sstables/trie: handle `partition_region`s other than `clustered` in BTI position encoding sstables/trie: rewrite lcb_mismatch to handle fragment invalidation test/boost/bti_key_translation_test: fix a compilation error hidden behind `if constexpr`	2025-09-10 21:48:52 +03:00
Pavel Emelyanov	9deea3655f	s3: Fix chunked download source metrics calculations In S3 client both read and write metrics have three counters -- number of requests made, number of bytes processed and request latency. In most of the cases all three counters are updated at once -- upon response arrival. However, in case of chunked download source this way of accounting metrics is misleading. In this code the request is made once, and then the obtained bytes are consumed eventually as the data arrive. Currently, each time a new portion of data is read from the socket the number of read requests is incremented. That's wrong, the request is made once, and this counter should also be incremented once, not for every data buffer that arrived in response. Same for read request latency -- it's "added" for every data buffer that arrives, but it's a lenghy process, the _request_ latency should be accounted once per responce. Maybe later we'll want to have "data latency" metrics as well, but for what we have now it's request latency. The number of read bytes is accounted properly, so not touched here. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#25770	2025-09-08 09:49:03 +03:00
Michał Chojnowski	a800fef633	utils/bit_cast: add a new overload of write_unaligned() Does the same thing as the existing overload, but this one takes `std::byte` instead of `void`, and it additionally returns the pointer to the end position.	2025-09-07 00:30:15 +02:00
Avi Kivity	ed483647a4	interval: specialize interval_data<T> for trivial types C++ data movement algorithms (std::uninitialized_copy()) and friends and the containers that use them optimize for trivially copyable and destructible types by calling memcpy instead of using a loop around constructors/destructors. Make intervals of trivially copyable and destructible types also trivially copyable and destructible by specializing interval_data<T> not to have user-defined special member functions. This requires that T have a default constructor since we can't skip construction when !_start_exists or !_end_exists. To choose whether we specialize or not, we look at default constructiblity (see above) and trivial destructibility. This is wider than trivial copyablity (a user-defined copy constructor can exist) but is still beneficial, since the generated copy constructor for interval_data<T> will be branch-free. We don't implement the poison words in debug mode; nor are they necessary, since we no don't manage the lifetime of _start_value and _end_value manually any more but let the compiler do that for us. Note [1] prevents full conversion to memcpy for now, but we still get branch free code. [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121789	2025-09-06 18:38:24 +03:00
Avi Kivity	20751517a4	interval: split data members into new interval_data class Prepare for specialized handling of trivial types by extracting the data members of wrapping_internal<T> and the special member functions (constructors/destructors/assignment) into a new interval_data<T> template. To avoid having to refer to data member with a this-> prefix, add using declarations in wrapping_interval<T>.	2025-09-06 18:31:58 +03:00
Radosław Cybulski	c242234552	Revert "build: add precompiled headers to CMakeLists.txt" This reverts commit `01bb7b629a`. Closes scylladb/scylladb#25735	2025-09-03 09:46:00 +03:00
Avi Kivity	7ed261fc52	Merge 'Inital GCP object storage support' from Calle Wilund Adds infrastructure and client for interaction with GCP object storage services. Note: this is just a client object usable for creating, listing, deleting and up/downloading of objects to/from said storage service. It makes no attempt at actually inserting it into the sstable storage flow. That can come later. This PR breaks out GCP auth and some general REST call functionality into shared routines. Not all code is 100% reused, but at least some. Test is added, though could be more comprehensive (feel free to suggest a test vector). Test can run in either local mock server mode (default), or against actual GCP. See `test/boost/gcp_object_storage_test.cc` for explanation on the config environment vars. Default is to run the test against a temporary docker deamon. Closes scylladb/scylladb#24629 * github.com:scylladb/scylladb: test::boost::gcp_object_storage_test: Initial unit tests for GCP obj storage proc-utils: Re-export waiting types from seastar proc-utils: Inherit environment from current process utils::gcp::object_storage: Add client for GCP object storage utils::http: Add optional external credentials to dns_connection_factory init utils::rest: Break out request wrapper and send logic encryption::gcp_host: Use shared gcp credentials + REST helpers utils::gcp: Move/add gcp credentials management to shared file utils::rest::client: Add formatter for seastar::http::reply utils::rest::client: Add helper routines for simple REST calls utils::http: Make shared system trust certificates public	2025-09-02 14:38:09 +03:00
Avi Kivity	fe308de8df	Merge 'treewide: Add missing `#pragma once`' from Ernest Zaslavsky Add missing #pragma once and license boilerplate to include headers. Consider adding a CI step to catch missing header guards early. It can be done easily by running `cpplint` like below ``` find . -path ./seastar -prune -o -path ./venv -prune -o -path ./idl -prune -o -type f \( -name ".h" -o -name ".hh" -o -name ".hpp" \) -print0 \| xargs -0 cpplint 2>&1 \| grep "header guard found" ``` No backport is needed, the change is not "functional" Closes scylladb/scylladb#25768 github.com:scylladb/scylladb: treewide: Add missing license boilerplate treewide: Add missing `#pragma once`	2025-09-02 13:18:04 +03:00
Calle Wilund	4a5b547a86	utils::gcp::object_storage: Add client for GCP object storage Adds a minial client for GCP object storage operations: * Create buckets * Delete buckets * List bucket content * Copy/move bucket content * Delete bucket content * Upload bucket content * Download bucket content	2025-09-01 18:03:44 +00:00
Calle Wilund	8f54b709ce	utils::http: Add optional external credentials to dns_connection_factory init Also allow creating the object using an endpoint expression. Note: this moves code to the .cc file, because it introduces a few more lines, and I feel we have to much stuff in headers as is.	2025-09-01 18:03:44 +00:00
Calle Wilund	0e9e1f7738	utils::rest: Break out request wrapper and send logic Allows sharing some of the wrapping and logic outside the single-call object/routine paths, using it also with an external seastar::http::client, i.e. caching resources across several calls.	2025-09-01 18:03:44 +00:00
Calle Wilund	2b7ad605b3	utils::gcp: Move/add gcp credentials management to shared file Copied from encryption::gcp_host. Light-weight impl of gcp credentials management.	2025-09-01 18:03:44 +00:00
Calle Wilund	f6d7c7e300	utils::rest::client: Add formatter for seastar::http::reply	2025-09-01 18:03:44 +00:00
Calle Wilund	cc1e659abd	utils::rest::client: Add helper routines for simple REST calls Packing headers and unpacking response to json. Usable for esp. gcp interaction.	2025-09-01 18:03:43 +00:00
Calle Wilund	886fcf1759	utils::http: Make shared system trust certificates public So other clients/factories can share.	2025-09-01 18:03:43 +00:00
Ernest Zaslavsky	0e4292adb4	treewide: Add missing license boilerplate Add missing license boilerplate to include headers	2025-09-01 14:58:32 +03:00
Ernest Zaslavsky	19345e539f	treewide: Add missing `#pragma once` Add missing `#pragma once` to include headers	2025-09-01 14:58:21 +03:00
Nadav Har'El	6d1abc5b2c	utils/base64: fix misleading code and comment (no functional change) utils/base64.cc had some strange code with a strange comment in base64_begins_with(). The code had base.substr(operand.size() - 4, operand.size()) The comment claims that this is "last 4 bytes of base64-encoded string", but this comment is misleading - operand is typically shorter than base (this this whole point of the base64_begins_with()), so the real intention of the code is not to find the last 4 bytes of base, but rather the next four bytes after the (operand.size() - 4) which we already copied. These four bytes that may need the full power of base64_decode_string() because they may or may not contain padding. But, if we really want the next 4 bytes, why pass operand.size() as the length of the substring? operand.size() is at least 4 (it's a mutiple of 4, and if it's 0 we returned earlier), but it could me more. We don't need more, we just need 4. It's not really wrong to take more than 4 (so this patch doesn't fix any bug), but can be wasteful. So this code should be: base.substr(operand.size() - 4, 4) We already have in test/boost/alternator_unit_test.cc a test, test_base64_begins_with that takes encoded base64 strings up to 12 characters in length (corresponding to decoded strings up to 8 chars), and substrings from length 0 to the base string's length, and check that test_base64_begins_with succeeds. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#25712	2025-09-01 08:57:50 +03:00
Ernest Zaslavsky	05154e131a	cleanup: Add missing `#pragma once` Add missing `#pragma once` to include header Closes scylladb/scylladb#25761	2025-09-01 06:41:57 +03:00
Nadav Har'El	ff91027eac	utils, alternator: fix detection of invalid base-64 This patch fixes an error-path bug in the base-64 decoding code in utils/base64.cc, which among other things is used in Alternator to decode blobs in JSON requests. The base-64 decoding code has a lookup table, which was wrongly sized 255 bytes, but needed to be 256 bytes. This meant that if the byte 255 (0xFF) was included in an invalid base-64 string, instead of detecting that this is an invalid byte (since the only valid bytes in a base-64 string are A-Z,a-z,0-9,+,/ and =), the code would either think it's valid with a nonsense 6-bit part, or even crash on an out-of-bounds read. Besides the trivial fix, this patch also includes a reproducing test, which tries to write a blob as a supposedly base-64 encoded string with a 0xFF byte in it. The test fails before this patch (the write succeeds, unexpectedly), and passes after this patch (the write fails as expected). The test also passes on DynamoDB. Fixes #25701 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#25705	2025-08-31 15:38:01 +03:00
Avi Kivity	bf9a963582	utils: mark crc barrett tables const They're marked constinit, but constinit does not imply const. Since they're not supposed to be modified, mark them const too. Closes scylladb/scylladb#25539	2025-08-31 11:37:39 +03:00
Avi Kivity	bc5773f777	Merge 'Add out of space prevention mechanisms' from Łukasz Paszkowski When a scaling out is delayed or fails, it is crucial to ensure that clusters remain operational and recoverable even under extreme conditions. To achieve this, the following proactive measures are implemented: - reject writes - includes: inserts, updates, deletes, counter updates, hints, read+repair and lwt writes - applicable to: user tables, views, CDC log, audit, cql tracing - stop running compactions/repairs and prevent from starting new ones - reject incoming tablet migrations The aforementioned mechanisms are automatically enabled when node's disk utilization reaches the critical level (default: 98%) and disabled when the utilization drop below the threshold. Apart from that, the series add tests that require mounted volumes to simulate out of space. The paths to the volumes can be provided using the a pytest argument, i.e. `--space-limited-dirs`. When not provided, tests are skipped. Test scenarios: 1. Start a cluster and write data until one of the nodes reaches 90% of the disk utilization 2. Perform an operation that would take the nodes over 100% 3. The nodes should not exceed the critical disk utilization (98% by default) 4. Scale out the cluster by adding one node per rack 5. Retry or wait for the operation from step 2 The operation is: writing data, running compactions, building materialized views, running repair, migrating tablets (caused by RF change, decommission). The test is successful, if no nodes run out of space, the operation from step 2 is aborted/paused/timed out and the operation from step 5 is successful. `perf-simple-query --smp 1 -m 1G` results obtained for fixed 400MHz frequency: Read path (before) ``` instructions_per_op: mean= 39661.51 standard-deviation=34.53 median= 39655.39 median-absolute-deviation=23.33 maximum=39708.71 minimum=39622.61 ``` Read path (after) ``` instructions_per_op: mean= 39691.68 standard-deviation=34.54 median= 39683.14 median-absolute-deviation=11.94 maximum=39749.32 minimum=39656.63 ``` Write path (before): ``` instructions_per_op: mean= 50942.86 standard-deviation=97.69 median= 50974.11 median-absolute-deviation=34.25 maximum=51019.23 minimum=50771.60 ``` Write path (after): ``` instructions_per_op: mean= 51000.15 standard-deviation=115.04 median= 51043.93 median-absolute-deviation=52.19 maximum=51065.81 minimum=50795.00 ``` Fixes: https://github.com/scylladb/scylladb/issues/14067 Refs: https://github.com/scylladb/scylladb/issues/2871 No backport, as it is a new feature. Closes scylladb/scylladb#23917 * github.com:scylladb/scylladb: tests/cluster: Add new storage tests test/scylla_cluster: Override workdir when passed via cmdline streaming: Reject incoming migrations storage_service: extend locator::load_stats to collect per-node critical disk utilization flag repair_service: Add a facility to disable the service compaction_manager: Subscribe to out of space controller compaction_manager: Replace enabled/disabled states with running state database: Add critical_disk_utilization mode database can be moved to disk_space_monitor: add subscription API for threshold-based disk space monitoring docs: Add feature documentation config: Add critical_disk_utilization_level option replica/exceptions: Add a new custom replica exception	2025-08-30 18:47:57 +03:00
Piotr Dulikowski	7ccb50514d	Merge 'Introduce view building coordinator' from Michał Jadwiszczak This patch introduces `view_building_coordinator`, a single entity within whole cluster responsible for building tablet-based views. The view building coordinator takes slightly different approach than the existing node-local view builder. The whole process is split into smaller view building tasks, one per each tablet replica of the base table. The coordinator builds one base table at a time and it can choose another when all views of currently processing base table are built. The tasks are started by setting `STARTED` state and they are executed by node-local view building worker. The tasks are scheduled in a way, that each shard processes only one tablet at a time (multiple tasks can be started for a shard on a node because a table can have multiple views but then all tasks have the same base table and tablet (last_token)). Once the coordinator starts the tasks, it sends `work_on_view_building_tasks` RPC to start the tasks and receive their results. This RPC is resilient to RPC failure or raft leader change, meaning if one RPC call started a batch of tasks but then failed (for instance the raft leader was changed and caller aborted waiting for the response), next RPC call will attach itself to the already started batch. The coordinator plugs into handling tablet operations (migration/resize/RF change) and adjusts its tasks accordingly. At the start of each tablet operation, the coordinator aborts necessary view building tasks to prevent https://github.com/scylladb/scylladb/issues/21564. Then, new adjusted tasks are created at the end of the operation. If the operation fails at any moment, aborted tasks are rollback. The view building coordinator can also handle staging sstables using process_staging view building tasks. We do this because we don't want to start generating view updates from a staging sstable prematurely, before the writes are directed to the new replica (https://github.com/scylladb/scylladb/issues/19149). For detailed description check: `docs/dev/view-building-coordinator.md` Fixes https://github.com/scylladb/scylladb/issues/22288 Fixes https://github.com/scylladb/scylladb/issues/19149 Fixes https://github.com/scylladb/scylladb/issues/21564 Fixes https://github.com/scylladb/scylladb/issues/17603 Fixes https://github.com/scylladb/scylladb/issues/22586 Fixes https://github.com/scylladb/scylladb/issues/18826 Fixes https://github.com/scylladb/scylladb/issues/23930 --- This PR is reimplementation of https://github.com/scylladb/scylladb/pull/21942 Closes scylladb/scylladb#23760 * github.com:scylladb/scylladb: test/cluster: add view build status tests test/cluster: add view building coordinator tests utils/error_injection: allow to abort `injection_handler::wait_for_message()` test: adjust existing tests utils/error_injection: add injection with `sleep_abortable()` db/view/view_builder: ignore `no_such_keyspace` exception docs/dev: add view building coordinator documentation db/view/view_building_worker: work on `process_staging` tasks db/view/view_building_worker: register staging sstable to view building coordinator when needed db/view/view_building_worker: discover staging sstables db/view/view_building_worker: add method to register staging sstable db/view/view_update_generator: add method to process staging sstables instantly db/view/view_update_generator: extract generating updates from staging sstables to a method db/view/view_update_generator: ignore tablet-based sstables db/view/view_building_coordinator: update view build status on node join/left db/view/view_building_coordinator: handle tablet operations db/view: add view building task mutation builder service/topology_coordinator: run view building coordinator db/view: introduce `view_building_coordinator` db/view/view_building_worker: update built views locally db/view: introduce `view_building_worker` db/view: extract common view building functionalities db/view: prepare to create abstract `view_consumer` message/messaging_service: add `work_on_view_building_tasks` RPC service/topology_coordinator: make `term_changed_error` public db/schema_tables: create/cleanup tasks when an index is created/dropped service/migration_manager: cleanup view building state on drop keyspace service/migration_manager: cleanup view building state on drop view service/migration_manager: create view building tasks on create view test/boost: enable proxy remote in some tests service/migration_manager: pass `storage_proxy` to `prepare_keyspace_drop_announcement()` service/migration_manager: coroutinize `prepare_new_view_announcement()` service/storage_proxy: expose references to `system_keyspace` and `view_building_state_machine` service: reload `view_building_state_machine` on group0 apply() service/vb_coordinator: add currently processing base db/system_keyspace: move `get_scylla_local_mutation()` up db/system_keyspace: add `view_building_tasks` table db/view: add view_building_state and views_state db/system_keyspace: add method to get view build status map db/view: extract `system.view_build_status_v2` cql statements to system_keyspace db/system_keyspace: move `internal_system_query_state()` function earlier db/view: ignore tablet-based views in `view_builder` gms/feature_service: add VIEW_BUILDING_COORDINATOR feature	2025-08-29 17:28:44 +02:00
Dario Mirovic	51995af258	transport: replace throwing protocol_exception with returns Replace throwing `protocol_exception` with returning it as a result or an exceptional future in the transport server module. The goal is to improve performance. Most of the `protocol_exception` throws were made from `fragmented_temporary_buffer` module, by passing `exception_thrower()` to its `read*` methods. `fragmented_temporary_buffer` is changed so that it now accepts an exception creator, not exception thrower. `fragmented_temporary_buffer_concepts::ExceptionCreator` concept replaced `fragmented_temporary_buffer_concepts::ExceptionThrower` and all methods that have been throwing now return failed result of type `utils::result_with_exception_ptr`. This change is then propagated to the callers. The scope of this patch is `protocol_exception`, so commitlog just calls `.value()` method on the result. If the result failed, that will throw the exception from the result, as defined by `utils::result_with_exception_ptr_throw_policy`. This means that the behavior of commitlog module stays the same. transport server module handles results gracefully. All the caller functions that return non-future value `T` now return `utils::result_with_exception_ptr<T>`. When the caller is a function that returns a future, and it receives failed result, `make_exception_future(std::move(failed_result).value())` is returned. The rest of the callstack up to the transport server `handle_error` function is already working without throwing, and that's how zero throws is achieved. Fixes: #24567	2025-08-28 23:31:36 +02:00
Dario Mirovic	f01efd822e	utils: add result_with_exception_ptr Add `result_with_exception_ptr` result type. Successful result has user specified type. Failed result has std::exception_ptr. This approach is simpler than `result_with_exception`. It does not require user to pass exception types as variadic template through all the callstack. Specific exception type can still be accessed without costly std::rethrow_exception(eptr) by using `try_catch`, if configured so via `USE_OPTIMIZED_EXCEPTION_HANDLING`. This means no information loss, but less verbosity when writing result types. Refs: #24567	2025-08-28 23:31:04 +02:00
Łukasz Paszkowski	3e740d25b5	disk_space_monitor: add subscription API for threshold-based disk space monitoring Introduce the `subscribe` method to disk_space_monitor, allowing clients to register callbacks triggered when disk utilization crosses a configurable threshold. The API supports flexible trigger options, including notifications on threshold crossing and direction (above/below). This enables more granular and efficient disk space monitoring for consumers.	2025-08-28 18:06:37 +02:00
Radosław Cybulski	01bb7b629a	build: add precompiled headers to CMakeLists.txt Add precompiled header support to CMakeLists.txt and configure.py - it improves compilation time by approximately 10%. New header `stdafx.hh` is added, don't include it manually - the compiler will include it for you. The header contains includes from external libraries used by Scylla - seastar, standard library, linux headers and zlib. The feature is enabled by default, use CMake option `Scylla_USE_PRECOMPILED_HEADER` or configure.py --disable-precompiled-header to disable. The feature should be disabled, when trying to check headers - otherwise you might get false negatives on missing includes from seastar / abseil and so on. Note: following configuration needs to be added to ccache.conf: sloppiness = pch_defines,time_macros Closes #25182	2025-08-27 21:37:54 +03:00
Michał Jadwiszczak	90b5b2c5f5	utils/error_injection: allow to abort `injection_handler::wait_for_message()`	2025-08-27 10:23:04 +02:00
Michał Jadwiszczak	6056b55309	utils/error_injection: add injection with `sleep_abortable()`	2025-08-27 10:23:04 +02:00
Botond Dénes	f8b79d563a	Merge 's3: Minor refactoring and beautification of S3 client and tests' from Ernest Zaslavsky This pull request introduces minor code refactoring and aesthetic improvements to the S3 client and its associated test suite. The changes focus on enhancing readability, consistency, and maintainability without altering any functional behavior. No backport is required, as the modifications are purely cosmetic and do not impact functionality or compatibility. Closes scylladb/scylladb#25490 * github.com:scylladb/scylladb: s3_client: relocate `req` creation closer to usage s3_client: reformat long logging lines for readability s3_test: extract file writing code to a function	2025-08-18 18:48:42 +03:00
Avi Kivity	96956e48c4	Merge 'utils: stall_free: detect clear_gently method of const payload types' from Benny Halevy Currently, when a container or smart pointer holds a const payload type, utils::clear_gently does not detect the object's clear_gently method as the method is non-const and requires a mutable object, as in the following example in class tablet_metadata: ``` using tablet_map_ptr = foreign_ptr<lw_shared_ptr<const tablet_map>>; using table_to_tablet_map = std::unordered_map<table_id, tablet_map_ptr>; ``` That said, when a container is cleared gently the elements it holds are destroyed anyhow, so we'd like to allow to clear them gently before destruction. This change still doesn't allow directly calling utils::clear_gently an const objects. And respective unit tests. Fixes #24605 Fixed #25026 * This is an optimization that is not strictly required to backport (as https://github.com/scylladb/scylladb/pull/24618 dealt with clear_gently of `tablet_map_ptr = foreign_ptr<lw_shared_ptr<const tablet_map>>` well enough) Closes scylladb/scylladb#24606 * github.com:scylladb/scylladb: utils: stall_free: detect clear_gently method of const payload types utils: stall_free: clear gently a foreign shared ptr only when use_count==1	2025-08-18 12:52:02 +03:00
Ernest Zaslavsky	a0016bd0cc	s3_client: relocate `req` creation closer to usage Move the creation of the `req` object to the point where it is actually used, improving code clarity and reducing premature initialization.	2025-08-14 16:18:43 +03:00
Ernest Zaslavsky	6ef2b0b510	s3_client: reformat long logging lines for readability Break up excessively long logging statements to improve readability and maintain consistent formatting across the codebase.	2025-08-14 16:18:43 +03:00
Ernest Zaslavsky	dd51e50f60	s3_client: add memory fallback in `chunked_download_source` Introduce fallback logic in `chunked_download_source` to handle memory exhaustion. When memory is low, feed the `deque` with only one uncounted buffer at a time. This allows slow but steady progress without getting stuck on the memory semaphore. Fixes: https://github.com/scylladb/scylladb/issues/25453 Fixes: https://github.com/scylladb/scylladb/issues/25262 Closes scylladb/scylladb#25452	2025-08-14 09:52:10 +03:00
Ernest Zaslavsky	380c73ca03	s3_client: make memory semaphore acquisition abortable Add `abort_source` to the `get_units` call for the memory semaphore in the S3 client, allowing the acquisition process to be aborted. Fixes: https://github.com/scylladb/scylladb/issues/25454 Closes scylladb/scylladb#25469	2025-08-13 08:48:55 +03:00
Benny Halevy	23ac80fc6b	utils: stall_free: detect clear_gently method of const payload types Currently, when a container or smart pointer holds a const payload type, utils::clear_gently does not detect the object's clear_gently method as the method is non-const and requires a mutable object, as in the following example in class tablet_metadata: ``` using tablet_map_ptr = foreign_ptr<lw_shared_ptr<const tablet_map>>; using table_to_tablet_map = std::unordered_map<table_id, tablet_map_ptr>; ``` That said, when a container is cleared gently the elements it holds are destroyed anyhow, so we'd like to allow to clear them gently before destruction. This change still doesn't allow directly calling utils::clear_gently an const objects. And respective unit tests. Fixes #24605 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-08-11 14:22:01 +03:00
Benny Halevy	cb9db2f396	utils: stall_free: clear gently a foreign shared ptr only when use_count==1 Unlike clear_gently of SharedPtr, clear_gently of a `foreign_ptr<shared_ptr<T>>` calls clear_gently on the contained object even if it's still shared and may still be in use. This change examines the foreign shared pointer's use_count and calls clear_gently on the shard object only when its use_count reaches 1. Fixes #25026 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-08-11 14:21:32 +03:00

1 2 3 4 5 ...

2047 Commits