93 Commits

Author SHA1 Message Date
Ernest Zaslavsky
aa9da87e97 encryption: fix deadlock in encrypted_data_source::get()
When encrypted_data_source::get() caches a trailing block in
_next, the next call takes it directly — bypassing
input_stream::read(), which checks _eof. It then calls
input_stream::read_exactly() on the already-drained stream.
Unlike read(), read_up_to(), and consume(), read_exactly()
does not check _eof when the buffer is empty, so it calls
_fd.get() on a source that already returned EOS.

In production this manifested as stuck encrypted SSTable
component downloads during tablet restore: the underlying
chunked_download_source hung forever on the post-EOS get(),
causing 4 tablets to never complete. The stuck files were
always block-aligned sizes (8k, 12k) where _next gets
populated and the source is fully consumed in the same call.

Fix by checking _input.eof() before calling read_exactly().
When the stream already reached EOF, buf2 is known to be
empty, so the call is skipped entirely.

A comprehensive test is added that uses a strict_memory_source
which fails on post-EOS get(), reproducing the exact code
path that caused the production deadlock.
2026-03-19 13:54:54 +02:00
Marcin Maliszkiewicz
975cd60e05 ldap: fix use-after-move crash in ldap_reuser::reap()
After stop() moved _reaper, in-flight with_connection() callbacks could
still call reap(), which accessed the moved-from future causing a
SIGSEGV in future_base::detach_promise(). Add a seastar::gate so
stop() waits for all in-flight operations before moving _reaper.

Fixes https://scylladb.atlassian.net/browse/SCYLLADB-1043

Closes scylladb/scylladb#29015
2026-03-12 16:48:45 +02:00
Calle Wilund
8e71a6f52a gcp: Add handling of 429 (too many requests) to exponential backoff
Fixes: SCYLLADB-611

Adds http error code 429 to codes handled by exponential backoff.

Closes scylladb/scylladb#28588
2026-02-19 09:42:39 +01:00
Botond Dénes
19efd7f6f9 Merge 'The system_replicated_keys should be mark as a system keyspace' from Amnon Heiman
This PR marks system_replicated_keys as a system keyspace.
It was missing when the keyspace was added.

A side effect of that is that metrics that are not supposed to be reported are.
Fixes #27903

Closes scylladb/scylladb#27954

* github.com:scylladb/scylladb:
  distributed_loader: system_replicated_keys as system keyspace
  replicated_key_provider: make KSNAME public
2026-01-19 09:37:41 +02:00
Botond Dénes
551eecab63 Merge 'EAR: deprecate the replicated key provider' from Calle Wilund
Refs #22733.

Adds runtime warning and docs info that replicated provider is deprecated and will be removed.

Fixes #27292

Closes scylladb/scylladb#27270

* github.com:scylladb/scylladb:
  docs::encryption: Add warning that replicated provider is deprecated
  ent::encryption: Switch default key provider from replicated to local
  replicated_key_provider: Add deprecation warning on usage
2026-01-14 13:47:23 +02:00
Amnon Heiman
83c1103917 replicated_key_provider: make KSNAME public
Move KSNAME constant from internal static to public member of
replicated_key_provider_factory class.

It will be used to identify it as a system keyspace.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2026-01-02 16:39:51 +02:00
Botond Dénes
85f05fbe1b Revert "Merge 'Add digests for all sstable components in scylla metadata' from Taras Veretilnyk"
This reverts commit 866c96f536, reversing
changes made to 367633270a.

This change caused all longevities to fail, with a crash in parsing
scylla-metadata. The investigation is still ongoing, with no quick fix
in sight yet.

Fixes: #27496

Closes scylladb/scylladb#27518
2025-12-16 11:34:40 +02:00
Botond Dénes
866c96f536 Merge 'Add digests for all sstable components in scylla metadata' from Taras Veretilnyk
This pull request adds support for calculation and storing CRC32 digests for all SSTable components.
This change replaces plain file_writer with crc32_digest_file_writer for all SSTable components that should be checksummed. The resulting component digests are stored in the sstable structure
and later persisted to disk as part of the Scylla metadata component during writer::consume_end_of_stream.
All important SSTable components (Index, Partitions, Rows, Summary, Filter, CompressionInfo, and TOC) are covered.
Several test cases where introduced to verify expected behaviour.

Backport is not required, it is a new feature

Fixes #20100

Closes scylladb/scylladb#27287

* github.com:scylladb/scylladb:
  sstable_test: add verification testcases of SSTable components digests persistance
  sstables: store digest of all sstable components in scylla metadata
  sstables: Add TemporaryScylla metadata component type
  sstables: Extract file writer closing logic into separate methods
  sstables: Add components_digests to scylla metadata components
  sstables: Implement CRC32 digest-only writer
2025-12-05 11:36:50 +02:00
Botond Dénes
367633270a Merge 'EAR: handle IPV6 hosts in KMIP and use shared (improved) http parser in AWS/Azure' from Calle Wilund
Fixes #27367
Fixes #27362
Fixes #27366

Makes http URL parser handle IPv6.
Makes KMIP host setup handle IPv6 hosts + use system trust if no truststore set
Moves Azure/KMS code to use shared http URL parser to avoid same regex everywhere.

Closes scylladb/scylladb#27368

* github.com:scylladb/scylladb:
  ear::kms/ear::azure: Use utils::http URL parsing
  ear::kmip_host: Handle ipv6 hosts + use system trust when not specified
  utils::http: Handle ipv6 numeric host part in URL:s
2025-12-05 10:43:07 +02:00
Calle Wilund
8dd69f02a8 ear::kms/ear::azure: Use utils::http URL parsing
Fixes #27367

Move to reuse shared code.
2025-12-04 11:38:41 +00:00
Calle Wilund
d000fa3335 ear::kmip_host: Handle ipv6 hosts + use system trust when not specified
Fixes #27362

The KMIP host connector should handle ipv4 connections (named or numeric).
It also should fall back to system trust when truststore is not specified.
2025-12-04 11:38:41 +00:00
Calle Wilund
4169bdb7a6 encryption::gcp_host: Add exponential retry for server errors
Fixes #27242

Similar to AWS, google services may at times simply return a 503,
more or less meaning "busy, please retry". We rely for most cases
higher up layers to handle said retry, but we cannot fully do so,
because both we reach this code sometimes through paths that do
no such thing, and also because it would be slightly inefficient,
since we'd like to for example control the back-off for auth etc.

This simply changes the existing retry loop in gcp_host to
be a little more forgiving, special case 503 errors and extend
the retry to the auth part, as well as re-use the
exponential_backoff_retry primitive.

v2:
* Avoid backoff if refreshing credentials. Should not add latency due to this.
* Only allow re-auth once per (non-service-failure-backoff) try.
* Add abort source to both request and retry
v3:
* Include timeout and other server errors in retry-backoff
v4:
* Reorder error code handling correctly

Closes scylladb/scylladb#27267
2025-12-04 10:13:37 +02:00
Taras Veretilnyk
d287b054b9 sstables: Add TemporaryScylla metadata component type
Add TemporaryScylla component type to make atomic updates of SSTable Scylla metadata using temporary files
and atomic rename operations possible. This will be needed in further commit to rewrite metadata together with
the statistics component.
2025-12-03 23:40:10 +01:00
Calle Wilund
52cc30e00c ent::encryption: Switch default key provider from replicated to local
Since we are deprecating the replicated provider, it makes little sense to
have it be default.
2025-12-01 09:44:44 +00:00
Calle Wilund
ba38d58539 replicated_key_provider: Add deprecation warning on usage
Warns user utilizing the provider that the provider is deprecated
and will be removed.
2025-12-01 09:44:44 +00:00
Radosław Cybulski
d589e68642 Add precompiled headers to CMakeLists.txt
Add precompiled header support to CMakeLists.txt and configure.py -
it improves compilation time by approximately 10%.

New header `stdafx.hh` is added, don't include it manually -
the compiler will include it for you. The header contains includes from
external libraries used by Scylla - seastar, standard library,
linux headers and zlib.

The feature is enabled by default, use CMake option `Scylla_USE_PRECOMPILED_HEADER`
or configure.py --disable-precompiled-header to disable.

The feature should be disabled, when trying to check headers - otherwise
you might get false negatives on missing includes from seastar / abseil and so on.

Note: following configuration needs to be added to ccache.conf:

    sloppiness = pch_defines,time_macros,include_file_mtime,include_file_ctime

Closes scylladb/scylladb#26617
2025-11-21 12:27:41 +02:00
Botond Dénes
6f6ee5581e Merge 'encryption::kms_host: Add exponential backoff-retry for 503 errors' from Calle Wilund
Refs #26822

AWS says to treat 503 errors, at least in the case of ec2 metadata query, as backoff-retry (generally, we do _not_ retry on provider level, but delegate this to higher levels). This patch adds special treatment for 503:s (service unavailable) for both ec2 meta and    actual endpoint, doing exponential backoff.

Note: we do _not_ retry forever.
Not tested as such, since I don't get any errors when testing (doh!). Should try to set up a mock ec2 meta with injected errors maybe.

Closes scylladb/scylladb#26934

* github.com:scylladb/scylladb:
  encryption::kms_host: Add exponential backoff-retry for 503 errors
  encryption::kms_host: Include http error code in kms_error
2025-11-12 08:33:33 +02:00
Calle Wilund
d22e0acf0b encryption::kms_host: Add exponential backoff-retry for 503 errors
Refs #26822

AWS says to treat 503 errors, at least in the case of ec2 metadata
query, as backoff-retry (generally, we do _not_ retry on provider
level, but delegate this to higher levels). This patch adds special
treatment for 503:s (service unavailable) for both ec2 meta and
actual endpoint, doing exponential backoff.

Note: we do _not_ retry forever.
Not tested as such, since I don't get any errors when testing
(doh!). Should try to set up a mock ec2 meta with injected errors
maybe.

v2:
* Use utils::exponential_backoff_retry
2025-11-11 21:02:32 +00:00
Calle Wilund
190e3666cb encryption::kms_host: Include http error code in kms_error
Keep track of actual HTTP failure.
2025-11-11 21:02:32 +00:00
Calle Wilund
af85b7f61b ent::encryption: Make text helper routines take std::string
Moving away from custom string type. Pure cosmetics.
2025-11-05 10:22:22 +00:00
Avi Kivity
ab488fbb3f Merge 'Switch to seastar API level 9 (no more packet-s in output_stream/data_sink API)' from Pavel Emelyanov
Other than patching Scylla sinks to implement new data_sink_impl::put(std::span<temporary_buffer>) overload, the PR changes transport write_response() method to stop using output_stream::write(scattered_message) because it's also gone.

Using newer seastar API, no need to backport

Closes scylladb/scylladb#26592

* github.com:scylladb/scylladb:
  code: Fix indentation after previous patch
  code: Switch to seastar API level 9
  transport: Open-code invoke_with_counting into counting_data_sink::put
  transport: Don't use scattered_message
  utils: Implement memory_data_sink::put(net::packet)
2025-10-22 01:51:43 +03:00
Pavel Emelyanov
44ed3bbb7c Merge 'RFC: Initial GCP storage backend for scylla (sstables + backup)' from Calle Wilund
Integrates GCP object storage as a working storage backend for scylla sstables as well as backup storage.

Adds an abstraction layer (atm very heavily designed around the s3 client interface and usage) to allow the "storage" etc layers of sstable management to pick transparently between "s3" and "gs" providers.

This modifies the scylla config such that endpoints can optionally (through a "type" param) ref a GS backend.
Similarly with storage_options.

Also adds some IO wrapping primitives to make it more feasible to place some logic at a mid level of the implementation stack (such as making networked storage files, ranged reading etc).

Test s3 fixture is replaced (where appropriate) with an `object_storage` fixture that multiplexes the test across both backends.
Unit tests are duplicated and for the GS versions use a boost test fixture for GCS, default local fake.

Fixes #25359
Fixes #26453

Closes scylladb/scylladb#26186

* github.com:scylladb/scylladb:
  docs::dev::object_storage: Add some initial info on GS storage
  docs/dev: Add mention of (nested) docker usage in testing.md
  sstables::object_storage_client: Forward memory limit semaphore to GS instance
  utils::gcp::object_storage: Add optional memory limits to up/download
  sstables::object_storage_client: Add multi-upload support for GS
  utils::gcp::storage: Add merge objects operation
  test_backup/test_basic: Make tests multiplex both s3 and gs backends
  test::cluster::conftest: Add support for multiple object storage backends
  boost::gcs_storage_test: reindent
  boost::gcs_storage_test: Convert to use fixture
  tests::boost: Add GS object storage cases to mirror S3 ones
  tests::lib::gcs_fixture: Add a reusable test fixture for real/fake GS/GCS
  tests::lib::test_utils: Add overloads/helpers for reading and (temp) writing env
  sstables::object_storage_client: Add google storage implementation
  test_services: Allow testing with GS object storage parameters
  utils::gcp::gcp_credentials: Add option to create uninitialized credentials
  utils::gcp::object_storage: Make create_download_source return seekable_data_source
  utils::gcp::object_storage: Add defensive copies of string_view params
  utils::gcp::object_storage: Add missing retry backoff increate
  utils::gcp::object_storage: Add timestamp to object listing
  utils::gcp::object_storage: Add paging support to list_objects
  object_storage_client: Add object_name wrapper type
  utils::gcp::object_storage: Add optional abort_source
  utils::rest::client: Add abort_source support
  sstables: Use object_storage_client for remote storage
  sstables::object_storage_client: Add abstraction layer for OS cliens (s3 initial)
  s3::upload_progress: Promote to general util type
  storage_options: Abstract s3 to "object_storage" and add gs as option
  sstables::file_io_extension: Change "creator" callback to just data_source
  utils::io-wrappers: Add ranged data_source
  utils::io-wrappers: Add file wrapper type for seekable_source
  utils::seekable_source: Add a seekable IO source type
  object_storage_endpoint_param: Add gs storage as option
  config: break out object_storage_endpoint_param preparing for multi storage
2025-10-20 13:14:53 +03:00
Pavel Emelyanov
a88a36f5b5 code: Switch to seastar API level 9
In the new API the biggest change is to implement the only
data_sink_impl::put(span<temporary_buffer>) overload.

Encrypted file impl and sstables compress sink use fallback_put() helper
that generates a chain of continuations each holding a buffer.

The counting_data_sink in transport had mostly been patched to correct
implementation by the previous patch, the change here is to replace
vector argument with span one.

Most other sinks just re-implement their put(vector<temporary_buffer>)
overload by iterating over span and non-preemptively grabbing buffers
from it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-10-17 10:26:50 +03:00
Gleb Natapov
c255740989 schema: Allow configuring consistency setting for a keyspace
We want to add strongly consistent tables as an option. We will have
two kind of strongly consistent tables: globally consistent and locally
consistent. The former means that requests from all DCs will be globally
linearisable while the later - only requests to the same DCs will be
linearisable.  To allow configuring all the possibilities the patch
adds new parameter to a keyspace definition "consistency" that can be
configured to be `eventual`, `global` or `local`. Non eventual setting
is supported for tablets enabled keyspaces only. Since we want to start
with implementing local consistency configuring global consistency will
result in an error for now.
2025-10-16 13:34:49 +03:00
Calle Wilund
ac438c61e6 sstables::file_io_extension: Change "creator" callback to just data_source
Because the concept of pushing reading range does not work for the wrapping
we do (i.e. encryption), there is no point having it here. We need to do
said range handling higher up.
Also, must allow multi-layered wrapping.
2025-10-13 08:53:25 +00:00
Pavel Emelyanov
8002ddf946 code: Use tls_options::bye_timeout instead of deprecated switch
Some code wants its TLS sockets to close immediately without sending BYE
message and waiting for the response. Recent seastar update changed the
way this functionality is requested (scylladb/seastar#2986)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#26253
2025-10-06 16:25:35 +03:00
Michał Chojnowski
b1984d6798 sstables: implement an alternative way to rebuild bloom filters for sstables without Index
For efficiency, the cardinality of the bloom filter
(i.e. the number of partition keys which will be written into the sstable)
has to be known before elements are inserted into the filter.

In some cases (e.g. memtables flush) this number is known exactly.
But in others (e.g. repair) it can only be estimated,
and the estimation might be very wrong, leading to an oversized filter.

Because of that, some time ago we added a piece of logic
(ran after the sstable is written, but before it's sealed)
which looks at the actual number of written partitions,
compares it to the initial estimate (on which the size of the bloom
filter was based on), and if the difference is unacceptably large,
it rewrites the bloom filter from partition keys contained in Index.db.

But the idea to rebuild the bloom filters from index files
isn't going to work with BTI indexes, because they don't store
whole partition keys. If we want sstables which don't have Index.db
files, we need some other way to deal with oversized filters.
Partition keys can be recovered from Data.db,
but that would often be way too expensive.

This patch adds another way. We introduce a new component file,
TemporaryHashes. This component, if written at all,
contains the 16-byte murmur hash for every partition key, in order,
and can be used in place of Index to reconstruct the bloom filter.

(Our bloom filters are actually built from the set of murmur hashes of
partition keys. The first step of inserting a partition key into a
filter is hashing the key. Remembering the hashes is sufficient
to build the filter later, without looking at partition keys again.)

As of this patch, if the Index component is not being written,
we don't allocate and populate a bloom filter during the Data.db write.
Instead, we write the murmur hashes to TemporaryHashes, and only
later, after the Data write finishes, we allocate the optimal-size,
bloom filter, we read the hashes back from TemporaryHashes,
and we populate the filter with them.

That is suboptimal.
Writing the hashes to disk (or worse, to S3) and reading
them back is more expensive than building the bloom filter
during the main Data pass.
So ideally it should be avoided in cases where we know
in advance that the partition key count estimate is good enough.
(Which should be the case in flushes and compactions).
But we defer that to a future patch.
(Such a change would involve passing some flag to the sstable writer
if the cardinality estimate is trustworthy, and not creating
TemporaryHashes if the estimate is trustworthy).
2025-09-29 13:01:21 +02:00
Michał Chojnowski
18875621e8 sstables: introduce Partition and Rows component types
BTI indexes are made up of Partition.db and Rows.db files.
In this patch we introduce the corresponding component types.

In Cassandra, BTI is a separate "sstable format", with a new set
of versions. (I.e. `bti-da`, as opposed to `big-me`).

In this patch series, we are doing something different:
we are introducing version `ms`, which is like `me`, except with
`Index.db` and `Summary.db` replaced with `Partitions.db` and `Rows.db`.

With a setup like that, Scylla won't yet be able to read Cassandra's
BTI (`da`) files, because this patch doesn't teach Scylla
about `da`.
(But the way to that is open. It would just require first implementing
several other things which changed between `me` and `da`).

(And, naturally Cassandra will reject `ms` sstables.
But this isn't the first time we are breaking file
compatibility with Cassandra to some degree.
Other examples include encryption and dictionary compression).

Note: Partitions.db and Rows.db contain prefixes of keys,
which is sensitive information, so they have to be encrypted.
2025-09-29 13:01:21 +02:00
Pavel Emelyanov
a1ea553fe1 code: Replace distributed<> with sharded<>
The latter is recommended in seastar, and the former was left as
compatibility alias. Latest seastar explicitly marks it as deprecated so
once the submodule is updated, compilation logs will explode.

Most of the patch is generated with

    for f in $(git grep -l '\<distributed<[A-Za-z0-9:_]*>') ; do sed -e 's/\<distributed<\([A-Za-z0-9:_]*\)>/sharded<\1>/g' -i $f; done
    for f in $(git grep -l distributed.hh); do sed -e 's/distributed.hh/sharded.hh/' -i $f ; done

and a small manual change in test/perf/perf.hh

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#26136
2025-09-19 12:22:51 +02:00
Ernest Zaslavsky
e56081d588 treewide: seastar module update and fix broken rest client
start using `write_body` in `rest/client` to properly set headers due to changes applied to seastar's http client

Seastar module update
```
b6be384e Merge 'http: generalize Content-Type setting' from Nadav Har'El
74472298 http: generalize request's Content-Type setting
9fd5a1cc http: generalize reply's Content-Type setting
a2665f38 memory: Remove deprecated enable_abort_on_allocation_failure()
d2a5a8a9 resource.cc: Remove some dead code
7ad9f424 http: Add support of multiple key repetitions for the request
a636baca task: Move task::get_backtrace() definition in its class
a0101efa Fixed "doxygen" spelling in error message
db969482 Merge 'http/reply: introduce set_cookie()' from Botond Dénes
5357b434 http/reply: introduce set_cookie()
1ddcf05f http/reply: make write_reply*() public
4b782d73 http/connection: start_response(): fix indentation
720feca0 http/reply: encapsulate reply writing in write_reply()
3e19917d Merge 'exceptions: log thrown and propagated exception with distinct log levels' from Botond Dénes
db9aea93 Merge 'Correctly wrap up abandoned yielding directory lister' from Pavel Emelyanov
dbb2bf3f test: Add test for input_stream::read_exactly()
a5308ec9 file/directory_lister: Correctly wrap up fallback generator
4f0811f4 file/directory_lister: Convert on-stack queue to shared pointer
59801da7 tests: Add directory lister early drop cases
33233032 http/reply: s/write_reply_to_connection/write_reply/
69b93620 http/reply: write_reply_{to_connection,headers}(): pass output stream
56e9bda7 test: Convert directory_test into seastar test
96782358 Merge 'Improve io_tester's seqwrite and append workloads' from Pavel Emelyanov
8b46e3d4 SEASTAR_ASSERT: assert to stderr and flush stream
3370e22a tutorial.md: use current_exception_as_future()
e977453a Add fixture support for seastar::testing
3e70d7f7 io_tester: Do not set append_is_unlikely unconditionally
2a4ae7b4 io_tester: Count file size overflows
5e678bb5 io_tester: Tuneup size overflow check
d5dad8ce io_tester: Move position management code to io_class_data
5586a056 io_tester: Rename seqwrite -> overwrite
92df2fb2 io_tester: Relax return value of create_and_fill_file()
03d9500d io_tester: Dont fill file for APPEND
d6844a7b io_tester: Indentation fix after previous patch
fb9e0088 io_tester: Coroutinize create_and_fill_file()
2f802f57 exceptions: log thrown and propagated exception with distinct log levels
4971fa70 util: move log-level into own header
39448fc1 Merge 'Fix and tune http::request setup by client' from Pavel Emelyanov
52d0c4fb iostream: Move output_stream::write(scattered_message) lower
7a52f734 Merge 'read_first_line: Missing pragma and licence' from Ernest Zaslavsky
d0881b7e read_first_line: Add missing license boilerplate
988a0e99 read_first_line:: Add missing `#pragma once`
42675266 http: Make client::make_request accept const request&
c7709fb5 http: Make request making API return exceptional future not throw
b68ed89b http: Move request content length header setup
1d96dac6 http: Move request version configuration
072e86f6 http: Setup request once
```

Closes scylladb/scylladb#25915

(cherry picked from commit 44d34663bc)

Closes scylladb/scylladb#26100
2025-09-19 11:40:59 +03:00
Nikos Dragazis
78bcecd570 utils: azure: Bump default credential logs from DEBUG to INFO
The default credential provider produces diagnostic logs on each step as
it walks through the credential chain. These logs are useful for
operators to diagnose authentication problems as they expose information
about which credential sources are being evaluated, in which order, why
they fail, and which source is eventually selected.

Promote them from DEBUG to INFO level.

Additionally, concatenate the logs for environment credentials into a
single log statement to avoid interleaving with other logs.

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
2025-09-16 15:20:52 +03:00
Botond Dénes
ee7c85919e Revert "treewide: seastar module update and fix broken rest client"
This reverts commit 44d34663bc of PR
https://github.com/scylladb/scylladb/pull/25915.

Breaks articact tests on ARM, blocking us from building new images from
master.
2025-09-16 08:31:08 +03:00
Ernest Zaslavsky
44d34663bc treewide: seastar module update and fix broken rest client
start using `write_body` in `rest/client` to properly set headers due to changes applied to seastar's http client

Seastar module update
```
b6be384e Merge 'http: generalize Content-Type setting' from Nadav Har'El
74472298 http: generalize request's Content-Type setting
9fd5a1cc http: generalize reply's Content-Type setting
a2665f38 memory: Remove deprecated enable_abort_on_allocation_failure()
d2a5a8a9 resource.cc: Remove some dead code
7ad9f424 http: Add support of multiple key repetitions for the request
a636baca task: Move task::get_backtrace() definition in its class
a0101efa Fixed "doxygen" spelling in error message
db969482 Merge 'http/reply: introduce set_cookie()' from Botond Dénes
5357b434 http/reply: introduce set_cookie()
1ddcf05f http/reply: make write_reply*() public
4b782d73 http/connection: start_response(): fix indentation
720feca0 http/reply: encapsulate reply writing in write_reply()
3e19917d Merge 'exceptions: log thrown and propagated exception with distinct log levels' from Botond Dénes
db9aea93 Merge 'Correctly wrap up abandoned yielding directory lister' from Pavel Emelyanov
dbb2bf3f test: Add test for input_stream::read_exactly()
a5308ec9 file/directory_lister: Correctly wrap up fallback generator
4f0811f4 file/directory_lister: Convert on-stack queue to shared pointer
59801da7 tests: Add directory lister early drop cases
33233032 http/reply: s/write_reply_to_connection/write_reply/
69b93620 http/reply: write_reply_{to_connection,headers}(): pass output stream
56e9bda7 test: Convert directory_test into seastar test
96782358 Merge 'Improve io_tester's seqwrite and append workloads' from Pavel Emelyanov
8b46e3d4 SEASTAR_ASSERT: assert to stderr and flush stream
3370e22a tutorial.md: use current_exception_as_future()
e977453a Add fixture support for seastar::testing
3e70d7f7 io_tester: Do not set append_is_unlikely unconditionally
2a4ae7b4 io_tester: Count file size overflows
5e678bb5 io_tester: Tuneup size overflow check
d5dad8ce io_tester: Move position management code to io_class_data
5586a056 io_tester: Rename seqwrite -> overwrite
92df2fb2 io_tester: Relax return value of create_and_fill_file()
03d9500d io_tester: Dont fill file for APPEND
d6844a7b io_tester: Indentation fix after previous patch
fb9e0088 io_tester: Coroutinize create_and_fill_file()
2f802f57 exceptions: log thrown and propagated exception with distinct log levels
4971fa70 util: move log-level into own header
39448fc1 Merge 'Fix and tune http::request setup by client' from Pavel Emelyanov
52d0c4fb iostream: Move output_stream::write(scattered_message) lower
7a52f734 Merge 'read_first_line: Missing pragma and licence' from Ernest Zaslavsky
d0881b7e read_first_line: Add missing license boilerplate
988a0e99 read_first_line:: Add missing `#pragma once`
42675266 http: Make client::make_request accept const request&
c7709fb5 http: Make request making API return exceptional future not throw
b68ed89b http: Move request content length header setup
1d96dac6 http: Move request version configuration
072e86f6 http: Setup request once
```

Closes scylladb/scylladb#25915
2025-09-13 17:14:28 +03:00
Radosław Cybulski
436150eb52 treewide: fix spelling errors
Fix spelling errors reported by copilot on github.
Remove single use namespace alias.

Closes scylladb/scylladb#25960
2025-09-12 15:58:19 +03:00
Radosław Cybulski
c242234552 Revert "build: add precompiled headers to CMakeLists.txt"
This reverts commit 01bb7b629a.

Closes scylladb/scylladb#25735
2025-09-03 09:46:00 +03:00
Avi Kivity
7ed261fc52 Merge 'Inital GCP object storage support' from Calle Wilund
Adds infrastructure and client for interaction with GCP object storage services.

Note: this is just a client object usable for creating, listing, deleting and up/downloading of objects to/from said storage service. It makes no attempt at actually inserting it into the sstable storage flow. That can come later.

This PR breaks out GCP auth and some general REST call functionality into shared routines. Not all code is 100% reused, but at least some.

Test is added, though could be more comprehensive (feel free to suggest a test vector).
Test can run in either local mock server mode (default), or against actual GCP.
See `test/boost/gcp_object_storage_test.cc` for explanation on the config environment vars.
Default is to run the test against a temporary docker deamon.

Closes scylladb/scylladb#24629

* github.com:scylladb/scylladb:
  test::boost::gcp_object_storage_test: Initial unit tests for GCP obj storage
  proc-utils: Re-export waiting types from seastar
  proc-utils: Inherit environment from current process
  utils::gcp::object_storage: Add client for GCP object storage
  utils::http: Add optional external credentials to dns_connection_factory init
  utils::rest: Break out request wrapper and send logic
  encryption::gcp_host: Use shared gcp credentials + REST helpers
  utils::gcp: Move/add gcp credentials management to shared file
  utils::rest::client: Add formatter for seastar::http::reply
  utils::rest::client: Add helper routines for simple REST calls
  utils::http: Make shared system trust certificates public
2025-09-02 14:38:09 +03:00
Calle Wilund
fe4ab7f7bf encryption::gcp_host: Use shared gcp credentials + REST helpers
Removes code in favour of transplanted shared util code.
2025-09-01 18:03:44 +00:00
Ernest Zaslavsky
19345e539f treewide: Add missing #pragma once
Add missing `#pragma once` to include headers
2025-09-01 14:58:21 +03:00
Radosław Cybulski
01bb7b629a build: add precompiled headers to CMakeLists.txt
Add precompiled header support to CMakeLists.txt and configure.py -
it improves compilation time by approximately 10%.

New header `stdafx.hh` is added, don't include it manually -
the compiler will include it for you. The header contains includes from
external libraries used by Scylla - seastar, standard library,
linux headers and zlib.

The feature is enabled by default, use CMake option `Scylla_USE_PRECOMPILED_HEADER`
or configure.py --disable-precompiled-header to disable.

The feature should be disabled, when trying to check headers - otherwise
you might get false negatives on missing includes from seastar / abseil and so on.

Note: following configuration needs to be added to ccache.conf:

    sloppiness = pch_defines,time_macros

Closes #25182
2025-08-27 21:37:54 +03:00
Nikos Dragazis
b1d5a67018 encryption: gcp: Fix the grant type for user credentials
Exchanging a refresh token for an access token requires the
"refresh_token" grant type [1].

[1] https://datatracker.ietf.org/doc/html/rfc6749#section-6

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
2025-08-06 10:39:17 +03:00
Nikos Dragazis
77cc6a7bad encryption: gcp: Expand tilde in pathnames for credentials file
The GCP host searches for application default credentials in known
locations within the user's home directory using
`seastar::file_exists()`. However, this function does not perform tilde
expansion in pathnames.

Replace tildes with the home directory from the HOME environment
variable.

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
2025-08-06 09:46:08 +03:00
Calle Wilund
43f7eecf9e compress: move compress.cc/hh to sstables/compressor
Fixes #22106

Moves the shared compress components to sstables, and rename to
match class type.

Adjust includes, removing redundant/unneeded ones where possible.

Closes scylladb/scylladb#25103
2025-07-31 13:10:41 +03:00
Botond Dénes
837424f7bb Merge 'Add Azure Key Provider for Encryption at Rest' from Nikos Dragazis
This PR introduces a new Key Provider to support Azure Key Vault as a Key Management System (KMS) for Encryption at Rest. The core design principle is the same as in the AWS and GCP key providers - an externally provided Vault key that is used to protect local data encryption keys (a process known as "key wrapping").

In more detail, this patch series consists of:
* Multiple Azure credential sources, offering a variety of authentication options (Service Principals, Managed Identities, environment variables, Azure CLI).
* The Azure host - the Key Vault endpoint bridge.
* The Azure Key Provider - the interface for the Azure host.
* Unit tests using real Azure resources (credentials and Vault keys).
* Log filtering logic to not expose sensitive data in the logs (plaintext keys, credentials, access tokens).

This is part of the overall effort to support Azure deployments.

Testing done:
* Unit tests.
* Manual test on an Azure VM with a Managed Identity.
* Manual test with credentials from Azure CLI.
* Manual test of `--azure-hosts` cmdline option.
* Manual test of log filtering.

Remaining items:
- [x] Create necessary Azure resources for CI.
- [x] Merge pipeline changes (https://github.com/scylladb/scylla-pkg/pull/5201).

Closes https://github.com/scylladb/scylla-enterprise/issues/1077.

New feature. No backport is needed.

Closes scylladb/scylladb#23920

* github.com:scylladb/scylladb:
  docs: Document the Azure Key Provider
  test: Add tests for Azure Key Provider
  pylib: Add mock server for Azure Key Vault
  encryption: Define and enable Azure Key Provider
  encryption: azure: Delegate hosts to shard 0
  encryption: Add Azure host cache
  encryption: Add config options for Azure hosts
  encryption: azure: Add override options
  encryption: azure: Add retries for transient errors
  encryption: azure: Implement init()
  encryption: azure: Implement get_key_by_id()
  encryption: azure: Add id-based key cache
  encryption: azure: Implement get_or_create_key()
  encryption: azure: Add credentials in Azure host
  encryption: azure: Add attribute-based key cache
  encryption: azure: Add skeleton for Azure host
  encryption: Templatize get_{kmip,kms,gcp}_host()
  encryption: gcp: Fix typo in docstring
  utils: azure: Get access token with default credentials
  utils: azure: Get access token from Azure CLI
  utils: azure: Get access token from IMDS
  utils: azure: Get access token with SP certificate
  utils: azure: Get access token with SP secret
  utils: rest: Add interface for request/response redaction logic
  utils: azure: Declare all Azure credential types
  utils: azure: Define interface for Azure credentials
  utils: Introduce base64url_{encode,decode}
2025-07-25 10:45:32 +03:00
Ernest Zaslavsky
0053a4f24a encryption: remove default case from component_type switch
Do not use default, instead list all fall-through components
explicitly, so if we add a new one, the developer doing that
will be forced to consider what to do here.

Eliminate the `default` case from the switch in
`encryption_file_io_extension::wrap_sink`, and explicitly
handle all `component_type` values within the switch statement.

fixes: https://github.com/scylladb/scylladb/issues/23724

Closes scylladb/scylladb#24987
2025-07-21 14:43:12 +03:00
Ernest Zaslavsky
408aa289fe treewide: Move misc files to utils directory
As requested in #22114, moved the files and fixed other includes and build system.

Moved files:
- interval.hh
- Map_difference.hh

Fixes: #22114

This is a cleanup, no need to backport

Closes scylladb/scylladb#25095
2025-07-21 11:56:40 +03:00
Nikos Dragazis
41b63469e1 encryption: Define and enable Azure Key Provider
Define the Azure Key Provider to connect the core EaR business logic
with the Azure-based Key Management implementation (Azure host).

Introduce "AzureKeyProviderFactory" as a new `key_provider` value in the
configuration.

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
2025-07-16 17:14:09 +03:00
Nikos Dragazis
f0927aac07 encryption: azure: Delegate hosts to shard 0
As in the AWS and GCP hosts, make all Azure hosts delegate their traffic
to shard 0 to avoid creating too many data encryption keys and API
calls to Key Vault.

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
2025-07-16 17:14:09 +03:00
Nikos Dragazis
339992539d encryption: Add Azure host cache
The encryption context maintains a cache per host type per thread.
Add a cache for the Azure host as well. Initialize the cache with Azure
hosts from the configuration, while registering the extensions for
encryption.

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
2025-07-16 17:14:09 +03:00
Nikos Dragazis
c98d3246b2 encryption: Add config options for Azure hosts
Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
2025-07-16 17:14:09 +03:00
Nikos Dragazis
a1aef456ac encryption: azure: Add override options
Extend `get_or_create_key()` to accept host options that override the
config options. This will be used to pass encryption options from the
table schema. Currently, only the master key can be overridden.

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
2025-07-16 17:14:09 +03:00