Commit Graph

24 Commits

Author SHA1 Message Date
Botond Dénes
85f05fbe1b Revert "Merge 'Add digests for all sstable components in scylla metadata' from Taras Veretilnyk"
This reverts commit 866c96f536, reversing
changes made to 367633270a.

This change caused all longevities to fail, with a crash in parsing
scylla-metadata. The investigation is still ongoing, with no quick fix
in sight yet.

Fixes: #27496

Closes scylladb/scylladb#27518
2025-12-16 11:34:40 +02:00
Taras Veretilnyk
d287b054b9 sstables: Add TemporaryScylla metadata component type
Add TemporaryScylla component type to make atomic updates of SSTable Scylla metadata using temporary files
and atomic rename operations possible. This will be needed in further commit to rewrite metadata together with
the statistics component.
2025-12-03 23:40:10 +01:00
Calle Wilund
af85b7f61b ent::encryption: Make text helper routines take std::string
Moving away from custom string type. Pure cosmetics.
2025-11-05 10:22:22 +00:00
Calle Wilund
ac438c61e6 sstables::file_io_extension: Change "creator" callback to just data_source
Because the concept of pushing reading range does not work for the wrapping
we do (i.e. encryption), there is no point having it here. We need to do
said range handling higher up.
Also, must allow multi-layered wrapping.
2025-10-13 08:53:25 +00:00
Michał Chojnowski
b1984d6798 sstables: implement an alternative way to rebuild bloom filters for sstables without Index
For efficiency, the cardinality of the bloom filter
(i.e. the number of partition keys which will be written into the sstable)
has to be known before elements are inserted into the filter.

In some cases (e.g. memtables flush) this number is known exactly.
But in others (e.g. repair) it can only be estimated,
and the estimation might be very wrong, leading to an oversized filter.

Because of that, some time ago we added a piece of logic
(ran after the sstable is written, but before it's sealed)
which looks at the actual number of written partitions,
compares it to the initial estimate (on which the size of the bloom
filter was based on), and if the difference is unacceptably large,
it rewrites the bloom filter from partition keys contained in Index.db.

But the idea to rebuild the bloom filters from index files
isn't going to work with BTI indexes, because they don't store
whole partition keys. If we want sstables which don't have Index.db
files, we need some other way to deal with oversized filters.
Partition keys can be recovered from Data.db,
but that would often be way too expensive.

This patch adds another way. We introduce a new component file,
TemporaryHashes. This component, if written at all,
contains the 16-byte murmur hash for every partition key, in order,
and can be used in place of Index to reconstruct the bloom filter.

(Our bloom filters are actually built from the set of murmur hashes of
partition keys. The first step of inserting a partition key into a
filter is hashing the key. Remembering the hashes is sufficient
to build the filter later, without looking at partition keys again.)

As of this patch, if the Index component is not being written,
we don't allocate and populate a bloom filter during the Data.db write.
Instead, we write the murmur hashes to TemporaryHashes, and only
later, after the Data write finishes, we allocate the optimal-size,
bloom filter, we read the hashes back from TemporaryHashes,
and we populate the filter with them.

That is suboptimal.
Writing the hashes to disk (or worse, to S3) and reading
them back is more expensive than building the bloom filter
during the main Data pass.
So ideally it should be avoided in cases where we know
in advance that the partition key count estimate is good enough.
(Which should be the case in flushes and compactions).
But we defer that to a future patch.
(Such a change would involve passing some flag to the sstable writer
if the cardinality estimate is trustworthy, and not creating
TemporaryHashes if the estimate is trustworthy).
2025-09-29 13:01:21 +02:00
Michał Chojnowski
18875621e8 sstables: introduce Partition and Rows component types
BTI indexes are made up of Partition.db and Rows.db files.
In this patch we introduce the corresponding component types.

In Cassandra, BTI is a separate "sstable format", with a new set
of versions. (I.e. `bti-da`, as opposed to `big-me`).

In this patch series, we are doing something different:
we are introducing version `ms`, which is like `me`, except with
`Index.db` and `Summary.db` replaced with `Partitions.db` and `Rows.db`.

With a setup like that, Scylla won't yet be able to read Cassandra's
BTI (`da`) files, because this patch doesn't teach Scylla
about `da`.
(But the way to that is open. It would just require first implementing
several other things which changed between `me` and `da`).

(And, naturally Cassandra will reject `ms` sstables.
But this isn't the first time we are breaking file
compatibility with Cassandra to some degree.
Other examples include encryption and dictionary compression).

Note: Partitions.db and Rows.db contain prefixes of keys,
which is sensitive information, so they have to be encrypted.
2025-09-29 13:01:21 +02:00
Pavel Emelyanov
a1ea553fe1 code: Replace distributed<> with sharded<>
The latter is recommended in seastar, and the former was left as
compatibility alias. Latest seastar explicitly marks it as deprecated so
once the submodule is updated, compilation logs will explode.

Most of the patch is generated with

    for f in $(git grep -l '\<distributed<[A-Za-z0-9:_]*>') ; do sed -e 's/\<distributed<\([A-Za-z0-9:_]*\)>/sharded<\1>/g' -i $f; done
    for f in $(git grep -l distributed.hh); do sed -e 's/distributed.hh/sharded.hh/' -i $f ; done

and a small manual change in test/perf/perf.hh

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#26136
2025-09-19 12:22:51 +02:00
Calle Wilund
43f7eecf9e compress: move compress.cc/hh to sstables/compressor
Fixes #22106

Moves the shared compress components to sstables, and rename to
match class type.

Adjust includes, removing redundant/unneeded ones where possible.

Closes scylladb/scylladb#25103
2025-07-31 13:10:41 +03:00
Botond Dénes
837424f7bb Merge 'Add Azure Key Provider for Encryption at Rest' from Nikos Dragazis
This PR introduces a new Key Provider to support Azure Key Vault as a Key Management System (KMS) for Encryption at Rest. The core design principle is the same as in the AWS and GCP key providers - an externally provided Vault key that is used to protect local data encryption keys (a process known as "key wrapping").

In more detail, this patch series consists of:
* Multiple Azure credential sources, offering a variety of authentication options (Service Principals, Managed Identities, environment variables, Azure CLI).
* The Azure host - the Key Vault endpoint bridge.
* The Azure Key Provider - the interface for the Azure host.
* Unit tests using real Azure resources (credentials and Vault keys).
* Log filtering logic to not expose sensitive data in the logs (plaintext keys, credentials, access tokens).

This is part of the overall effort to support Azure deployments.

Testing done:
* Unit tests.
* Manual test on an Azure VM with a Managed Identity.
* Manual test with credentials from Azure CLI.
* Manual test of `--azure-hosts` cmdline option.
* Manual test of log filtering.

Remaining items:
- [x] Create necessary Azure resources for CI.
- [x] Merge pipeline changes (https://github.com/scylladb/scylla-pkg/pull/5201).

Closes https://github.com/scylladb/scylla-enterprise/issues/1077.

New feature. No backport is needed.

Closes scylladb/scylladb#23920

* github.com:scylladb/scylladb:
  docs: Document the Azure Key Provider
  test: Add tests for Azure Key Provider
  pylib: Add mock server for Azure Key Vault
  encryption: Define and enable Azure Key Provider
  encryption: azure: Delegate hosts to shard 0
  encryption: Add Azure host cache
  encryption: Add config options for Azure hosts
  encryption: azure: Add override options
  encryption: azure: Add retries for transient errors
  encryption: azure: Implement init()
  encryption: azure: Implement get_key_by_id()
  encryption: azure: Add id-based key cache
  encryption: azure: Implement get_or_create_key()
  encryption: azure: Add credentials in Azure host
  encryption: azure: Add attribute-based key cache
  encryption: azure: Add skeleton for Azure host
  encryption: Templatize get_{kmip,kms,gcp}_host()
  encryption: gcp: Fix typo in docstring
  utils: azure: Get access token with default credentials
  utils: azure: Get access token from Azure CLI
  utils: azure: Get access token from IMDS
  utils: azure: Get access token with SP certificate
  utils: azure: Get access token with SP secret
  utils: rest: Add interface for request/response redaction logic
  utils: azure: Declare all Azure credential types
  utils: azure: Define interface for Azure credentials
  utils: Introduce base64url_{encode,decode}
2025-07-25 10:45:32 +03:00
Ernest Zaslavsky
0053a4f24a encryption: remove default case from component_type switch
Do not use default, instead list all fall-through components
explicitly, so if we add a new one, the developer doing that
will be forced to consider what to do here.

Eliminate the `default` case from the switch in
`encryption_file_io_extension::wrap_sink`, and explicitly
handle all `component_type` values within the switch statement.

fixes: https://github.com/scylladb/scylladb/issues/23724

Closes scylladb/scylladb#24987
2025-07-21 14:43:12 +03:00
Nikos Dragazis
41b63469e1 encryption: Define and enable Azure Key Provider
Define the Azure Key Provider to connect the core EaR business logic
with the Azure-based Key Management implementation (Azure host).

Introduce "AzureKeyProviderFactory" as a new `key_provider` value in the
configuration.

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
2025-07-16 17:14:09 +03:00
Nikos Dragazis
339992539d encryption: Add Azure host cache
The encryption context maintains a cache per host type per thread.
Add a cache for the Azure host as well. Initialize the cache with Azure
hosts from the configuration, while registering the extensions for
encryption.

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
2025-07-16 17:14:09 +03:00
Nikos Dragazis
e078abba57 encryption: Templatize get_{kmip,kms,gcp}_host()
For deduplication.

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
2025-07-16 17:14:08 +03:00
Ernest Zaslavsky
0de61f56a2 sstables: add make_data_or_index_source to the storage
Add `make_data_or_index_source` to the `storage` interface, implement it
for `filesystem_storage` storage which just creates `data_source` from a
file and for the `s3_storage` create a (maybe) decrypting source from s3
make_download_source.

This change should solve performance improvement for reading large objects
from S3 and should not affect anything for the `filesystem_storage`.
2025-07-06 09:18:39 +03:00
Ernest Zaslavsky
7e5e3c5569 encryption: refactor key retrieval
Get the encryption schema extension retrieval code out of
`wrap_file` method to make it reusable elsewhere
2025-07-06 09:18:39 +03:00
Calle Wilund
ee98f5d361 encryption: Ensure stopping timers in provider cache objects
utils::loading cache has a timer that can, if we're unlucky, be runnnig
while the encryption context/extensions referencing the various host
objects containing them are destroyed in the case of unit testing.

Add a stop phase in encryption context shutdown closing the caches.
2025-06-30 11:36:38 +00:00
Calle Wilund
5c6337b887 encryption: Add "wrap_sink" to encryption sstable extension
Creates a more efficient data_sink wrapper for encrypted output
stream (S3).
2025-03-20 14:54:24 +00:00
Avi Kivity
a62ab824e6 schema: deprecate schema_extension
schema_extension allows making invisible changes to system_schema
that evade upgrade rollback tests. They appear in system_schema
as an encoded blob which reduces serviceability, as they cannot
be read.

Deprecate it and point users to adding explicit columns in scylla_tables.

We could probably make use of the data structure, after we teach it
to encode its payload into proper named and typed columns instead of
using IDL.

Closes scylladb/scylladb#23151
2025-03-19 20:36:16 +02:00
Calle Wilund
83aa66da1a encryption: Add exception handler to context init (for tests)
Adds exception handler + cleanup for the case where we have a
bad config/env vars (hint minio) or similar, such that we fail
with exception during setting up the EAR context.
In a normal startup, this is ok. We will report the exception,
and the do a exit(1).

In tests however, we don't and active context will instead be
freed quite proper, in which case we need to call stop to ensure
we don't crash on shared pointer destruction on wrong shard.
Doing so will hide the real issue from whomever runs the test.
2025-02-17 13:49:42 +00:00
Kefu Chai
7ff0d7ba98 tree: Remove unused boost headers
This commit eliminates unused boost header includes from the tree.

Removing these unnecessary includes reduces dependencies on the
external Boost.Adapters library, leading to faster compile times
and a slightly cleaner codebase.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#22857
2025-02-15 20:32:22 +02:00
Calle Wilund
7db14420b7 encryption: Fix encrypted components mask check in describe
Fixes #22401

In the fix for scylladb/scylla-enterprise#892, the extraction and check for sstable component encryption mask was copied
to a subroutine for description purposes, but a very important 1 << <value> shift was somehow
left on the floor.

Without this, the check for whether we actually contain a component encrypted can be wholly
broken for some components.

Closes scylladb/scylladb#22398
2025-01-30 11:29:13 +02:00
Kefu Chai
3b7a991f74 ent/encryption: rename "sie" to "get_opt"
"sie" is the short for "system info encryption". it is a wrapper around
a `opts` map so we can get the individual option by providing a default
value via an `optional<>` return value. but "sie" could be difficult to
understand without more context. and it is used like a function -- we
get the individual option using its operator().

so, in order to improve the readability, in this change, we rename it
to "get_opt".

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2025-01-14 21:08:17 +08:00
Kefu Chai
92c6c8a32f ent,main: fix misspellings
these misspellings are identified by codespell. they are either in
comment or logging messages. let's fix them.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2025-01-14 21:08:17 +08:00
Calle Wilund
723518c390 EAR: port the ear feature from enterprise
Bulk transfer of EAR functionality. Includes all providers etc.
Could maybe break up into smaller blocks, but once it gets down to
the core of it, would require messing with code instead of just moving.
So this is it.

Note: KMIP support is disabled unless you happen to have the kmipc
SDK in your scylla dir.

Adds optional encryption of sstables and commitlog, using block
level file encryption. Provides key sourcing from various sources,
such as local files or popular KMS systems.
2025-01-09 10:37:26 +00:00