Fixes#22106
Moves the shared compress components to sstables, and rename to
match class type.
Adjust includes, removing redundant/unneeded ones where possible.
Closesscylladb/scylladb#25103
This PR introduces a new Key Provider to support Azure Key Vault as a Key Management System (KMS) for Encryption at Rest. The core design principle is the same as in the AWS and GCP key providers - an externally provided Vault key that is used to protect local data encryption keys (a process known as "key wrapping").
In more detail, this patch series consists of:
* Multiple Azure credential sources, offering a variety of authentication options (Service Principals, Managed Identities, environment variables, Azure CLI).
* The Azure host - the Key Vault endpoint bridge.
* The Azure Key Provider - the interface for the Azure host.
* Unit tests using real Azure resources (credentials and Vault keys).
* Log filtering logic to not expose sensitive data in the logs (plaintext keys, credentials, access tokens).
This is part of the overall effort to support Azure deployments.
Testing done:
* Unit tests.
* Manual test on an Azure VM with a Managed Identity.
* Manual test with credentials from Azure CLI.
* Manual test of `--azure-hosts` cmdline option.
* Manual test of log filtering.
Remaining items:
- [x] Create necessary Azure resources for CI.
- [x] Merge pipeline changes (https://github.com/scylladb/scylla-pkg/pull/5201).
Closes https://github.com/scylladb/scylla-enterprise/issues/1077.
New feature. No backport is needed.
Closesscylladb/scylladb#23920
* github.com:scylladb/scylladb:
docs: Document the Azure Key Provider
test: Add tests for Azure Key Provider
pylib: Add mock server for Azure Key Vault
encryption: Define and enable Azure Key Provider
encryption: azure: Delegate hosts to shard 0
encryption: Add Azure host cache
encryption: Add config options for Azure hosts
encryption: azure: Add override options
encryption: azure: Add retries for transient errors
encryption: azure: Implement init()
encryption: azure: Implement get_key_by_id()
encryption: azure: Add id-based key cache
encryption: azure: Implement get_or_create_key()
encryption: azure: Add credentials in Azure host
encryption: azure: Add attribute-based key cache
encryption: azure: Add skeleton for Azure host
encryption: Templatize get_{kmip,kms,gcp}_host()
encryption: gcp: Fix typo in docstring
utils: azure: Get access token with default credentials
utils: azure: Get access token from Azure CLI
utils: azure: Get access token from IMDS
utils: azure: Get access token with SP certificate
utils: azure: Get access token with SP secret
utils: rest: Add interface for request/response redaction logic
utils: azure: Declare all Azure credential types
utils: azure: Define interface for Azure credentials
utils: Introduce base64url_{encode,decode}
Do not use default, instead list all fall-through components
explicitly, so if we add a new one, the developer doing that
will be forced to consider what to do here.
Eliminate the `default` case from the switch in
`encryption_file_io_extension::wrap_sink`, and explicitly
handle all `component_type` values within the switch statement.
fixes: https://github.com/scylladb/scylladb/issues/23724Closesscylladb/scylladb#24987
Define the Azure Key Provider to connect the core EaR business logic
with the Azure-based Key Management implementation (Azure host).
Introduce "AzureKeyProviderFactory" as a new `key_provider` value in the
configuration.
Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
The encryption context maintains a cache per host type per thread.
Add a cache for the Azure host as well. Initialize the cache with Azure
hosts from the configuration, while registering the extensions for
encryption.
Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
Add `make_data_or_index_source` to the `storage` interface, implement it
for `filesystem_storage` storage which just creates `data_source` from a
file and for the `s3_storage` create a (maybe) decrypting source from s3
make_download_source.
This change should solve performance improvement for reading large objects
from S3 and should not affect anything for the `filesystem_storage`.
utils::loading cache has a timer that can, if we're unlucky, be runnnig
while the encryption context/extensions referencing the various host
objects containing them are destroyed in the case of unit testing.
Add a stop phase in encryption context shutdown closing the caches.
schema_extension allows making invisible changes to system_schema
that evade upgrade rollback tests. They appear in system_schema
as an encoded blob which reduces serviceability, as they cannot
be read.
Deprecate it and point users to adding explicit columns in scylla_tables.
We could probably make use of the data structure, after we teach it
to encode its payload into proper named and typed columns instead of
using IDL.
Closesscylladb/scylladb#23151
Adds exception handler + cleanup for the case where we have a
bad config/env vars (hint minio) or similar, such that we fail
with exception during setting up the EAR context.
In a normal startup, this is ok. We will report the exception,
and the do a exit(1).
In tests however, we don't and active context will instead be
freed quite proper, in which case we need to call stop to ensure
we don't crash on shared pointer destruction on wrong shard.
Doing so will hide the real issue from whomever runs the test.
This commit eliminates unused boost header includes from the tree.
Removing these unnecessary includes reduces dependencies on the
external Boost.Adapters library, leading to faster compile times
and a slightly cleaner codebase.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#22857
Fixes#22401
In the fix for scylladb/scylla-enterprise#892, the extraction and check for sstable component encryption mask was copied
to a subroutine for description purposes, but a very important 1 << <value> shift was somehow
left on the floor.
Without this, the check for whether we actually contain a component encrypted can be wholly
broken for some components.
Closesscylladb/scylladb#22398
"sie" is the short for "system info encryption". it is a wrapper around
a `opts` map so we can get the individual option by providing a default
value via an `optional<>` return value. but "sie" could be difficult to
understand without more context. and it is used like a function -- we
get the individual option using its operator().
so, in order to improve the readability, in this change, we rename it
to "get_opt".
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
these misspellings are identified by codespell. they are either in
comment or logging messages. let's fix them.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Bulk transfer of EAR functionality. Includes all providers etc.
Could maybe break up into smaller blocks, but once it gets down to
the core of it, would require messing with code instead of just moving.
So this is it.
Note: KMIP support is disabled unless you happen to have the kmipc
SDK in your scylla dir.
Adds optional encryption of sstables and commitlog, using block
level file encryption. Provides key sourcing from various sources,
such as local files or popular KMS systems.