This patchset migrates: query_all_directly_granted, query_all,
get_attribute, query_attribute_for_all functions to use cache
instead of doing CQL queries. It also includes some preparatory
work which fixes cache update order and triggering.
Main motivation behind this is to make sure that all calls
from service_level_controller::auth_integration are cached,
which we achieve here.
Alternative implementation could move the whole auth_integration
data into auth cache but since auth_integration manages also lifetime
and contains service levels specific logic such solution would be
too complex for little (if any) gain.
Fixes https://scylladb.atlassian.net/browse/SCYLLADB-159
Backport: no, not a bug
Closesscylladb/scylladb#28791
* github.com:scylladb/scylladb:
auth: switch query_attribute_for_all to use cache
auth: switch get_attribute to use cache
auth: cache: add heterogeneous map lookups
auth: switch query_all to use cache
auth: switch query_all_directly_granted to use cache
auth: cache: add ability to go over all roles
raft: service: reload auth cache before service levels
service: raft: move update_service_levels_effective_cache check
GRANT/REVOKE fails on the maintenance socket connections,
because maintenance_auth_service uses allow_all_authorizer.
allow_all_authorizer allows all operations, but not GRANT/REVOKE,
because they make no sense in its context.
This has been observed during PGO run failure in operations from
./pgo/conf/auth.cql file.
This patch introduces maintenance_socket_authorizer that supports
the capabilities of default_authorizer ('CassandraAuthorizer')
without needing authorization.
Refs SCYLLADB-1070
Replace the hardcoded meta::DEFAULT_SUPERUSER_NAME comparison with
default_superuser(_qp) which reads from the auth_superuser_name
config option. This makes the IF NOT EXISTS clause in DESCRIBE
output correct for clusters with a non-default superuser name.
Move default_superuser() to auth::meta in common.{hh,cc} and remove the
cached _superuser member from both standard_role_manager and
password_authenticator. The superuser name comes from config which is
immutable at runtime, so caching it is unnecessary.
Removes auth-v1 hack for cassandra superuser as auth-v1
code no longer exists.
Also CL is not really used when quering raft replicated
tables (like auth ones), but LOCAL_ONE is the least confusing
one.
Replace get_auth_ks_name(qp) with db::system_keyspace::NAME directly.
The function always returned the constant "system" and its qp
parameter was unused.
This series adds a global read barrier to raft_group0_client, ensuring that Raft group0 mutations are applied on all live nodes before returning to the caller.
Currently, after a group0_batch::commit, the mutations are only guaranteed to be applied on the leader. Other nodes may still be catching up, leading to stale reads. This patch introduces a broadcast read barrier mechanism. Calling send_group0_read_barrier_to_live_members after committing will cause the coordinator to send a read barrier RPC to all live nodes (discovered via gossiper) and waits for them to complete. This is best effort attempt to get cluster-wide visibility of the committed state before the response is returned to the user.
Auth and service levels write paths are switched to use this new mechanism.
Fixes https://scylladb.atlassian.net/browse/SCYLLADB-650
Backport: no, new feature
Closesscylladb/scylladb#28731
* https://github.com/scylladb/scylladb:
test: add tests for global group0_batch barrier feature
qos: switch service levels write paths to use global group0_batch barrier
auth: switch write paths to use global group0_batch barrier
raft: add function to broadcast read barrier request
raft: add gossiper dependency to raft_group0_client
raft: add read barrier RPC
Changes the behavior of default superuser creation.
Previously, without configuration 'cassandra:cassandra' credentials
were used. Now default superuser creation is skipped if not configured.
The two ways to create default superuser are:
- Config file - auth_superuser_name and auth_superuser_salted_password fields
- Maintenance socket - connect over maintenance socket and CREATE/ALTER ROLE ...
Behavior changes:
Old behavior:
- No config - 'cassandra:cassandra' created
- auth_superuser_name only - <name>:cassandra created
- auth_superuser_salted_password only - 'cassandra:<password>' created
- Both specified - '<name>:<password>' created
New behavior:
- No config - no default superuser
- Requires maintenance socket setup
- auth_superuser_name only - '<name>:' created WITHOUT password
- Requires maintenance socket setup
- auth_superuser_salted_password only - no default superuser
- Both specified - '<name>:<password>' created
Fixes SCYLLADB-409
Introduce maintenance_socket_authenticator and rework
maintenance_socket_role_manager to support role management operations.
Maintenance auth service uses allow_all_authenticator. To allow
role modification statements over the maintenance socket connections,
we need to treat the maintenance socket connections as superusers and
give them proper access rights.
Possible approaches are:
1. Modify allow_all_authenticator with conditional logic that
password_authenticator already does
2. Modify password_authenticator with conditional logic specific
for the maintenance socket connections
3. Extend password_authenticator, overriding the methods that differ
Option 3 is chosen: maintenance_socket_authenticator extends
password_authenticator with authentication disabled.
The maintenance_socket_role_manager is reworked to lazily create a
standard_role_manager once the node joins the cluster, delegating role
operations to it. In maintenance mode role operations remain disabled.
Refs SCYLLADB-409
This patch is part of preparations for dropping 'cassandra::cassandra'
default superuser. When that is implemented, maintenance_socket_role_manager
will have two modes of work:
1. in maintenance mode, where role operations are forbidden
2. in normal mode, where role operations are allowed
To execute the role operations, the node has to join a cluster.
In maintenance mode the node does not join a cluster.
This patch lets maintenance_socket_role_manager know if it works under
maintenance mode and returns appropriate error message when role
operations execution is requested.
Refs SCYLLADB-409
This patch removes class registrator usage in auth module.
It is not used after switching to factory functor initialization
of auth service.
Several role manager, authenticator, and authorizer name variables
are returned as well, and hardcoded inside qualified_java_name method,
since that is the only place they are ever used.
Refs SCYLLADB-409
Auth service can be initialized:
- [current] by passing instantiated authorizer, authenticator, role manager
- [current] by passing service_config, which then uses class registrator to instantiate authorizer, authenticator, role manager
- This approach is easy to use with sharded services
- [new] by passing factory functors which instantiate authorizer, authenticator, role manager
- This approach is also easy to use with sharded services
Refs SCYLLADB-409
In a follow-up patch in this patch series class registrator will be removed.
Adding transitional.hh file will be necessary to expose the authenticator and authorizer.
Refs SCYLLADB-409
The inner loops in reload_all_permissions iterate role's permissions
and _anonymous_permissions maps across yield points. Concurrent
load_permissions calls (which don't hold _loading_sem) can emplace
into those same maps during a yield, potentially triggering a rehash
that invalidates the active iterator.
We want to avoid adding semaphore acquire in load_permissions
because it's on a common path (get_permissions).
Fixing by snapshotting the keys into a vector before iterating with
yields, so no long-lived map iterator is held across suspension
points.
distribute_role() modifies _roles on non-zero shards via
invoke_on_others() without holding _loading_sem. Similarly, load_all()'s
invoke_on_others() callback calls prune_all() without the semaphore.
When these run concurrently with reload_all_permissions(), which
iterates _roles across yield points, an insertion can trigger
absl::flat_hash_map::resize(), freeing the backing storage while
an iterator still references it.
Fix by acquiring _loading_sem on the target shard in both
distribute_role()'s and load_all()'s invoke_on_others callbacks,
serializing all _roles mutations with coroutines that iterate
the map.
The LDAP server may change role-chain assignments without notifying
Scylla. As a result, effective permissions can change, so some form of
polling is required.
Currently, this is handled via cache expiration. However, the unified
cache is designed to be consistent and does not support expiration.
To provide an equivalent mechanism for LDAP, we will periodically
reload the permissions portion of the new cache at intervals matching
the previously configured expiration time.
We want to get rid of loading cache because its periodic
refresh logic generates a lot of internal load when there
is many entries. Also our operation procedures involve tweaking
the config while new unified cache is supposed to work out
of the box.
In the following commit we'll need to add some
cache related logic (removing resource permissions).
This logic doesn't depend on authorizer so it should
be managed by the service itself.
Previously, we wanted to make minimal changes with regards to the new
unified auth cache. However, as a result, some calls on the hot path
were missed. Now we have switched the underlying find_record call
to use the cache. Since caching is now at a lower level, we bring
back the original code.
Since every write-type auth statement takes group0_guard at the beginning,
we hold read_apply_mutex and cannot have a running raft apply during our
operation. Therefore, the auth cache and internal CQL reads return the same,
consistent results. This makes it safe to read via cache instead of internal
CQL.
LDAP is an exception, but it is eventually consistent anyway.
Auth v2 migration uses non-paged queries via `execute_internal` API.
This commit changes it to use `query_internal` instead, which uses
paging under the hood.
Fixes: scylladb/scylladb#27577
Just move the touched code above so the next commit is more readable.
But this has a drawback: previously, if the returned rows were empty,
this code was not executed, but now is independently of the query
results. This shouldn't be a big deal, though, as auth shouldn't be
empty.
Auth cache loading at startup is racing between
auth service and raft code and it doesn't support
concurrency causing it to crash.
We can't easily remove any of the places as during
raft recovery snapshot is not loaded and it relies
on loading cache via auth service. Therefore we add
semaphore.
Fixes https://github.com/scylladb/scylladb/issues/27540Closesscylladb/scylladb#27573
This patch series contains the following changes:
- Incorporation of `crypt_sha512.c` from musl to out codebase
- Conversion of `crypt_sha512.c` to C++ and coroutinization
- Coroutinization of `auth::passwords::check`
- Enabling use of `__crypt_sha512` orignated from `crypt_sha512.c` for
computing SHA 512 passwords of length <=255
- Addition of yielding in the aforementioned hashing implementation.
The alien thread was a solution for reactor stalls caused by indivisible
password‑hashing tasks (https://github.com/scylladb/scylladb/issues/24524).
However, because there is only one alien thread, overall hashing throughput was reduced
(see, e.g., https://github.com/scylladb/scylla-enterprise/issues/5711). To address this,
the alien‑thread solution is reverted, and a hashing implementation
with yielding is introduced in this patch series.
Before this patch series, ScyllaDB used SHA-512 hashing provided
by the `crypt_r` function, which in our case meant using the implementation
from the `libxcrypt` library. Adding yielding to this `libxcrypt`
implementation is problematic, both due to licensing (LGPL) and because the
implementation is split into many functions across multiple files. In
contrast, the SHA-512 implementation from `musl libc` has a more
permissive license and is concise, which makes it easier to incorporate
into the ScyllaDB codebase.
The performance of this solution was compared with the previous
implementation that used one alien thread and the implementation
after the alien thread was reverted. The results (median) of
`perf-cql-raw` with `--connection-per-request 1 --smp 10` parameters
are as follows:
- Alien thread: 41.5 new connections/s per shard
- Reverted alien thread: 244.1 new connections/s per shard
- This commit (yielding in hashing): 198.4 new connections/s per shard
The roughly 20% performance deterioration compared to
the old implementation without the alien thread comes from the fact
that the new hashing algorithm implemented in `utils/crypt_sha512.cc`
performs an expensive self-verification and stack cleanup.
On the other hand, with smp=10 the current implementation achieves
roughly 5x higher throughput than the alien thread. In addition,
due to yielding added in this commit, the algorithm is expected
to provide similar protection from stalls as the alien thread did.
In a test that in parallel started a cassandra-stress workload and
created thousands of new connections using python-driver, the values of
`scylla_reactor_stalls_count` metric were as follows:
- Alien thread: 109 stalls/shard total
- Reverted alien thread: 13186 stalls/shard total
- This commit (yielding in hashing): 149 stalls/shard total
Similarly, the `scylla_scheduler_time_spent_on_task_quota_violations_ms`
values were:
- Alien thread: 1087 ms/shard total
- Reverted alien thread: 72839 ms/shard total
- This commit (yielding in hashing): 1623 ms/shard total
To summarize, yielding during hashing computations achieves similar
throughput to the old solution without the alien thread but also
prevents stalls similarly to the alien thread.
Fixes: scylladb/scylladb#26859
Refs: scylladb/scylla-enterprise#5711
No automatic backport. After this PR is completed, the alien thread should be rather reverted from older branches (2025.2-2025.4 because on 2025.1 it's already removed). Backporting of the other commits needs further discussion.
Closesscylladb/scylladb#26860
* github.com:scylladb/scylladb:
test/boost: add too_long_password to auth_passwords_test
test/boost: add same_hashes_as_crypt_r to auth_passwords_test
auth: utils: add yielding to crypt_sha512
auth: change return type of passwords::check to future
auth: remove code duplication in verify_scheme
test/boost: coroutinize auth_passwords_test
utils: coroutinize crypt_sha512
utils: make crypt_sha512.cc to compile
utils: license: import crypt_sha512.c from musl to the project
Revert "auth: move passwords::check call to alien thread"
This change allows yielding during hashing computations to prevent
stalls.
The performance of this solution was compared with the previous
implementation that used one alien thread and the implementation
after the alien thread was reverted. The results (median) of
`perf-cql-raw` with `--connection-per-request 1 --smp 10` parameters
are as follows:
- Alien thread: 41.5 new connections/s per shard
- Reverted alien thread: 244.1 new connections/s per shard
- This commit (yielding in hashing): 198.4 new connections/s per shard
The alien thread is limited by a single-core hashing throughput,
which is roughly 400-500 hashes/s in the test environment. Therefore,
with smp=10, the throughput is below 50 hashes/s, and the difference
between the alien thread and other solutions further increases with
higer smp.
The roughly 20% performance deterioration compared to
the old implementation without the alien thread comes from the fact
that the new hashing algorithm implemented in `utils/crypt_sha512.cc`
performs an expensive self-verification and stack cleanup.
On the other hand, with smp=10 the current implementation achieves
roughly 5x higher throughput than the alien thread. In addition,
due to yielding added in this commit, the algorithm is expected
to provide similar protection from stalls as the alien thread did.
In a test that in parallel started a cassandra-stress workload and
created thousands of new connections using python-driver, the values of
`scylla_reactor_stalls_count` metric were as follows:
- Alien thread: 109 stalls/shard total
- Reverted alien thread: 13186 stalls/shard total
- This commit (yielding in hashing): 149 stalls/shard total
Similarly, the `scylla_scheduler_time_spent_on_task_quota_violations_ms`
values were:
- Alien thread: 1087 ms/shard total
- Reverted alien thread: 72839 ms/shard total
- This commit (yielding in hashing): 1623 ms/shard total
To summarize, yielding during hashing computations achieves similar
throughput to the old solution without the alien thread but also
prevents stalls similarly to the alien thread.
Fixes: scylladb/scylladb#26859
Refs: scylladb/scylla-enterprise#5711
Introduce a new `passwords::hash_with_salt_async` and change the return
type of `passwords::check` to `future<bool>`. This enables yielding
during password computations later in this patch series.
The old method, `hash_with_salt`, is marked as deprecated because
new code should use the new `hash_with_salt_async` function.
We are not removing `hash_with_salt` now to reduce the regression risk
of changing the hashing implementation—at least the methods that change
persistent hashes (CREATE, ALTER) will continue to use the old hashing
method. However, in the future, `hash_with_salt` should be entirely
removed.
Refs: scylladb/scylladb#26859
Refactoring: create a new function `verify_hashing_output` to reuse
code in `hash_with_salt` and `verify_scheme`. The change is introduced
to facilitate verification of hashing output when the implementation
is extended later in this patch series.
Refs: scylladb/scylladb#26859
The alien thread was a solution for reactor stalls caused by indivisible
password‑hashing tasks (scylladb/scylladb#24524). However, because
there is only one alien thread, overall hashing throughput was reduced
(see, e.g., scylladb/scylla-enterprise#5711). To address this,
the alien‑thread solution is reverted, and a hashing implementation
with yielding will be introduced later in this patch series.
This reverts commit 9574513ec1.
During b9199e8b24
reivew it was suggested to use standard for loop
but when erasing element it causes increment on
invalid iterator, as role could have been erased
before.
This change brings back original code.
Fixes: https://github.com/scylladb/scylladb/issues/27422
Backport: no, offending commit not released yet
Closesscylladb/scylladb#27444