The inner loops in reload_all_permissions iterate role's permissions
and _anonymous_permissions maps across yield points. Concurrent
load_permissions calls (which don't hold _loading_sem) can emplace
into those same maps during a yield, potentially triggering a rehash
that invalidates the active iterator.
We want to avoid adding semaphore acquire in load_permissions
because it's on a common path (get_permissions).
Fixing by snapshotting the keys into a vector before iterating with
yields, so no long-lived map iterator is held across suspension
points.
distribute_role() modifies _roles on non-zero shards via
invoke_on_others() without holding _loading_sem. Similarly, load_all()'s
invoke_on_others() callback calls prune_all() without the semaphore.
When these run concurrently with reload_all_permissions(), which
iterates _roles across yield points, an insertion can trigger
absl::flat_hash_map::resize(), freeing the backing storage while
an iterator still references it.
Fix by acquiring _loading_sem on the target shard in both
distribute_role()'s and load_all()'s invoke_on_others callbacks,
serializing all _roles mutations with coroutines that iterate
the map.
The LDAP server may change role-chain assignments without notifying
Scylla. As a result, effective permissions can change, so some form of
polling is required.
Currently, this is handled via cache expiration. However, the unified
cache is designed to be consistent and does not support expiration.
To provide an equivalent mechanism for LDAP, we will periodically
reload the permissions portion of the new cache at intervals matching
the previously configured expiration time.
We want to get rid of loading cache because its periodic
refresh logic generates a lot of internal load when there
is many entries. Also our operation procedures involve tweaking
the config while new unified cache is supposed to work out
of the box.
Auth cache loading at startup is racing between
auth service and raft code and it doesn't support
concurrency causing it to crash.
We can't easily remove any of the places as during
raft recovery snapshot is not loaded and it relies
on loading cache via auth service. Therefore we add
semaphore.
Fixes https://github.com/scylladb/scylladb/issues/27540Closesscylladb/scylladb#27573
During b9199e8b24
reivew it was suggested to use standard for loop
but when erasing element it causes increment on
invalid iterator, as role could have been erased
before.
This change brings back original code.
Fixes: https://github.com/scylladb/scylladb/issues/27422
Backport: no, offending commit not released yet
Closesscylladb/scylladb#27444
It combines data from all underlying auth tables.
Supports gentle full load and per role reloads.
Loading is done on shard 0 and then deep copies data
to all shards.