This change allows yielding during hashing computations to prevent
stalls.
The performance of this solution was compared with the previous
implementation that used one alien thread and the implementation
after the alien thread was reverted. The results (median) of
`perf-cql-raw` with `--connection-per-request 1 --smp 10` parameters
are as follows:
- Alien thread: 41.5 new connections/s per shard
- Reverted alien thread: 244.1 new connections/s per shard
- This commit (yielding in hashing): 198.4 new connections/s per shard
The alien thread is limited by a single-core hashing throughput,
which is roughly 400-500 hashes/s in the test environment. Therefore,
with smp=10, the throughput is below 50 hashes/s, and the difference
between the alien thread and other solutions further increases with
higer smp.
The roughly 20% performance deterioration compared to
the old implementation without the alien thread comes from the fact
that the new hashing algorithm implemented in `utils/crypt_sha512.cc`
performs an expensive self-verification and stack cleanup.
On the other hand, with smp=10 the current implementation achieves
roughly 5x higher throughput than the alien thread. In addition,
due to yielding added in this commit, the algorithm is expected
to provide similar protection from stalls as the alien thread did.
In a test that in parallel started a cassandra-stress workload and
created thousands of new connections using python-driver, the values of
`scylla_reactor_stalls_count` metric were as follows:
- Alien thread: 109 stalls/shard total
- Reverted alien thread: 13186 stalls/shard total
- This commit (yielding in hashing): 149 stalls/shard total
Similarly, the `scylla_scheduler_time_spent_on_task_quota_violations_ms`
values were:
- Alien thread: 1087 ms/shard total
- Reverted alien thread: 72839 ms/shard total
- This commit (yielding in hashing): 1623 ms/shard total
To summarize, yielding during hashing computations achieves similar
throughput to the old solution without the alien thread but also
prevents stalls similarly to the alien thread.
Fixes: scylladb/scylladb#26859
Refs: scylladb/scylla-enterprise#5711
Introduce a new `passwords::hash_with_salt_async` and change the return
type of `passwords::check` to `future<bool>`. This enables yielding
during password computations later in this patch series.
The old method, `hash_with_salt`, is marked as deprecated because
new code should use the new `hash_with_salt_async` function.
We are not removing `hash_with_salt` now to reduce the regression risk
of changing the hashing implementation—at least the methods that change
persistent hashes (CREATE, ALTER) will continue to use the old hashing
method. However, in the future, `hash_with_salt` should be entirely
removed.
Refs: scylladb/scylladb#26859
Before this change, the following hashing schemes were supported by
`identify_best_supported_scheme()`: bcrypt_y, bcrypt_a, SHA-512,
SHA-256, and MD5. The reason for this was that the `crypt_r` function
used for password hashing comes from an external library (currently
`libxcrypt`), and the supported hashing algorithms vary depending
on the library in use.
However:
- The bcrypt algorithms do not work because their scheme
prefix lacks the required round count (e.g., it is `$2y$` instead of
`$2y$05$`). We suspect this never worked as intended. Moreover,
bcrypt tends to be slower than SHA-512, so we do not want to fix the
prefix and start using it.
- SHA-256 and SHA-512 are both part of the SHA-2 family, and libraries
that support one almost always support the other. It is not expected
to find a library that supports only SHA-256 but not SHA-512.
- MD5 is not considered secure for password hashing.
Therefore, this commit removes support for bcrypt_y, bcrypt_a, SHA-256,
and MD5 for hashing new passwords to ensure that the correct hashing
function (SHA-512) is used everywhere.
This commit does not change the behavior of `passwords::check`, so
it is still possible to use passwords hashed with the removed
algorithms.
Ref. scylladb/scylladb#24524
This is a refactoring commit that changes the `generate_salt` function
to require a password hashing scheme as a parameter. This change is
motivated by the upcoming removal of support for obsolete password
hashing schemes and removal of `identify_best_supported_scheme()`
function.
Ref. scylladb/scylladb#24524
Fixes some typos as found by codespell run on the code.
In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc.
Follow-up commits will take care of them.
Refs: https://github.com/scylladb/scylladb/issues/16255
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.
Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.
The changes we applied mechanically with a script, except to
licenses/README.md.
Closes#9937
On Fedora 28, creating an instance of `std::random_device` opens a file
descriptor for `/dev/urandom` (observed via `strace`).
By declaring static thread-local instances of `std::random_device`,
these descriptors will be open (barring optimization by the compiler)
for the entire duration of the Scylla process's life.
However, the `std::random_device` instance is only necessary for
initializing the `RandomNumberEngine` for generating salts. With this
change, the file-descriptor is closed immediately after the engine is
initialized.
I considered generalizing this pattern of initialization into a
function, but with only two uses (and simple ones) I think this would
only obscure things.
Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com>
Tests: unit (release)
Message-Id: <f1b985d99f66e5e64d714fd0f087e235b71557d2.1536697368.git.jhaberku@scylladb.com>
This makes the function useable in more contexts due to
flexibility (including in tests), since the state is not captured and
the characteristics of salt generation can be customized to the caller's
needs.
The `generate_salt` function invokes this function internally now.
This change means that `generate_salt` is now thread-safe and therefore
does not have to be invoked by a single thread only when starting the
`password_authenticator`.
This further means that `generate_salt` does not need to be part of the
public interface of the module, and can be moved to the implementation
file.
While the `password_authenticator` is a complex component with lots of
dependencies, password hashing and checking itself is a process with
limited logical state and dependencies, which makes it easy to isolate
and test.