Commit Graph

3 Commits

Author SHA1 Message Date
Avi Kivity
67c4d980dd Revert "Merge 'auth: move passwords::check call to alien thread' from Andrzej Jackowski"
This reverts commit 1fd82d32e0. It causes
connection storms to snowball into a node crash via this mechanism:

1. large node suffers mild connection storm
2. password hash requests queue up on alien hash thread
3. incoming hash requests queue faster than the alien thread can retire them.
4. auth latency grows without bounds
5. this encourages the clients to create new connections
6. problem grows

Reverting the patch restores the hash stall, but at least prevents node
crashes.

Fixes #26461 (2025.1)

Closes scylladb/scylladb#26462
2025-10-09 11:04:34 +03:00
Avi Kivity
1fd82d32e0 Merge 'auth: move passwords::check call to alien thread' from Andrzej Jackowski
Analysis of customer stalls revealed that the function `detail::hash_with_salt` (invoked by `passwords::check`) often blocks the reactor. Internally, this function uses the external `crypt_r` function to compute password hashes, which is CPU-intensive.

This PR addresses the issue in two ways:
1) `sha-512` is now the only password hashing scheme for new passwords (it was already the common-case).
2) `passwords::check` is moved to a dedicated alien thread.

Regarding point 1: before this change, the following hashing schemes were supported by     `identify_best_supported_scheme()`: bcrypt_y, bcrypt_a, SHA-512, SHA-256, and MD5. The reason for this was that the `crypt_r` function used for password hashing comes from an external library (currently `libxcrypt`), and the supported hashing algorithms vary depending on the library in use. However:
- The bcrypt schemes never worked properly because their prefixes lack the required round count (e.g. `$2y$` instead of `$2y$05$`). Moreover, bcrypt is slower than SHA-512, so it  not good idea to fix or use it.
- SHA-256 and SHA-512 both belong to the SHA-2 family. Libraries that support one almost always support the other, so it’s very unlikely to find SHA-256 without SHA-512.
- MD5 is no longer considered secure for password hashing.

Regarding point 2: the `passwords::check` call now runs on a shared alien thread created at database startup. An `std::mutex` synchronizes that thread with the shards. In theory this could introduce a frequent lock contention, but in practice each shard handles only a few hundred new connections per second—even during storms. There is already `_conns_cpu_concurrency_semaphore` in `generic_server` limits the number of concurrent connection handlers.

Fixes https://github.com/scylladb/scylladb/issues/24524

Backport not needed, as it is a new feature.

Closes scylladb/scylladb#24924

* github.com:scylladb/scylladb:
  main: utils: add thread names to alien workers
  auth: move passwords::check call to alien thread
  test: wait for 3 clients with given username in test_service_level_api
  auth: refactor password checking in password_authenticator
  auth: make SHA-512 the only password hashing scheme for new passwords
  auth: whitespace change in identify_best_supported_scheme()
  auth: require scheme as parameter for `generate_salt`
  auth: check password hashing scheme support on authenticator start

(cherry picked from commit c762425ea7)
2025-09-07 15:47:34 +03:00
Michał Chojnowski
d301c29af5 utils: introduce alien_worker
Introduces a util which launches a new OS thread and accepts
callables for concurrent execution.

Meant to be created once at startup and used until shutdown,
for running nonpreemptible, 3rd party, non-interactive code.

Note: this new utility is almost identical to wasm::alien_thread_runner.
Maybe we should unify them.
2024-12-23 23:37:02 +01:00