scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-02 06:05:53 +00:00

Author	SHA1	Message	Date
Pavel Emelyanov	3f7ee3ce5d	Merge 'batchlog: make replay (flush) faster' from Botond Dénes The batchlog table contains an entry for each logged batch that is processed by the local node as coordinator. These entries are typically very short lived, they are inserted when the batch is processed and deleted immediately after the batch is successfully applied. When a table has `tombstone_gc = {'mode': 'repair'}` enabled, every repair has to flush all hints and batchlogs, so that we can be certain that there is no live data in any of these, older than the last repair. Since batches can contain member queries from any number of tables, the whole batchlog has to be flushed, even if repair-mode tombstone-gc is enabled for a single table. Flushing the batchlog table happens by doing a batchlog replay. This involves reading the entire content of this table, and attempting to replay+delete any live entries (that are old enough to be replayed). Under normal operating circumstances, 99%+ of the content of the batchlog table is partition tombstones. Because of this, scanning the content of this table has to process thousands to millions of tombstones. This was observed to require up to 20 minutes to finish, causing repairs to slow down to a crawl, as the batchlog-flush has to be repeated at the end of the repair of each token-range. When trying to address this problem, the first idea was that we should expedite the garbage-collection of these accumulated tombstones. This experiment failed, see https://github.com/scylladb/scylladb/pull/23752. The commitlog proved to be an impossible to bypass barrier, preventing quick garbage-collection of tombstones. So long as a single commit-log segment is alive, holding content from the batchlog table, all tombstones written after are blocked from GC. The second approach, represented by this PR, is to not rely in tombstone GC to reduce the tombstone amount. Instead restructure the table such that a single higher-order tombstone can be used to shadow and allow for the eviction of the myriads of individual batchlog entry tombstones. This is realized by reorganizing the batchlog table such that individual batches are rows, not partitions. This new schema is introduced by the new `system.batchlog_v2` table, introduced by this PR: CREATE TABLE system.batchlog_v2 ( version int, stage int, shard int, written_at timestamp, id uuid, data blob, PRIMARY KEY ((version, stage, shard), written_at, id)); The new schema organization has the following goals: 1) Make post-replay batchlog cleanup possible with a simple range-tombstone. This allows dropping the individual dead batchlog entries, as they are shadowed by a higher level tombstone. This enables dropping tombstones without tombstone GC. 2) To make the above possible, introduce the stage key component: batchlog entries that fail the first replay attempt, are moved to the failed_replay stage, so the initial stage can be cleaned up safely. 3) Spread out the data among Scylla shards, via the batchlog shard column. 4) Make batchlog entries ordered by the batchlog create time (id). This allows for selecting batchlogs to replay, without post-filtering of batchlogs that are too young to be replayed. Fixes: https://github.com/scylladb/scylladb/issues/23358 This is an improvement, normally not a backport-candidate. We might override this and backport to allow wider use of `tombstone_gc: {'mode': 'repair'}`. Closes scylladb/scylladb#26671 * github.com:scylladb/scylladb: db/config: change batchlog_replay_cleanup_after_replays default to 1 test/boost/batchlog_manager_test: add test for batchlog cleanup replica/mutation_dump: always set position weight for clustering positions service/storage_proxy: s/batch_replay_throw/storage_proxy_fail_replay_batch/ test/lib: introduce error_injection.hh utils/error_injection: add debug log to disable() and disable_all() test/lib/cql_test_env: forward config to batchlog test/lib/cql_test_env: add batch type to execute_batch() test/lib/cql_assertions: add with_size(predicate) overload test/lib/cql_assertions: add source location to fail messages test/lib/cql_assertions: columns_assertions: add assert_for_columns_of_each_row() test/lib/cql_assertions: rows_assertions::assert_for_columns_of_row(): add index bound check test/lib/cql_assertions: columns_assertions: add T* with_typed_column() overload db/batchlog_manager: config: s/write_timeout/reply_timeot/ db,service: switch to system.batchlog_v2 db/system_keyspace: introduce system.batchlog_v2 service,db: extract generation of batchlog delete mutation service,db: extract get_batchlog_mutation_for() from storage-proxy db/batchlog_manager: only consider propagation delay with tombstone-gc=repair db/batchlog_manager: don't drop entire batch if one mutations' table was dropped data_dictionary: table: add get_truncation_time() db/batchlog_manager: batch(): replace map_reduce() with simple loop db/batchlog_manager: finish coroutinizing replay_all_failed_batches db/batchlog_manager: improve replayAllFailedBatches logs	2025-12-15 15:05:19 +03:00
Dawid Mędrek	1e14c08eee	locator/token_metadata: Remove get_host_id() The function is declared, but it's not defined or used anywhere. Closes scylladb/scylladb#27374	2025-12-15 10:36:52 +01:00
Michael Litvak	b9ec1180f5	alternator: require rf_rack_valid_keyspaces when creating index When creating an alternator table with tablets, if it has an index, LSI or GSI, require the config option rf_rack_valid_keyspaces to be enabled. The option is required for materialized views in tablets keyspaces to function properly and avoid consistency issues that could happen due to cross-rack migrations and pairing switches when RF-rack validity is not enforced. Currently the option is validated when creating a materialized view via the CQL interface, but it's missing from the alternator interface. Since alternator indexes are based on materialized views, the same check should be added there as well. Fixes scylladb/scylladb#27612 Closes scylladb/scylladb#27622	2025-12-15 10:36:57 +02:00
Michał Hudobski	12483d8c3c	vector_search: throw an error when we restrict primary in vector search We currently allow restrictions on single column primary key, but we ignore the restriction and return all results. This can confuse the users. We change it so such a restriction will throw an error and add a test to validate it. Fixes: VECTOR-331 Closes scylladb/scylladb#27143	2025-12-15 09:45:56 +02:00
Jenkins Promoter	d5641398f5	Update pgo profiles - aarch64	2025-12-15 05:16:31 +02:00
Nadav Har'El	c06e63daed	Merge 'auth: start using SHA 512 hashing originated from musl with added yielding' from Andrzej Jackowski This patch series contains the following changes: - Incorporation of `crypt_sha512.c` from musl to out codebase - Conversion of `crypt_sha512.c` to C++ and coroutinization - Coroutinization of `auth::passwords::check` - Enabling use of `__crypt_sha512` orignated from `crypt_sha512.c` for computing SHA 512 passwords of length <=255 - Addition of yielding in the aforementioned hashing implementation. The alien thread was a solution for reactor stalls caused by indivisible password‑hashing tasks (https://github.com/scylladb/scylladb/issues/24524). However, because there is only one alien thread, overall hashing throughput was reduced (see, e.g., https://github.com/scylladb/scylla-enterprise/issues/5711). To address this, the alien‑thread solution is reverted, and a hashing implementation with yielding is introduced in this patch series. Before this patch series, ScyllaDB used SHA-512 hashing provided by the `crypt_r` function, which in our case meant using the implementation from the `libxcrypt` library. Adding yielding to this `libxcrypt` implementation is problematic, both due to licensing (LGPL) and because the implementation is split into many functions across multiple files. In contrast, the SHA-512 implementation from `musl libc` has a more permissive license and is concise, which makes it easier to incorporate into the ScyllaDB codebase. The performance of this solution was compared with the previous implementation that used one alien thread and the implementation after the alien thread was reverted. The results (median) of `perf-cql-raw` with `--connection-per-request 1 --smp 10` parameters are as follows: - Alien thread: 41.5 new connections/s per shard - Reverted alien thread: 244.1 new connections/s per shard - This commit (yielding in hashing): 198.4 new connections/s per shard The roughly 20% performance deterioration compared to the old implementation without the alien thread comes from the fact that the new hashing algorithm implemented in `utils/crypt_sha512.cc` performs an expensive self-verification and stack cleanup. On the other hand, with smp=10 the current implementation achieves roughly 5x higher throughput than the alien thread. In addition, due to yielding added in this commit, the algorithm is expected to provide similar protection from stalls as the alien thread did. In a test that in parallel started a cassandra-stress workload and created thousands of new connections using python-driver, the values of `scylla_reactor_stalls_count` metric were as follows: - Alien thread: 109 stalls/shard total - Reverted alien thread: 13186 stalls/shard total - This commit (yielding in hashing): 149 stalls/shard total Similarly, the `scylla_scheduler_time_spent_on_task_quota_violations_ms` values were: - Alien thread: 1087 ms/shard total - Reverted alien thread: 72839 ms/shard total - This commit (yielding in hashing): 1623 ms/shard total To summarize, yielding during hashing computations achieves similar throughput to the old solution without the alien thread but also prevents stalls similarly to the alien thread. Fixes: scylladb/scylladb#26859 Refs: scylladb/scylla-enterprise#5711 No automatic backport. After this PR is completed, the alien thread should be rather reverted from older branches (2025.2-2025.4 because on 2025.1 it's already removed). Backporting of the other commits needs further discussion. Closes scylladb/scylladb#26860 * github.com:scylladb/scylladb: test/boost: add too_long_password to auth_passwords_test test/boost: add same_hashes_as_crypt_r to auth_passwords_test auth: utils: add yielding to crypt_sha512 auth: change return type of passwords::check to future auth: remove code duplication in verify_scheme test/boost: coroutinize auth_passwords_test utils: coroutinize crypt_sha512 utils: make crypt_sha512.cc to compile utils: license: import crypt_sha512.c from musl to the project Revert "auth: move passwords::check call to alien thread"	2025-12-14 14:01:01 +02:00
David Garcia	c1c3b2c5bb	docs: fix local build prevents early exits in metrics docs generation to break the local build. Fixes #27497 Closes scylladb/scylladb#27615	2025-12-14 11:48:48 +02:00
Botond Dénes	7e7e378a4b	Merge 'Revert "Merge 'Add option to use sstable identifier in snapshot' from Benny Halevy"' from null Reverts commit `8192f45e84`. The merge exposed a critical bug where truncate operations during table drop with auto-snapshot fail, causing Raft applier fiber to stop with unhandled exceptions. This leads to schema inconsistencies across nodes and test failures with "Keyspace does not exist" errors. Root Cause Commit `19b6207f` modified `truncate_table_on_all_shards` to set `use_sstable_identifier = true`: ```cpp // Before (working) co_await table::snapshot_on_all_shards(sharded_db, table_shards, name); // After (broken) auto opts = db::snapshot_options{.use_sstable_identifier = true}; co_await table::snapshot_on_all_shards(sharded_db, table_shards, name, opts); ``` This triggers exceptions during snapshot that propagate through Raft state machine, causing: - Raft applier stops: `raft::state_machine_error` at `raft/server.cc:1369` - Schema changes fail to propagate - Nodes report non-existent keyspaces for valid schemas Changes Reverts 15 files (200 deletions, 74 insertions): - Removes `use_sstable_identifier` from truncate/snapshot code paths - Reverts `snapshot_options` struct back to simple `skip_flush` boolean - Removes REST API and nodetool `--use-sstable-identifier` parameter - Removes feature tests from `test/boost/database_test.cc` No backport required - the original feature was merged to master only and never released. <!-- START COPILOT ORIGINAL PROMPT --> <details> <summary>Original prompt</summary> ---- This section details on the original issue you should resolve <issue_title>test_table_drop_with_auto_snapshot failed with InvalidRequest</issue_title> <issue_description>Seen in: https://jenkins.scylladb.com/job/scylla-master/job/next/9968//testReport Logs: [download](https://downloads.scylladb.com/unstable/scylla/master/testLogs/2025-12-08T15:05:42Z/) Error message: ``` cassandra.InvalidRequest: Error from server: code=2200 [Invalid query] message="Keyspace test does not exist" ``` Stacktrace: ``` manager = <test.pylib.manager_client.ManagerClient object at 0xffff734e6c10> @pytest.mark.asyncio async def test_table_drop_with_auto_snapshot(manager: ManagerClient): logger.info("Bootstrapping cluster") cfg = { 'auto_snapshot': True } servers = await manager.servers_add(3, config = cfg) cql = manager.get_cql() # Increases the chance of tablet migration concurrent with schema change await inject_error_on(manager, "tablet_allocator_shuffle", servers) for i in range(3): await cql.run_async("DROP KEYSPACE IF EXISTS test;") await cql.run_async("CREATE KEYSPACE IF NOT EXISTS test WITH replication = {'class': 'NetworkTopologyStrategy', 'replication_factor': 1} AND tablets = {'initial': 8 };") await cql.run_async("CREATE TABLE IF NOT EXISTS test.tbl_sample_kv (id int, value text, PRIMARY KEY (id));") > await cql.run_async("INSERT INTO test.tbl_sample_kv (id, value) VALUES (1, 'ala');") E cassandra.InvalidRequest: Error from server: code=2200 [Invalid query] message="Keyspace test does not exist" test/cluster/test_tablets2.py:173: InvalidRequest ``` </issue_description> <agent_instructions>this issue was exposed by commit `8192f45e84`, please send a pull request reverting that merge commit and mark it as fixing this github issue.</agent_instructions> <comments> <comment_new><author>@yaronkaikov</author><body> @denesb is this something in your team area? if not , please feel free to delegate it or un-assign yourself :-)</body></comment_new> <comment_new><author>@nyh</author><body> This is very strange. Clearly the keyspace `test` does exist at this point, because we created it two lines above and also we ran `CREATE TABLE .. test.tbl_sample_kv` which would have failed if the keyspace `test` didn't exist - so it must exit, no? In the past, we had a bug where the running `CREATE KEYSPACE IF NOT EXISTS` forgot to set the "schema modified" event in the response so it failed to wait for schema agreement, but 1. we fixed this bug (https://github.com/scylladb/scylladb/pull/18819 by @nuivall ) and 2. this bug didn't happen in this case, where CREATE TABLE deed had work to do. But I just realized something... Our fix in https://github.com/scylladb/scylladb/pull/18819 only applies to CREATE KEYSPACE / TABLE / VIEW / TYPE statements. It wasn't applied to `DROP KEYSPACE` - and it should have been.... But I don't have a good theory how a bug like https://github.com/scylladb/scylladb/pull/18819 can explain this specific test failure. Different schema operations are already linearized, so if a `CREATE TABLE test.tbl_sample_kv` succeeded, I don't see how there could possibly be any earlier `DROP KEYSPACE test` that suddenly springs to life. Unless we have a serious bug in our raft-based schema operations.</body></comment_new> <comment_new><author>@nyh</author><body> Another bug we could have in theory is that the Python driver's async `cql.run_async` might have a bug where it is not waiting for the schema agreement despite being told to wait. If it doesn't wait for schema agreement, this can easily explain this bug: 1. the CREATE KEYSPACE, CREATE TABLE both are sent to node A, but 2. the last INSERT INTO is sent to node B which is not yet aware of this new keyspace and table, and fails. Copilot claims that execute_async() does have this bug! > For schema-altering statements, schema agreement (meaning all nodes agree on the new schema) is important before running follow-up operations, but this is enforced only by synchronous helpers like Session.execute(), not the asynchronous version. > If you use execute_async() for schema operations, you are responsible for checking schema agreement yourself, using [Session.check_schema_agreement()](https://docs.datastax.com/en/developer/python-driver/latest/api/cassandra/cluster/#cassandra.cluster.Session.check_schema_agreement) or (in newer code) ResponseFuture.check_schema_agreement. > According to [a discussion on the DataStax support forum](https://support.datastax.com/s/article/Does-the-Python-Driver-for-Cassandra-Wait-for-Schema-Agreement-after-a-Schema-Change?language=en_US) and the [driver’s source code](`7f12a5e1c6/cassandra/cluster.py (L487)`), schema agreement is not ch... </details> <!-- START COPILOT CODING AGENT SUFFIX --> - Fixes scylladb/scylladb#27501 <!-- START COPILOT CODING AGENT TIPS --> --- Closes scylladb/scylladb#27604 * github.com:scylladb/scylladb: Revert "Merge 'Add option to use sstable identifier in snapshot' from Benny Halevy" Initial plan	2025-12-12 13:20:49 +02:00
copilot-swe-agent[bot]	77ee7f3417	Revert "Merge 'Add option to use sstable identifier in snapshot' from Benny Halevy" This reverts commit `8192f45e84`. The merge exposed a bug where truncate (via drop) fails and causes Raft errors, leading to schema inconsistencies across nodes. This results in test_table_drop_with_auto_snapshot failures with 'Keyspace test does not exist' errors. The specific problematic change was in commit `19b6207f` which modified truncate_table_on_all_shards to set use_sstable_identifier = true. This causes exceptions during truncate that are not properly handled, leading to Raft applier fiber stopping and nodes losing schema synchronization.	2025-12-12 03:55:13 +00:00
copilot-swe-agent[bot]	0ff89a58be	Initial plan	2025-12-12 03:48:12 +00:00
Yaron Kaikov	f7ffa395a8	workflows: trigger CI automatically when conflicts label is removed Add pull_request_target event with unlabeled type to trigger-scylla-ci workflow. This allows automatic CI triggering when the 'conflicts' label is removed from a PR, in addition to the existing manual trigger via comment. The workflow now runs when: - A user posts a comment with '@scylladbbot trigger-ci' (existing) - The 'conflicts' label is removed from a PR (new) Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-84 Closes scylladb/scylladb#27521	2025-12-11 16:48:06 +02:00
Piotr Smaron	3fa3b920de	Update CODEOWNERS to remove redundant entries Removing myself as I have no maintainer's permissions to review the code Closes scylladb/scylladb#27576	2025-12-11 16:47:08 +02:00
Botond Dénes	e7ca52ee79	Merge 'api: storage_service/tablets/repair: disable incremental repair by default' from Benny Halevy Change the default incremental_mode to `disabled` due to https://github.com/scylladb/scylladb/issues/26041 and https://github.com/scylladb/scylladb/issues/27414 Backport to 2025.4 where `611918056a` was introduced Closes scylladb/scylladb#27530 * github.com:scylladb/scylladb: api: storage_service/tablets/repair: disable incremental repair by default docs: nodetool-commands: cluster: repair: fix incremental-mode example	2025-12-11 15:23:09 +02:00
Botond Dénes	730eca5dac	Merge 'Remove noexcept from storage_group and table functions to allow exception propagation' from null Fixed a critical bug where `storage_group::for_each_compaction_group()` was incorrectly marked `noexcept`, causing `std::terminate` when actions threw exceptions (e.g., `utils::memory_limit_reached` during memory-constrained reader creation). Changes made: 1. Removed `noexcept` from `storage_group::for_each_compaction_group()` declaration and implementation 2. Removed `noexcept` from `storage_group::compaction_groups()` overloads (they call for_each_compaction_group) 3. Removed `noexcept` from `storage_group::live_disk_space_used()` and `memtable_count()` (they call compaction_groups()) 4. Kept `noexcept` on `storage_group::flush()` - it's a coroutine that automatically captures exceptions and returns them as exceptional futures 5. Removed `noexcept` from `table_load_stats()` functions in base class, table, and storage group managers Rationale: As noted by reviewers, there's no reason to kill the server if these functions throw. For coroutines returning futures, `noexcept` is appropriate because Seastar automatically captures exceptions and returns them as exceptional futures. For other functions, proper exception handling allows the system to recover gracefully instead of terminating. Fixes #27475 Closes scylladb/scylladb#27476 * github.com:scylladb/scylladb: replica: Remove unnecessary noexcept replica: Remove noexcept from compaction_groups() functions replica: Remove noexcept from storage_group::for_each_compaction_group	2025-12-11 15:17:35 +02:00
Benny Halevy	c8cff94a5a	api: storage_service/tablets/repair: disable incremental repair by default Change the default incremental_mode to `disabled` due to https://github.com/scylladb/scylladb/issues/26041 and https://github.com/scylladb/scylladb/issues/27414 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-12-11 14:25:21 +02:00
Benny Halevy	5fae4cdf80	docs: nodetool-commands: cluster: repair: fix incremental-mode example There is no 'regular' incremental mode anymore. The example seems have meant 'disabled'. Fixes #27587 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-12-11 14:25:11 +02:00
Marcin Maliszkiewicz	8bbcaacba1	auth: always catch by const reference This is best practice. Closes scylladb/scylladb#27525	2025-12-11 12:42:30 +01:00
Yaron Kaikov	3dfa5ebd7f	Add JIRA issue validation to backport PR fixes check Extend the Fixes validation pattern to also accept JIRA issue references (format: [A-Z]+-\d+) in addition to GitHub issue references. This allows backport PRs to reference JIRA issues in the format 'Fixes: PROJECT-123'. Fixes: https://github.com/scylladb/scylladb/issues/27571 Closes scylladb/scylladb#27572	2025-12-11 12:23:16 +02:00
Avi Kivity	24264e24bb	Revert "repair: Add tablet repair progress report support" This reverts commit `faad0167d7`. It causes a regression in test_two_tablets_concurrent_repair_and_migration_repair_writer_level in debug mode (with ~5%-10% probability). Fixes #27510. Closes scylladb/scylladb#27560	2025-12-11 12:18:11 +02:00
Nadav Har'El	0c64e3be9a	Merge 'Unify and fix rjson string and string_view conversions' from Marcin Maliszkiewicz This patch-set consolidates and corrects rjson string conversion handling. It removes unnecessary string copies, ensures proper length usage and replaces ad-hoc conversions with consistent helper functions. Overall, the changes make rjson string handling safer, faster, and more uniform across the codebase. Backport: no, it's a refactor Closes scylladb/scylladb#27394 * github.com:scylladb/scylladb: fix rjson::value to bytes conversion with missing GetStringLength call alternator: change type from string to string_view in should_add_capacity fix rjson::value to string_view conversion with missing GetStringLength call use rjson::to_string_view when rjson::value gets converted using GetStringLength use rjson::to_sstring and rjson::to_string for various string conversions utils: use rjson document wrapper in instance_profile_credentials_provider::parse_creds utils: move rjson::to_string_view func to string related place utils: add to_sstring and to_string rjson helper	2025-12-11 12:05:41 +02:00
Marcin Maliszkiewicz	d5b63df46e	transport: remove redundant futurize_invoke from counted data sink and source Closes scylladb/scylladb#27526	2025-12-11 10:32:16 +03:00
Dario Mirovic	f545ed37bc	test: dtest: audit_test.py: fix audit error log detection `test_insert_failure_doesnt_report_success` test in `test/cluster/dtest/audit_test.py` has an insert statement that is expected to fail. Dtest environment uses `FlakyRetryPolicy`, which has `max_retries = 5`. 1 initial fail and 5 retry fails means we expect 6 error audit logs. The test failed because `create keyspace ks` failed once, then succeeded on retry. It allowed the test to proceed properly, but the last part of the test that expects exactly 6 failed queries actually had 7. The goal of this patch is to make sure there are exactly 6 = 1 + `max_retries` failed queries, counting only the query expected to fail. If other queries fail with successful retry, it's fine. If other queries fail without successful retry, the test will fail, as it should in such situations. They are not related to this expected failed insert statement. Fixes #27322 Closes scylladb/scylladb#27378	2025-12-11 10:17:07 +03:00
Benny Halevy	5f13880a91	utils: error_injection: wait_for_message: print injection_name and caller source_location on timeout When waiting for the condition variable times out we call on_internal_error, but unfortunately, the backtrace it generates is obfuscated by `coroutine_handle<seastar::internal::coroutine_traits_base<void>::promise_type>::resume`. To make the log more useful, print the error injection name and the caller's source_location in the timeout error message. Fixes #27531 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#27532	2025-12-10 23:25:54 +01:00
Andrzej Jackowski	11ad32c85e	test/boost: add too_long_password to auth_passwords_test The test documents the current behavior of hashing algorithms that fail if the passphrase has 512 bytes or more. Moreover, it documents the behavior of the current bcrypt implementation that compares only the first 72 bytes of the password. Although we don't typically use bcrypt for password hashing, it is possible to insert such a hash using `CREATE ROLE ... WITH HASHED PASSWORD ...`. Refs: scylladb/scylladb#26842	2025-12-10 15:36:18 +01:00
Andrzej Jackowski	4c8c9cd548	test/boost: add same_hashes_as_crypt_r to auth_passwords_test The test verifies that the old and new implementation of SHA-512 hashing returns exactly the same values. Refs: scylladb/scylladb#26859	2025-12-10 15:36:18 +01:00
Andrzej Jackowski	98f431dd81	auth: utils: add yielding to crypt_sha512 This change allows yielding during hashing computations to prevent stalls. The performance of this solution was compared with the previous implementation that used one alien thread and the implementation after the alien thread was reverted. The results (median) of `perf-cql-raw` with `--connection-per-request 1 --smp 10` parameters are as follows: - Alien thread: 41.5 new connections/s per shard - Reverted alien thread: 244.1 new connections/s per shard - This commit (yielding in hashing): 198.4 new connections/s per shard The alien thread is limited by a single-core hashing throughput, which is roughly 400-500 hashes/s in the test environment. Therefore, with smp=10, the throughput is below 50 hashes/s, and the difference between the alien thread and other solutions further increases with higer smp. The roughly 20% performance deterioration compared to the old implementation without the alien thread comes from the fact that the new hashing algorithm implemented in `utils/crypt_sha512.cc` performs an expensive self-verification and stack cleanup. On the other hand, with smp=10 the current implementation achieves roughly 5x higher throughput than the alien thread. In addition, due to yielding added in this commit, the algorithm is expected to provide similar protection from stalls as the alien thread did. In a test that in parallel started a cassandra-stress workload and created thousands of new connections using python-driver, the values of `scylla_reactor_stalls_count` metric were as follows: - Alien thread: 109 stalls/shard total - Reverted alien thread: 13186 stalls/shard total - This commit (yielding in hashing): 149 stalls/shard total Similarly, the `scylla_scheduler_time_spent_on_task_quota_violations_ms` values were: - Alien thread: 1087 ms/shard total - Reverted alien thread: 72839 ms/shard total - This commit (yielding in hashing): 1623 ms/shard total To summarize, yielding during hashing computations achieves similar throughput to the old solution without the alien thread but also prevents stalls similarly to the alien thread. Fixes: scylladb/scylladb#26859 Refs: scylladb/scylla-enterprise#5711	2025-12-10 15:36:18 +01:00
Andrzej Jackowski	4ffdb0721f	auth: change return type of passwords::check to future Introduce a new `passwords::hash_with_salt_async` and change the return type of `passwords::check` to `future<bool>`. This enables yielding during password computations later in this patch series. The old method, `hash_with_salt`, is marked as deprecated because new code should use the new `hash_with_salt_async` function. We are not removing `hash_with_salt` now to reduce the regression risk of changing the hashing implementation—at least the methods that change persistent hashes (CREATE, ALTER) will continue to use the old hashing method. However, in the future, `hash_with_salt` should be entirely removed. Refs: scylladb/scylladb#26859	2025-12-10 15:36:18 +01:00
Andrzej Jackowski	775906d749	auth: remove code duplication in verify_scheme Refactoring: create a new function `verify_hashing_output` to reuse code in `hash_with_salt` and `verify_scheme`. The change is introduced to facilitate verification of hashing output when the implementation is extended later in this patch series. Refs: scylladb/scylladb#26859	2025-12-10 15:36:18 +01:00
Andrzej Jackowski	11eca621b0	test/boost: coroutinize auth_passwords_test This commit prepares `auth_passwords_test` for using coroutines, because later in this patch series `auth::passwords::check` and other similar functions will return Seastar futures. Refs: scylladb/scylladb#26859	2025-12-10 15:36:18 +01:00
Andrzej Jackowski	d7818b56df	utils: coroutinize crypt_sha512 Change `sha512crypt` and `__crypt_sha512` to coroutines to allow yielding during hash computations later in this patch series. Refs: scylladb/scylladb#26859	2025-12-10 15:36:18 +01:00
Andrzej Jackowski	033fed5734	utils: make crypt_sha512.cc to compile The purpose of this change is to allow the usage of Seastar futures in crypt_sha512 later in this patch series. Refs: scylladb/scylladb#26859	2025-12-10 15:36:18 +01:00
Andrzej Jackowski	c6c30b7d0a	utils: license: import crypt_sha512.c from musl to the project This patch imports the `crypt_sha512.c` file from the musl library. We need it to incorporate yielding in the `crypt_r` function to avoid reactor stalls during long hashing computations. Before this patch series, ScyllaDB used SHA-512 hashing provided by the `crypt_r` function, which in our case meant using the implementation from the `libxcrypt` library. Adding yielding to this `libxcrypt` implementation is problematic, both due to licensing (LGPL) and because the implementation is split into many functions across multiple files. In contrast, the SHA-512 implementation from `musl libc` has a more permissive license and is concise, which makes it easier to incorporate into the ScyllaDB codebase. Both `crypt_sha512.c` and musl license are obtained from git.musl-libc.org: - https://git.musl-libc.org/cgit/musl/tree/src/crypt/crypt_sha512.c - https://git.musl-libc.org/cgit/musl/tree/COPYRIGHT Import commit: commit 1b76ff0767d01df72f692806ee5adee13c67ef88 Author: Alex Rønne Petersen <alex@alexrp.com> Date: Sun Oct 12 05:35:19 2025 +0200 s390x: shuffle register usage in __tls_get_offset to avoid r0 as address Refs: scylladb/scylladb#26859	2025-12-10 15:36:18 +01:00
Andrzej Jackowski	5afcec4a3d	Revert "auth: move passwords::check call to alien thread" The alien thread was a solution for reactor stalls caused by indivisible password‑hashing tasks (scylladb/scylladb#24524). However, because there is only one alien thread, overall hashing throughput was reduced (see, e.g., scylladb/scylla-enterprise#5711). To address this, the alien‑thread solution is reverted, and a hashing implementation with yielding will be introduced later in this patch series. This reverts commit `9574513ec1`.	2025-12-10 15:36:09 +01:00
Tomasz Grabiec	0e51a1f812	replica: Remove unnecessary noexcept Can potentially lead to unnecessary abort. compaction_groups() and for_each_compaction_group() can throw. Co-authored-by: bhalevy <20910904+bhalevy@users.noreply.github.com>	2025-12-10 14:51:35 +01:00
Tomasz Grabiec	8b807b299e	replica: Remove noexcept from compaction_groups() functions They can throw during merge, when the number of compaction groups is higher than 3. Callers can deal with that, so we shouldn't abort.	2025-12-10 14:48:23 +01:00
Tomasz Grabiec	07ff659849	replica: Remove noexcept from storage_group::for_each_compaction_group They don't really have to be noexcept. And "action" may actually throw, leading to abort. It was observed to throw when creating memtable readers: terminate called after throwing an instance of 'utils::memory_limit_reached' what(): kill limit triggered on semaphore sl:users by permit xxx Aborting on shard 4, in scheduling group sl:users. std::terminate() at ??:0 __clang_call_terminate at main.cc:0 replica::storage_group::for_each_compaction_group(std::function<void (seastar::lw_shared_ptr<replica::compaction_group> const&)>) const at ./replica/table.cc:920 (inlined by) replica::table::add_memtables_to_reader_list(std::vector<mutation_reader, std::allocator<mutation_reader>>&, seastar::lw_shared_ptr<schema const> const&, reader_permit const&, interval<dht::ring_position> const&, query::partition_slice const&, tracing::trace_state_ptr const&, seastar::bool_class<streamed_mutation::forwarding_tag>, seastar::bool_class<mutation_reader::partition_range_forwarding_tag>, std::function<void (unsigned long)>) const at ./replica/table.cc:196 (inlined by) replica::table::make_reader_v2(seastar::lw_shared_ptr<schema const>, reader_permit, interval<dht::ring_position> const&, query::partition_slice const&, tracing::trace_state_ptr, seastar::bool_class<streamed_mutation::forwarding_tag>, seastar::bool_class<mutation_reader::partition_range_forwarding_tag>) const at ./replica/table.cc:243 (inlined by) replica::table::as_mutation_source() const::$_0::operator()(seastar::lw_shared_ptr<schema const>, reader_permit, interval<dht::ring_position> const&, query::partition_slice const&, tracing::trace_state_ptr, seastar::bool_class<streamed_mutation::forwarding_tag>, seastar::bool_class<mutation_reader::partition_range_forwarding_tag>) const at ./replica/table.cc:3673 (inlined by) mutation_reader std::__invoke_impl<mutation_reader, replica::table::as_mutation_source() const::$_0&, seastar::lw_shared_ptr<schema const>, reader_permit, interval<dht::ring_position> const&, query::partition_slice const&, tracing::trace_state_ptr, seastar::bool_class<streamed_mutation::forwarding_tag>, seastar::bool_class<mutation_reader::partition_range_forwarding_tag>>(std::__invoke_other, replica::table::as_mutation_source() const::$_0&, seastar::lw_shared_ptr<schema const>&&, reader_permit&&, interval<dht::ring_position> const&, query::partition_slice const&, tracing::trace_state_ptr&&, seastar::bool_class<streamed_mutation::forwarding_tag>&&, seastar::bool_class<mutation_reader::partition_range_forwarding_tag>&&) at /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/invoke.h:61 (inlined by) std::enable_if<is_invocable_r_v<mutation_reader, replica::table::as_mutation_source() const::$_0&, seastar::lw_shared_ptr<schema const>, reader_permit, interval<dht::ring_position> const&, query::partition_slice const&, tracing::trace_state_ptr, seastar::bool_class<streamed_mutation::forwarding_tag>, seastar::bool_class<mutation_reader::partition_range_forwarding_tag>>, mutation_reader>::type std::__invoke_r<mutation_reader, replica::table::as_mutation_source() const::$_0&, seastar::lw_shared_ptr<schema const>, reader_permit, interval<dht::ring_position> const&, query::partition_slice const&, tracing::trace_state_ptr, seastar::bool_class<streamed_mutation::forwarding_tag>, seastar::bool_class<mutation_reader::partition_range_forwarding_tag>>(replica::table::as_mutation_source() const::$_0&, seastar::lw_shared_ptr<schema const>&&, reader_permit&&, interval<dht::ring_position> const&, query::partition_slice const&, tracing::trace_state_ptr&&, seastar::bool_class<streamed_mutation::forwarding_tag>&&, seastar::bool_class<mutation_reader::partition_range_forwarding_tag>&&) at /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/invoke.h:114 (inlined by) std::_Function_handler<mutation_reader (seastar::lw_shared_ptr<schema const>, reader_permit, interval<dht::ring_position> const&, query::partition_slice const&, tracing::trace_state_ptr, seastar::bool_class<streamed_mutation::forwarding_tag>, seastar::bool_class<mutation_reader::partition_range_forwarding_tag>), replica::table::as_mutation_source() const::$_0>::_M_invoke(std::_Any_data const&, seastar::lw_shared_ptr<schema const>&&, reader_permit&&, interval<dht::ring_position> const&, query::partition_slice const&, tracing::trace_state_ptr&&, seastar::bool_class<streamed_mutation::forwarding_tag>&&, seastar::bool_class<mutation_reader::partition_range_forwarding_tag>&&) at /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/std_function.h:290 (inlined by) std::function<mutation_reader (seastar::lw_shared_ptr<schema const>, reader_permit, interval<dht::ring_position> const&, query::partition_slice const&, tracing::trace_state_ptr, seastar::bool_class<streamed_mutation::forwarding_tag>, seastar::bool_class<mutation_reader::partition_range_forwarding_tag>)>::operator()(seastar::lw_shared_ptr<schema const>, reader_permit, interval<dht::ring_position> const&, query::partition_slice const&, tracing::trace_state_ptr, seastar::bool_class<streamed_mutation::forwarding_tag>, seastar::bool_class<mutation_reader::partition_range_forwarding_tag>) const at /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/std_function.h:591 (inlined by) mutation_source::make_reader_v2(seastar::lw_shared_ptr<schema const>, reader_permit, interval<dht::ring_position> const&, query::partition_slice const&, tracing::trace_state_ptr, seastar::bool_class<streamed_mutation::forwarding_tag>, seastar::bool_class<mutation_reader::partition_range_forwarding_tag>) const at ././readers/mutation_source.hh:143 query::querier_base::querier_base(seastar::lw_shared_ptr<schema const>, reader_permit, interval<dht::ring_position>, query::partition_slice, mutation_source const&, tracing::trace_state_ptr, query::querier_base::querier_config) at ././querier.hh:91 (inlined by) query::querier::querier(mutation_source const&, seastar::lw_shared_ptr<schema const>, reader_permit, interval<dht::ring_position>, query::partition_slice, tracing::trace_state_ptr, query::querier_base::querier_config) at ././querier.hh:164 (inlined by) replica::table::query(seastar::lw_shared_ptr<schema const>, reader_permit, query::read_command const&, query::result_options, std::vector<interval<dht::ring_position>, std::allocator<interval<dht::ring_position>>> const&, tracing::trace_state_ptr, query::result_memory_limiter&, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l>>>, std::optional<query::querier>) at ./replica/table.cc:3583 replica::database::query(seastar::lw_shared_ptr<schema const>, query::read_command const&, query::result_options, std::vector<interval<dht::ring_position>, std::allocator<interval<dht::ring_position>>> const&, tracing::trace_state_ptr, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l>>>, std::variant<std::monostate, db::per_partition_rate_limit::account_only, db::per_partition_rate_limit::account_and_enforce>)::$_0::operator()(reader_permit) const at ./replica/database.cc:1533 (inlined by) seastar::noncopyable_function<seastar::future<void> (reader_permit)>::indirect_vtable_for<replica::database::query(seastar::lw_shared_ptr<schema const>, query::read_command const&, query::result_options, std::vector<interval<dht::ring_position>, std::allocator<interval<dht::ring_position>>> const&, tracing::trace_state_ptr, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l>>>, std::variant<std::monostate, db::per_partition_rate_limit::account_only, db::per_partition_rate_limit::account_and_enforce>)::$_0>::call(seastar::noncopyable_function<seastar::future<void> (reader_permit)> const, reader_permit) (.llvm.13537529942037499926) at ././seastar/include/seastar/util/noncopyable_function.hh:158 seastar::noncopyable_function<seastar::future<void> (reader_permit)>::operator()(reader_permit) const at ././seastar/include/seastar/util/noncopyable_function.hh:215 (inlined by) reader_concurrency_semaphore::execution_loop() (.resume) at ./reader_concurrency_semaphore.cc:980 std::__n4861::coroutine_handle<seastar::internal::coroutine_traits_base<void>::promise_type>::resume() const at /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/coroutine:242 (inlined by) seastar::internal::coroutine_traits_base<void>::promise_type::run_and_dispose() at ./build/release/seastar/./seastar/include/seastar/core/coroutine.hh:122 (inlined by) seastar::reactor::run_tasks(seastar::reactor::task_queue&) at ./build/release/seastar/./seastar/src/core/reactor.cc:2627 (inlined by) seastar::reactor::run_some_tasks() at ./build/release/seastar/./seastar/src/core/reactor.cc:3099 seastar::reactor::do_run() at ./build/release/seastar/./seastar/src/core/reactor.cc:3267 seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_0::operator()() const at ./build/release/seastar/./seastar/src/core/reactor.cc:4591 (inlined by) void std::__invoke_impl<void, seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_0&>(std::__invoke_other, seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_0&) at /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/invoke.h:61 (inlined by) std::enable_if<is_invocable_r_v<void, seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_0&>, void>::type std::__invoke_r<void, seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_0&>(seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_0&) at /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/invoke.h:111 (inlined by) std::_Function_handler<void (), seastar::smp::configure(seastar::smp_options const&, seastar::reactor_options const&)::$_0>::_M_invoke(std::_Any_data const&) at /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/std_function.h:290 std::function<void ()>::operator()() const at /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/bits/std_function.h:591 Fixes #27475 Co-authored-by: bhalevy <20910904+bhalevy@users.noreply.github.com>	2025-12-10 14:48:11 +01:00
Yaron Kaikov	d3e199984e	auto-backport.py: modify instruction for making PR ready for review Update the comment sent when PR has conflicts with clear instrauctions how to make the PR Ready for review Fixes: https://scylladb.atlassian.net/browse/RELENG-152 Closes scylladb/scylladb#27547	2025-12-10 14:53:38 +02:00
Nadav Har'El	8822c23ad4	Merge 'test: cqlpy: test_protocol_exceptions.py: increase cpp exceptions thr…' from Dario Mirovic …eshold The initial problem: Some of the tests in test_protocol_exceptions.py started failing. The failure is on the condition that no more than `cpp_exception_threshold` happened. Test logic: These tests assert that specific code paths do not throw an exception anymore. Initial implementation ran a code path once, and asserted there were 0 exceptions. Sometimes an exception or several can occur, not directly related to the code paths the tests check, but those would fail the tests. The solution was to run the tests multiple times. If there is a regression, there would be at least as many exceptions thrown as there are test runs. If there is no regression, a few exceptions might happen, up to 10 per 100 test runs. I have arbitrarily chosen `run_count = 100` and `cpp_exception_threshold = 10` values. Note that the exceptions are counted per shard, not per code path. The new problem: The occassional exceptions thrown by some parts of the server now throw a bit more than before. Based on the logs linked on the issues, it is usually 12. There are possibly multiple ways to resolve the issue. I have considered logging exceptions and parsing them. I would have to filter exception logs only for wanted exceptions. However, if a new, different exception is introduced, it might not be counted. Another approach is to just increase the threshold a bit. The issue of throwing more exceptions than before in some other server modules should be addressed by a set of tests for that module, just like these tests check protocol exceptions, not caring who used protocol check code paths. For those reasons, the solution implemented here is to increase `cpp_exception_threshold` to `20`. It will not make the tests unreliable, because, as mentioned, if there is a regression, there would be at least `run_count` exceptions per `run_count` test runs (1 exception per single test run). Still, to make "background exceptions" occurence a bit more normalized, `run_count` too is doubled, from `100` to `200`. At the first glance this looks like nothing is changed, but actually doubling both run count and exception threshold here implies that the burst does not scale as much as run count, it is just that the "jitter" is bigger than the old threshold. Also, this patch series enables debug logging for `exception` logger. This will allow us to inspect which exceptions happened if a protocol exceptions test fails again. Fixes #27247 Fixes #27325 Issue observed on master and branch-2025.4. The tests, in the same form, exist on master, branch-2025.4, branch-2025.3, branch-2025.2, and branch-2025.1. Code change is simple, and no issue is expected with backport automation. Thus, backports for all the aforementioned versions is requested. Closes scylladb/scylladb#27412 * github.com:scylladb/scylladb: test: cqlpy: test_protocol_exceptions.py: enable debug exception logging test: cqlpy: test_protocol_exceptions.py: increase cpp exceptions threshold	2025-12-10 10:53:30 +02:00
Marcin Maliszkiewicz	be9992cfb3	fix rjson::value to bytes conversion with missing GetStringLength call	2025-12-09 19:27:22 +01:00
Marcin Maliszkiewicz	daf00a7f24	alternator: change type from string to string_view in should_add_capacity It avoids allocation.	2025-12-09 19:27:21 +01:00
Marcin Maliszkiewicz	62962f33bb	fix rjson::value to string_view conversion with missing GetStringLength call In some cases we unnecessarily convert to string which causes a copy. In other we convert without calling GetStringLength which causes iteration to dermine length which is already known. In some cases we do even both. This commit fixes that.	2025-12-09 19:27:21 +01:00
Marcin Maliszkiewicz	060c2f7c0d	use rjson::to_string_view when rjson::value gets converted using GetStringLength This commit is only cosmetics, changes calls to GetStringLength into rjson::to_string_view with the same underlying implementation.	2025-12-09 19:27:21 +01:00
Marcin Maliszkiewicz	64149b57c3	use rjson::to_sstring and rjson::to_string for various string conversions In some cases we ommit size checking which is wrong as according to rapid json documentation strings may contain \0 byte in the middle.	2025-12-09 19:27:21 +01:00
Marcin Maliszkiewicz	4b004fcdfc	utils: use rjson document wrapper in instance_profile_credentials_provider::parse_creds So that we can use our common utility functions.	2025-12-09 19:27:21 +01:00
Marcin Maliszkiewicz	5e38b3071b	utils: move rjson::to_string_view func to string related place	2025-12-09 19:27:21 +01:00
Marcin Maliszkiewicz	225b3351fc	utils: add to_sstring and to_string rjson helper So that conversion code is common and it's easier to avoid accidental type conversions. Additionally according to rapid json library size must be checked explicitly, this also avoids extra iteration in char* to (s)string conversion.	2025-12-09 19:27:21 +01:00
Avi Kivity	80c6718ea8	build: update toolchain to Fedora 43 with clang 21.1.6 Rebase to Fedora 43 with clang 21.1 and libstdc++ 15. Fedora container image registry moved to registry.fedoraproject.org as it seems to be updated more regularly. Added python3-devel to the dependencies as some packages scylla-cqlsh depends on aren't yet available in the form of wheels for Python 3.14, and so have to be built locally. In any case it's better to reduce dependency on those wheels even if the ones currently missing appear eventually. Added libev-devel to the dependencies so that the python driver builds correctly even if "wheels" are not published. This reduces our dependency on the python driver's binary release schedule. Without libev-devel, TLS does not work correctly. We no long remove the clang and clang-libs packages. Doxygen started depending on clang-libs, and removing them removes doxygen, breaking the build when it looks for that. The build will still pick up the optimized clang, since /usr/local/bin is earlier in the path. We keep the clang package, since it allows us to mess a little less with the directory structure. Optimized clang binaries generates and stored in https://devpkg.scylladb.com/clang/clang-21.1.6-Fedora-43-aarch64.tar.gz https://devpkg.scylladb.com/clang/clang-21.1.6-Fedora-43-x86_64.tar.gz With ./scripts/refresh-pgo-profiles.sh, the new compiler shows a small performance improvement (instructions_per_op) in perf-simple-query: clang 21: 259353.60 tps ( 64.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 35720 insns/op, 17427 cycles/op, 0 errors) 265940.08 tps ( 64.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 35725 insns/op, 17042 cycles/op, 0 errors) 262650.01 tps ( 64.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 35720 insns/op, 17240 cycles/op, 0 errors) 262881.22 tps ( 64.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 35675 insns/op, 17222 cycles/op, 0 errors) 264898.68 tps ( 64.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 35732 insns/op, 17070 cycles/op, 0 errors) throughput: mean= 263144.72 standard-deviation=2528.69 median= 262881.22 median-absolute-deviation=1753.96 maximum=265940.08 minimum=259353.60 instructions_per_op: mean= 35714.47 standard-deviation=22.34 median= 35720.38 median-absolute-deviation=10.20 maximum=35732.14 minimum=35675.50 cpu_cycles_per_op: mean= 17200.12 standard-deviation=154.62 median= 17221.70 median-absolute-deviation=129.77 maximum=17427.33 minimum=17041.57 clang 20: 254431.39 tps ( 64.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 35883 insns/op, 17708 cycles/op, 0 errors) 259701.02 tps ( 64.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 35883 insns/op, 17351 cycles/op, 0 errors) 261166.92 tps ( 64.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 35912 insns/op, 17270 cycles/op, 0 errors) 260656.31 tps ( 64.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 35869 insns/op, 17289 cycles/op, 0 errors) 259628.13 tps ( 64.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 35946 insns/op, 17370 cycles/op, 0 errors) throughput: mean= 259116.75 standard-deviation=2698.56 median= 259701.02 median-absolute-deviation=1539.55 maximum=261166.92 minimum=254431.39 instructions_per_op: mean= 35898.42 standard-deviation=30.69 median= 35882.97 median-absolute-deviation=15.90 maximum=35945.63 minimum=35869.02 cpu_cycles_per_op: mean= 17397.49 standard-deviation=178.35 median= 17351.35 median-absolute-deviation=108.79 maximum=17707.63 minimum=17269.68 Closes scylladb/scylladb#26773	2025-12-09 15:16:31 +02:00
Pavel Emelyanov	855b91ec20	scripts: Make PR merging check more granular Currently we have 3 explicit checks, and some of them are configurable: - Jenkins job being stable. Can be disabled with --force - Whether submodule update is happenning. It's not allowed by default, and should be enabled with --allow-submodule option - Target branch checking (recently merged #27249). Happens unconditionally This PR unifies all checks in two ways. First, each restriction can be lifted with --allow-foo options. The existing --allow-submodule stays and two options are added: - --allow-unstable to skip jenkins job check (like --force works now) - --allow-any-branch to skip target branch check Second, the --force option lifts all the known restrictions. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#27294	2025-12-09 13:58:21 +02:00
Nadav Har'El	95e303faf3	Merge 'Refactor get_view_natural_endpoint' from Wojciech Mitros With the introduction of rack-lists and the reliance of materialized views on them, the `get_view_natural_endpoint` function can be greatly simplified. When using tablets, instead of doing any index-matching, we can now pair base tables with views only in the same rack. In this series we remove no longer needed code and reorganize the needed code for better clarity. After the changes, the `get_view_natural_endpoint` function goes down from 245 lines to 85 lines, while the whole pairing-related text goes down from 346 lines to 239 lines. Fixes https://github.com/scylladb/scylladb/issues/26313 Closes scylladb/scylladb#27383 * github.com:scylladb/scylladb: mv: replace the simple/complex rack-aware pairing with exact rack matching mv: split out vnode pairing code from get_view_natural_endpoint mv: unify self-pairing and rack-aware pairing into one bool mv: remove the workaround for left nodes when sending view updates	2025-12-09 13:19:13 +02:00
Nadav Har'El	8ba595e472	Merge 'alternator: fix batch writes during intranode tablet migrations' from Petr Gusev Scylla implements `LWT` in the` storage_proxy::cas` method. This method expects to be called on a specific shard, represented by the `cas_shard` parameter. Clients must create this object before calling `storage_proxy::cas`, check its `this_shard()` method, and jump to `cas_shard.shard()` if it returns false. The nuance is that by the time the request reaches the destination shard, the tablet may have already advanced in its migration state machine. For example, a client may acquire a `cas_shard` at the `streaming` tablet state, then submit a request to another shard via `smp::submit_to(cas_shard.shard())`. However, the new `cas_shard` created on that other shard might already be in the `write_both_read_new` state, and its `cas_shard.shard()` would not be equal to `this_shard_id()`. Such broken invariant results in an `on_internal_error` in `storage_proxy::cas`. Clients of `storage_proxy::cas` are expected to check` cas_shard.this_shard()` and recursively jump to another shard if it returns false. Most calls to `storage_proxy::cas` already implement this logic. The only exception is `executor::do_batch_write`, which currently checks `cas_shard.this_shard()` only once. This can break the invariant if the tablet state changes more than once during the operation. This PR fixes the issue by implementing recursive `cas_shard.this_shard()` checks in `executor::do_batch_write`. It also adds a test that reproduces the problem. Fixes: scylladb/scylladb#27353 backport: need to be backported to 2025.4 Closes scylladb/scylladb#27396 * github.com:scylladb/scylladb: alternator/executor.cc: eliminate redundant dk copy alternator/executor.cc: release cas_shard on the original shard alternator/executor.cc: move shard check into cas_write alternator/executor.cc: make cas_write a private method alternator/executor.cc: make do_batch_write a private method alternator/executor.cc: fix indent test_alternator: add test_alternator_invalid_shard_for_lwt	2025-12-09 11:25:15 +02:00

1 2 3 4 5 ...

50941 Commits