scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-05 14:33:08 +00:00

Author	SHA1	Message	Date
Sergey Zolotukhin	e157e8577e	gossiper: check for a race condition in `do_apply_state_locally` In do_apply_state_locally, a race condition can occur if a task is suspended at a preemption point while the node entry is not locked. During this time, the host may be removed from _endpoint_state_map. When the task resumes, this can lead to inserting an entry with an empty host ID into the map, causing various errors, including a node crash. This change 1. adds a check after locking the map entry: if a gossip ACK update does not contain a host ID, we verify that an entry with that host ID still exists in the gossiper’s _endpoint_state_map. 2. Removes xfail from the test_gossiper_race test since the issue is now fixed. 3. Adds exception handling in `do_shadow_round` to skip responses from nodes that sent an empty host ID. This re-applies the commit `13392a40d4` that was reverted in `46aa59fe49`, after fixing the issues that caused the CI to fail. Fixes: scylladb/scylladb#25702 Fixes: scylladb/scylladb#25621 Ref: scylladb/scylla-enterprise#5613 (cherry picked from commit `f08df7c9d7`)	2025-09-08 21:54:43 +00:00
Emil Maskovsky	e8b903979e	test/gossiper: add reproducible test for race condition during node decommission This change introduces a targeted test that simulates the gossiper race condition observed during node decommissioning. The test delays gossip state application and host ID lookup to reliably reproduce the scenario where `gossiper::get_host_id()` is called on a removed endpoint, potentially triggering an abort in `apply_new_states`. There is a specific error injection added to widen the race window, in order to increase the likelihood of hitting the race condition. The error injection is designed to delay the application of gossip state updates, for the specific node that is being decommissioned. This should then result in the server abort in the gossiper. This re-applies the commit `5dac4b38fb` that was reverted in `dc44fca67c`, but modified to relax the check from "on_internal_error" to a just warning log. The more strict can be re-introduced later once we are sure that all remaining problems are resolved and it will not break the CI. Refs: scylladb/scylladb#25621 Fixes: scylladb/scylladb#25721 (cherry picked from commit `28e0f42a83`)	2025-09-08 21:54:43 +00:00
Calle Wilund	71ef2caef9	system_keyspace: Prune dropped tables from truncation on start/drop Fixes #25683 Once a table drop is complete, there should be no reason to retain truncation records for it, as any replay should skip mutations anyway (no CF), and iff we somehow resurrect a dropped table, this replay-resurrected data is the least problem anyway. Adds a prune phase to the startup drop_truncation_rp_records run, which ignores updating, and instead deletes records for non-existant tables (which should patch any existing servers with lingering data as well). Also does an explicit delete of records on actual table DROP, to ensure we don't grow this table more than needed even in long uptime nodes. Small unit test included. Closes scylladb/scylladb#25699 (cherry picked from commit `bc20861afb`) Closes scylladb/scylladb#25813	2025-09-08 16:26:04 +03:00
Avi Kivity	5de570c9ae	Merge 'auth: move passwords::check call to alien thread' from Andrzej Jackowski Analysis of customer stalls revealed that the function `detail::hash_with_salt` (invoked by `passwords::check`) often blocks the reactor. Internally, this function uses the external `crypt_r` function to compute password hashes, which is CPU-intensive. This PR addresses the issue in two ways: 1) `sha-512` is now the only password hashing scheme for new passwords (it was already the common-case). 2) `passwords::check` is moved to a dedicated alien thread. Regarding point 1: before this change, the following hashing schemes were supported by `identify_best_supported_scheme()`: bcrypt_y, bcrypt_a, SHA-512, SHA-256, and MD5. The reason for this was that the `crypt_r` function used for password hashing comes from an external library (currently `libxcrypt`), and the supported hashing algorithms vary depending on the library in use. However: - The bcrypt schemes never worked properly because their prefixes lack the required round count (e.g. `$2y$` instead of `$2y$05$`). Moreover, bcrypt is slower than SHA-512, so it not good idea to fix or use it. - SHA-256 and SHA-512 both belong to the SHA-2 family. Libraries that support one almost always support the other, so it’s very unlikely to find SHA-256 without SHA-512. - MD5 is no longer considered secure for password hashing. Regarding point 2: the `passwords::check` call now runs on a shared alien thread created at database startup. An `std::mutex` synchronizes that thread with the shards. In theory this could introduce a frequent lock contention, but in practice each shard handles only a few hundred new connections per second—even during storms. There is already `_conns_cpu_concurrency_semaphore` in `generic_server` limits the number of concurrent connection handlers. Fixes https://github.com/scylladb/scylladb/issues/24524 Backport not needed, as it is a new feature. Closes scylladb/scylladb#24924 * github.com:scylladb/scylladb: main: utils: add thread names to alien workers auth: move passwords::check call to alien thread test: wait for 3 clients with given username in test_service_level_api auth: refactor password checking in password_authenticator auth: make SHA-512 the only password hashing scheme for new passwords auth: whitespace change in identify_best_supported_scheme() auth: require scheme as parameter for `generate_salt` auth: check password hashing scheme support on authenticator start (cherry picked from commit `c762425ea7`)	2025-09-07 14:30:26 +03:00
Pavel Emelyanov	112827b170	Revert "test/gossiper: add reproducible test for race condition during node decommission" This reverts commit `46f8404100` because parent PR had been reverted as per #25803	2025-09-05 10:07:10 +03:00
Pavel Emelyanov	ed3e671564	Merge '[Backport 2025.2] drop table: fix crash on drop table with concurrent cleanup' from Scylladb[bot] Consider the following scenario: - A tablet is migrated away from a shard - The tablet cleanup stage closes the storage group's async_gate - A drop table runs truncate which attempts to disable compaction on the tablet with its gate closed. This fails, because table::parallel_foreach_compaction_group() ultimately calls storage_group_manager::parallel_foreach_storage_group() which will not disable compaction if it can't hold the storage group's gate - Truncate calls table::discard_sstables() which checks if the compaction has been disabled, and because it hasn't, it then runs on_internal_error() with "compaction not disabled on table ks.cf during TRUNCATE" which causes a crash Fixes: #25706 This needs to be backported to all supported versions with tablets - (cherry picked from commit `a0934cf80d`) - (cherry picked from commit `1b8a44af75`) Parent PR: #25708 Closes scylladb/scylladb#25784 * github.com:scylladb/scylladb: test: reproducer and test for drop with concurrent cleanup truncate: check for closed storage group's gate in discard_sstables	2025-09-04 08:44:29 +03:00
Emil Maskovsky	46f8404100	test/gossiper: add reproducible test for race condition during node decommission This change introduces a targeted test that simulates the gossiper race condition observed during node decommissioning. The test delays gossip state application and host ID lookup to reliably reproduce the scenario where `gossiper::get_host_id()` is called on a removed endpoint, potentially triggering an abort in `apply_new_states`. There is a specific error injection added to widen the race window, in order to increase the likelihood of hitting the race condition. The error injection is designed to delay the application of gossip state updates, for the specific node that is being decommissioned. This should then result in the server abort in the gossiper. Refs: scylladb/scylladb#25621 Fixes: scylladb/scylladb#25721 Backport: The test is primarily for an issue found in 2025.1, so it needs to be backported to all the 2025.x branches. Closes scylladb/scylladb#25685 (cherry picked from commit `5dac4b38fb`) Closes scylladb/scylladb#25780	2025-09-02 20:54:09 +02:00
Ferenc Szili	1bf298c722	test: reproducer and test for drop with concurrent cleanup This change adds a reproducer and test for issue #25706 (cherry picked from commit `1b8a44af75`)	2025-09-02 02:18:21 +00:00
Nadav Har'El	c04b086929	alternator: avoid oversized allocation in Query/Scan This patch fixes one cause of oversized allocations - and therefore potentially stalls and increased tail latencies - in Alternator. Alternator's Scan or Query operation return a page of results. When the number of items is not limited by a "Limit" parameter, the default is to return a 1 MB page. If items are short, a large number of them can fit in that 1MB. The test test_query.py::test_query_large_page_small_rows has 30,000 items returned in a single page. In the response JSON, all these items are returned in a single array "Items". Before this patch, we build the full response as a RapidJSON object before sending it. The problem is that unfortunately, RapidJSON stores arrays as contiguous allocations. This results in large contiguous allocations in workloads that scan many small items, and large contiguous allocations can also cause stalls and high tail latencies. For example, before this patch, running test/alternator/run --runveryslow \ test_query.py::test_query_large_page_small_rows reports in the log: oversized allocation: 573440 bytes. After this patch, this warning no longer appears. The patch solves the problem by collecting the scanned items not in a RapidJSON array, but rather in a chunked_vector<rjson::value>, i.e, a chunked (non-contiguous) array of items (each a JSON value). After collecting this array separately from the response object, we need to print its content without actually inserting it into the object - we add a new function print_with_extra_array() to do that. The new separate-chunked-vector technique is used when a large number (currently, >256) of items were scanned. When there is a smaller number of items in a page (this is typical when each item is longer), we just insert those items in the object and print it as before. Beyond the original slow test that demonstrated the oversized allocation (which is now gone), this patch also includes a new test which exercises the new code with a scan of 700 (>256) items in a page - but this new test is fast enough to be permanently in our test suite and not a manual "veryslow" test as the other test. Fixes #23535 (cherry picked from commit `2385fba4b6`) Closes scylladb/scylladb#25654	2025-09-01 16:40:02 +03:00
Nadav Har'El	46bd9f2f27	utils, alternator: fix detection of invalid base-64 This patch fixes an error-path bug in the base-64 decoding code in utils/base64.cc, which among other things is used in Alternator to decode blobs in JSON requests. The base-64 decoding code has a lookup table, which was wrongly sized 255 bytes, but needed to be 256 bytes. This meant that if the byte 255 (0xFF) was included in an invalid base-64 string, instead of detecting that this is an invalid byte (since the only valid bytes in a base-64 string are A-Z,a-z,0-9,+,/ and =), the code would either think it's valid with a nonsense 6-bit part, or even crash on an out-of-bounds read. Besides the trivial fix, this patch also includes a reproducing test, which tries to write a blob as a supposedly base-64 encoded string with a 0xFF byte in it. The test fails before this patch (the write succeeds, unexpectedly), and passes after this patch (the write fails as expected). The test also passes on DynamoDB. Fixes #25701 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#25705 (cherry picked from commit `ff91027eac`) Closes scylladb/scylladb#25765	2025-09-01 09:07:00 +03:00
Calle Wilund	a2bc1a7c6b	commitlog: Ensure segment deletion is re-entrant Fixes #25709 If we have large allocations, spanning more than one segment, and the internal segment references from lead to secondary are the only thing keeping a segment alive, the implicit drop in discard_unused_segments and orphan_all can cause a recursive call to discard_unused_segments, which in turn can lead to vector corruption/crash, or even double free of segment (iterator confusion). Need to separate the modification of the vector (_segments) from actual releasing of objects. Using temporaries is the easiest solution. To further reduce recursion, we can also do an early clear of segment dependencies in callbacks from segment release (cf release). Closes scylladb/scylladb#25719 (cherry picked from commit `cc9eb321a1`) Closes scylladb/scylladb#25755	2025-08-30 18:51:35 +03:00
Avi Kivity	1ae593da2e	Merge 'token_range_vector: fragment' from Avi Kivity token_range_vector is a sequence of intervals of tokens. It is used to describe vnodes or token ranges owned by shards. Since tokens are bloated (16 bytes instead of 8), and intervals are bloated (40 byte of overhead instead of 8), and since we have plenty of token ranges, such vectors can exceed our allocation unit of 128 kB and cause allocation stalls. This series fixes that by first generalizing some helpers and then changing token_range_vector to use chunked_vector. Although this touches IDL, there is no compatibility problem since the encoding for vector and chunked_vector are identical. There is no performance concern since token_range_vector is never used on any hot path (hot paths always contain a partition key). Fixes #3335. Fixes #24115. Fixes #24156 Closes scylladb/scylladb#25659 * github.com:scylladb/scylladb: dht: fragment token_range_vector partition_range_compat: generalize wrap/unwrap helpers utils: chunked_vector: add swap() method utils: chunked_vector: add range insert() overloads	2025-08-26 22:43:08 +03:00
Taras Veretilnyk	c272dc7746	keys: from_nodetool_style_string don't split single partition keys Users with single-column partition keys that contain colon characters were unable to use certain REST APIs and 'nodetool' commands, because the API split key by colon regardless of the partition key schema. Affected commands: - 'nodetool getendpoints' - 'nodetool getsstables' Affected endpoints: - '/column_family/sstables/by_key' - '/storage_service/natural_endpoints' Refs: #16596 - This does not fully fix the issue, as users with compound keys will face the issue if any column of the partition key contains a colon character. Closes scylladb/scylladb#24829 Closes scylladb/scylladb#25564	2025-08-26 10:31:54 +03:00
Avi Kivity	6171da6fbc	utils: chunked_vector: add swap() method Following std::vector(), we implement swap(). It's a simple matter of swapping all the contents. A unit test is added. (cherry picked from commit `13a75ff835`)	2025-08-25 12:44:13 +03:00
Avi Kivity	faaec66be7	utils: chunked_vector: add range insert() overloads Inserts an iterator range at some position. Again we insert the range at the end and use std::rotate() to move the newly inserted elements into place, forgoing possible optimizations. Unit tests are added. (cherry picked from commit `24e0d17def`)	2025-08-25 12:44:13 +03:00
Dawid Mędrek	99b65be52e	main: Log RF-rack-invalid keyspaces at startup When the configuration option `rf_rack_valid_keyspaces` is enabled and there is an RF-rack-invalid keyspace, starting a node fails. However, when the configuration option is disabled, but there still is a keyspace that violates the condition, we'd like Scylla to print a warning informing the user about the fact. That's what happens in this commit. We provide a validation test. (cherry picked from commit `837d267cbf`)	2025-08-22 14:31:13 +00:00
Dawid Mędrek	4df1d35375	cql3: Warn when creating RF-rack-invalid keyspace Although RF-rack-valid keyspaces are not universally enforced yet (they're governed by the configuration option `rf_rack_valid_keyspaces`), we'd like to encourage the user to abide by the restriction. To that end, we're introducing a warning when creating or altering a keyspace. If the configuration option is disabled, but the user is trying to create an RF-rack-invalid keyspace, they'll receive a warning. We provide a validation test. (cherry picked from commit `60ea22d887`)	2025-08-22 14:31:13 +00:00
Pavel Emelyanov	dee53a0107	Merge '[Backport 2025.2] generic server: 2 step shutdown' from Scylladb[bot] This PR implements solution proposed in scylladb/scylladb#24481 Instead of terminating connections immediately, the shutdown now proceeds in two stages: first closing the receive (input) side to stop new requests, then waiting for all active requests to complete before fully closing the connections. The updated shutdown process is as follows: 1. Initial Shutdown Phase * Close the accept gate to block new incoming connections. * Abort all accept() calls. * For all active connections: * Close only the input side of the connection to prevent new requests. * Keep the output side open to allow responses to be sent. 2. Drain Phase * Wait for all in-progress requests to either complete or fail. 3. Final Shutdown Phase * Fully close all connections. Fixes scylladb/scylladb#24481 - (cherry picked from commit `122e940872`) - (cherry picked from commit `3848d10a8d`) - (cherry picked from commit `3610cf0bfd`) - (cherry picked from commit `27b3d5b415`) - (cherry picked from commit `061089389c`) - (cherry picked from commit `7334bf36a4`) - (cherry picked from commit `ea311be12b`) - (cherry picked from commit `4f63e1df58`) Parent PR: #24499 Closes scylladb/scylladb#25518 * github.com:scylladb/scylladb: test: Set `request_timeout_on_shutdown_in_seconds` to `request_timeout_in_ms`, decrease request timeout. generic_server: Two-step connection shutdown. transport: consmetic change, remove extra blanks. generic_server: replace empty destructor with `= default` generic_server: refactor connection::shutdown to use `shutdown_input` and `shutdown_output` generic_server: add `shutdown_input` and `shutdown_output` functions to `connection` class. test: Add test for query execution during CQL server shutdown	2025-08-19 17:11:22 +03:00
Wojciech Mitros	6f7f639f54	test: run mv tests depending on metrics on a standalone instance The test_base_partition_deletion_with_metrics test case (and the batch variant) uses the metric of view updates done during its runtime to check if we didn't perform too many of them. The test runs in the cqlpy suite, which runs all test cases sequentially on one Scylla instance. Because of this, if another test case starts a process which generates view updates and doesn't wait for it to finish before it exists, we may observe too many view updates in test_base_partition_deletion_with_metrics and fail the test. In all test cases we make sure that all tables that were created during the test are dropped at the end. However, that doesn't stop the view building process immediately, so the issue can happen even if we drop the view. I confirmed it by adding a test just before test_base_partition_deletion_with_metrics which builds a big materialized view and drops it at the end - the metrics check still failed. The issue could be caused by any of the existing test cases where we create a view and don't wait for it to be built. Note that even if we start adding rows after creating the view, some of them may still be included in the view building, as the view building process is started asynchronously. In such a scenario, the view building also doesn't cause any issues with the data in these tests - writes performed after view creation generate view updates synchronously when they're local (and we're running a single Scylla server), the corresponding view udpates generated during view building are redundant. Because we have many test cases which could be causing this issue, instead of waiting for the view building to finish in every single one of them, we move the susceptible test cases to be run on separate Scylla instances, in the "cluster" suite. There, no other test cases will influence the results. Fixes https://github.com/scylladb/scylladb/issues/20379 Closes scylladb/scylladb#25209 (cherry picked from commit `2ece08ba43`) Closes scylladb/scylladb#25503	2025-08-19 17:10:58 +03:00
Pavel Emelyanov	39a85df231	Merge '[Backport 2025.2] test: test_mv_backlog: fix to consider internal writes' from Scylladb[bot] The PR fixes a test flakiness issue in test_mv_backlog related to reading metrics. The first commit fixes a more general issue in the ScyllaMetrics helper class where it doesn't return the value of all matching lines when a specific shard is requested, but it breaks after the first match. The second commit fixes a test issue where it expects exactly one write to be throttled, not taking into account other internal writes that may be executed during this time. Fixes https://github.com/scylladb/scylladb/issues/23139 backport to improve CI stability - test only change - (cherry picked from commit `5c28cffdb4`) - (cherry picked from commit `276a09ac6e`) Parent PR: #25279 Closes scylladb/scylladb#25474 * github.com:scylladb/scylladb: test: test_mv_backlog: fix to consider internal writes test/pylib/rest_client: fix ScyllaMetrics filtering	2025-08-19 17:10:39 +03:00
Ferenc Szili	a0a346496e	test: remove test_tombstone_gc_disabled_on_pending_replica The test test_tombstone_gc_disabled_on_pending_replica was added when we fixed (#20788) the potential problem with data resurrection during file based streaming. The issue was occurring only in Enterprise, but we added the fix in OSS to limit code divergence. This test was added together with the fix in OSS with the idea to guard this change in OSS. The real reproducer and test for this fix was added later, after the fix was ported into Enterprise. It is in: test/cluster/test_resurrection.py Since Enterprise has been merged into OSS, there is no more need to keep the test test_tombstone_gc_disabled_on_pending_replica. Also, it is flaky with very low probability of failure, making it difficult to investigate the cause of failure. Fixes: #22182 Refs: scylladb/scylladb#25448 Closes scylladb/scylladb#25134 (cherry picked from commit `7ce96345bf`) Closes scylladb/scylladb#25572	2025-08-19 16:02:42 +03:00
Patryk Jędrzejczak	3113968380	test: test_maintenance_socket: use cluster_con for driver sessions The test creates all driver sessions by itself. As a consequence, all sessions use the default request timeout of 10s. This can be too low for the debug mode, as observed in scylladb/scylla-enterprise#5601. In this commit, we change the test to use `cluster_con`, so that the sessions have the request timeout set to 200s from now on. Fixes scylladb/scylla-enterprise#5601 This commit changes only the test and is a CI stability improvement, so it should be backported all the way to 2024.2. 2024.1 doesn't have this test. Closes scylladb/scylladb#25510 (cherry picked from commit `03cc34e3a0`) Closes scylladb/scylladb#25546	2025-08-18 16:43:31 +02:00
Sergey Zolotukhin	687439615f	test: Set `request_timeout_on_shutdown_in_seconds` to `request_timeout_in_ms`, decrease request timeout. In debug mode, queries may sometimes take longer than the default 30 seconds. To address this, the timeout value `request_timeout_on_shutdown_in_seconds` during tests is aligned with other request timeouts. Change request timeout for tests from 180s to 90s since we must keep the request timeout during shutdown significantly lower than the graceful shutdown timeout(2m), or else a request timeout would cause a graceful shutdown timeout and fail a test. (cherry picked from commit `4f63e1df58`)	2025-08-15 18:20:45 +02:00
Sergey Zolotukhin	44853b3bec	generic_server: Two-step connection shutdown. When shutting down in `generic_server`, connections are now closed in two steps. First, only the RX (receive) side is shut down. Then, after all ongoing requests are completed, or a timeout happened the connections are fully closed. Fixes scylladb/scylladb#24481 (cherry picked from commit `ea311be12b`)	2025-08-15 18:20:42 +02:00
Abhinav Jha	a0161ef67a	raft: replication test: change rpc_propose_conf_change test to SEASTAR_THREAD_TEST_CASE RAFT_TEST_CASE macro creates 2 test cases, one with random 20% packet loss named name_drops. The framework makes hard coded assumptions about leader which doesn't hold well in case of packet losses. This short term fix disables the packet drop variant of the specified test. It should be safe to re-enable it once the whole framework is re-worked to remove these hard coded assumptions. This PR fixes a bug. Hence we need to backport it. Fixes: scylladb/scylladb#23816 Closes scylladb/scylladb#25489 (cherry picked from commit `a0ee5e4b85`) Closes scylladb/scylladb#25527	2025-08-15 13:27:52 +03:00
Sergey Zolotukhin	aa8913f317	test: Add test for query execution during CQL server shutdown This test simulates a scenario where a query is being executed while the query coordinator begins shutting down the CQL server and client connections. The shutdown process should wait until the query execution is either completed or timed out. Test for scylladb/scylladb#24481 (cherry picked from commit `122e940872`)	2025-08-14 13:22:04 +00:00
Michael Litvak	a114e95798	test: test_mv_backlog: fix to consider internal writes The test executes a single write, fetching metrics before and after the write, and expects the total throttled writes count to be increased exactly by one. However, other internal writes (compaction for example) may be executed during this time and be throttled, causing the metrics to be increased by more than expected. To address this, we filter the metrics by the scheduling group label of the user write, to filter out the compaction writes that run in the compaction scheduling group. Fixes scylladb/scylladb#23139 (cherry picked from commit `276a09ac6e`)	2025-08-12 20:53:16 +03:00
Michael Litvak	5161342910	test/pylib/rest_client: fix ScyllaMetrics filtering In the ScyllaMetrics `get` function, when requesting the value for a specific shard, it is expected to return the sum of all values of metrics for that shard that match the labels. However, it would return the value of the first matching line it finds instead of summing all matching lines. For example, if we have two lines for one shard like: some_metric{scheduling_group_name="compaction",shard="0"} 1 some_metric{scheduling_group_name="sl:default",shard="0"} 2 The result of this call would be 1 instead of 3: get('some_metric', shard="0") We fix this to sum all matching lines. The filtering of lines by labels is fixed to allow specifying only some of the labels. Previously, for the line to match the filter, either the filter needs to be empty, or all the labels in the metric line had to be specified in the filter parameter and match its value, which is unexpected, and breaks when more labels are added. We also simplify the function signature and the implementation - instead of having the shard as a separate parameter, it can be specified as a label, like any other label. (cherry picked from commit `5c28cffdb4`)	2025-08-12 20:53:16 +03:00
Szymon Malewski	3791a6a4c5	test/alternator: enable more relevant logs in CI. This patch sets, for alternator test suite, all 'alternator-*' loggers and 'paxos' logger to trace level. This should significantly ease debugging of failed tests, while it has no effect on test time and increases log size only by 7%. This affects running alternator tests only with `test.py`, not with `test/alternator/run`. Closes #24645 Closes scylladb/scylladb#25327 (cherry picked from commit `eb11485969`) Closes scylladb/scylladb#25382	2025-08-11 06:59:34 +03:00
Benny Halevy	da0f9608d8	scylla-sstable: print_query_results_json: continue loop if row is disengaged Otherwise it is accessed right when exiting the if block. Add a unit test reproducing the issue and validating the fix. Fixes #25325 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#25326 (cherry picked from commit `5e5e63af10`) Closes scylladb/scylladb#25378	2025-08-10 18:54:29 +03:00
Botond Dénes	d845de84aa	Merge '[Backport 2025.2] truncate: change check for write during truncate into a log warning' from Scylladb[bot] TRUNCATE TABLE performs a memtable flush and then discards the sstables of the table being truncated. It collects the highest replay position for both of these. When the highest replay position of the discarded sstables is higher than the highest replay position of the flushed memtable, that means that we have had writes during truncate which have been flushed to disk independently of the truncate process. We check for this and trigger an on_internal_error() which throws an exception, informing the user that writing data concurrently with TRUNCATE TABLE is not advised. The problem with this is that truncate is also called from DROP KEYSPACE and DROP TABLE. These are raft operations and exceptions thrown by them are caught by the (...) exception handler in the raft applier fiber, which then exits leaving the node without the ability to execute subsequent raft commands. This commit changes the on_internal_error() into a warning log entry. It also outputs to keyspace/table names, and the offending replay positions which caused the check to fail. This PR also adds a test which validates that TRUNCATE works correctly with concurrent writes. More specifically, it checks that: - all data written before TRUNCATE starts is deleted - none of the data after TRUNCATE completes is deleted Fixes: #25173 Fixes: #25013 Backport is needed in versions which check for truncate with concurrent writes using `on_internal_error()`: 2025.3 2025.2 2025.1 - (cherry picked from commit `268ec72dc9`) - (cherry picked from commit `33488ba943`) Parent PR: #25174 Closes scylladb/scylladb#25349 * github.com:scylladb/scylladb: truncate: add test for truncate with concurrent writes truncate: change check for write during truncate into a log warning	2025-08-08 11:44:41 +03:00
Botond Dénes	cccf726b54	Merge '[Backport 2025.2] test: introduce upgrade tests to test.py, add a SSTable dict compression upgrade test' from Michał Chojnowski This PR adds an upgrade test for SSTable compression with shared dictionaries, and adds some bits to pylib and test.py to support that. In the series, we: 1. Mount $XDG_CACHE_DIR into dbuild. 2. Add a pylib function which downloads and installs a released ScyllaDB package into a subdirectory of $XDG_CACHE_DIR/scylladb/test.py, and returns the path to bin/scylla. 3. Add new methods and params to the cluster manager, which let the test start nodes with historical Scylla executables, and switch executables during the test. 4. Add a test which uses the above to run an upgrade test between the released package and the current build. 5. Add --run-internet-dependent-tests to test.py which lets the user of test.py skip this test (and potentially other internet-dependent tests in the future). (The patch modifying wait_for_cql_and_get_hosts is a part of the new test — the new test needs it to test how particular nodes in a mixed-version cluster react to some CQL queries.) This is a follow-up to https://github.com/scylladb/scylladb/pull/23025, split into a separate PR because the potential addition of upgrade tests to test.py deserved a separate thread. Needs backport to 2025.2, because that's where the tested feature is introduced. Fixes https://github.com/scylladb/scylladb/issues/24110 - (cherry picked from commit `63218bb094`) - (cherry picked from commit `cc7432888e`) - (cherry picked from commit `34098fbd1f`) - (cherry picked from commit `2ef0db0a6b`) - (cherry picked from commit `1ff7e09edc`) - (cherry picked from commit `5da19ff6a6`) - (cherry picked from commit `d3cb873532`) - (cherry picked from commit `dd878505ca`) Parent PR: https://github.com/scylladb/scylladb/pull/23538 Closes scylladb/scylladb#25158 * github.com:scylladb/scylladb: test: add test_sstable_compression_dictionaries_upgrade.py test.py: add --run-internet-dependent-tests pylib/manager_client: add server_switch_executable test/pylib: in add_server, give a way to specify the executable and version-specific config pylib: pass scylla_env environment variables to the topology suite test/pylib: add get_scylla_2025_1_executable() pylib/scylla_cluster: give a way to pass executable-specific options to nodes dbuild: mount "$XDG_CACHE_HOME/scylladb"	2025-08-07 06:26:25 +03:00
Nikos Dragazis	5ae00a3dab	test: kmip: Fix segfault from premature destruction of port_promise `kmip_test_helper()` is a utility function to spawn a dedicated PyKMIP server for a particular Boost test case. The function runs the server as an external process and uses a thread to parse the port from the server's logs. The thread communicates the port to the main thread via a promise. The current implementation has a bug where the thread may set a value to the promise after its destruction, causing a segfault. This happens when the server does not start within 20 seconds, in which case the port future throws and the stack unwinding machinery destroys the port promise before the thread that writes to it. Fix the bug by declaring the promise before the cleanup action. The bug has been encountered in CI runs on slow machines, where the PyKMIP server takes too long to create its internal tables (due to slow fdatasync calls from SQLite). This patch does not improve CI stability - it only ensures that the error condition is properly reflected in the test output. This patch is not a backport. The same bug has been fixed in master as part of a larger rewrite of the `kmip_test_helper()` (see `722e2bce96`). Refs #24747, #24842. Fixes #24574. Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com> Closes scylladb/scylladb#25029	2025-08-06 11:58:56 +03:00
Raphael S. Carvalho	be94db3ace	replica: Fix take_storage_snapshot() running concurrently to merge completion Some background: When merge happens, a background fiber wakes up to merge compaction groups of sibling tablets into main one. It cannot happen when rebuilding the storage group list, since token metadata update is not preemptable. So a storage group, post merge, has the main compaction group and two other groups to be merged into the main. When the merge happens, those two groups are empty and will be freed. Consider this scenario: 1) merge happens, from 2 to 1 tablet 2) produces a single storage group, containing main and two other compaction groups to be merged into main. 3) take_storage_snapshot(), triggered by migration post merge, gets a list of pointer to all compaction groups. 4) t__s__s() iterates first on main group, yields. 5) background fiber wakes up, moves the data into main and frees the two groups 6) t__s__s() advances to other groups that are now freed, since step 5. 7) segmentation fault In addition to memory corruption, there's also a potential for data to escape the iteration in take_storage_snapshot(), since data can be moved across compaction groups in background, all belonging to the same storage group. That could result in data loss. Readers should all operate on storage group level since it can provide a view on all the data owned by a tablet replica. The movement of sstable from group A to B is atomic, but iteration first on A, then later on B, might miss data that was moved from B to A, before the iteration reached B. By switching to storage group in the interface that retrieves groups by token range, we guarantee that all data of a given replica can be found regardless of which compaction group they sit on. Fixes #23162. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb/scylladb#24058 (cherry picked from commit `28056344ba`) Closes scylladb/scylladb#25338	2025-08-06 09:41:56 +03:00
Botond Dénes	22942c0a85	Merge '[Backport 2025.2] Raft-based recovery procedure: simplify rolling restart with recovery_leader' from Scylladb[bot] The following steps are performed in sequence as part of the Raft-based recovery procedure: - set `recovery_leader` to the host ID of the recovery leader in `scylla.yaml` on all live nodes, - send the `SIGHUP` signal to all Scylla processes to reload the config, - perform a rolling restart (with the recovery leader being restarted first). These steps are not intuitive and more complicated than they could be. In this PR, we simplify these steps. From now on, we will be able to simply set `recovery_leader` on each node just before restarting it. Apart from making necessary changes in the code, we also update all tests of the Raft-based recovery procedure and the user-facing documentation. Fixes scylladb/scylladb#25015 The Raft-based procedure was added in 2025.2. This PR makes the procedure simpler and less error-prone, so it should be backported to 2025.2 and 2025.3. - (cherry picked from commit `ec69028907`) - (cherry picked from commit `445a15ff45`) - (cherry picked from commit `23f59483b6`) - (cherry picked from commit `ba5b5c7d2f`) - (cherry picked from commit `9e45e1159b`) - (cherry picked from commit `f408d1fa4f`) Parent PR: #25032 Closes scylladb/scylladb#25334 * github.com:scylladb/scylladb: docs: document the option to set recovery_leader later test: delay setting recovery_leader in the recovery procedure tests gossip: add recovery_leader to gossip_digest_syn db: system_keyspace: peers_table_read_fixup: remove rows with null host_id db/config, gms/gossiper: change recovery_leader to UUID db/config, utils: allow using UUID as a config option	2025-08-06 09:41:17 +03:00
Dawid Mędrek	8e96968fb7	test: Enable RF-rack-valid keyspaces in all Python suites We're enabling the configuration option `rf_rack_valid_keyspaces` in all Python test suites. All relevant tests have been adjusted to work with it enabled. That encompasses the following suites: * alternator, * broadcast_tables, * cluster (already enabled in scylladb/scylladb@ee96f8dcfc), * cql, * cqlpy (already enabled in scylladb/scylladb@be0877ce69), * nodetool, * rest_api. Two remaining suites that use tests written in Python, redis and scylla_gdb, are not affected, at least not directly. The redis suite requires creating an instance of Scylla manually, and the tests don't do anything that could violate the restriction. The scylla_gdb suite focuses on testing the capabilities of scylla-gdb.py, but even then it reuses the `run` file from the cqlpy suite. Fixes scylladb/scylladb#25126 Closes scylladb/scylladb#24617 (cherry picked from commit `b41151ff1a`) Closes scylladb/scylladb#25230	2025-08-06 09:35:34 +03:00
Ferenc Szili	932223414b	truncate: add test for truncate with concurrent writes test_validate_truncate_with_concurrent_writes checks if truncate deletes all the data written before the truncate starts, and does not delete any data after truncate completes. (cherry picked from commit `33488ba943`)	2025-08-06 00:51:43 +00:00
Patryk Jędrzejczak	beb28a6155	test: delay setting recovery_leader in the recovery procedure tests In the previous commit, we made it possible to set `recovery_leader` on each node just before restarting it. Here, we change all the tests of the Raft-based recovery procedure to use and test this option. (cherry picked from commit `9e45e1159b`)	2025-08-05 10:59:06 +00:00
Patryk Jędrzejczak	b79902baf8	gossip: add recovery_leader to gossip_digest_syn In the new Raft-based recovery procedure, live nodes join the new group 0 one by one during a rolling restart. There is a time window when some of them are in the old group 0, while others are in the new group 0. This causes a group 0 mismatch in `gossiper::handle_syn_msg`. The current solution for this problem is to ignore group 0 mismatches if `recovery_leader` is set on the local node and to ask the administrator to perform the rolling restart in the following way: - set `recovery_leader` in `scylla.yaml` on all live nodes, - send the `SIGHUP` signal to all Scylla processes to reload the config, - proceed with the rolling restart. This commit makes `gossiper::handle_syn_msg` ignore group 0 mismatches when exactly one of the two gossiping nodes has `recovery_leader` set. We achieve this by adding `recovery_leader` to `gossip_digest_syn`. This change makes setting `recovery_leader` earlier on all nodes and reloading the config unnecessary. From now on, the administrator can simply restart each node with `recovery_leader` set. However, note that nodes that join group 0 must have `recovery_leader` set until all nodes join the new group 0. For example, assume that we are in the middle of the rolling restart and one of the nodes in the new group 0 crashes. It must be restarted with `recovery_leader` set, or else it would reject `gossip_digest_syn` messages from nodes in the old group 0. To avoid problems in such cases, we will continue to recommend setting `recovery_leader` in `scylla.yaml` instead of passing it as a command line argument. (cherry picked from commit `ba5b5c7d2f`)	2025-08-05 10:59:06 +00:00
Nikos Dragazis	f53588813b	test: Use in-memory SQLite for PyKMIP server The PyKMIP server uses an SQLite database to store artifacts such as encryption keys. By default, SQLite performs a full journal and data flush to disk on every CREATE TABLE operation. Each operation triggers three fdatasync(2) calls. If we multiply this by 16, that is the number of tables created by the server, we get a significant number of file syncs, which can last for several seconds on slow machines. This behavior has led to CI stability issues from KMIP unit tests where the server failed to complete its schema creation within the 20-second timeout (observed on spider9 and spider11). Fix this by configuring the server to use an in-memory SQLite. Fixes #24842. Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com> Closes scylladb/scylladb#24995 (cherry picked from commit `2656fca504`) Closes scylladb/scylladb#25299	2025-08-02 17:13:13 +03:00
Piotr Dulikowski	c4b4db62e3	Merge '[Backport 2025.2] qos: don't populate effective service level cache until auth is migrated to raft' from Scylladb[bot] Right now, service levels are migrated in one group0 command and auth is migrated in the next one. This has a bad effect on the group0 state reload logic - modifying service levels in group0 causes the effective service levels cache to be recalculated, and to do so we need to fetch information about all roles. If the reload happens after SL upgrade and before auth upgrade, the query for roles will be directed to the legacy auth tables in system_auth - and the query, being a potentially remote query, has a timeout. If the query times out, it will throw an exception which will break the group0 apply fiber and the node will need to be restarted to bring it back to work. In order to solve this issue, make sure that the service level module does not start populating and using the service level cache until both service levels and auth are migrated to raft. This is achieved by adding the check both to the cache population logic and the effective service level getter - they now look at service level's accessor new method, `can_use_effective_service_level_cache` which takes a look at the auth version. Fixes: scylladb/scylladb#24963 Should be backported to all versions which support upgrade to topology over raft - the issue described here may put the cluster into a state which is difficult to get out of (group0 apply fiber can break on multiple nodes, which necessitates their restart). - (cherry picked from commit `2bb800c004`) - (cherry picked from commit `3a082d314c`) Parent PR: #25188 Closes scylladb/scylladb#25284 * github.com:scylladb/scylladb: test: sl: verify that legacy auth is not queried in sl to raft upgrade qos: don't populate effective service level cache until auth is migrated to raft	2025-08-01 17:15:25 +02:00
Piotr Dulikowski	d9075e6160	test: sl: verify that legacy auth is not queried in sl to raft upgrade Adjust `test_service_levels_upgrade`: right before upgrade to topology on raft, enable an error injection which triggers when the standard role manager is about to query the legacy auth tables in the system_auth keyspace. The preceding commit which fixes scylladb/scylladb#24963 makes sure that the legacy tables are not queried during upgrade to topology on raft, so the error injection does not trigger and does not cause a problem; without that commit, the test fails. (cherry picked from commit `3a082d314c`)	2025-07-31 15:13:24 +00:00
Piotr Dulikowski	618459125d	qos: don't populate effective service level cache until auth is migrated to raft Right now, service levels are migrated in one group0 command and auth is migrated in the next one. This has a bad effect on the group0 state reload logic - modifying service levels in group0 causes the effective service levels cache to be recalculated, and to do so we need to fetch information about all roles. If the reload happens after SL upgrade and before auth upgrade, the query for roles will be directed to the legacy auth tables in system_auth - and the query, being a potentially remote query, has a timeout. If the query times out, it will throw an exception which will break the group0 apply fiber and the node will need to be restarted to bring it back to work. In order to solve this issue, make sure that the service level module does not start populating and using the service level cache until both service levels and auth are migrated to raft. This is achieved by adding the check both to the cache population logic and the effective service level getter - they now look at service level's accessor new method, `can_use_effective_service_level_cache` which takes a look at the auth version. Fixes: scylladb/scylladb#24963 (cherry picked from commit `2bb800c004`)	2025-07-31 15:13:23 +00:00
Pavel Emelyanov	45101e072e	Merge '[Backport 2025.2] transport: remove throwing protocol_exception on connection start' from Dario Mirovic Note: The simplest approach to resolving `process_request_one` merge issues, since it has been refactored, was to include the three commits from before, and then the commits that are actually being backported. `protocol_exception` is thrown in several places. This has become a performance issue, especially when starting/restarting a server. To alleviate this issue, throwing the exception has to be replaced with returning it as a result or an exceptional future. This PR replaces throws in the `transport/server` module. This is achieved by using result_with_exception, and in some places, where suitable, just by creating and returning an exceptional future. There are four commits in this PR. The first commit introduces tests in `test/cqlpy`. The second commit refactors transport server `handle_error` to not rethrow exceptions. The third commit refactors reusable buffer writer callbacks. The fourth commit replaces throwing `protocol_exception` to returning it. Based on the comments on an issue linked in https://github.com/scylladb/scylladb/issues/24567, the main culprit from the side of protocol exceptions is the invalid protocol version one, so I tested that exception for performance. In order to see if there is a measurable difference, a modified version of `test_protocol_version_mismatch` Python is used, with 100'000 runs across 10 processes (not threads, to avoid Python GIL). One test run consisted of 1 warm-up run and 5 measured runs. First test run has been executed on the current code, with throwing protocol exceptions. Second test urn has been executed on the new code, with returning protocol exceptions. The performance report is in https://github.com/scylladb/scylladb/pull/24738#issuecomment-3051611069. It shows ~10% gains in real, user, and sys time for this test. Testing Build: `release` Test file: `test/cqlpy/test_protocol_exceptions.py` Test name: `test_protocol_version_mismatch` (modified for mass connection requests) Test arguments: ``` max_attempts=100'000 num_parallel=10 ``` Throwing `protocol_exception` results: ``` real=1:26.97 user=10:00.27 sys=2:34.55 cpu=867% real=1:26.95 user=9:57.10 sys=2:32.50 cpu=862% real=1:26.93 user=9:56.54 sys=2:35.59 cpu=865% real=1:26.96 user=9:54.95 sys=2:32.33 cpu=859% real=1:26.96 user=9:53.39 sys=2:33.58 cpu=859% real=1:26.95 user=9:56.85 sys=2:34.11 cpu=862% # average ``` Returning `protocol_exception` as `result_with_exception` or an exceptional future: ``` real=1:18.46 user=9:12.21 sys=2:19.08 cpu=881% real=1:18.44 user=9:04.03 sys=2:17.91 cpu=869% real=1:18.47 user=9:12.94 sys=2:19.68 cpu=882% real=1:18.49 user=9:13.60 sys=2:19.88 cpu=883% real=1:18.48 user=9:11.76 sys=2:17.32 cpu=878% real=1:18.47 user=9:10.91 sys=2:18.77 cpu=879% # average ``` This PR replaced `transport/server` throws of `protocol_exception` with returns. There are a few other places where protocol exceptions are thrown, and there are many places where `invalid_request_exception` is thrown. That is out of scope of this single PR, so the PR just refs, and does not resolve issue #24567. Refs: #24567 This PR improves performance in cases when protocol exceptions happen, for example during connection storms. It will require backporting. * (cherry picked from commit `7aaeed012e`) * (cherry picked from commit `30d424e0d3`) * (cherry picked from commit `9f4344a435`) * (cherry picked from commit `5390f92afc`) * (cherry picked from commit `4a6f71df68`) Parent PR: #24738 Closes scylladb/scylladb#25239 * github.com:scylladb/scylladb: test/cqlpy: add cpp exception metric test conditions transport/server: replace protocol_exception throws with returns utils/reusable_buffer: accept non-throwing writer callbacks via result_with_exception transport/server: avoid exception-throw overhead in handle_error test/cqlpy: add protocol_exception tests transport: remove redundant references in process_request_one transport: fix the indentation in process_request_one transport: add futures in CQL server exception handling	2025-07-31 12:18:50 +03:00
Dario Mirovic	9708d9c4d7	test/cqlpy: add cpp exception metric test conditions Tested code paths should not throw exceptions. `scylla_reactor_cpp_exceptions` metric is used. This is a global metric. To address potential test flakiness, each test runs multiple times: - `run_count = 100` - `cpp_exception_threshold = 10` If a change in the code introduced an exception, expectation is that the number of registered exceptions will be > `cpp_exception_threshold` in `run_count` runs. In which case the test fails. Fixes: #25272 (cherry picked from commit `4a6f71df68`)	2025-07-30 21:54:47 +02:00
Dario Mirovic	a8d38882ff	utils/reusable_buffer: accept non-throwing writer callbacks via result_with_exception Make make_bytes_ostream and make_fragmented_temporary_buffer accept writer callbacks that return utils::result_with_exception instead of forcing them to throw on error. This lets callers propagate failures by returning an error result rather than throwing an exception. Introduce buffer_writer_for, bytes_ostream_writer, and fragmented_buffer_writer concepts to simplify and document the template requirements on writer callbacks. This patch does not modify the actual callbacks passed, except for the syntax changes needed for successful compilation, without changing the logic. Refs: #24567 Fixes: #25272 (cherry picked from commit `9f4344a435`)	2025-07-30 21:54:41 +02:00
Dario Mirovic	57a32e50d3	test/cqlpy: add protocol_exception tests Add a helper to fetch scylla_transport_cql_errors_total{type="protocol_error"} counter from Scylla's metrics endpoint. These metrics are used to track protocol error count before and after each test. Add cql_with_protocol context manager utility for session creation with parameterized protocol_version value. This is used for testing connection establishment with different protocol versions, and proper disposal of successfully established sessions. The tests cover two failure scenarios: - Protocol version mismatch in test_protocol_version_mismatch which tests both supported and unsupported protocol version - Malformed frames via raw socket in _protocol_error_impl, used by several test functions, and also test_no_protocol_exceptions test to assert that the error counters never decrease during test execution, catching unintended metric resets Refs: #24567 Fixes: #25272 (cherry picked from commit `7aaeed012e`)	2025-07-30 21:54:27 +02:00
Pavel Emelyanov	95b906bea9	Merge '[Backport 2025.2] storage_service: cancel all write requests after stopping transports' from Scylladb[bot] When a node shuts down, in storage service, after storage_proxy RPCs are stopped, some write handlers within storage_proxy may still be waiting for background writes to complete. These handlers hold appropriate ERMs to block schema changes before the write finishes. After the RPCs are stopped, these writes cannot receive the replies anymore. If, at the same time, there are RPC commands executing `barrier_and_drain`, they may get stuck waiting for these ERM holders to finish, potentially blocking node shutdown until the writes time out. This change introduces cancellation of all outstanding write handlers from storage_service after the storage proxy RPCs were stopped. Fixes scylladb/scylladb#23665 Backport: since this fixes an issue that frequently causes issues in CI, backport to 2025.1, 2025.2, and 2025.3. - (cherry picked from commit `bc934827bc`) - (cherry picked from commit `e0dc73f52a`) Parent PR: #24714 Closes scylladb/scylladb#25169 * github.com:scylladb/scylladb: storage_service: Cancel all write requests on storage_proxy shutdown test: Add test for unfinished writes during shutdown and topology change	2025-07-28 09:25:15 +03:00
Pavel Emelyanov	8622a07bdd	Merge '[Backport 2025.2] streaming: Avoid deadlock by running view checks in a separate scheduling group' from Scylladb[bot] This issue happens with removenode, when RBNO is disabled, so range streamer is used. The deadlock happens in a scenario like this: 1. Start 3 nodes: {A, B, C}, RF=2 2. Node A is lost 3. removenode A 4. Both B and C gain ownership of ranges. 5. Streaming sessions are started with crossed directions: B->C, C->B Readers created by sender side exhaust streaming semaphore on B and C. Receiver side attempts to obtain a permit indirectly by calling check_needs_view_update_path(), which reads local tables. That read is blocked and times-out, causing streaming to fail. The streaming writer is already using a tracking-only permit. Even if we didn't deadlock, and the streaming semaphore was simply exhausted by other receiving sessions (via tracking-only permit), the query may still time-out due to starvation. To avoid that, run the query under a different scheduling group, which translates to the system semaphore instead of the maintenance semaphore, to break the dependency. The gossip group was chosen because it shouldn't be contended and this change should not interfere with it much. Fixes #24807 Fixes #24925 - (cherry picked from commit `ee2fa58bd6`) - (cherry picked from commit `dff2b01237`) Parent PR: #24929 Closes scylladb/scylladb#25055 * github.com:scylladb/scylladb: streaming: Avoid deadlock by running view checks in a separate scheduling group service: migration_manager: Run group0 barrier in gossip scheduling group	2025-07-28 09:24:53 +03:00
Tomasz Grabiec	3991e4de28	streaming: Avoid deadlock by running view checks in a separate scheduling group This issue happens with removenode, when RBNO is disabled, so range streamer is used. The deadlock happens in a scenario like this: 1. Start 3 nodes: {A, B, C}, RF=2 2. Node A is lost 3. removenode A 4. Both B and C gain ownership of ranges. 5. Streaming sessions are started with crossed directions: B->C, C->B Readers created by sender side exhaust streaming semaphore on B and C. Receiver side attempts to obtain a permit indirectly by calling check_needs_view_update_path(), which reads local tables. That read is blocked and times-out, causing streaming to fail. The streaming writer is already using a tracking-only permit. To avoid that, run the query under a different scheduling group, which translates to the system semaphore instead of the maintenance semaphore, to break the dependency. The gossip group was chosen because it shouldn't be contended and this change should not interfere with it much. Fixes: #24807 (cherry picked from commit `dff2b01237`)	2025-07-27 22:52:56 +02:00

1 2 3 4 5 ...

8918 Commits