From f5eb99f149fa7fa510ee8c86017018c275d9e596 Mon Sep 17 00:00:00 2001 From: Avi Kivity Date: Mon, 20 Apr 2026 15:45:24 +0300 Subject: [PATCH] test: bump multishard_query_test querier_cache TTL to 60s to avoid flake MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Three test cases in multishard_query_test.cc set the querier_cache entry TTL to 2s and then assert, between pages of a stateful paged query, that cached queriers are still present (population >= 1) and that time_based_evictions stays 0. The 2s TTL is not load-bearing for what these tests exercise — they are checking the paging-cache handoff, not TTL semantics. But on busy CI runners (SCYLLADB-1642 was observed on aarch64 release), scheduling jitter between saving a reader and sampling the population can exceed 2s. When that happens, the TTL fires, both saved queriers are time-evicted, population drops to 0, and the assertion `require_greater_equal(saved_readers, 1u)` fails. The trailing `require_equal(time_based_evictions, 0)` check never runs because the earlier assertion has already aborted the iteration — which is why the Jenkins failure surfaces only as a bare "C++ failure at seastar_test.cc:93". Reproduced deterministically in test_read_with_partition_row_limits by injecting a `seastar::sleep(2500ms)` between the save and the sample: the hook then reports population=0 inserts=2 drops=0 time_based_evictions=2 resource_based_evictions=0 and the assertion fires — matching the Jenkins symptoms exactly. Bump the TTL to 60s in all three affected tests: - test_read_with_partition_row_limits (confirmed repro for SCYLLADB-1642) - test_read_all (same pattern, same invariants — suspect) - test_read_all_multi_range (same pattern, same invariants — suspect) Leave test_abandoned_read (1s TTL, actually tests TTL-driven eviction) and test_evict_a_shard_reader_on_each_page (tests manual eviction via evict_one(); its TTL is not load-bearing but the fix is deferred for a separate review) unchanged. Fixes: SCYLLADB-1642 Co-Authored-By: Claude Opus 4.7 (1M context) Closes scylladb/scylladb#29564 --- test/boost/multishard_query_test.cc | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/test/boost/multishard_query_test.cc b/test/boost/multishard_query_test.cc index eaabc9ec8e..5e8d7f2b6a 100644 --- a/test/boost/multishard_query_test.cc +++ b/test/boost/multishard_query_test.cc @@ -548,7 +548,7 @@ SEASTAR_THREAD_TEST_CASE(test_read_all) { using namespace std::chrono_literals; env.db().invoke_on_all([] (replica::database& db) { - db.set_querier_cache_entry_ttl(2s); + db.set_querier_cache_entry_ttl(60s); }).get(); const auto ks = create_vnodes_keyspace(env); @@ -605,7 +605,7 @@ SEASTAR_THREAD_TEST_CASE(test_read_all_multi_range) { using namespace std::chrono_literals; env.db().invoke_on_all([] (replica::database& db) { - db.set_querier_cache_entry_ttl(2s); + db.set_querier_cache_entry_ttl(60s); }).get(); const auto ks = create_vnodes_keyspace(env); @@ -667,7 +667,7 @@ SEASTAR_THREAD_TEST_CASE(test_read_with_partition_row_limits) { using namespace std::chrono_literals; env.db().invoke_on_all([] (replica::database& db) { - db.set_querier_cache_entry_ttl(2s); + db.set_querier_cache_entry_ttl(60s); }).get(); const auto ks = create_vnodes_keyspace(env);