From e39267b55f98858f906c48c8bae13dc15a323be6 Mon Sep 17 00:00:00 2001 From: Petr Gusev Date: Tue, 28 Apr 2026 10:34:50 +0200 Subject: [PATCH 1/2] paxos_state: keep prepared message alive across statement execution In do_execute_cql_with_timeout(), when the prepared statement was not found in the cache, we called qp.prepare() and stored the returned result_message::prepared in a local variable scoped to the 'if' block. We then extracted ps_ptr (a checked_weak_ptr to the prepared statement) from the message, let the message go out of scope at the end of the 'if', and used ps_ptr after a co_await on st->execute(). Since 3ac4e258e8 ("transport/messages: hold pinned prepared entry in PREPARE result"), result_message::prepared owns a strong pinned reference to the prepared cache entry. While qp.prepare() runs it also holds its own pin on the entry, so on return the entry has at least the pin owned by the returned message. As long as that message is alive, the cache entry cannot be purged and the weak handle inside ps_ptr remains promotable. The lifetime gap manifested only in debug builds. qp.prepare() returns a ready future on the cache-miss path, so in release builds the co_await resumes synchronously: control flows from the assignment of ps_ptr straight into st->execute() with no opportunity for any other task (in particular, prepared cache invalidation triggered by a concurrent schema change) to run in between. Debug builds, however, force a reactor preemption point on every co_await even when the awaited future is ready. With prepared_msg already destroyed at the end of the 'if' block, the only remaining handle on the cache entry was the weak ps_ptr, and the preemption gave a concurrent cache purge - triggered, for example, by Raft schema changes received during a node restart - the chance to drop the entry. The subsequent execute() then failed when promoting the weak pointer with checked_ptr_is_null_exception. The exception propagated out of the Paxos prepare path as a generic std::exception with no type information in the log, surfacing on the coordinator as: WriteFailure: Failed to prepare ballot ... Replica errors: host_id ... -> seastar::rpc::remote_verb_error (std::exception) Hoist the result_message::prepared into the outer scope so the pinned cache entry stays alive across co_await st->execute(...), closing the window in which a concurrent cache purge could invalidate the weak handle. Fixes SCYLLADB-1173 --- service/paxos/paxos_state.cc | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/service/paxos/paxos_state.cc b/service/paxos/paxos_state.cc index a051230cf3..2da4136e56 100644 --- a/service/paxos/paxos_state.cc +++ b/service/paxos/paxos_state.cc @@ -438,9 +438,10 @@ static future do_execute_cql_with_timeout(sstring req, const auto cache_key = qp.compute_id(req, "", cql3::internal_dialect()); auto ps_ptr = qp.get_prepared(cache_key); + shared_ptr prepared_msg; if (!ps_ptr) { - const auto msg_ptr = co_await qp.prepare(req, qs, cql3::internal_dialect()); - ps_ptr = msg_ptr->get_prepared(); + prepared_msg = co_await qp.prepare(req, qs, cql3::internal_dialect()); + ps_ptr = prepared_msg->get_prepared(); if (!ps_ptr) { on_internal_error(paxos_state::logger, "prepared statement is null"); } @@ -449,8 +450,8 @@ static future do_execute_cql_with_timeout(sstring req, -1, service::node_local_only::yes); const auto st = ps_ptr->statement; - const auto msg_ptr = co_await st->execute(qp, qs, qo, std::nullopt); - co_return cql3::untyped_result_set(msg_ptr); + const auto result_ptr = co_await st->execute(qp, qs, qo, std::nullopt); + co_return cql3::untyped_result_set(result_ptr); } template From e6137ab11b16e1d9eacbdc5240b13680ae933f2a Mon Sep 17 00:00:00 2001 From: Petr Gusev Date: Tue, 28 Apr 2026 16:01:21 +0200 Subject: [PATCH 2/2] table_helper: retry insert prepare on concurrent cache invalidation table_helper::insert() retrieves the prepared statement via cache_table_info() and then dereferences _prepared_stmt to read bound_names. _prepared_stmt is a checked_weak_ptr into the prepared statements cache and can be invalidated at any time by a concurrent purge (for example, on a schema change). cache_table_info() (re-)prepares the statement and assigns _prepared_stmt before returning, and the strong pin held by the result_message::prepared returned from qp.prepare() keeps the cache entry alive only for the duration of try_prepare(). After try_prepare() returns, the pin is gone and _prepared_stmt is the only remaining handle on the entry. In release builds this is fine: the chain of ready-future co_awaits between try_prepare() finishing and _prepared_stmt->bound_names being read resumes synchronously, so no other task -- in particular, no cache purge -- can run in that window. In debug builds, however, Seastar inserts a reactor preemption point on every co_await even when the awaited future is ready. That preemption window is wide enough for a concurrent invalidation to drop the freshly installed cache entry, turning _prepared_stmt into a null weak handle and crashing the subsequent dereference with checked_ptr_is_null_exception. Wrap the cache_table_info() call in a loop that re-attempts the preparation until a synchronous post-resume check finds _prepared_stmt still valid. The check runs in the same task immediately after the co_await resumes, with no co_await between the check and the dereference, so a purge cannot slip in. _insert_stmt is a strong shared_ptr to the statement object and is not affected by cache invalidation, so it remains safe to use across the final co_await on execute(). The other caller of cache_table_info(), trace_keyspace_helper::apply_events_mutation(), accesses only the strong _insert_stmt via insert_stmt() and never dereferences the weak _prepared_stmt, so it is unaffected. Refs SCYLLADB-1173 --- table_helper.cc | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/table_helper.cc b/table_helper.cc index 8f5a55096e..b96e97c977 100644 --- a/table_helper.cc +++ b/table_helper.cc @@ -135,7 +135,23 @@ future<> table_helper::cache_table_info(cql3::query_processor& qp, service::migr } future<> table_helper::insert(cql3::query_processor& qp, service::migration_manager& mm, service::query_state& qs, noncopyable_function opt_maker) { - co_await cache_table_info(qp, mm, qs); + // _prepared_stmt is a checked_weak_ptr into the prepared statements + // cache and can be invalidated by a concurrent purge (e.g. on a schema + // change). cache_table_info() (re-)prepares and assigns _prepared_stmt, + // but the pin protecting the entry is dropped when try_prepare() + // returns. In release the chain of ready-future co_awaits back to here + // resumes synchronously, but debug builds preempt on every co_await + // even for ready futures, opening a window for a purge to drop the + // entry and leave _prepared_stmt null. Loop until a synchronous + // post-resume check finds _prepared_stmt valid; nothing can run between + // that check and the dereference below. _insert_stmt is a strong + // shared_ptr and is not affected by cache invalidation. + while (true) { + co_await cache_table_info(qp, mm, qs); + if (_prepared_stmt) { + break; + } + } auto opts = opt_maker(); opts.prepare(_prepared_stmt->bound_names); co_await _insert_stmt->execute(qp, qs, opts, std::nullopt);