From 32058669784062b002bc75d26dc38f2b995b588f Mon Sep 17 00:00:00 2001
From: Tomasz Grabiec <tgrabiec@scylladb.com>
Date: Sun, 27 Oct 2024 23:28:12 +0100
Subject: [PATCH] utils: cached_file: Mark permit as awaiting on page miss

Otherwise, the read will be considered as on-cpu during promoted index
search, which will severely underutlize the disk because by default
on-cpu concurrency is 1.

I verified this patch on the worst case scenario, where the workload
reads missing rows from a large partition. So partition index is
cached (no IO) and there is no data file IO. But there is IO during
promoted index search (via cached_file). Before the patch this
workload was doing 4k req/s, after the patch it does 30k req/s.

The problem is much less pronounced if there is data file or index
file IO involved because that IO will signal read concurrency
semaphore to invite more concurrency.

(cherry picked from commit 0f2101b0550a3476f0daf5c4229da8b07beae5f9)
---
 utils/cached_file.hh | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/utils/cached_file.hh b/utils/cached_file.hh
index fd64326e20..e570d1aae7 100644
--- a/utils/cached_file.hh
+++ b/utils/cached_file.hh
@@ -168,12 +168,14 @@ private:
                 : read_ahead * page_size;
 
         std::optional<reader_permit::resource_units> units;
+        std::optional<reader_permit::awaits_guard> await_guard;
         if (permit) {
             units = permit->consume_memory(size);
+            await_guard.emplace(*permit);
         }
 
         return _file.dma_read_exactly<char>(idx * page_size, size)
-            .then([this, units = std::move(units), idx] (temporary_buffer<char>&& buf) mutable {
+            .then([this, ag = std::move(await_guard), units = std::move(units), idx] (temporary_buffer<char>&& buf) mutable {
                 cached_page::ptr_type first_page;
                 while (buf.size()) {
                     auto this_size = std::min(page_size, buf.size());