mirror of
https://github.com/scylladb/scylladb.git
synced 2026-06-04 14:03:06 +00:00
When the partition_index_cache is evicted, we yield for preemption between
pages, but not within a page.
Commit 3b2890e1db ("sstables: Switch index_list to chunked_vector
to avoid large allocations") recognized that index pages can be large enough
to overflow a 128k alignment block (this was before the index cache and
index entries were not stored in LSA then). However, it did not go as far as
to gently free individual entries; either the problem was not recognized
or wasn't as bad.
As the referenced issue shows, a fairly large stall can happen when freeing
the page. The workload had a large number of tombstones, so index selectivity
was poor.
Fix by evicting individual rows gently.
The fix ignores the case where rows are still references: it is unlikely
that all index pages will be referenced, and in any case skipping over
a referenced page takes an insignificant amount of time, compared to freeing
a page.
Fixes #17605
Closes scylladb/scylladb#17606