"
Terms
-----
querier: A class encapsulating all the logic and state needed to fill a
page. This Includes the reader, the compact_mutation object and all
associated state.
Preamble
--------
Currently for paged-queries we throw away all readers, compactors and
all associated state that contributed to filling the page and on the
next page we create them from scratch again. Thus on each page we throw
away a considerable amount of work, only to redo it again on the next
page. This has been one of the major contributors to latencies as from
the point of view of a replica each page is as much work as a fresh
query.
Solution
--------
The solution presented in this patch-series is to save queriers after
filling a page and reuse them on the next pages, thus doing the
considerable amount of work involved with creating the them only once.
On each page the coordinator will generate a UUID that identifies this
page. This UUID is used as the key, under which the contributing
queriers will be saved in the cache. On the next page the UUID from the
previous page will be used to lookup saved queriers, and the one from
the current one to saved them afterwards (if the query isn't finished).
These UUIDs (reader_recall_uuid and reader_save_uuid) are attached to
the page-state. Also attached to the page state is the list of replicas
hit on the last page. On the next page this list will be consulted to
hit the same replicas again, thus reusing the queriers saved on them.
Cached queriers will be evicted after a certain period of time to avoid
unecessary resource consumption by abandoned reads.
Cached queriers may also be evicted when the shard faces
resource-pressure, to free up resources.
Splitting up the work
---------------------
This series only fixes the singular-mutation query path, that is queries
that either fetch a single partition, or severeal single partitions (IN
queries). The fix for the scanning query path will be done in a
follow-up series, however much of the infrastructure needed for the
general querier reuse is already introduced by this series.
Ref #1865
Tests: unit-tests(debug, release), dtests(paging_test, paging_additional_test)
Benchmarking summary (read-from-disk)
-------------------------------------
1) Latency
BEFORE
latency mean : 58.0
latency median : 57.4
latency 95th percentile : 68.8
latency 99th percentile : 79.9
latency 99.9th percentile : 93.6
latency max : 93.6
AFTER
latency mean : 41.3
latency median : 40.5
latency 95th percentile : 50.8
latency 99th percentile : 68.9
latency 99.9th percentile : 89.2
latency max : 89.2
2) Throughput (single partition query)
sum(scylla_cql_reads):
BEFORE: 173'567
AFTER: 427'774
+246%
3) Throughput (IN query, 2 partitions)
sum(scylla_cql_reads):
BEFORE: 85'637
AFTER: 127'431
+148%
"
* '1865/singular-mutations/v8.2' of https://github.com/denesb/scylla: (23 commits)
Add unit test for resource based cache eviction
Add unit tests for querier_cache
Add counters to monitor querier-cache efficiency
Memory based cache eviction
Add buffer_size() to flat_mutation_reader
Resource-based cache eviction
Time-based cache eviction
Save and restore queriers in mutation_query() and data_query()
Add the querier_cache_context helper
Add querier_cache
Add querier
Add are_limits_reached() compact_mutation_state
Add start_new_page() to compact_mutation_state
Save last key of the page and method to query it
Make compact_mutation reusable
Add the CompactedFragmentsConsumer
Use the last_replicas stored in the page_state
query_singular(): return the used replicas
Consider preferred replicas when choosing endpoints for query_singular()
Add preferred and last replicas to the signature of query()
...