mirror of
https://github.com/scylladb/scylladb.git
synced 2026-04-19 16:15:07 +00:00
`query_partition_key_range()` does the final result merging and trimming (if necessary) to make sure we don't send more rows to the client than requested. This merging and trimming is done by a continuation attached to the `query_partition_key_range_concurrent()` which does the actual querying. The continuations captures via value the `row_limit` and `partition_limit` fields of the `query::read_command` object of the query. This has an unexpected consequence. The lambda object is constructed after the call to `query_partition_key_range_concurrent()` returns. If this call doesn't defer, any modifications done to the read command object done by `query_partition_key_range_concurrent()` will be visible to the lambda. This is undesirable because `query_partition_key_range_concurrent()` updates the read command object directly as the vnodes are traversed which in turn will result in the lambda doing the final trimming according to a decremented `row_limits`, which will cause the paging logic to declare the query as exhausted prematurely because the page will not be full. To avoid all this make a copy of the relevant limit fields before `query_partition_key_range_concurrent()` is called and pass these copies to the continuation, thus ensuring that the final trimming will be done according to the original page limits. Spotted while investigating a dtest failure on my 1865/range-scans/v2 branch. On that branch the way range scans are executed on replicas is completely refactored. These changes appearantly reduce the number of continuations in the read path to the point where an entire page can be filled without deferring and thus causing the problem to surface. Fixes #3605. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <f11e80a6bf8089d49ba3c112b25a69edf1a92231.1531743940.git.bdenes@scylladb.com>