Some code paths were obtaining db_clock timestamp to only convert it
to gc_clock later. Avoid this. In the future we could make gc_clock
cheaper cause it has low precision.
Message-Id: <1482401190-2035-1-git-send-email-tgrabiec@scylladb.com>
This patch fixes a regression introduced in 0518895, where we counted
one extra row per partition when it contained live, non static rows.
We also simplify the visitor logic further, since now we don't need to
count rows one by one. Also remove a bunch of unused fields.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1482234083-2447-1-git-send-email-duarte@scylladb.com>
Since storage_proxy::query() now respects the read_command limits, we
can remove the trimming logic from query_pagers.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Currently, the paging implementation assumes that the server retunrs
either as many rows as it was asked for all reached the end. Soon,
that's not going to be true so instead of making any assumptions about
the number of the rows returned use the new "short read" flag to
determine whether there is going to be more data.
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
Query pager needs to handle results that contain partitions with
possibly multiple clustering rows quite differently than results with
just one row per partition (for example a page may end in a middle of
partition). However, the logic dealing with partitions with clustering
rows doesn't work correctly for SELECT DISTINCT queries, which are
much more similar to the ones for schemas without clustering key.
The solution is to set _has_clustering_keys to false in case of SELECT
DISTINCT queries regardless of the schema which will make pager
correctly expect each partition to return at most one rows.
Fixes#1822.
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
Message-Id: <1478612486-13421-1-git-send-email-pdziepak@scylladb.com>
Currently, the code responsible for calculating ranges for the next
request could produce a wrap-around partition range. For example, if the
original range was (unimportant, A] and the last partition key A then
the output range would be (A, A].
This patch adds checks to make sure that in such cases the range is
removed.
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
Message-Id: <1475497244-2790-1-git-send-email-pdziepak@scylladb.com>
Paging code assumes that clustering row range [a, a] contains only one
row which may not be true. Another problem is that it tries to use
range<> interface for dealing with clustering key ranges which doesn't
work because of the lack of correct comparator.
Refs #1446.
Fixes#1684.
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
Message-Id: <1475236805-16223-1-git-send-email-pdziepak@scylladb.com>
Having a trace_state_ptr in the storage_proxy level is needed to trace code bits in this level.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Refs #752
Paged aggregate queries will re-use the partition_slice object,
thus when setting a specific ck range for "last pk", we will hit
an exception case.
Allow removing entries (actually only the one), and overwriting
(using schema equality for keys), so we maintain the interface
while allowing the pager code to re-set the ck range for previous
page pass.
[tgrabiec: commit log cleanup, fixed issue ref]
Message-Id: <1452616259-23751-1-git-send-email-calle@scylladb.com>
Fixes#752
We set row limit for query to be min of page size/remaining in limit,
but if we have a multinode query we might end up with more rows than asked
for, so must do this again in post-processing.
Message-Id: <1452606935-12899-2-git-send-email-calle@scylladb.com>
Refs #640
* Remove use of cluster key range for tables without CK
Checking CK existance once and use the info allows us to remove some
stupid complexity in checking for "last key" match
* With fix for #589 we can also remove some superfluous code to
compensate for that issue, and make "partition end" simper
* Remove extra row in CK case. Not needed anymore
End result is that pager now more or less only relies on adapted query
ranges.
If we get a partition with no row data, but statics, we should treat this as
a row (include in count), but also make sure we skip to next partition
if our page ends here.
The "end partition" with zero rows but static data can also happen if we
happen to resume paging by giving a column range exluding all data. In this
case we should _not_ include it, since we have already provided the
data in question in previous page.
Fixes#556
1.) Should not reset to input query state if run repeatedly
2.) And if run repeatedly without input state, likewise keep
the internal one active
Fixes#560
* Static query method to determine if paging might be required
(very conservative - almost all querys will be paged me thinks).
* Static factory method for pager
* Actual pager implementation
Pager object uses three variables to keep track of paging state:
1.) Last partition key - partition key of last partion processed
-> next partition to start process
2.) Last clustering key, i.e. row offset within last key partition,
i.e. how far we got last time
3.) Max remaining - max rows to process further, i.e. initial limit -
processed so far
Partition ranges are modified/removed so that we begin with "Last key",
if present. (Or end with, in the case of reversed processing)
A counting visitor then keeps count of rows to include in processing.
Basic interface for paging control objects.
We probably do not need virtual behaviour for paging, but on the other
hand it does not really cost much, and it keeps a nice symmetry with
origin.
Note: serial format blob is different compared to origin, due to scyllas
different internal architecture. I.e. we query actual rows.
But drivers etc ignore the content of the blob, it is opaque.