The idea behind caching is that when we have two index readers where
one is catching up with the other, each page will be read only
once. Currently that's not always the case. There is a case when
advance_to() may need to read two pages. That's when the target
position is not found in the first page as determined by the summary
index. The second reader which catches up would have to read the first
page as well, but it would not be in cache any more. To avoid this
extra I/O let's keep a reference to the two last pages touched by the
index.
To produce a streamed_mutation for the next partition, we need to read
its key and the tombstone. Currently we always do that by consuming
the partition header from the data file. In some cases that may cause
unnecessary IO.
It's better to obtain partition information from the index if we
already have it. We can save on IO if the user will skip past the
front of partition immediately after.
It is also better to pay the cost of reading the index if we know that
we will need to use the index anyway soon. This patch predicts that by
checking if there are any clustering restrictions. If there are any,
we will almost surely need_skip() and use the index anyway.
This change also lays the ground for unification of multi and single
partiton queries without loss of performance.
sstable_data_source holds a shared state between mutation_reader and
streamed_mutation for sstables. The information whether index is in
current partition will have to be accessed by both in the following
patches.
Before the change, the following scenario was happening:
1) we try to skip based on clustering restrictions
2) we find the page and fast forward to it, recording walker's
lower bound counter
3) we read the first fragment, it's not a tombstone, so we reset the walker,
and its lower bound counter too
4) the fragment is not in range (the range starts in the middle of the page)
5) needs_skip() is true, we redo the index lookup, which wastes some CPU
This change fixes the problem by avoiding resetting the walker. We can
do that because leading tombstones are checked with a non-mutable
contains_tombstone()
Allows detecting changes of lower_bound().
Result of advance_to() is not enough. When we get false from
advance_to() twice in a row, lower bound may or may not have changed.
Simplifies the code a bit, but also will make it easier to calculate
the next position we should skip to after forwarding, taking into
consideration both the position forwarded to as well as clustering
ranges of the query. That will be just calling
_ck_ranges_walker->lower_bound() after it is trimmed.
In general mp_row_consumer has better information about the next
position to read. It could be after the position we forward to if
there are clustering restrictions. This will be exploited in the
following patches.
Problem is that column family field of task wasn't being set for resharding,
so column family wasn't being properly removed from compaction manager.
In addition to fixing this issue, we'll also interrupt ongoing compactions
when dropping a column family, exactly like we do with shutdown.
Fixes#2291.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20170418125807.7712-1-raphaelsc@scylladb.com>
streaming generates lots of small sstables with large token range,
which triggers O(N^2) in space in interval map.
level 0 sstables will now be stored in a structure that has O(N)
in space complexity and which will be included for every read.
Fixes#2287.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20170417185509.6633-1-raphaelsc@scylladb.com>
"This series fixes some errors found by clang, with the aim of enabling
clang/zapcc as a supported compiler. A few more fixes are needed to produce
a binary."
* tag 'clang/1/v1' of https://github.com/avikivity/scylla:
logalloc: avoid auto in function argument declaration
thrift: avoid auto in function argument declaration
streamed_mutation: fix non-POD argument to C-style variadic function
mutation_partition_serializer: avoid auto in function argument declaration
date: use correct casts for years
streaming: avoid auto in function argument declaration
repair: avoid auto in function argument declaration
gms: expose gms::inet_address streaming operator
murmur3_partitioner: fix build on clang
i_partitioner: remove unused function
byte_ordered_partitioner: fix bad operator precedence
result_set: pass comparator by reference to std::sort()
to_string: move standard container overloads of to_string to std:: namespace
cql_type: fix bad enum syntax on clang
build: disable more warnings for clang
build: fix detection of unsupported warnings on clang
'auto' in a non-lambda function argument is not legal C++, and is hard
to read besides. Replace with the right type.
Since the right type is private, add some friendship.
Clang warns that passing a non-POD to a C-style variadic function will
result in an abort(). That happens to be exactly what we want, but to
silence the warning, use a template instead. Since templates aren't
allowed in local classes, move the containing class to namespace scope.
The standard says, and clang enforces, that declaring a function via
a friend declaration is not sufficient for ADL to kick in. Add a namespace
level declaration so ADL works.