This series adds the ability for partition cache to keep information
whether partition size makes it uncacheable. During, reads these
entries save us IO operations since we already know that the partiiton
is too big to be put in the cache.
First part of the patchset makes all mutation_readers allow the
streamed_mutations they produce to outlive them, which is a guarantee
used later by the code handling reading large partitions.
(cherry picked from commit d2ed75c9ff)
Originally, streamed_mutations guaranteed that emitted tombstones are
disjoint. In order to achieve that two separate objects were produced
for each range tombstone: range_tombstone_begin and range_tombstone_end.
Unfortunately, this forced sstable writer to accumulate all clustering
rows between range_tombstone_begin and range_tombstone_end.
However, since there is no need to write disjoint tombstones to sstables
(see #1153 "Write range tombstones to sstables like Cassandra does") it
is also not necessary for streamed_mutations to produce disjoint range
tombstones.
This patch changes that by making streamed_mutation produce
range_tombstone objects directly.
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
With so many consumer concepts out there, it is confusing to name
parameters using genering "Consumer" name, let's name them after
(already defined) concepts: CompactedMutationsConsumer, FlattenedConsumer.
This is a version of consume_flattened() intended to be run inside a
thread. All consumer code is going to be invoked in the same thread
context.
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
Since decorated keys are already computed it is better to pass more
information than less. Consumers interested just in partition key can
just drop token and the ones requiring full decorated key don't need to
recompute it.
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
While limiting the number of concurrently executing sstable readers reduces
our memory load, the queued readers, although consuming a small amount of
memory, can still grow without bounds.
To limit the damage, add two limits on the queue:
- a timeout, which is equal to the read timeout
- a queue length limit, which is equal to 2% of the shard memory divided
by an estimate of the queued request size (1kb)
Together, these limits bound the amount of memory needed by queued disk
requests in case the disk can't keep up.
Message-Id: <1467206055-30769-1-git-send-email-avi@scylladb.com>
A restricting_reader wraps a mutation_reader, and restricts it concurrency
using a provided semaphore; this allows controlling read concurrency, which
is important since reads can consume a lot of resources ((number of
participating sstables) * 128k after we have streaming mutations, and a lot
more before).
Mutation reader produces a stream of streamed_mutations. Each
streamed_mutation itself is a stream so basically we are dealing here
with a stream of streams.
consume_flattened() flattens such stream of streams making all its
elements consumable by a single consumer. It also allows reversing
the mutations before consumption using reverse_streamed_mutation().
Using a partition_key_view can save an allocation in some cases. We will
make use of it when we linearize a partition_key; during the process we
are given a simple byte pointer, and constructing a partition_key from that
requires an allocation.
SSTables already have a priority argument wired to their read path. However,
most of our reads do not call that interface directly, but employ the services
of a mutation reader instead.
Some of those readers will be used to read through a mutation_source, and those
have to patched as well.
Right now, whenever we need to pass a class, we pass Seastar's default priority
class.
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Its definition as a lambda function is inconvenient, because it does not allow
us to use default values for parameters.
Signed-off-by: Glauber Costa <glauber@scylladb.com>
The intent is to make data returned by queries always conform to a
single schema version, which is requested by the client. For CQL
queries, for example, we want to use the same schema which was used to
compile the query. The other node expects to receive data conforming
to the requested schema.
Interface on shard level accepts schema_ptr, across nodes we use
table_schema_version UUID. To transfer schema_ptr across shards, we
use global_schema_ptr.
Because schema is identified with UUID across nodes, requestors must
be prepared for being queried for the definition of the schema. They
must hold a live schema_ptr around the request. This guarantees that
schema_registry will always know about the requested version. This is
not an issue because for queries the requestor needs to hold on to the
schema anyway to be able to interpret the results. But care must be
taken to always use the same schema version for making the request and
parsing the results.
Schema requesting across nodes is currently stubbed (throws runtime
exception).
Using a lambda for implementing a mutation_reader is nifty, but does not
allow us to add methods.
Switch to a class-based implementation in anticipation of adding a close()
method.
Many mutation_reader implementations capture 'this', which, if copied,
becomes invalid. Protect against this error my making mutation_reader
a non-copyable object.
Fix inadvertant copied around the code base.
Similar to a mutation_reader, but limited: it only returns whether a key
is sure not to exist in some mutation source. Non-blocking and expected
to execute fast. Corresponds to an sstable bloom filter.
To avoid ambiguity, it doesn't return a bool, instead a longer but less
ambiguous "definitely_doesnt_exists" or "maybe_exists".
Currently column_family::for_all_partitions() relies on monotonicity
of keys. Adding strict monotonicity requirement doesn't hurt
implementaitons, but makes some consumers simpler.