This series adds the ability for partition cache to keep information
whether partition size makes it uncacheable. During, reads these
entries save us IO operations since we already know that the partiiton
is too big to be put in the cache.
First part of the patchset makes all mutation_readers allow the
streamed_mutations they produce to outlive them, which is a guarantee
used later by the code handling reading large partitions.
(cherry picked from commit d2ed75c9ff)
Add contiguity flag to cache entry and set it in scanning reader.
Partitions fetched during scanning are continuous
and we know there's nothing between them.
Clear contiguity flag on cache entries
when the succeeding entry is removed.
Use continuous flag in range queries.
Don't go do disk if we know that there's nothing
between two entries we have in cache. We know that
when continuous flag of the first one is set to true.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <72bae432717037e95d1ac9465deaccfa7c7da707.1466627603.git.piotr@scylladb.com>
Correctness of current uses of clear() and invalidate() relies on fact
that cache is not populated using readers created before
invalidation. Sstables are first modified and then cache is
invalidated. This is not guaranteed by current implementation
though. As pointed out by Avi, a populating read may race with the
call to clear(). If that read started before clear() and completed
after it, the cache may be populated with data which does not
correspond to the new sstable set.
To provide such guarantee, invalidate() variants were adjusted to
synchronize using _populate_phaser, similarly like row_cache::update()
does.
row_cache::update() does not explicitly invalidate the entries it failed
to update in case of a failure. This could lead to inconsistency between
row cache and sstables.
In paractice that's not a problem because before row_cache::update()
fails it will cause all entries in the cache to be invalidated during
memory reclaim, but it's better to be safe and explicitly remove entries
that should be updated but it was not possible to do so.
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
Message-Id: <1453829681-29239-1-git-send-email-pdziepak@scylladb.com>
Its definition as a lambda function is inconvenient, because it does not allow
us to use default values for parameters.
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Its definition as a lambda function is inconvenient, because it does not allow
us to use default values for parameters.
Signed-off-by: Glauber Costa <glauber@scylladb.com>
The intent is to make data returned by queries always conform to a
single schema version, which is requested by the client. For CQL
queries, for example, we want to use the same schema which was used to
compile the query. The other node expects to receive data conforming
to the requested schema.
Interface on shard level accepts schema_ptr, across nodes we use
table_schema_version UUID. To transfer schema_ptr across shards, we
use global_schema_ptr.
Because schema is identified with UUID across nodes, requestors must
be prepared for being queried for the definition of the schema. They
must hold a live schema_ptr around the request. This guarantees that
schema_registry will always know about the requested version. This is
not an issue because for queries the requestor needs to hold on to the
schema anyway to be able to interpret the results. But care must be
taken to always use the same schema version for making the request and
parsing the results.
Schema requesting across nodes is currently stubbed (throws runtime
exception).
The underlying data source for cache should not be the same memtable
which is later used to update the cache from. This fixes the following
assertion failure:
row_cache_test_g: utils/logalloc.hh:289: decltype(auto) logalloc::allocating_section::operator()(logalloc::region&, Func&&) [with Func = memtable::make_reader(schema_ptr, const partition_range&)::<lambda()>]: Assertion `r.reclaiming_enabled()' failed.
The problem is that when memtable is merged into cache their regions
are also merged, so locking cache's region locks the memtable region
as well.
Since bytes is a very generic value that is returned from many calls,
it is easy to pass it by mistake to a function expecting a data_value,
and to get a wrong result. It is impossible for the data_value constructor
to know if the argument is a genuine bytes variable, a data_value of another
type, but serialized, or some other serialized data type.
To prevent misuse, make the data_value(bytes) constructor
(and complementary data_value(optional<bytes>) explicit.