Commit Graph

37 Commits

Author SHA1 Message Date
Tomasz Grabiec
be24816c8a row_cache: Clear partitions with region locked
Since invalidate() may allocate, we need to take the region lock to
keep m.partitions references valid around whole clear_and_dispose(),
which relies on that.
2016-02-26 16:57:31 +01:00
Glauber Costa
f6cfb04d61 add a priority class to mutation readers
SSTables already have a priority argument wired to their read path. However,
most of our reads do not call that interface directly, but employ the services
of a mutation reader instead.

Some of those readers will be used to read through a mutation_source, and those
have to patched as well.

Right now, whenever we need to pass a class, we pass Seastar's default priority
class.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2016-01-25 15:20:38 -05:00
Tomasz Grabiec
6b059fd828 row_cache: Guard against wrap-around range in make_reader() 2016-01-13 17:50:55 +01:00
Tomasz Grabiec
50cc0c162e row_cache: Make invalidate() handle wrap-around ranges
Currently for wrap around the "begin" iterator would not meet with the
"end" iterator, invoking undefined behavior in erase_and_dispose()
which results in a crash.

Fixes #785
2016-01-13 17:50:55 +01:00
Tomasz Grabiec
d81a46d7b5 column_family: Add schema setters
There is one current schema for given column_family. Entries in
memtables and cache can be at any of the previous schemas, but they're
always upgraded to current schema on access.
2016-01-11 10:34:52 +01:00
Tomasz Grabiec
4e5a52d6fa db: Make read interface schema version aware
The intent is to make data returned by queries always conform to a
single schema version, which is requested by the client. For CQL
queries, for example, we want to use the same schema which was used to
compile the query. The other node expects to receive data conforming
to the requested schema.

Interface on shard level accepts schema_ptr, across nodes we use
table_schema_version UUID. To transfer schema_ptr across shards, we
use global_schema_ptr.

Because schema is identified with UUID across nodes, requestors must
be prepared for being queried for the definition of the schema. They
must hold a live schema_ptr around the request. This guarantees that
schema_registry will always know about the requested version. This is
not an issue because for queries the requestor needs to hold on to the
schema anyway to be able to interpret the results. But care must be
taken to always use the same schema version for making the request and
parsing the results.

Schema requesting across nodes is currently stubbed (throws runtime
exception).
2016-01-11 10:34:52 +01:00
Tomasz Grabiec
036974e19b Make mutation interfaces support multiple versions
Schema is tracked in memtable and cache per-entry. Entries are
upgraded lazily on access. Incoming mutations are upgraded to table's
current schema on given shard.

Mutating nodes need to keep schema_ptr alive in case schema version is
requested by target node.
2016-01-11 10:34:51 +01:00
Paweł Dziepak
59245e7913 row_cache: add functions for invalidating entries in cache
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2015-12-15 13:21:11 +01:00
Tomasz Grabiec
de75f3fa69 row_cache: Add default value for partition range in make_reader() 2015-11-29 16:25:21 +01:00
Tomasz Grabiec
7c3e6c306b row_cache: Wait for in-flight populations on update
Before this change, populations could race with update from flushed
memtable, which might result in cache being populated with older
data. Populations started before the flush are not considering the
memtable nor its sstable.

The fix employed here is to make update wait for populations which
were started before the flushed memtable's sstable was added to the
undrelying data source. All populatinos started after that are
guaranteed to see the new data.
2015-11-29 16:25:21 +01:00
Paweł Dziepak
b1b830bcbb row_cache: merge cache_entry::compare and ring_position_compare
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2015-10-22 12:25:02 +03:00
Paweł Dziepak
c2b53c5282 row_cache: add scanning_and_populating_reader
This reader enables range queries on row cache. An underlying key_reader
is used to obtain information about partitions that belong to the
specified range and if any of them isn't in the cache an underlying
mutation reader is used to read the missing data.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2015-10-20 20:27:53 +02:00
Paweł Dziepak
9abefbfa28 row_cache: add just_cache_scanning_reader
This mutation reader returns mutations from cache that are in a given
range. There may be other mutations in the system (e.g. in sstables)
that won't be returned, so this reader on its own cannot really satisfy
any query.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2015-10-20 20:27:53 +02:00
Paweł Dziepak
c765b38599 row_cache: add modification counter
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2015-10-20 20:27:53 +02:00
Paweł Dziepak
c1e95dd893 row_cache: pass underlying key_source to row_cache
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2015-10-20 20:27:53 +02:00
Paweł Dziepak
6b73d6bb36 row_cache: add cache_entry::ring_position_compare
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2015-10-20 20:27:53 +02:00
Pekka Enberg
ac4007153d row_cache: Implement clear() helper
We need to clear the row cache for column family truncate operation.
2015-09-30 09:09:42 +02:00
Tomasz Grabiec
1b1cfd2cbf tests: Introduce tests/memory_footprint_test 2015-09-23 21:27:44 -07:00
Tomasz Grabiec
4712af2c21 row_cache: Use allocating_section in row_cache::populate()
Cache has a tendency to eat up all available memory. It is evicted
on-demand, but this happens at certain points in time (during large
allocation requests). Small allocations which are served from small
object pools won't usually trigger this. Large allocations happen for
example when LSA region needs a new segment, eg. when row cache is
populated. If large allocations happen for certain period only inside
row_cache::update(), then eviction will not be able to make forward
progress because cache's LSA region is locked inside
row_cache::update(). While it's locked, data can't be evicted from
it.

The solution is to use allocating_section.

Fixes #376.
2015-09-21 13:25:13 +03:00
Avi Kivity
d5cf0fb2b1 Add license notices 2015-09-20 10:43:39 +03:00
Avi Kivity
9dbe8ca1b5 row_cache: reduce cpu impact of memtable flush
Restrict the impact of flushing a memtable to row_cache to 20% of the
cpu.  This is accomplished by converting the code to a thread (with
bad indentation to improve patch readability) and using a thread
scheduling group.
2015-09-19 09:22:52 +03:00
Tomasz Grabiec
91e7dcfe10 row_cache: Don't count insertions and merges as hits and misses
Currently cache update which from a flushed memtable affects hits and
misses, which may be confusing. Let's reserve hits and misses for
reads. Cache update will affect counters called "insertions" and
"merges".
2015-09-10 12:41:27 +03:00
Tomasz Grabiec
f64ac3a80e row_cache: Extract scanning reader construction 2015-09-10 12:41:27 +03:00
Tomasz Grabiec
447e59eaf9 row_cache: Expose a metric for the number of cached partitions
Fixes #193.
2015-09-10 12:41:12 +03:00
Tomasz Grabiec
74603425ac mutation_partition: Introduce r-value version of apply() 2015-09-07 09:41:36 +02:00
Tomasz Grabiec
d1f89b4eab row_cache: Use allocation_section
See #259.

When transferring mutations between memtable and cache, lsa sometimes
runs out of memory. This solves the first two points, keeping reserve
filled up and adjusting the amount of reserve based on execution
history.
2015-09-06 21:25:44 +02:00
Tomasz Grabiec
7efcde12aa row_cache: Introduce row_cache::touch()
Useful in tests for ensuring that certain entries survive eviction.
2015-09-06 21:25:44 +02:00
Avi Kivity
4390be3956 Rename 'negative_mutation_reader' to 'partition_presence_checker'
Suggested by Tomek.
2015-08-24 18:03:22 +03:00
Amnon Heiman
0e1aa2e78b Expose the cache tracker and the num_entries in row_cache
This expose the cache tracker and the num entries in the row cache so it
can be used by the API.

And it adds a const getter for the region.

Both are const and are used for inspecting only.

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-08-17 19:42:23 +03:00
Avi Kivity
1016b21089 cache: improve preloading of flushed memtable mutations
If a mutation definitely doesn't exist in all sstables, then we can
certainly load it into the cache.
2015-08-09 22:46:08 +03:00
Tomasz Grabiec
ef549ae5a5 lsa: Reclaim space from evictable regions incrementally
When LSA reclaimer cannot reclaim more space by compaction, it
will reclaim data by evicting from evictable regions.

Currently the only evictable region is the one owned by the row cache.
2015-08-08 09:59:24 +02:00
Tomasz Grabiec
7a8f1ef6c3 row_cache: Replace _lru_len counter with region occupancy
_lru_len may get stale when row_cache instance goes out of scope
purging all its partitions from cache. I'm assuming we're not really
interested in the number of partitions here, but rather a measure of
occupancy, so I applied a simple fix of using LSA region occupancy
instead.
2015-08-08 09:59:24 +02:00
Tomasz Grabiec
926509525f row_cache: Switch to using LSA 2015-08-06 14:05:16 +02:00
Avi Kivity
e577ed8459 row_cache: wire up collectd statistics 2015-07-28 09:48:27 +02:00
Tomasz Grabiec
f2502ae1ad row_cache: Value-initialize stats 2015-07-13 12:57:15 +03:00
Tomasz Grabiec
d4e0e5957b db: Integrate cache with the read path 2015-06-23 13:49:25 +02:00
Tomasz Grabiec
e40638823e db: Introduce mutation cache
row_cache class is meant to cache data for given table by wrapping
some underlying data source. It gives away a mutation_reader which
uses in-memory data if possible, or delegates to the underlying reader
and populates the cache on-the-fly.

Accesses to data in cache is tracked for eviction purposes by a
separate entity, the cache_tracker. There is one such tracker for the
whole shard.
2015-06-23 13:49:24 +02:00