scylladb

Author	SHA1	Message	Date
Tomasz Grabiec	b224ff6ede	Merge 'pdziepak/row-cache-wide-entries/v4' from seastar-dev.git This series adds the ability for partition cache to keep information whether partition size makes it uncacheable. During, reads these entries save us IO operations since we already know that the partiiton is too big to be put in the cache. First part of the patchset makes all mutation_readers allow the streamed_mutations they produce to outlive them, which is a guarantee used later by the code handling reading large partitions. (cherry picked from commit `d2ed75c9ff`)	2016-08-02 20:24:29 +02:00
Piotr Jastrzebski	6960fce9b2	Use continuity flag correctly with concurrent invalidations Between reading cache entry and actually using it invalidations can happen so we have to check if no flag was cleared if it was we need to read the entry again. Fixes #1464. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <7856b0ded45e42774ccd6f402b5ee42175bd73cf.1469701026.git.piotr@scylladb.com> (cherry picked from commit `fdfd1af694`)	2016-08-02 20:24:22 +02:00
Piotr Jastrzebski	bf27379583	Add tests for wide partiton handling in cache. They shouldn't be cached. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> (cherry picked from commit `7d29cdf81f`)	2016-07-27 14:09:45 +03:00
Piotr Jastrzebski	02cf5a517a	Add collectd counter for uncached wide partitions. Keep track of every read of wide partition that's not cached. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> (cherry picked from commit `37a7d49676`)	2016-07-27 14:09:40 +03:00
Piotr Jastrzebski	ec3d59bf13	Add flag to configure max size of a cached partition. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> (cherry picked from commit `636a4acfd0`)	2016-07-27 14:09:34 +03:00
Paweł Dziepak	81e4952c78	row_cache: fix marking last entry as continuous Range queries need to take special care when transitioning between ranges that are read from sstables and ranges that are already in the cache. Original code in such case just started a secondary reader and told it to unconditionally mark the last entry as continuous (primary reader has already returned an element tha immediately follows the range that is going to be read form sstables). However, that information may get stale. For instance, by the time secondary reader finish reading its range the element immediately following it may get evicted from the cache thus causing continuity flag to be incorrectly set. The solution is to ensure that the element immediately after the range read from sstables is still in the cache. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com> Message-Id: <1468586893-15266-1-git-send-email-pdziepak@scylladb.com>	2016-07-15 15:15:02 +02:00
Piotr Jastrzebski	59d0d9e666	Fix cache_tracker::clear Make sure that artificial entries for all column families are set to non continuous. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <f9e517fe40482c05f6c388faab7d6b9eca6b159e.1467103548.git.piotr@scylladb.com>	2016-06-28 11:18:23 +02:00
Piotr Jastrzebski	9b011bff18	row_cache: add contiguity flag to cache entry to reduce disk IO during scans Add contiguity flag to cache entry and set it in scanning reader. Partitions fetched during scanning are continuous and we know there's nothing between them. Clear contiguity flag on cache entries when the succeeding entry is removed. Use continuous flag in range queries. Don't go do disk if we know that there's nothing between two entries we have in cache. We know that when continuous flag of the first one is set to true. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <72bae432717037e95d1ac9465deaccfa7c7da707.1466627603.git.piotr@scylladb.com>	2016-06-23 09:43:15 +03:00
Paweł Dziepak	b2c37429e7	row_cache: drop slicing_reader Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-06-20 21:29:51 +01:00
Paweł Dziepak	f605499aec	row_cache: fully support streamed_mutations Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-06-20 21:29:51 +01:00
Avi Kivity	465c0a4ead	Merge "Make stronger guarantees in row_cache's clear/invalidate" from Tomasz "Correctness of current uses of clear() and invalidate() relies on fact that cache is not populated using readers created before invalidation. Sstables are first modified and then cache is invalidated. This is not guaranteed by current implementation though. As pointed out by Avi, a populating read may race with the call to clear(). If that read started before clear() and completed after it, the cache may be populated with data which does not correspond to the new sstable set. To provide such guarantee, invalidate() variants were adjusted to synchronize using _populate_phaser, similarly like row_cache::update() does. Fixes #1291."	2016-06-13 09:55:29 +03:00
Tomasz Grabiec	d5a2d7a88d	row_cache: Add eviciton and removal counters Fixes #1273. Message-Id: <1465315433-8473-1-git-send-email-tgrabiec@scylladb.com>	2016-06-08 16:08:32 -04:00
Tomasz Grabiec	170a214628	row_cache: Make stronger guarantees in clear/invalidate Correctness of current uses of clear() and invalidate() relies on fact that cache is not populated using readers created before invalidation. Sstables are first modified and then cache is invalidated. This is not guaranteed by current implementation though. As pointed out by Avi, a populating read may race with the call to clear(). If that read started before clear() and completed after it, the cache may be populated with data which does not correspond to the new sstable set. To provide such guarantee, invalidate() variants were adjusted to synchronize using _populate_phaser, similarly like row_cache::update() does.	2016-06-06 13:21:06 +02:00
Avi Kivity	9637c2232c	Merge "Move the JMX timer polling logic to Scylla" from Amnon	2016-05-24 13:07:52 +03:00
Amnon Heiman	468bcfbf1f	row_cache: Change counter to timed_rate_moving_average_and_histogram As part of moving the derived statistic in to scylla, this replaces the counter in the row_cache stats to timed_rate_moving_average_and_histogram. Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2016-05-17 11:53:15 +03:00
Piotr Jastrzebski	dcba6f5c45	Pass clustering_row_ranges to mutation readers. This will allow readers to reduce the amount of data read. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2016-05-16 14:36:57 +02:00
Pekka Enberg	38a54df863	Fix pre-ScyllaDB copyright statements People keep tripping over the old copyrights and copy-pasting them to new files. Search and replace "Cloudius Systems" with "ScyllaDB". Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>	2016-04-08 08:12:47 +03:00
Tomasz Grabiec	be24816c8a	row_cache: Clear partitions with region locked Since invalidate() may allocate, we need to take the region lock to keep m.partitions references valid around whole clear_and_dispose(), which relies on that.	2016-02-26 16:57:31 +01:00
Glauber Costa	f6cfb04d61	add a priority class to mutation readers SSTables already have a priority argument wired to their read path. However, most of our reads do not call that interface directly, but employ the services of a mutation reader instead. Some of those readers will be used to read through a mutation_source, and those have to patched as well. Right now, whenever we need to pass a class, we pass Seastar's default priority class. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2016-01-25 15:20:38 -05:00
Tomasz Grabiec	6b059fd828	row_cache: Guard against wrap-around range in make_reader()	2016-01-13 17:50:55 +01:00
Tomasz Grabiec	50cc0c162e	row_cache: Make invalidate() handle wrap-around ranges Currently for wrap around the "begin" iterator would not meet with the "end" iterator, invoking undefined behavior in erase_and_dispose() which results in a crash. Fixes #785	2016-01-13 17:50:55 +01:00
Tomasz Grabiec	d81a46d7b5	column_family: Add schema setters There is one current schema for given column_family. Entries in memtables and cache can be at any of the previous schemas, but they're always upgraded to current schema on access.	2016-01-11 10:34:52 +01:00
Tomasz Grabiec	4e5a52d6fa	db: Make read interface schema version aware The intent is to make data returned by queries always conform to a single schema version, which is requested by the client. For CQL queries, for example, we want to use the same schema which was used to compile the query. The other node expects to receive data conforming to the requested schema. Interface on shard level accepts schema_ptr, across nodes we use table_schema_version UUID. To transfer schema_ptr across shards, we use global_schema_ptr. Because schema is identified with UUID across nodes, requestors must be prepared for being queried for the definition of the schema. They must hold a live schema_ptr around the request. This guarantees that schema_registry will always know about the requested version. This is not an issue because for queries the requestor needs to hold on to the schema anyway to be able to interpret the results. But care must be taken to always use the same schema version for making the request and parsing the results. Schema requesting across nodes is currently stubbed (throws runtime exception).	2016-01-11 10:34:52 +01:00
Tomasz Grabiec	036974e19b	Make mutation interfaces support multiple versions Schema is tracked in memtable and cache per-entry. Entries are upgraded lazily on access. Incoming mutations are upgraded to table's current schema on given shard. Mutating nodes need to keep schema_ptr alive in case schema version is requested by target node.	2016-01-11 10:34:51 +01:00
Paweł Dziepak	59245e7913	row_cache: add functions for invalidating entries in cache Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2015-12-15 13:21:11 +01:00
Tomasz Grabiec	de75f3fa69	row_cache: Add default value for partition range in make_reader()	2015-11-29 16:25:21 +01:00
Tomasz Grabiec	7c3e6c306b	row_cache: Wait for in-flight populations on update Before this change, populations could race with update from flushed memtable, which might result in cache being populated with older data. Populations started before the flush are not considering the memtable nor its sstable. The fix employed here is to make update wait for populations which were started before the flushed memtable's sstable was added to the undrelying data source. All populatinos started after that are guaranteed to see the new data.	2015-11-29 16:25:21 +01:00
Paweł Dziepak	b1b830bcbb	row_cache: merge cache_entry::compare and ring_position_compare Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2015-10-22 12:25:02 +03:00
Paweł Dziepak	c2b53c5282	row_cache: add scanning_and_populating_reader This reader enables range queries on row cache. An underlying key_reader is used to obtain information about partitions that belong to the specified range and if any of them isn't in the cache an underlying mutation reader is used to read the missing data. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2015-10-20 20:27:53 +02:00
Paweł Dziepak	9abefbfa28	row_cache: add just_cache_scanning_reader This mutation reader returns mutations from cache that are in a given range. There may be other mutations in the system (e.g. in sstables) that won't be returned, so this reader on its own cannot really satisfy any query. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2015-10-20 20:27:53 +02:00
Paweł Dziepak	c765b38599	row_cache: add modification counter Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2015-10-20 20:27:53 +02:00
Paweł Dziepak	c1e95dd893	row_cache: pass underlying key_source to row_cache Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2015-10-20 20:27:53 +02:00
Paweł Dziepak	6b73d6bb36	row_cache: add cache_entry::ring_position_compare Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2015-10-20 20:27:53 +02:00
Pekka Enberg	ac4007153d	row_cache: Implement clear() helper We need to clear the row cache for column family truncate operation.	2015-09-30 09:09:42 +02:00
Tomasz Grabiec	1b1cfd2cbf	tests: Introduce tests/memory_footprint_test	2015-09-23 21:27:44 -07:00
Tomasz Grabiec	4712af2c21	row_cache: Use allocating_section in row_cache::populate() Cache has a tendency to eat up all available memory. It is evicted on-demand, but this happens at certain points in time (during large allocation requests). Small allocations which are served from small object pools won't usually trigger this. Large allocations happen for example when LSA region needs a new segment, eg. when row cache is populated. If large allocations happen for certain period only inside row_cache::update(), then eviction will not be able to make forward progress because cache's LSA region is locked inside row_cache::update(). While it's locked, data can't be evicted from it. The solution is to use allocating_section. Fixes #376.	2015-09-21 13:25:13 +03:00
Avi Kivity	d5cf0fb2b1	Add license notices	2015-09-20 10:43:39 +03:00
Avi Kivity	9dbe8ca1b5	row_cache: reduce cpu impact of memtable flush Restrict the impact of flushing a memtable to row_cache to 20% of the cpu. This is accomplished by converting the code to a thread (with bad indentation to improve patch readability) and using a thread scheduling group.	2015-09-19 09:22:52 +03:00
Tomasz Grabiec	91e7dcfe10	row_cache: Don't count insertions and merges as hits and misses Currently cache update which from a flushed memtable affects hits and misses, which may be confusing. Let's reserve hits and misses for reads. Cache update will affect counters called "insertions" and "merges".	2015-09-10 12:41:27 +03:00
Tomasz Grabiec	f64ac3a80e	row_cache: Extract scanning reader construction	2015-09-10 12:41:27 +03:00
Tomasz Grabiec	447e59eaf9	row_cache: Expose a metric for the number of cached partitions Fixes #193.	2015-09-10 12:41:12 +03:00
Tomasz Grabiec	74603425ac	mutation_partition: Introduce r-value version of apply()	2015-09-07 09:41:36 +02:00
Tomasz Grabiec	d1f89b4eab	row_cache: Use allocation_section See #259. When transferring mutations between memtable and cache, lsa sometimes runs out of memory. This solves the first two points, keeping reserve filled up and adjusting the amount of reserve based on execution history.	2015-09-06 21:25:44 +02:00
Tomasz Grabiec	7efcde12aa	row_cache: Introduce row_cache::touch() Useful in tests for ensuring that certain entries survive eviction.	2015-09-06 21:25:44 +02:00
Avi Kivity	4390be3956	Rename 'negative_mutation_reader' to 'partition_presence_checker' Suggested by Tomek.	2015-08-24 18:03:22 +03:00
Amnon Heiman	0e1aa2e78b	Expose the cache tracker and the num_entries in row_cache This expose the cache tracker and the num entries in the row cache so it can be used by the API. And it adds a const getter for the region. Both are const and are used for inspecting only. Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>	2015-08-17 19:42:23 +03:00
Avi Kivity	1016b21089	cache: improve preloading of flushed memtable mutations If a mutation definitely doesn't exist in all sstables, then we can certainly load it into the cache.	2015-08-09 22:46:08 +03:00
Tomasz Grabiec	ef549ae5a5	lsa: Reclaim space from evictable regions incrementally When LSA reclaimer cannot reclaim more space by compaction, it will reclaim data by evicting from evictable regions. Currently the only evictable region is the one owned by the row cache.	2015-08-08 09:59:24 +02:00
Tomasz Grabiec	7a8f1ef6c3	row_cache: Replace _lru_len counter with region occupancy _lru_len may get stale when row_cache instance goes out of scope purging all its partitions from cache. I'm assuming we're not really interested in the number of partitions here, but rather a measure of occupancy, so I applied a simple fix of using LSA region occupancy instead.	2015-08-08 09:59:24 +02:00
Tomasz Grabiec	926509525f	row_cache: Switch to using LSA	2015-08-06 14:05:16 +02:00

1 2

54 Commits