scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-22 09:30:45 +00:00

Author	SHA1	Message	Date
Tomasz Grabiec	ac49b1def0	mutation_cleaner: Migrate partition_snapshots when queueing for background cleanup partition_snapshots created in the memtable will keep a reference to the memtable (as region*) and to memtable::_cleaner. As long as the reader is alive the memtable will be kept alive by partition_snapshot_flat_reader::_container_guard. But after that, nothing prevents it from being destroyed. The snapshot can outlive the read if mutation_cleaner::merge_and_destroy() defers its destruction for later. When the read ends after memtable was flushed, the snapshot will be queued in the cache's cleaner, but internally will reference memtable's region and cleaner. This will result in a use-after-free when the snapshot resumses destruction. The fix is to update snapshots's region and cleaner references at the time of queueing to point to the cache's region and cleaner. When memtable is destroyed without being moved to cache there is no problem, because the snapshot would be queued into memtable's cleaner, which will be drained on destruction from all snapshots. Introduced in `f3da043`. Fixes #4030.	2018-12-27 18:08:50 +01:00
Tomasz Grabiec	67f9afbd1a	mutation_cleaner: impl: Store a back-reference to the owning mutation_cleaner	2018-12-27 18:08:50 +01:00
Tomasz Grabiec	074be4d4e8	memtable, cache: Run mutation_cleaner worker in its own scheduling group The worker is responsible for merging MVCC snapshots, which is similar to merging sstables, but in memory. The new scheduling group will be therefore called "memory compaction". We should run it in a separate scheduling group instead of main/memtables, so that it doesn't disrupt writes and other system activities. It's also nice for monitoring how much CPU time we spend on this.	2018-06-27 21:51:04 +02:00
Tomasz Grabiec	6c6ffaee71	mutation_cleaner: Make merge() redirect old instance to the new one If memtable snapshot goes away after memtable started merging to cache, it would enqueue the snapshots for cleaning on the memtable's cleaner, which will have to clean without deferrring when the memtable is destroyed. That may stall the reactor. To avoid this, make merge() cause the old instance of the cleaner to redirect to the new instance (owned by cache), like we do for regions. This way the snapshots mentioned earlier can be cleaned after memtable is destroyed, gracefully.	2018-06-27 21:51:04 +02:00
Tomasz Grabiec	c26a304fbb	mvcc: Merge partition version versions gradually in the background When snapshots go away, typically when the last reader is destroyed, we used to merge adjacent versions atomically. This could induce reactor stalls if partitions were large. This is especially true for versions created on cache update from memtables. The solution is to allow this process to be preempted and move to the background. mutation_cleaner keeps a linked list of such unmerged snapshots and has a worker fiber which merges them incrementally and asynchronously with regards to reads. This reduces scheduling latency spikes in tests/perf_row_cache_update for the case of large partition with many rows. For -c1 -m1G I saw them dropping from 23ms to 2ms.	2018-06-27 12:48:30 +02:00
Paweł Dziepak	bdc299cc38	mutation_cleaner: add disclaimer about mutation_partition lifetime mutation_cleaner has already caused problems by extending lifetime of mutation_partition past the lifetime of LSA migrators that it uses (due to the fact that both the cleaner and migrators where thread-local globals). Since, the long term goal is to make mutation_partition internal representation depend more and more on schema that lifetime extension may again cause problems in the future, so let's add a disclaimer that hopefuly, will help avoiding them.	2018-06-25 09:37:43 +01:00
Tomasz Grabiec	e0803ff71e	Introduce mutation_cleaner Used for collecting unsued partition_version objects and freeing them incrementally. Will be used for both cache and memtables.	2018-05-30 14:41:39 +02:00

7 Commits