Commit Graph

16 Commits

Author SHA1 Message Date
Tomasz Grabiec
53026f3ba6 memtable: Subtract from flushed memory when cleaning
This patch prevents virtual dirty from going negative during memtable
flush in case partition version merging erases data previously
accounted by the flush reader. There is an assert in
~flush_memory_accounter which guards for this.

This will start happening after tombstones are compacted with rows on
partition version merging.

This problem is prevented by the patch by having the cleaner notify
the memtable layer via callback about the amount of dirty memory released
during merging, so that the memtable layer can adjust its accounting.
2022-06-15 11:30:25 +02:00
Tomasz Grabiec
a4e96960b8 mvcc: Apply mutations in memtable with preemption enabled
Preerequisite for eagerly applying tombstones, which we want to be
preemptible. Before the patch, apply path to the memtable was not
preemptible.

Because merging can now be defered, we need to involve snapshots to
kick-off background merging in case of preemption. This requires us to
propagate region and cleaner objects, in order to create a snapshot.
2022-06-15 11:29:43 +02:00
Avi Kivity
5129280f45 Revert "Merge 'memtable, cache: Eagerly compact data with tombstones' from Tomasz Grabiec"
This reverts commit e0670f0bb5, reversing
changes made to 605ee74c39. It causes failures
in debug mode in
database_test.test_database_with_data_in_sstables_is_a_mutation_source_plain,
though with low probability.

Fixes #10780
Reopens #652.
2022-06-14 18:06:22 +03:00
Tomasz Grabiec
9135d1fd1f memtable: Subtract from flushed memory when cleaning
This patch prevents virtual dirty from going negative during memtable
flush in case partition version merging erases data previously
accounted by the flush reader. There is an assert in
~flush_memory_accounter which guards for this.

This will start happening after tombstones are compacted with rows on
partition version merging.

This problem is prevented by the patch by having the cleaner notify
the memtable layer via callback about the amount of dirty memory released
during merging, so that the memtable layer can adjust its accounting.
2022-06-06 19:25:41 +02:00
Tomasz Grabiec
0e3c4fc641 mvcc: Apply mutations in memtable with preemption enabled
Preerequisite for eagerly applying tombstones, which we want to be
preemptible. Before the patch, apply path to the memtable was not
preemptible.

Because merging can now be defered, we need to involve snapshots to
kick-off background merging in case of preemption. This requires us to
propagate region and cleaner objects, in order to create a snapshot.
2022-06-06 19:23:37 +02:00
Botond Dénes
4f77e74bd4 partition_snapshot_reader: convert implementation to native v2
The underlying mutation representation is still v1, so the
implementation still has to do conversion. This happens right above the
lsa reader component.
2022-04-28 14:12:12 +03:00
Avi Kivity
a9812166cd replica, partition_snapshot_reader, keys: replace boost::any with std::any
Reduce #include load by standardizing on std::any.

In keys.cc, we just drop the unneeded include.

One instance of boost::any remains in config_file, due to a tie-in with
other boost components.

Closes #10441
2022-04-28 07:18:53 +03:00
Avi Kivity
e7fb71020b Merge 'replica: Optimize empty_flat_reader out of the hot path' from Michał Chojnowski
When row_cache::make_reader() and memtable::make_flat_reader() see that the query result is empty, they return empty_flat_reader, which is a trivial implementation of flat_mutation_reader.
Even though empty_flat_reader doesn't do anything meaningful, it still needs to be created, handled in merging_reader and destroyed. Turns out this is costly.

This patch series replaces hot path uses of empty_flat_reader with an empty optional.

Performance effects:

`perf_simple_query --smp 1`
TPS: 138k -> 168k
allocs/op: 80.2 -> 71.1
insns/op: 49.9k -> 45.1k

`perf_simple_query --smp 1 --enable-cache=1 --flush`
TPS: 125k -> 150k
allocs/op: 79.2 -> 71.1
insns/op: 51.7k -> 47.2k

For a cassandra-stress benchmark (localhost, 100% cache reads) this translates to a TPS increase from ~42k to ~48k per hyperthread.

Note that this optimization is effective for single-partition reads where the queried partition is only in cache/sstables or only in memtables. Other queries (e.g. where the partition is in both cache in memtables and needs to be merged) are unaffected.

Closes #10204

* github.com:scylladb/scylla:
  replica: Prefer row_cache::make_reader_opt() to row_cache::make_reader()
  row_cache: Add row_cache::make_reader_opt()
  replica: Prefer memtable::make_flat_reader_opt() to memtable::make_flat_reader()
  memtable: Add memtable::make_flat_reader_opt()

[avi: adjust #include for readers/ split]
2022-03-14 14:07:00 +02:00
Mikołaj Sielużycki
1d84a254c0 flat_mutation_reader: Split readers by file and remove unnecessary includes.
The flat_mutation_reader files were conflated and contained multiple
readers, which were not strictly necessary. Splitting optimizes both
iterative compilation times, as touching rarely used readers doesn't
recompile large chunks of codebase. Total compilation times are also
improved, as the size of flat_mutation_reader.hh and
flat_mutation_reader_v2.hh have been reduced and those files are
included by many file in the codebase.

With changes

real	29m14.051s
user	168m39.071s
sys	5m13.443s

Without changes

real	30m36.203s
user	175m43.354s
sys	5m26.376s

Closes #10194
2022-03-14 13:20:25 +02:00
Michał Chojnowski
f211ef9d71 replica: Prefer memtable::make_flat_reader_opt() to memtable::make_flat_reader()
The former is significantly cheaper when there is nothing to be read.
2022-03-14 12:02:49 +01:00
Michał Chojnowski
218f2b6e98 memtable: Add memtable::make_flat_reader_opt()
When there is nothing to read, make_flat_reader() returns an empty (no-op)
reader. But it turns out that constructing, combining and destroying that
empty reader is quite costly.
As an optimization, add an alternative version which returns an empty optional
instead.
2022-03-14 12:02:49 +01:00
Mikołaj Sielużycki
f4c57cbe87 memtable: Convert partition_snapshot_flat_reader to v2.
This is a facade change only, the make_partition_snapshot_flat_reader
function calls upgrade_to_v2 internally.

Closes #10152
2022-03-02 15:07:36 +02:00
Michael Livshin
34ed752885 memtable::make_flush_reader(): return flat_mutation_reader_v2
Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2022-02-28 17:11:54 +02:00
Michael Livshin
9bacce4359 memtable::make_flat_reader(): return flat_mutation_reader_v2
This is just a facade change.

Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2022-02-28 17:11:54 +02:00
Michał Radwański
2a3bd40c69 memtable: upgrade scanning_reader and flush_reader to v2
This change is a part of effort to migrate existing readers from old API
to the new one. The corresponding make_flush_reader and
make_flat_reader functions still return flat_mutation_reader.
2022-02-28 17:11:54 +02:00
Avi Kivity
cbba80914d memtable: move to replica module and namespace
Memtables are a replica-side entity, and so are moved to the
replica module and namespace.

Memtables are also used outside the replica, in two places:
 - in some virtual tables; this is also in some way inside the replica,
   (virtual readers are installed at the replica level, not the
   cooordinator), so I don't consider it a layering violation
 - in many sstable unit tests, as a convenient way to create sstables
   with known input. This is a layering violation.

We could make memtables their own module, but I think this is wrong.
Memtables are deeply tied into replica memory management, and trying
to make them a low-level primitive (at a lower level than sstables) will
be difficult. Not least because memtables use sstables. Instead, we
should have a memtable-like thing that doesn't support merging and
doesn't have all other funky memtable stuff, and instead replace
the uses of memtables in sstable tests with some kind of
make_flat_mutation_reader_from_unsorted_mutations() that does
the sorting that is the reason for the use of memtables in tests (and
live with the layering violation meanwhile).

Test: unit (dev)

Closes #10120
2022-02-23 09:05:16 +02:00