Commit Graph

13507 Commits

Author SHA1 Message Date
Tomasz Grabiec
a968a84ec5 tests: mutation_source_tests: Always print the seed
BOOST_TEST_MESSAGE() is not logged by default, and for some tests we
don't want to enable that because it's too noisy. But we need to know
the seed to reproduce a failure, so we better to always print it.
2017-11-13 20:55:14 +01:00
Tomasz Grabiec
e868929faf tests: Disable alloc failure injection in test assertions
Injecting failures to assertions doesn't add much value but slows down
test execution by adding extra iterations.
2017-11-13 20:55:14 +01:00
Tomasz Grabiec
5cf7f9d1bb tests: Avoid needless copies 2017-11-13 20:55:14 +01:00
Tomasz Grabiec
1971332195 row_cache: Fix exception safety of cache_entry::read()
When we fail, we need to return streamed_mutation back, so that
the operation can be retried.

Causes SIGSEGV on nullptr otherwise on bad_alloc.
2017-11-13 20:55:14 +01:00
Tomasz Grabiec
11a195c403 row_cache: scanning_and_populating_reader: Fix exception unsafety causing read to skip data
If assignment to _lower_bound in the "_secondary_in_progress = true;"
case in do_read_from_primary() throws due to allocation failure, the
update section will be retried and we will take the not_moved path,
skipping the range which was discontinuous and was supposed to be read
from underlying.

Fix by redoing lookup using _lower_bound in case the section is
retried. When we retry, _primary.valid() will be false. We need to
ensure now that _lower_bound is always valid.

Fixes #2944.
2017-11-13 20:55:14 +01:00
Tomasz Grabiec
5dc1ee41e4 row_cache: partition_range_cursor: Extract valid() and advance_to() from refresh() 2017-11-13 20:55:14 +01:00
Tomasz Grabiec
09c49b2db3 cache_streamed_mutation: Add trace-level logging to cache_streamed_mutation 2017-11-13 20:55:14 +01:00
Tomasz Grabiec
f60cfa34f4 mvcc: Lift noexcept off partition_snapshot_row_weakref assignment/constructors
Assignment to _pos (position_in_partition) may throw. noexcept is a
remnant from the version which didn't have _pos.
2017-11-13 20:55:14 +01:00
Tomasz Grabiec
bd7b68f877 cache_streamed_mutation: Make advancing to the next range exception-safe
Changing _ck_ranges_curr and _lower_bound should be atomic, either
both fail or both succeed.  Currently it could happen that if
position_in_partition::for_range_start() fails, _ck_ranges_curr would
be advanced but _lower_bound not.
2017-11-13 20:55:14 +01:00
Tomasz Grabiec
081deec731 cache_streamed_mutation: Make add_clustering_row_to_buffer() exception-safe
We need to maintain the following invariants:
 (1) no fragment with position >= _lower_bound was pushed yet
 (2) If _lower_bound > mf.position(), mf was emitted

Before this patch (1) could be violated if drain_tombstones() failed
in the middle. (2) could be violated if push_mutation_fragment()
failed.
2017-11-13 20:55:14 +01:00
Tomasz Grabiec
d1b844737a cache_streamed_mutation: Make drain_tombstones() exception-safe
If push_mutation_fragment() failed, mfo which we got from get_next()
would be lost. Fix by making sure push_mutation_fragment() won't fail.
2017-11-13 20:55:14 +01:00
Tomasz Grabiec
875fc93956 cache_streamed_mutation: Return void from start_reading_from_underlying()
The return value is no longer used.
2017-11-13 20:55:14 +01:00
Tomasz Grabiec
5fb319bbb9 cache_streamed_mutation: Document invariants related to exception-safety 2017-11-13 20:55:14 +01:00
Tomasz Grabiec
53f4452b47 streamed_mutation: Add reserve_one() 2017-11-13 20:55:13 +01:00
Tomasz Grabiec
8d69d217af lsa: Guarantee invalidated references on allocating section retry
There is existing code (e.g. use of partition_snapshot_row_cursor in
cache_streamed_mutation) which assumes that references will be
invalidated when bad_alloc is thrown from allocating_section. That is
currently the case because on retry we will attempt memory reclamation
which will invalidate references either through compaction or
eviction. Make this guarantee explicit.
2017-11-13 20:55:13 +01:00
Tomasz Grabiec
6bf1c6014f mvcc: partition_snapshot_row_cursor: Mark allocation points
This marks places which may allocate but not always do as allocation
points to increase effectiveness of testing.
2017-11-13 20:55:13 +01:00
Avi Kivity
f8af4f507b Merge "Support for varint and decimal in aggregate functions" from Daniel
"This patch adds support for varint and decimal to aggregate functions.
Some other types (like byte or smallint) weren't supported and they are
supported by C*. So their aggregate functions were added as well.

To allow aggregate functions for big_decimal, following methods were added
to big_decimal type:
  * Division by int64_t that preservers number of decimal digits.
  * Operator += .
  * Comparison operators.

Fixes #2842."

* 'danfiala/scylla-2842-send-002' of https://github.com/hagrid-the-developer/scylla:
  tests: Add tests for aggregate functions.
  tests: Add tests for big_decimal type.
  cql3/functions: Add aggregate functions for big_decimal.
  utils/big_decimal: Added necessary operators and methods for aggregate functions.
  cql3/functions: Add aggregate functions for types for which it is trivial.
2017-11-12 17:11:33 +02:00
Daniel Fiala
bc20484c47 tests: Add tests for aggregate functions.
Signed-off-by: Daniel Fiala <daniel@scylladb.com>
2017-11-12 15:53:22 +01:00
Daniel Fiala
ee1d69502b tests: Add tests for big_decimal type.
Signed-off-by: Daniel Fiala <daniel@scylladb.com>
2017-11-12 15:53:22 +01:00
Daniel Fiala
74c5f70b0a cql3/functions: Add aggregate functions for big_decimal.
Signed-off-by: Daniel Fiala <daniel@scylladb.com>
2017-11-12 15:53:13 +01:00
Daniel Fiala
ce2f010859 utils/big_decimal: Added necessary operators and methods for aggregate functions.
Signed-off-by: Daniel Fiala <daniel@scylladb.com>
2017-11-12 15:51:29 +01:00
Daniel Fiala
115668fe70 cql3/functions: Add aggregate functions for types for which it is trivial.
Signed-off-by: Daniel Fiala <daniel@scylladb.com>
2017-11-12 13:56:20 +01:00
Tomasz Grabiec
484dde692f Merge "make sure that cache updates don't overflow dirty memory" from Glauber
Since we started accounting virtual dirty memory we no longer have a cap
on real dirty memory. In most situations that is not needed, since real
dirty will just be at most twice as much as virtual dirty (current
flushing memtable plus new memtable).

However, due to things like cache updates and component flushing we can
end up having a lot of memtables that are virtually freed but not yet
fully released, leading real dirty memory to explode using all the box'
memory.

This patch adds a cap on real dirty memory as well. Because of the
hierarchical nature of region_group, if the parent blocks due to memory
depletion, so will the child (virtual dirty region group).

After that is done, we need to make sure that dirty memory is not seen
as freed until the cache update is done. Until a particular partition is
moved to the cache it is not evictable. As a result we can OOM the
system if we have a lot of pending cache updates as the writes will not
be throttled and memory won't be made available.

This patch pins the memory used by the region as real dirty before the
cache update starts, and unpins it when it is over. In the mean time it
gradually releases memory of the partitions that are being moved to
cache.

I have verified in a couple of workloads that the amount of memory
accounted through this is the same amount of memory accounted through
the memtable flush procedure.

Fixes #1942

* git@github.com:glommer/scylla.git glommer/update-cache-v4:
  row_cache: modernize use of seastar threads
  mutation_partition: estimate size of partition
  memtable: factor out calculation of memtable_entry memory size
  memtable: add a method to export memtable's dirty memory manager
  dirty_memory_manager: block if we hit the real dirty limit
  dirty_memory_manager: add functions to manipulate real dirty
  partition: add method to calculate memory size of a partition
  row cache: pin real dirty during cache updates.
2017-11-10 13:55:12 +01:00
Piotr Jastrzebski
e7a0732f72 Add schema_ptr to flat_mutation_reader
It is usefull to have a schema inside a flat reader
the same way we had schema inside a streamed_mutation.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <b37e0dbf38810c00bd27fb876b69e1754c16a89f.1510312137.git.piotr@scylladb.com>
2017-11-10 13:54:55 +01:00
Pekka Enberg
0c192c835c cql3: Fix 'DROP INDEX' to also drop index view
This patch fixes 'DROP INDEX' CQL statement to also drop the underlying
index view automatically so that we don't leave unused materialized
views behind.
Message-Id: <1510303421-15945-1-git-send-email-penberg@scylladb.com>
2017-11-10 10:52:08 +01:00
Duarte Nunes
73f6c9a612 Merge seastar upstream
* seastar 8040cab...11ad0b1 (7):
  > alloc_failure_injector: Fix compilation error with gcc 7.1
  > core/gate: Add is_closed() function
  > doc: code formatting and fix function call
  > doc: tutoral code formatting
  > build: adjust -Wno-error=cpp for clang
  > build: don't error out on preprocessor #warning
  > Merge 'Enhancements of allocation failure injector' from Tomasz

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-11-09 14:42:06 +01:00
Takuya ASADA
f607a01cc5 dist/debian: link boost statically
Since we switched scylla-boost163 which isn't provided by distribution repo,
we need to link them statically.

Fixes #2946

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <1510229553-29801-1-git-send-email-syuu@scylladb.com>
2017-11-09 14:51:00 +02:00
Glauber Costa
1d7617723d row cache: pin real dirty during cache updates.
Right now, once a region is moved to the cache is no longer visible to
the dirty memory system. Not as real dirty nor virtual dirty.

The problem is that until a particular partition is moved to the cache
it is not evictable. As a result we can OOM the system if we have a lot
of pending cache updates as the writes will not be throttled and memory
won't be made available.

This patch pins the memory used by the region as real dirty before the
cache update starts, and unpins it when it is over. In the mean time it
gradually releases memory of the partitions that are being moved to
cache.

I have verified in a couple of workloads that the amount of memory
accounted through this is the same amount of memory accounted through
the memtable flush procedure.

Fixes #1942

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2017-11-08 19:46:36 -05:00
Glauber Costa
c2f49da609 partition: add method to calculate memory size of a partition
Once that is added, also add a method to a memtable entry to calculate
the entire size of a memtable entry. Right now we only have one method
to calculate the size minus rows.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2017-11-08 16:21:44 -05:00
Glauber Costa
b02ab991b9 dirty_memory_manager: add functions to manipulate real dirty
There are times in which we want to add and remove real dirty memory
without impacting virtual dirty. One such example is the cache update
process, where real dirty is the limiting factor.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2017-11-08 16:21:44 -05:00
Glauber Costa
a6b2226562 dirty_memory_manager: block if we hit the real dirty limit
Since we started accounting virtual dirty memory we no longer have a cap
on real dirty memory. In most situations that is not needed, since real
dirty will just be at most twice as much as virtual dirty (current
flushing memtable plus new memtable).

However, due to things like cache updates and component flushing we can
end up having a lot of memtables that are virtually freed but not yet
fully released, leading real dirty memory to explode using all the box'
memory.

This patch adds a cap on real dirty memory as well. Because of the
hierarchical nature of region_group, if the parent blocks due to memory
depletion, so will the child (virtual dirty region group).

A next step is to add a controller that will increase the priority of
the tasks involving in releasing real dirty memory if we get dangerously
close to the threshold.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2017-11-08 16:21:44 -05:00
Glauber Costa
b98a48657e memtable: add a method to export memtable's dirty memory manager
It will be used by the cache update process to gradually return real
dirty memory to the manager.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2017-11-08 16:21:44 -05:00
Glauber Costa
ec36b9eddc memtable: factor out calculation of memtable_entry memory size
The total size is the sum of two components. Add a method that
does that sum so this code gets easier to reuse.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2017-11-08 16:21:44 -05:00
Glauber Costa
d49ecae201 mutation_partition: estimate size of partition
In the memtable flusher, we account for the size of a partition as we
read them. However, there are other points in the architecture where we
would like to calculate the size of a partition in a point in which we
are not reading it. One such example is the cache update process.

This patch enhances the mutation_partition adding a method that returns
the total size for this partition.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2017-11-08 16:21:44 -05:00
Glauber Costa
b836005555 row_cache: modernize use of seastar threads
For a while now we have an async() function, that simplifies the code by not
needing to issue an explicit join. This patch converts the row cache to use
async() as well, which most of our code already does. Doing so will make
it easier to make changes to update_cache.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
2017-11-08 16:21:44 -05:00
Paweł Dziepak
b69f94fece Merge "Implement flat_mutation_reader::consume" from Piotr
"Implement flat_mutation_reader::consume and add tests for it.
For that implement flat_mutation_reader_from_mutations and
read_mutation_from_flat_mutation_reader."

* 'haaawk/flat_reader_consume_v3' of github.com:scylladb/seastar-dev:
  Add tests for flat_mutation_reader::consume
  Add tests for flat_mutation_reader utils
  Introduce read_mutation_from_flat_mutation_reader
  Make mutation_rebuilder streamed_mutation independent
  flat_mutation_reader_from_mutation: support multiple mutations
  Introduce flat_mutation_reader::consume
  Move FlattenedConsumer concept to flat_mutation_reader.hh
2017-11-08 15:08:47 +00:00
Paweł Dziepak
0373f357a8 Merge "Make memtable::make_reader return flat_mutation_reader" from Piotr
"This patchset introduces memtable::make_flat_reader that returns
flat_mutation_reader and converts internal memtable readers into
flat_mutation_readers.

It also introduces some utility methods like make_forwardable and
make_partition_snapshot_flat_reader."

* 'haaawk/flat_reader_memtable_v4' of github.com:scylladb/seastar-dev:
  Turn scanning_reader into flat_mutation_reader
  Change memtable_entry::read to return flat_mutation_reader
  Make iterator_reader independent from mutation_reader
  Introduce make_partition_snapshot_flat_reader
  Prepare partition_snapshot_flat_reader
  Introduce flat_mutation_reader_from_mutation
  Prepare flat_mutation_reader_from_mutation
  Introduce make_forwardable
  Prepare make_forwardable for flat_mutation_reader
  Introduce empty_flat_reader
  memtable: Introduce make_flat_reader
2017-11-08 14:24:26 +00:00
Piotr Jastrzebski
29d409de2f Add tests for flat_mutation_reader::consume
Make sure that flat_mutation_reader::consume stops
as it's asked by the consumer.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2017-11-08 14:26:10 +01:00
Piotr Jastrzebski
d42e53982d Add tests for flat_mutation_reader utils
Test flat_mutation_reader_from_mutations and
read_mutation_from_flat_mutation_reader.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2017-11-08 14:26:10 +01:00
Piotr Jastrzebski
4b58a05053 Introduce read_mutation_from_flat_mutation_reader
This helper method reads a single mutation from
a flat_mutation_reader.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2017-11-08 14:26:10 +01:00
Piotr Jastrzebski
6718ecab82 Make mutation_rebuilder streamed_mutation independent
mutation_rebuilder will be used not only with streamed_mutations
but also with flat_mutation_readers so it's better for it to be
independent from streamed_mutation.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2017-11-08 14:26:10 +01:00
Piotr Jastrzebski
aa16cd7eef flat_mutation_reader_from_mutation: support multiple mutations
Rename flat_mutation_reader_from_mutation to
flat_mutation_reader_from_mutations.

Make it work with std::vector<mutation> instead of a single
mutation.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2017-11-08 14:26:10 +01:00
Piotr Jastrzebski
bcd5415413 Introduce flat_mutation_reader::consume
This is equivalent to consume_flattened for old
mutation_reader.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2017-11-08 14:25:28 +01:00
Piotr Jastrzebski
9233ee7309 Move FlattenedConsumer concept to flat_mutation_reader.hh
This concept will be used both in flat_mutation_reader.hh
and mutation_reader.hh. mutation_reader.hh includes
flat_mutation_reader.hh so we have to move the concept to
make it accessible in both files.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2017-11-08 14:14:51 +01:00
Piotr Jastrzebski
864d02e795 Turn scanning_reader into flat_mutation_reader
This will make memtable::make_reader more efficient.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2017-11-08 14:08:53 +01:00
Tomasz Grabiec
9e115fb7e2 Merge addition of mutation_source::make_flat_mutation_reader() from Piotr
Make it possible for a mutation_source to be created both for sources
that use old mutation_reader and new flat_mutation_reader.

Add tests for flat_mutation_reader::next_partition to run_mutation_source_tests.

* seastar-dev.git 'dev/haaawk/flat_reader_mutation_source_v3':
  Remove mutation_reader.hh dependency from flat_mutation_reader.hh
  Prepare mutation_source for more than one implementation
  Add flat reader mutation source implementation
  Add mutation_source::make_flat_mutation_reader
  Use mutation_source::make_flat_mutation_reader in tests
  Add flat_mutation_reader_assertions
  Add test for flat_mutation_reader::next_partition
2017-11-08 14:05:25 +01:00
Piotr Jastrzebski
68505a5065 Change memtable_entry::read to return flat_mutation_reader
This is the first step to move scanning_reader to
be flat_mutation_reader.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2017-11-08 13:52:09 +01:00
Piotr Jastrzebski
7b016527bf Make iterator_reader independent from mutation_reader
iterator_reader will be used also in flat_mutation_reader.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2017-11-08 13:52:09 +01:00
Piotr Jastrzebski
f499949645 Introduce make_partition_snapshot_flat_reader
This allows creation of flat_mutation_reader from MVCC.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2017-11-08 13:52:09 +01:00
Piotr Jastrzebski
ced64d7571 Prepare partition_snapshot_flat_reader
This commit creates a copy of partition_snapshot_reader
and names it partition_snapshot_flat_reader.
This new class will be turned into a flat_mutation_reader
in the next commit.

The purpose of this commit is to make it easier to review the next
commit.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2017-11-08 13:52:09 +01:00