Commit Graph

2834 Commits

Author SHA1 Message Date
Tomasz Grabiec
7747f2dde3 Merge "nodetool toppartitions" from Rafi & Avi
Implementation of nodetool toppartiotion query, which samples most frequest PKs in read/write
operation over a period of time.

Content:
- data_listener classes: mechanism that interfaces with mutation readers in database and table classes,
- toppartition_query and toppartition_data_listener classes to implement toppartition-specific query (this
  interfaces with data_listeners and the REST api),
- REST api for toppartitions query.

Uses Top-k structure for handling stream summary statistics (based on implementation in C*, see #2811).

What's still missing:
- JMX interface to nodetool (interface customization may be required),
- Querying #rows and #bytes (currently, only #partitions is supported).

Fixes #2811

* https://github.com/avikivity/scylla rafie_toppartitions_v7.1:
  top_k: whitespace and minor fixes
  top_k: map template arguments
  top_k: std::list -> chunked_vector
  top_k: support for appending top_k results
  nodetool toppartitions: refactor table::config constructor
  nodetool toppartitions: data listeners
  nodetool toppartitions: add data_listeners to database/table
  nodetool toppartitions: fully_qualified_cf_name
  nodetool toppartitions: Toppartitions query implementation
  nodetool toppartitions: Toppartitions query REST API
  nodetool toppartitions: nodetool-toppartitions script
2018-12-28 16:31:24 +01:00
Rafi Einstein
0bffe5f83e nodetool toppartitions: add data_listeners to database/table
Add data_listeners member to database.
Adds data_listeners* to table::config, to be used by table methods to invoke listeners.
Install on_read() listener in table::make_reader().
Install on_write() listener in database::apply_in_memory().

Tests: Unit (release)
Signed-off-by: Rafi Einstein <rafie@scylladb.com>
2018-12-28 16:45:57 +02:00
Rafi Einstein
aeebe8e86b top_k: std::list -> chunked_vector
Replaced std::list with chunked_vector. Because chunked_vector requires
a noexcept move constructor from its value type, change the bad_boy type
in the unit test not to throw in the move constructor.

Signed-off-by: Rafi Einstein <rafie@scylladb.com>
2018-12-28 16:45:07 +02:00
Avi Kivity
8e2f6d0513 Merge "Fix use-after-free when destroying partition_snapshots in the background"from Tomasz
"
partition_snapshots created in the memtable will keep a reference to
the memtable (as region*) and to memtable::_cleaner. As long as the
reader is alive, the memtable will be kept alive by
partition_snapshot_flat_reader::_container_guard. But after that
nothing prevents it from being destroyed. The snapshot can outlive the
read if mutation_cleaner::merge_and_destroy() defers its destruction
for later. When the read ends after memtable was flushed, the snapshot
will be queued in the cache's cleaner, but internally will reference
memtable's region and cleaner. This will result in a use-after-free
when the snapshot resumes destruction.

The fix is to update snapshots's region and cleaner references at the
time of queueing to point to the cache's region and cleaner.

When memtable is destroyed without being moved to cache there is no
problem because the snapshot would be queued into memtable's cleaner,
which will be drained on destruction from all snapshots.

Introduced in f3da043 (in >= 3.0-rc1)

Fixes #4030.

Tests:

  - mvcc_test (debug)

"

* tag 'fix-snapshot-merging-use-after-free-v1.1' of github.com:tgrabiec/scylla:
  tests: mvcc: Add test_snapshot_merging_after_container_is_destroyed
  tests: mvcc: Introduce mvcc_container::migrate()
  tests: mvcc: Make mvcc_partition move-constructible
  tests: mvcc: Introduce mvcc_container::make_not_evictable()
  tests: mvcc: Allow constructing mvcc_container without a cache_tracker
  mutation_cleaner: Migrate partition_snapshots when queueing for background cleanup
  mvcc: partition_snapshot: Introduce migrate()
  mutation_cleaner: impl: Store a back-reference to the owning mutation_cleaner
2018-12-28 12:45:10 +02:00
Tomasz Grabiec
bb1c9cb6f3 tests: mvcc: Add test_snapshot_merging_after_container_is_destroyed 2018-12-28 10:32:39 +01:00
Tomasz Grabiec
4d13dea39a tests: mvcc: Introduce mvcc_container::migrate() 2018-12-28 10:32:39 +01:00
Tomasz Grabiec
676868ed31 tests: mvcc: Make mvcc_partition move-constructible 2018-12-28 10:32:39 +01:00
Tomasz Grabiec
c6798f7872 tests: mvcc: Introduce mvcc_container::make_not_evictable() 2018-12-28 10:32:39 +01:00
Tomasz Grabiec
1fa00656ea tests: mvcc: Allow constructing mvcc_container without a cache_tracker
Some test cases will need many containers to simulate memtable ->
cache transitions, but there can be only one cache_tracker per shard
due to metrics. Allow constructing a conatiner without a cache_tracker
(and thus non-evictable).
2018-12-28 10:32:39 +01:00
Benny Halevy
f104951928 sstable_test: read_file should open the file read-only
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20181218145156.12716-1-bhalevy@scylladb.com>
2018-12-25 12:02:46 +02:00
Rafael Ávila de Espíndola
f8c81d4d89 tests: sstables: mc: add tests with incompatible schemas
In one test the types in the schema don't match the types in the
statistics file. In another a column is missing.

The patch also updates the exceptions to have more human readable
messages.

Tests: unit (release)

Part of issue #3960.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20181219233046.74229-1-espindola@scylladb.com>
2018-12-25 11:11:54 +02:00
Rafi Einstein
75f21954d4 top_k: whitespace and minor fixes
Style and minor logic changes from code review.

Signed-off-by: Rafi Einstein <rafie@scylladb.com>
2018-12-20 16:41:33 +02:00
Tomasz Grabiec
2b55ab8c8e Merge "Add more extensive test for mutation reader fast-forwarding" from Paweł
Mutation readers allow fast-forwarding the ranges from which the data is
being read. The main user of this feature is cache which, when reading
from the underlying reader, may want to skip some data it already has.
Unsurprisingly, this adds more complexity to the implementation of the
readers and more edge cases the developers need to take care of.

While most of the readers were at least to some extent checked in this
area those test usually were quite isolated (e.g. one test doing
inter-partition fast-forwarding, another doing intra-partition
fast-forwarding) and as a consequence didn't cover many corner cases.

This patch adds a generic test for fast-forwarding and slicing that
covers more complicated scenarios when those operations are combined.
Needless to say that did uncover some problems, but fortunately none of
them is user-visible.

Fixes #3963.
Fixes #3997.

Tests: unit(release, debug)

* https://github.com/pdziepak/scylla.git test-fast-forwarding/v4.1:
  tests/flat_mutation_reader_assertions: accumulate received tombstones
  tests/flat_mutation_reader_assertions: add more test messages
  tests/flat_mutation_reader_assertions: relax has_monotonic_positions()
    check
  tests/mutation_readers: do not ignore streamed_mutation::forwarding
  Revert "mutation_source_test: add option to skip intra-partition
    fast-forwarding tests"
  memtable: it is not a single partition read if partition
    fast-forwaring is enabled
  sstables: add more tracing in mp_row_consumer_m
  row_cache: use make_forwardable() to implement
    streamed_mutation::forwarding
  row_cache: read is not single-partition if inter-partition forwarding
    is enabled
  row_cache: drop support for streamed_mutation::forwarding::yes
    entirely
  sstables/mp_row_consumer: position_range end bound is exclusive
  mutation_fragment_filter: handle streamed_mutation::forwarding::yes
    properly
  tests/mutation_reader: reduce sleeping time
  tests/memtable: fix partition_range use-after-free
  tests/mutation: fix partition range use-after-free
  flat_mutation_reader_from_mutations: add overload that accepts a slice
    and partition range
  flat_mutation_reader_from_mutations: fix empty range case
  flat_mutation_reader_from_mutations: destroy all remaining mutations
  tests/mutation_source: drop dropped column handling test
  tests/mutation_source: add test for complex fast_forwarding and
    slicing
2018-12-20 15:05:21 +01:00
Paweł Dziepak
3355d16938 tests/mutation_source: add test for complex fast_forwarding and slicing
While we already had tests that verified inter- and intra-partition
fast-forwarding as well as slicing, they had quite limited scope and
didn't combine those operations. The new test is meant to extensively
test these cases.
2018-12-20 13:27:25 +00:00
Paweł Dziepak
26a30375b1 tests/mutation_source: drop dropped column handling test
Schema changes are now covered by for_each_schema_change() function.
Having some additional tests in run_mutation_source_tests() is
problematic when it is used to test intermediate mutation readers
because schema changes may be irrelevant for them, which makes the test
a waste of time (might be a problem in debug mode) and requires those
intermediate reader to use more complex underlying reader that supports
schema changes (again, problem in a very slow debug mode).
2018-12-20 13:27:25 +00:00
Paweł Dziepak
93488209de tests/mutation: fix partition range use-after-free 2018-12-20 13:27:25 +00:00
Paweł Dziepak
e91165d929 tests/memtable: fix partition_range use-after-free 2018-12-20 13:27:25 +00:00
Paweł Dziepak
5db8dacd1f tests/mutation_reader: reduce sleeping time
It is a very bad taste to sleep anywhere in the code. The test should be
fixed to explicitly test various orderings between concurrent
operations, but before that happens let's at least readuce how much
those sleeps slow it down by changing it from milliseconds to
microseconds.
2018-12-20 13:27:25 +00:00
Paweł Dziepak
bcb5aed1ef Revert "mutation_source_test: add option to skip intra-partition fast-forwarding tests"
This reverts commit b36733971b. That commit made
run_mutation_reader_tests() support  mutation_sources that do not implement
streamed_mutation::forwarding::yes. This is wrong since mutation_sources
are not allowed to ignore or otherwise not support that mode. Moreover,
there is absolutely no reason for them to do so since there is a
make_forwardable() adapter that can make any mutation_reader a
forwardable one (at the cost of performance, but that's not always
important).
2018-12-20 13:27:25 +00:00
Paweł Dziepak
8706750b9b tests/mutation_readers: do not ignore streamed_mutation::forwarding
It is wrong to silently ignore streamed_mutation::forwarding option
which completely changes how the reader is supposed to operate. The best
solution is to use make_forwardable() adapter which changes
non-forwardable reader to a forwardable one.
2018-12-20 13:27:25 +00:00
Paweł Dziepak
edf2c71701 tests/flat_mutation_reader_assertions: relax has_monotonic_positions() check
Since 41ede08a1d "mutation_reader: Allow
range tombstones with same position in the fragment stream" mutation
readers emit fragments in non-decreasing order (as opposed to strictly
increasing), has_monotonic_posiitons() needs to be updated to allow
that.
2018-12-20 13:27:25 +00:00
Paweł Dziepak
787d1ba7b2 tests/flat_mutation_reader_assertions: add more test messages 2018-12-20 13:27:25 +00:00
Paweł Dziepak
593fb936c2 tests/flat_mutation_reader_assertions: accumulate received tombstones
Current data model employed by mutation readers doesn't have an unique
representation of range tombstones. This complicates testing by making
multiple ways of emitting range tombstones and rows equally valid.

This patch adds an option to verify mutation readers by checking whether
tombstones they emit properly affect the clustered rows regardless of how
exactly the tombstones are emitted. The interface of
flat_mutation_reader_assertions is extended by adding
may_produce_tombstones() that accepts any number of tombstones and
accumulates them. Then, produces_row_with_key() accepts an additional
argument which is the expected timestamp of the range tombstone that
affects that row.
2018-12-20 13:27:25 +00:00
Paweł Dziepak
e6d26a528f Merge "Optimize slicing sstable readers" from Tomasz
"
Contains several improvements for fast-forwarding and slicing readers. Mainly
for the MC format, but not only:

  - Exiting the parser early when going out of the fast-forwarding window [MC-format-only]
  - Avoiding reading of the head of the partition when slicing
  - Avoiding parsing rows which are going to be skipped [MC-format-only]
"

* 'sstable-mc-optimize-slicing-reads' of github.com:tgrabiec/scylla:
  sstables: mc: reader: Skip ignored rows before parsing them
  sstables: mc: reader: Call _cells.clear() when row ends rather than when it starts
  sstables: mc: mutation_fragment_filter: Take position_in_partition rather than a clustering_row
  sstables: mc: reader: Do not call consume_row_marker_and_tombstone() for static rows
  sstables: mc: parser: Allow the consumer to skip the whole row
  sstables: continuous_data_consumer: Introduce skip()
  sstables: continuous_data_consumer: Make position() meaningful inside state_processor::process_state()
  sstables: mc: parser: Allocate dynamic_bitset once per read instead of once per row
  sstables: reader: Do not read the head of the partition when index can be used
  sstables: mc: mutation_fragment_filter: Check the fast-forward window first
  sstables: mc: writer: Avoid calling unsigned_vint::serialized_size()
2018-12-20 12:48:22 +00:00
Duarte Nunes
776fdd4d1a service/storage_proxy: Expose local view update backlog
The local view update backlog is the max backlog out of the relative
memory backlog size and the relative hints backlog size.

We leverage the db::view::node_update_backlog class so we can send the
max backlog out of the node's shards.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2018-12-19 22:38:30 +00:00
Duarte Nunes
6662475dd9 tests/view_schema_test: Add simple test for db::view::node_update_backlog
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2018-12-19 22:38:30 +00:00
Rafael Ávila de Espíndola
ff18c837b7 tests: Add missing include in random-utils.hh
This file uses std::cout and so should include <iostream>.

Found with a patch to seastar that removes some redundant <iostream>
includes.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20181218183816.34504-1-espindola@scylladb.com>
2018-12-19 10:52:19 +00:00
Duarte Nunes
a7456db687 Merge 'Simplify natural endpoint calculation' from Calle
"
Implementation of origin change c000da13563907b99fe220a7c8bde3c1dec74ad5

Modifies network topology calculation, reducing the amount of
maps/sets used by applying the knowledge of how many replicas we
expect/need per dc and sharing endpoint and rack set (since we cannot have
overlaps).

Also includes a transposed origin test to ensure new calculation
matches the old one.

Fixes #2896
"

* 'calle/network_topology' of github.com:scylladb/seastar-dev:
  network_topology_test: Add test to verify new algorith results equals old
  network_topology_strategy: Simplify calculate_natural_endpoints
  token_metadata: Add "get_location" ip to dc+rack accessor
  sequenced_set: Add "insert" method, following std::set semantics
2018-12-19 09:39:29 +00:00
Rafael Ávila de Espíndola
b93d8d863d Add a test with mismatched timestamps.
Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20181218035931.3554-1-espindola@scylladb.com>
2018-12-18 11:30:56 +01:00
Tomasz Grabiec
fb15759934 sstables: reader: Do not read the head of the partition when index can be used
read_partition() was always called through read_next_partition(), even
if we're at the beginning of the read.  read_next_partition() is
supposed to skip to the next partition. It still works when we're
positioned before a partition, it doesn't advance the consumer, but it
clears _index_in_current_partition, because it (correctly) assumes it
corresponds to the partition we're about to leave, not the one we're
about to enter.

This means that index lookups we did in the read initializer will be
disregarded when reading starts, and we'll always start by reading
partition data from the data file. This is suboptimal for reads which
are slicing a large partition and don't need to read the front of the
partition.

Regression introduced in 4b9a34a854.

The fix is to call read_partition() directly when we're positioned at
the beginning of the partition. For that purpose a new flag was
introduced.

test_no_index_reads_when_rows_fall_into_range_boundaries has to be
relaxed, because it assumed that slicing reads will read the head of
the partition.

Refs #3984
Fixes #3992

Tested using:

 ./build/release/tests/perf/perf_fast_forward_g \
     --sstable-format=mc \
     --datasets large-part-ds1 \
     --run-tests=large-partition-slicing-clustering-keys

Before (focus on aio):

offset  read      time (s)     frags     frag/s    mad f/s    max f/s    min f/s    aio      (KiB) blocked dropped  idx hit idx miss  idx blk    c hit   c miss    c blk    cpu
4000000 1         0.001378         1        726          5        736        102      6        200       4       2        0        1        1        0        0        0  65.8%

After:

offset  read      time (s)     frags     frag/s    mad f/s    max f/s    min f/s    aio      (KiB) blocked dropped  idx hit idx miss  idx blk    c hit   c miss    c blk    cpu
4000000 1         0.001290         1        775          6        788        716      2        136       2       0        0        1        1        0        0        0  69.1%
2018-12-18 11:11:37 +01:00
Calle Wilund
e353a8633a network_topology_test: Add test to verify new algorith results equals old
Transposed from origin unit test.

Creates a semi-random topology of racks, dcs, tokens and replication
factors and verifies endpoint calculation equals old algo.
2018-12-17 13:10:59 +00:00
Botond Dénes
5780f2ce7a querier_cache: check that the query wasn't evicted during registering
The reader concurrency semaphore can evict the querier when it is
registered as an inactive read. Make the `querier_cache` aware of this
so that it doesn't continue to process the inserted querier when this
happens.
Also add a unit test for this.
2018-12-17 13:18:08 +02:00
Botond Dénes
e1d8237e6b reader_concurrency_semaphore: use the correct types in the constructor
Previously there was a type mismatch for `count` and `memory`, between
the actual type used to store them in the class (signed) and the type
of the parameters in the constructor (unsigned).
Although negative numbers are completely valid for these members,
initializing them to negative numbers don't make sense, this is why they
used unsigned types in the constructor. This restriction can backfire
however when someone intends to give these parameters the maximum
possible value, which, when interpreted as a signed value will be `-1`.
What's worse the caller might not even be aware of this unsigned->signed
conversion and be very suprised when they find out.
So to prevent surprises, expose the real type of these members, trusting
the clients of knowing what they are doing.

Also add a `no_limits` constructor, so clients don't have to make sure
they don't overflow internal types.
2018-12-17 13:18:08 +02:00
Rafael Ávila de Espíndola
4de14e6143 Add tests on broken mc range tombstones.
This tests that we diagnose both two consecutive range starts and two
consecutive range ends.
Message-Id: <20181214212608.95452-1-espindola@scylladb.com>
2018-12-15 13:53:25 +01:00
Avi Kivity
b023e8b45d Merge " Extract MC sstable writer to a separate compilation unit" from Tomasz
"
The motivation is to keep code related to each format separate, to make it
easier to comprehend and reduce incremental compilation times.

Also reduces dependency on sstable writer code by removing writer bits from
sstales.hh.

The ka/la format writers are still left in sstables.cc, they could be also extracted.
"

* 'extract-sstable-writer-code' of github.com:tgrabiec/scylla:
  sstables: Make variadic write() not picked on substitution error
  sstables: Extract MC format writer to mc/writer.cc
  sstables: Extract maybe_add_summary_entry() out of components_writer
  sstables: Publish functions used by writers in writer.hh
  sstables: Move common write functions to writer.hh
  sstables: Extract sstable_writer_impl to a header
  sstables: Do not include writer.hh from sstables.hh
  sstables: mc: Extract bound_kind_m related stuff into mc/types.hh
  sstables: types: Extract sstable_enabled_features::all()
  sstables: Move components_writer to .cc
  tests: sstable_datafile_test: Avoid dependency on components_writer
2018-12-14 15:05:00 +02:00
Rafael Ávila de Espíndola
f48d54543f Use read_rows_flat to test broken sstables.
The previous code was using mp_row_consumer_k_l to be as close to the
tested code as possible.

Given that it is testing for an unhandled exception, there is probably
more value in moving it to a higher level, easier to use, API.

This patch changes it to use read_rows_flat().

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20181210235016.41133-1-espindola@scylladb.com>
2018-12-14 10:14:28 +01:00
Tomasz Grabiec
245a0d953a tests: cql_test_env: Start the compaction manager
Broken in fee4d2e

Not doing this results in compaction requests being ignored.

One effect of this is that perf_fast_forward produces many sstables instead of one.

Refs #3984
Refs #3983

Message-Id: <1544719540-10178-1-git-send-email-tgrabiec@scylladb.com>
2018-12-13 18:58:50 +02:00
Rafael Ávila de Espíndola
51fd880892 Add tests for broken start and end composite markers. 2018-12-13 10:29:44 +01:00
Tomasz Grabiec
eff47a59ee tests: sstable_datafile_test: Avoid dependency on components_writer
It's LA format specific and it's going to become private to sstable.cc
2018-12-12 12:06:22 +01:00
Nadav Har'El
a0379209e6 secondary indexes: fail attempts to create a CUSTOM INDEX
Cassandra supports a "CREATE CUSTOM INDEX" to create a secondary index
with a custom implementation. The only custom implementation that Cassandra
supports is SASI. But Scylla doesn't support this, or any other custom
index implementation. If a CREATE CUSTOM INDEX statement is used, we
shouldn't silently ignore the "CUSTOM" tag, we should generate an error.

This patch also includes a regression test that "CREATE CUSTOM INDEX"
statements with valid syntax fail (before this patch, they succeeded).

Fixes #3977

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20181211224545.18349-2-nyh@scylladb.com>
2018-12-11 23:33:02 +00:00
Tomasz Grabiec
1dd2bf52ca Merge "Add a couple of tests of broken sstables" From Rafael
These are the current uninteresting cases I found when looking at
malformed_sstable_exception. The existing code is working, just not
being tested.

* https://github.com/espindola/scylla.git espindola/espindola/broken-sst:
  Add a broken sstable test.
  Add a test with mismatched schema.
2018-12-10 19:30:58 +01:00
Tomasz Grabiec
538e041f22 Merge "Remove some dependencies on db::config" from Avi
db::config is a global class; changes in any module can cause changes
in db::config. Therefore, it is a cause of needless recompilation.

Remove some of these dependencies by having consumers of db::config
declare an intermediate config struct that is contains only
configuration of interest to them, and have their caller fill it out
(in the case of auth, it already followed this scheme and the patchset
only moves the translation function).

In addition, some outright pointless inclusions of db/config.hh are
removed.

The result is somewhat shorter compile times, and fewer needless
recompiles.

* https://github.com/avikivity/scylla unconfig-1/v1:
  config: remove inclusions of db/config.hh from header files
  repair: remove unneeded config.hh inclusion
  batchlog_manager: remove dependency on db::config
  auth: remove permissions_cache dependency on db::config
  auth: remove auth::service dependency on db::config
  auth: remove unneeded db/config.hh includes
2018-12-10 14:53:14 +01:00
Avi Kivity
475b151c97 Merge "Use utils::small_vector more in read path" from Paweł
"
This series optimises the read path by replacing some usages of
std::vector by utils::small_vector. The motivation for this change was
an observation that memory allocation functions are pointed out by the
profiler as the ones where we spent most time and while they have a
large number of callers storage allocation for some vectors was close to
the top. The gains are not huge, since the problem is a lot of things
adding up and not a single slow thing, but we need to start with
something.

Unfortunately, the performance of boost::container::small_vector is
quite disappointing so a new implementation of a small_vector was
introduced.

perf_simple_query -c4 --duration 60, medians:

       ./perf_before  ./perf_after  diff
 read      343086.80     360720.53  5.1%

Tests: unit(release, small_vector in debug)
"

* tag 'small_vector/v2.1' of https://github.com/pdziepak/scylla:
  partition_slice: use small_vector for column_ids
  mutation_fragment_merger: use small_vector
  auth: use small_vector in resource
  auth: avoid list-initialisation of vectors
  idl: serialiser: add serialiser for utils::small_vector
  idl: serialiser: deduplicate vector serialisers
  utils: introduce small_vector
  intrusive_set_external_comparator: make iterator nothrow move constructible
  mutation_fragment_merger: value-initialise iterator
2018-12-10 13:50:59 +02:00
Avi Kivity
904db433d9 Merge "Re-use commitlog segments" from Calle
"
Refs #3929

Enables re-use of commitlog segments.

First, ensures we never succeed playing back a commitlog
segment with name not matching the ID:s in the actual
file data, by determining expected id based on file name.
This will also handle partially written re-used files, as
each chunk headers CRC is dependent on the ID, and will
fail once we hit any left-overs.

Second part renamed and puts files into a recycle list
instead of actually deleting them when finished.
Allocating new files will the prioritize this list
before creating a new file.

Note that since consumtion and release of segments can
be somewhat unbalanced, this does not really guarantee
we will use recycled files even in all cases when it
might be possible, simply because of timing. It does
however give a good chance of it.

We limit recycled files based on the max disk size
setting, thus we can potentially grow disk size
more than without depending on timing, but not
uncontrolled.

While all this theoretially might improve disk
writes in some cases, it is far from any magic bullet.
No real performance testing has been done yet, only
functional.
"

* 'calle/commitlog-reuse' of github.com:scylladb/seastar-dev:
  commitlog: Recycle used segments instead of delete + new file
  commitlog: Terminate all segments with a zero chunk
  commitlog_replay: Enforce file name based id matching
2018-12-10 11:15:02 +02:00
Calle Wilund
b35af84599 commitlog_replay: Enforce file name based id matching
When reading the header chunk of a commitlog file, check the stored id
value against the id derived from the file name, and ignore if
mismatched. This is a prerequisite for re-using renamed commitlog files,
as we can then fail-fast should one such be left on disk, instead of
trying to replay it.

We also check said id via the CRC check for each chunk parsed. If we
find a chunk with
mismatched id, we will get a CRC error for the chunk, and replay will
terminate (albeit not gracefully).
2018-12-10 09:09:07 +00:00
Avi Kivity
40677fae37 Merge "Compaction strategy aware major compaction" from Raphael
"
Make major compaction aware of compaction strategy, by using an
optimal approach which suits the strategy needs.

Refs #1431.
"

* 'compaction_strategy_aware_major_compaction_v2' of github.com:raphaelsc/scylla:
  tests: add test for compaction-strategy-aware major compaction
  compaction: implement major compaction heuristic for leveled strategy
  compaction: introduce notion of compaction-strategy-aware major compaction
2018-12-10 10:10:22 +02:00
Avi Kivity
89be47e291 batchlog_manager: remove dependency on db::config
Extract configuration into a new struct batchlog_manager_config and have the
callers populate it using db::config. This reduces dependencies on global objects.
2018-12-09 20:11:38 +02:00
Avi Kivity
864f55e745 config: remove inclusions of db/config.hh from header files
Instead, distribute those inclusions to .cc files that require them. This
reduces rebuilds when config.hh changes, and makes it easier to locate files
that need config disaggregation.
2018-12-09 20:11:38 +02:00
Tomasz Grabiec
b78d98a358 tests: perf_fast_forward: Fix result_collector::add() for multi-element results
The results vector should be populated vertically, not horizontally.

Responsible for assertion failure with --cache-enabled:

  void result_collector::add(test_result_vector): Assertion `rs.size() == results.size()' failed.

Introduced in 3fc78a25bf.
Message-Id: <1544105835-24530-2-git-send-email-tgrabiec@scylladb.com>
2018-12-07 12:44:32 +00:00
Tomasz Grabiec
10cde9ae50 tests: perf_fast_forward: Fix live_range not being initialized
Broken in 470552b7ab

Causes test failure when running with --cache-enabled
Message-Id: <1544105835-24530-1-git-send-email-tgrabiec@scylladb.com>
2018-12-07 12:38:01 +00:00