Commit Graph

900 Commits

Author SHA1 Message Date
Avi Kivity
d5e94ab224 test: partition_data_test: don't capture structured bindings in lambdas
Clang does not yet implement p1091r3, which allows lambdas
to capture structured bindings. To accomodate it, don't
use structured bindings for variables that are later
captured.
2020-10-16 15:24:45 +03:00
Avi Kivity
77d54410d0 test: querier_cache_test: don't capture structured bindings in lambdas
Clang does not yet implement p1091r3, which allows lambdas
to capture structured bindings. To accomodate it, don't
use structured bindings for variables that are later
captured.
2020-10-16 15:24:37 +03:00
Avi Kivity
b406af2556 test: mutation_test: don't capture structured bindings in lambdas
Clang does not yet implement p1091r3, which allows lambdas
to capture structured bindings. To accomodate it, don't
use structured bindings for variables that are later
captured.
2020-10-16 15:24:28 +03:00
Tomasz Grabiec
f893516e55 Merge "lwt: store column_mapping's for each table schema version upon a DDL change" from Pavel Solodovnikov
This patch introduces a new system table: `system.scylla_table_schema_history`,
which is used to keep track of column mappings for obsolete table
schema versions (i.e. schema becomes obsolete when it's being changed
by means of `CREATE TABLE` or `ALTER TABLE` DDL operations).

It is populated automatically when a new schema version is being
pulled from a remote in get_schema_definition() at migration_manager.cc
and also when schema change is being propagated to system schema tables
in do_merge_schema() at schema_tables.cc.

The data referring to the most recent table schema version is always
present. Other entries are garbage-collected when the corresponding
table schema version is obsoleted (they will be updated with a TTL equal
to `DEFAULT_GC_GRACE_SECONDS` on `ALTER TABLE`).

In case we failed to persist column mapping after a schema change,
missing entries will be recreated on node boot.

Later, the information from this table is used in `paxos_state::learn`
callback in case we have a mismatch between the most recent schema
version and the one that is stored inside the `frozen_mutation`
for the accepted proposal.

Such situation may arise under following circumstances:
 1. The previous LWT operation crashed on the "accept" stage,
    leaving behind a stale accepted proposal, which waits to be
    repaired.
 2. The table affected by LWT operation is being altered, so that
    schema version is now different. Stored proposal now references
    obsolete schema.
 3. LWT query is retried, so that Scylla tries to repair the
    unfinished Paxos round and apply the mutation in the learn stage.

When such mismatch happens, prior to that patch the stored
`frozen_mutation` is able to be applied only if we are lucky enough
and column_mapping in the mutation is "compatible" with the new
table schema.

It wouldn't work if, for example, the columns are reordered, or
some columns, which are referenced by an LWT query, are dropped.

With this patch we try to look up the column mapping for
the obsolete schema version, then upgrade the stored mutation
using obtained column mapping and apply an upgraded mutation instead.

* git@github.com:ManManson/scylla.git feature/table_schema_history_v7:
  lwt: add column_mapping history persistence tests
  schema: add equality operator for `column_mapping` class
  lwt: store column_mapping's for each table schema version upon a DDL change
  schema_tables: extract `fill_column_info` helper
  frozen_mutation: introduce `unfreeze_upgrading` method
2020-10-15 20:48:29 +02:00
Pavel Solodovnikov
b59ac032c9 lwt: add column_mapping history persistence tests
There are two basic tests, which:
 * Test that column mappings are serialized and deserialized
   properly on both CREATE TABLE and ALTER TABLE
 * Column mappings for obsoleted schema versions are updated
   with a TTL value on schema change

Tests: unit(dev)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2020-10-15 19:25:24 +03:00
Tomasz Grabiec
62d2979888 Merge "raft: snapshot support" from Gleb
Support snapshotting for raft. The patch series only concerns itself
with raft logic, not how a specific state machine implements
take_snapshot() callback.

* scylla-dev/raft-snapshots-v2:
  raft: test: add tests for snapshot functionality
  raft: preserve trailing raft log entries during snapshotting
  raft: implement periodic snapshotting of a state machine
  raft: add snapshot transfer logic
2020-10-15 12:45:30 +02:00
Gleb Natapov
36c67aef8b raft: test: add tests for snapshot functionality
The patch adds two tests; one for snapshot transfer and another for
snapshot generation.
2020-10-15 11:50:27 +03:00
Nadav Har'El
509a41db04 alternator: change name of Alternator's SSL options
When Alternator is enabled over HTTPS - by setting the
"alternator_https_port" option - it needs to know some SSL-related options,
most importantly where to pick up the certificate and key.

Before this patch, we used the "server_encryption_options" option for that.
However, this was a mistake: Although it sounds like these are the "server's
options", in fact prior to Alternator this option was only used when
communicating with other servers - i.e., connections between Scylla nodes.
For CQL connections with the client, we used a different option -
"client_encryption_options".

This patch introduces a third option "alternator_encryption_options", which
controls only Alternator's HTTPS server. Making it separate from the
existing CQL "client_encryption_options" allows both Alternator and CQL to
be active at the same time but with different certificates (if the user
so wishes).

For backward compatibility, we temporarily continue to allow
server_encryption_options to control the Alternator HTTPS server if
alternator_encryption_options is not specified. However, this generates
a warning in the log, urging the user to switch. This temporary workaround
should be removed in a future version.

This patch also:
1. fixes the test run code (which has an "--https" option to test over
   https) to use the new name of the option.
2. Adds documentation of the new option in alternator.md and protocols.md -
   previously the information on how to control the location of the
   certificate was missing from these documents.

Fixes #7204.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200930123027.213587-1-nyh@scylladb.com>
2020-10-14 18:13:57 +03:00
Calle Wilund
83339f4bac Alternator::streams: Make SequenceNumber monotinically growing
Fixes #7424

AWS sdk (kinesis) assumes SequenceNumbers are monotonically
growing bigints. Since we sort on and use timeuuids are these
a "raw" bit representation of this will _not_ fulfill the
requirement. However, we can "unwrap" the timestamp of uuid
msb and give the value as timestamp<<64|lsb, which will
ensure sort order == bigint order.
2020-10-14 16:45:21 +03:00
Calle Wilund
3f800d68c6 alternator::streams: Ensure shards are reported in string lexical order
Fixes #7409

AWS kinesis Java sdk requires/expects shards to be reported in
lexical order, and even worse, ignores lastevalshard. Thus not
upholding said order will break their stream intropection badly.

Added asserts to unit tests.

v2:
* Added more comments
* use unsigned_cmp
* unconditional check in streams_test
2020-10-14 16:45:21 +03:00
Benny Halevy
b3f46e9cbf test: serialized_action_test: add test_serialized_action_exception
Tests that the exceptional future returned by the serialized action
is propagated to trigger, reproducing #7352.

The test fails without the previoud patch:
"serialized_action: trigger: include also semaphore status to promise"

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2020-10-14 16:45:21 +03:00
Avi Kivity
86bbf1763d Merge "reader concurrency semaphore: dump permit diagnostics on timeout or queue overflow" from Botond
"
The reader concurrency semaphore timing out or its queue being overflown
are fairly common events both in production and in testing. At the same
time it is a hard to diagnose problem that often has a benign cause
(especially during testing), but it is equally possible that it points
to something serious. So when this error starts to appear in logs,
usually we want to investigate and the investigation is lengthy...
either involves looking at metrics or coredumps or both.

This patch intends to jumpstart this process by dumping a diagnostics on
semaphore timeout or queue overflow. The diagnostics is printed to the
log with debug level to avoid excessive spamming. It contains a
histogram of all the permits associated with the problematic semaphore
organized by table, operation and state.

Example:

DEBUG 2020-10-08 17:05:26,115 [shard 0] reader_concurrency_semaphore -
Semaphore _read_concurrency_sem: timed out, dumping permit diagnostics:
Permits with state admitted, sorted by memory
memory  count   name
3499M   27      ks.test:data-query

3499M   27      total

Permits with state waiting, sorted by count
count   memory  name
1       0B      ks.test:drain
7650    0B      ks.test:data-query

7651    0B      total

Permits with state registered, sorted by count
count   memory  name

0       0B      total

Total: permits: 7678, memory: 3499M

This allows determining several things at glance:
* What are the tables involved
* What are the operations involved
* Where is the memory

This can speed up a follow-up investigation greatly, or it can even be
enough on its own to determine that the issue is benign.

Tests: unit(dev, debug)
"

* 'dump-diagnostics-on-semaphore-timeout/v2' of https://github.com/denesb/scylla:
  reader_concurrency_semaphore: dump permit diagnostics on timeout or queue overflow
  utils: add to_hr_size()
  reader_concurrency_semaphore: link permits into an intrusive list
  reader_concurrency_semaphore: move expiry_handler::operator()() out-of-line
  reader_concurrency_semaphore: move constructors out-of-line
  reader_concurrency_semaphore: add state to permits
  reader_concurrency_semaphore: name permits
  querier_cache_test: test_immediate_evict_on_insert: use two permits
  multishard_combining_reader: reader_lifecycle_policy: add permit param to create_reader()
  multishard_combining_reader: add permit parameter
  multishard_combining_reader: shard_reader: use multishard reader's permit
2020-10-13 12:44:23 +03:00
Botond Dénes
ff623e70b3 reader_concurrency_semaphore: name permits
Require a schema and an operation name to be given to each permit when
created. The schema is of the table the read is executed against, and
the operation name, which is some name identifying the operation the
permit is part of. Ideally this should be different for each site the
permit is created at, to be able to discern not only different kind of
reads, but different code paths the read took.

As not all read can be associated with one schema, the schema is allowed
to be null.

The name will be used for debugging purposes, both for coredump
debugging and runtime logging of permit-related diagnostics.
2020-10-13 12:32:13 +03:00
Botond Dénes
40c5474022 querier_cache_test: test_immediate_evict_on_insert: use two permits
The test currently uses a single permit shared between two simulated
reads (to wait admission twice). This is not a supported way of using a
permit and will stop working soon as we make the states the permit is in
more pronounced.
2020-10-12 15:56:56 +03:00
Botond Dénes
307cdf1e0d multishard_combining_reader: reader_lifecycle_policy: add permit param to create_reader()
Allow the evictable reader managing the underlying reader to pass its
own permit to it when creating it, making sure they share the same
permit. Note that the two parts can still end up using different
permits, when the underlying reader is kept alive between two pages of a
paged read and thus keeps using the permit received on the previous
page.

Also adjust the `reader_context` in multishard_mutation_query.cc to use
the passed-in permit instead of creating a new one when creating a new
reader.
2020-10-12 15:56:56 +03:00
Botond Dénes
e09ab09fff multishard_combining_reader: add permit parameter
Don't create an own permit, take one as a parameter, like all other
readers do, so the permit can be provided by the higher layer, making
sure all parts of the logical read use the same permit.
2020-10-12 15:56:56 +03:00
Gleb Natapov
9d7c81c1b8 raft: fix boost/raft_fsm_test complication
Message-Id: <20201011063802.GA2628121@scylladb.com>
2020-10-12 12:09:21 +02:00
Nadav Har'El
977da3567f Merge 'Alternator streams: Fix shard lengths, parenting, expiration, filter useless ones and improve paging' from Calle Wilund
The remains of the defunct #7246.

Fixes #7344
Fixes #7345
Fixes #7346
Fixes #7347

Shard ID length is now within limits.
Shard end sequence number should be set when appropriate.
Shard parent is selected a bit more carefully (sorting)
Shards are filtered by time to exclude cdc generations we cannot get data from (too old)
Shard paging improved

Closes #7348

* github.com:scylladb/scylla:
  test_streams: Add some more sanity asserts
  alternator::streams: Set dynamodb data TTL explicitly in cdc options
  alternator::streams: Improve paging and fix parent-child calculation
  alternator::streams: Remove table from shard_id
  alternator::streams: Filter our cdc streams older than data/table
  alternator::error: Add a few dynamo exception types
2020-10-12 09:43:12 +03:00
Avi Kivity
4d6739c2e6 Merge "Use max_concurrent_for_each" from Benny
"
max_concurrent_for_each was added to seastar for replacing
sstable_directory::parallel_for_each_restricted by using
more efficient concurrency control that doesn't create
unlimited number of continuations.

The series replaces the use of sstable_directory::parallel_for_each_restricted
with max_concurrent_for_each and exposes the sstable_directory::do_for_each_sstable
via a static method.

This method is used here by table::snapshot to limit concurrency
do snapshot operations that suffer from the same unbound
concurrency problem sstable_directory solved.

In addition sstable_directory::_load_semaphore that was used
across calls to do_for_each_sstable was replaced by a static per-shard
semaphore that caps concurrency across all calls to `do_for_each_sstable`
on that shard.  This makes sense since the disk is a shared resource.

In the future, we may want to have a load semaphore per device rather than
a single global one.  We should experiment with that.

Test: unit(dev)
"

* tag 'max_concurrent_for_each-v5' of github.com:bhalevy/scylla:
  table: snapshot: use max_concurrent_for_each
  sstable_directory: use a external load_semaphore
  test: sstable_directory_test: extract sstable_directory creation into with_sstable_directory
  distributed_loader: process_upload_dir: use initial_sstable_loading_concurrency
  sstables: sstable_directory: use max_concurrent_for_each
2020-10-12 09:43:12 +03:00
Avi Kivity
610fa83f28 test: database_test: fix threading confusion
database_test contains several instances of calling do_with_cql_test_env()
with a function that expects to be called in a thread. This mostly works
because there is an internal thread in do_with_cql_test_env(), but is not
guaranteed to.

Fix by switching to the more appropriate do_with_cql_test_env_thread().

Closes #7333
2020-10-11 17:44:30 +03:00
Avi Kivity
58e02c216a test: sstable_datafile_test: sstable_run_based_compaction_test: prevent use of uninitialized variable observer
The variable 'observer' (an std::optional) may be left uninitialized
if 'incremental_enabled' is false. However, it is used afterwards
with a call to disconnect, accessing garbage.

Fix by accessing it via the optional wrapper. A call to optional::reset()
destroys the observable, which in turn calls disconnect().

Closes #7380
2020-10-11 17:36:08 +03:00
Avi Kivity
15ab6a3feb test: cql_repl: use boost::regex instead of std::regex to avoid stack overflow
libstdc++'s std::regex uses recursion[1], with a depth controlled by the
input. Together with clang's debug mode, this overflows the stack.

Use boost::regex instead, which is immune to the problem.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86164

Closes #7378
2020-10-11 17:12:21 +03:00
Avi Kivity
882ed2017a test: network_topology_strategy_test: fix overflow in d2t()
d2t() scales a fraction in the range [0, 1] to the range of
a biased token (same as unsigned long). But x86 doesn't support
conversion to unsigned, only signed, so this is a truncating
conversion. Clang's ubsan correctly warns about it.

Fix by reducing the range before converting, and expanding it
afterwards.

Closes #7376
2020-10-11 16:05:02 +03:00
Alejo Sanchez
5d408082b6 raft: log failed test case name
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2020-10-09 15:50:47 +02:00
Alejo Sanchez
664b3eddb1 raft: test add hasher
Values seen by nodes were so far added but this does not provide a
guarantee the order of these values was respected.

Use a digest to check output, implicitly checking order.

On the other hand, sum or a simple positional checksum like Fletcher's
is easier to debug as rolling sum is evident.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2020-10-09 15:50:42 +02:00
Alejo Sanchez
670824c6fa raft: declarative tests
For convenience making Raft tests, use declarative structures.

Servers are set up and initialized and then updates are processed.
For now, updates are just adding entries to leader and change of leader.

Updates and leader changes can be specified to run after initial test setup.

An example test for 3 nodes, node 0 starting as leader having two entries
0 and 1 for term 1, and with current term 2, then adding 12 entries,
changing leader to node 1, and adding 12 more entries. The test will
automatically add more entries to the last leader until the test limit
of total_values (default 100).

    {.name = "test_name", .nodes = 3, .initial_term = 2,
    .initial_states = {{.le = {{1,0},{1,1}}},
    .updates = {entries{12},new_leader{1},entries{12}},},

Leader is isolated before change via is_leader returning false.
Initial leader (default server 0) will be set with this method, too.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2020-10-09 15:50:31 +02:00
Alejo Sanchez
7d4b33d834 raft: test make app return proper exit int value
Seastar app returns int result exit value.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2020-10-09 15:50:24 +02:00
Alejo Sanchez
093bc8fbb3 raft: test add support for disconnected server
Failure detector support of disconnected servers with a global set of
addresses.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2020-10-09 15:50:02 +02:00
Alejo Sanchez
21d7686766 raft: tests use custom server ids for easier debugging
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2020-10-09 15:49:57 +02:00
Alejo Sanchez
56683ae689 raft: test remove unnecessary header
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2020-10-09 15:49:45 +02:00
Alejo Sanchez
1bff357816 raft: fix typo snaphot snapshot
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2020-10-09 15:49:39 +02:00
Benny Halevy
57cc5f6ae1 sstable_directory: use a external load_semaphore
Although each sstable_directory limits concurrency using
max_concurrent_for_each, there could be a large number
of calls to do_for_each_sstable running in parallel
(e.g per keyspace X per table in the distributed_loader).

To cap parallelism across sstable_directory instances and
concurrent calls to do_for_each_sstable, start a sharded<semaphore>
and pass a shared semaphore& to the sstable_directory:s.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2020-10-08 11:57:06 +03:00
Benny Halevy
dc46aaa3fd test: sstable_directory_test: extract sstable_directory creation into with_sstable_directory
Use common code to create, start, and stop the sharded<sstable_directory>
for each test.

This will be used in the next patch for creating a sharded semaphore
and passing it to the sstable_directory.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2020-10-08 11:57:06 +03:00
Gleb Natapov
0bff15a976 raft: Send multiple entries in one append_entry rpc
Send more that one entry in single append_entry message but
limit one packets size according to append_request_threshold parameter.

Message-Id: <20201007142602.GA2496906@scylladb.com>
2020-10-07 16:43:33 +02:00
Calle Wilund
349c5ee21a test_streams: Add some more sanity asserts
Checking validity of retured shard sets etc.
2020-10-07 08:43:39 +00:00
Calle Wilund
3cdd7fe191 alternator::streams: Remove table from shard_id
Fixes #7344

It is not data really needed, as shard_id:s are not required
to be unique across streams, and also because the length limit
on shard_id text representation.

As a side effect, shard iter instead carries the stream arn.
2020-10-07 08:43:39 +00:00
Avi Kivity
c6a3fa5a49 Merge "querier_cache: use the querier's permit for memory accounting" from Botond
"
The querier cache has a memory based eviction mechanism, which starts
evicting freshly inserted queriers once their collective memory
consumption goes above the configured limit. For determining the memory
consumption of individual queriers, the querier cache uses
`flat_mutation_reader::buffer_size()`. But we now have a much more
comprehensive accounting of the memory used by queriers: the reader
permit, which also happens to be available in each querier. So use this
to determine the querier's memory consumption instead.

Tests: unit(dev)
"

* 'querier-cache-use-permit-for-memory-accounting/v1' of https://github.com/denesb/scylla:
  flat_mutation_reader: de-virtualize buffer_size()
  querier_cache: use the reader permit for memory accounting
  querier_cache_test: use local semaphore not the test global one
  reader_permit: add consumed_resources() accessor
2020-10-06 16:52:44 +03:00
Tomasz Grabiec
46b7ba8809 Merge "Bring memory footprint test back to work" from Pavel Emelyanov
The test was broken by recent sstables manager rework. In the middle
the sstables::test_env is destroyed without being closed which leads
to broken _closing assertion inside ~sstables_manager().

Fix is to use the test_env::do_with helper.

tests: perf.memory_footprint

* https://github.com/xemul/scylla/tree/br-memory-footprint-test-fix:
  test/perf/memory_footprint: Fix indentation after previous patch
  test/perf/memory_footprint: Don't forget to close sstables::test_env after usage
2020-10-06 11:49:03 +02:00
Pavel Emelyanov
8bceb916ea test/perf/memory_footprint: Fix indentation after previous patch
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-10-06 11:08:09 +03:00
Pavel Emelyanov
3e4de0f748 test/perf/memory_footprint: Don't forget to close sstables::test_env after usage
After recent sstables manager rework the sstables::test_env must be .close()d
after usage, otherwise the ~sstables_mananger() hits the _closing assertion.

Do it with the help of .do_with(). The execution context is already seastar::async
in this place, so .get() it explicitly.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-10-06 11:06:35 +03:00
Botond Dénes
dd372c8457 flat_mutation_reader: de-virtualize buffer_size()
The main user of this method, the one which required this method to
return the collective buffer size of the entire reader tree, is now
gone. The remaining two users just use it to check the size of the
reader instance they are working with.
So de-virtualize this method and reduce its responsibility to just
returning the buffer size of the current reader instance.
2020-10-06 08:22:56 +03:00
Botond Dénes
f7eea06f61 querier_cache_test: use local semaphore not the test global one
In the mutation source, which creates the reader for this test, the
global test semaphore's permit was passed to the created reader
(`tests::make_permit()`). This caused reader resources to be accounted
on the global test semaphore, instead of the local one the test creates.
Just forward the permit passed to the mutation sources to the reader to
fix this.
2020-10-06 08:22:56 +03:00
Nadav Har'El
421f0c729d merge: counters: Avoid signed integer overflow
Merged patch series by Tomasz Grabiec:

UBSAN complains in debug mode when the counter value overflows:

  counters.hh:184:16: runtime error: signed integer overflow: 1 + 9223372036854775807 cannot be represented in type 'long int'
  Aborting on shard 0.

Overflow is supposed to be supported. Let's silence it by using casts.

Fixes #7330.

Tests:

  - build/debug/test/tools/cql_repl --input test/cql/counters_test.cql

Tomasz Grabiec (2):
  counters: Avoid signed integer overflow
  test: cql: counters: Add tests reproducing signed integer overflow in
    debug mode

 counters.hh                   |  2 +-
 test/cql/counters_test.cql    |  9 ++++++++
 test/cql/counters_test.result | 48 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 58 insertions(+), 1 deletion(-)
2020-10-05 21:43:19 +03:00
Tomasz Grabiec
f01ffe063a test: cql: counters: Add tests reproducing signed integer overflow in debug mode
Reproduces #7330
2020-10-05 20:06:34 +02:00
Alejo Sanchez
bb67d15e2f Raft: disable boost tests for now
Disable raft fsm boost tests until raft is part of build.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2020-10-02 14:03:01 +02:00
Alejo Sanchez
4e26dad3a0 Raft: Remove tests for now
Remove raft C++ tests until raft is included in build process.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2020-10-02 12:26:05 +02:00
Tomasz Grabiec
ca7f0c61f0 Merge "raft: initial implementation" from Gleb
This is the beginning of raft protocol implementation. It only supports
log replication and voter state machine. The main difference between
this one and the RFC (besides having voter state machine) is that the
approach taken here is to implement raft as a deterministic state
machine and move all the IO processing away from the main logic.
To do that some changes to RPC interface was required: all verbs are now
one way meaning that sending a request does not wait for a reply  and
the reply arrives as a separate message (or not at all, it is safe to
drop packets).

* scylla-dev/raft-v4:
  raft: add a short readme file
  raft: compile raft tests
  raft: add raft tests
  raft: Implement log replication and leader election
  raft: Introduce raft interface header
2020-10-01 17:09:52 +02:00
Gleb Natapov
4959609589 raft: add raft tests
Add test for currently implemented raft features. replication_test
tests replication functionality with various initial log configurations.
raft_fsm_test test voting state machine functionality.
2020-10-01 14:30:59 +03:00
Etienne Adam
98dc0dc03a redis: only create required keyspaces/tables
The 'redis_database_count' was already existing, but
was not used when initializing the keyspaces. This
patch merely uses it. I think it's better that way, it
seems cleaner not to create 15 x 5 tables when we
use only one redis database.

Also change a test to test with a higher max number
of database.

Signed-off-by: Etienne Adam <etienne.adam@gmail.com>
Message-Id: <20200930210256.4439-1-etienne.adam@gmail.com>
2020-10-01 10:27:03 +03:00
Avi Kivity
fd1dd0eac7 Merge "Track the memory consumption of reader buffers" from Botond
"
The last major untracked area of the reader pipeline is the reader
buffers. These scale with the number of readers as well as with the size
and shape of data, so their memory consumption is unpredictable varies
wildly. For example many small rows will trigger larger buffers
allocated within the `circular_buffer<mutation_fragment>`, while few
larger rows will consume a lot of external memory.

This series covers this area by tracking the memory consumption of both
the buffer and its content. This is achieved by passing a tracking
allocator to `circular_buffer<mutation_fragment>` so that each
allocation it makes is tracked. Additionally, we now track the memory
consumption of each and every mutation fragment through its whole
lifetime. Initially I contemplated just tracking the `_buffer_size` of
`flat_mutation_reader::impl`, but concluded that as our reader trees are
typically quite deep, this would result in a lot of unnecessary
`signal()`/`consume()` calls, that scales with the number of mutation
fragments and hence adds to the already considerable per mutation
fragment overhead. The solution chosen in this series is to instead
track the memory consumption of the individual mutation fragments, with
the observation that these are typically always moved and very rarely
copied, so the number of `signal()`/`consume()` calls will be minimal.

This additional tracking introduces an interesting dilemma however:
readers will now have significant memory on their account even before
being admitted. So it may happen that they can prevent their own
admission via this memory consumption. To prevent this, memory
consumption is only forwarded to the semaphore upon admission. This
might be solved when the semaphore is moved to the front -- before the
cache.
Another consequence of this additional, more complete tracking is that
evictable readers now consume memory even when the underlying reader is
evicted. So it may happen that even though no reader is currently
admitted, all memory is consumed from the semaphore. To prevent any such
deadlocks, the semaphore now admits a reader unconditionally if no
reader is admitted -- that is if all count resources all available.

Refs: #4176

Tests: unit(dev, debug, release)
"

* 'track-reader-buffers/v2' of https://github.com/denesb/scylla: (37 commits)
  test/manual/sstable_scan_footprint_test: run test body in statement sched group
  test/manual/sstable_scan_footprint_test: move test main code into separate function
  test/manual/sstable_scan_footprint_test: sprinkle some thread::maybe_yield():s
  test/manual/sstable_scan_footprint_test: make clustering row size configurable
  test/manual/sstable_scan_footprint_test: document sstable related command line arguments
  mutation_fragment_test: add exception safety test for mutation_fragment::mutate_as_*()
  test: simple_schema: add make_static_row()
  reader_permit: reader_resources: add operator==
  mutation_fragment: memory_usage(): remove unused schema parameter
  mutation_fragment: track memory usage through the reader_permit
  reader_permit: resource_units: add permit() and resources() accessors
  mutation_fragment: add schema and permit
  partition_snapshot_row_cursor: row(): return clustering_row instead of mutation_fragment
  mutation_fragment: remove as_mutable_end_of_partition()
  mutation_fragment: s/as_mutable_partition_start/mutate_as_partition_start/
  mutation_fragment: s/as_mutable_range_tombstone/mutate_as_range_tombstone/
  mutation_fragment: s/as_mutable_clustering_row/mutate_as_clustering_row/
  mutation_fragment: s/as_mutable_static_row/mutation_as_static_row/
  flat_mutation_reader: make _buffer a tracked buffer
  mutation_reader: extract the two fill_buffer_result into a single one
  ...
2020-09-29 16:08:16 +03:00