Commit Graph

29065 Commits

Author SHA1 Message Date
Konstantin Osipov
e3751068fe raft: (server) allow adding entries/modify config on a follower
Implement an RPC to forward add_entry calls from the follower
to leader. Bounce & retry in case of not_a_leader.
Do not retry in case of uncertainty - this can lead to adding
duplicate entries.

The feature is added to core Raft since it's needed by
all current clients - both topology and schema changes.

When forwarding an entry to a remote leader we may get back
a term/index pair that conflicts (has the same index, but is with
a higher term) with a local entry we're still waiting on.

This can happen, e.g. because there was a leader change and the
log was truncated, but we still haven't got the append_entries
RPC from the new leader, still haven't truncated the log locally,
still haven't aborted all the local waits for truncated entries.

Only remove the offending entry from the wait list and abort it.
There may be entries labeled with an older term to the right (with
higher commit index) of the conflicting entry. However, finding them,
would require a linear scan. If we allow it, we may end up doing this
linear scan for *every* conflicting entry during the transition
period, which brings us to N^2 complexity of this step. At the
same time, as soon as append_entries that commits a higher-term
entry with the same index reaches the follower, the waits
for the respective truncated entry will be aborted anyway (see
notify_waiters() which sets dropped_entry exception), so the scan
is unnecessary.

Similarly to being able to add entries, allow to modify
Raft group configuration on a follower. The implementation
works the same way as adding entries - forwards the command
to the leader.

Now that add_entry() or modify_config never throws not_a_leader,
it's more likely to throw timed_out_error, e.g. in case the
network is partitioned. Previously it was only possible due to a
semaphore wait timeout, and this scenario was not tested.
Handle timed_out_error on RPC level to let the existing tests
(specifically the randomized nemesis test) pass.
2021-11-25 11:50:38 +03:00
Konstantin Osipov
ae5dc8e980 raft: (test) replace virtual with override in derived class
Clang 12 complains if use of override is inconsistent,
so stick to it everywhere.
2021-11-25 11:50:38 +03:00
Konstantin Osipov
8f303844df raft: (server) fix a typo in exception message 2021-11-25 11:50:38 +03:00
Konstantin Osipov
9cde1cdf71 raft: (server) implement id() helper
There is no easy way to get server id otherwise.
2021-11-25 11:50:38 +03:00
Konstantin Osipov
b9faf41513 raft: (server) remove apply_dummy_entry()
It's currently unused, and going forward we'd like to make
it work on the follower, which requires a new implementation.
2021-11-25 11:50:38 +03:00
Konstantin Osipov
2763fdd3b7 raft: (test) fix missing initialization in generator.hh
A missing initialization in poll_timeout of class interpreter
could manifest itself as a sporadically failing
randomized_nemesis_test.

The test would prematurely run out of allowed limit of virtual
clock ticks.
2021-11-25 11:50:38 +03:00
Mikołaj Sielużycki
44f4ea38c5 test: Future-proof reader conversions tests.
Query time must be fetched after populate. If compaction is executed
during populate it may be executed with timestamp later than query_time.
This would cause the test expected compaction and compaction during
populate to be executed at different time points producing different
results. The result would be sporadic test failures depending on relative
timing of those operations. If no other mutations happen after populate,
and query_time is later than the compaction time during population, we're
guaranteed to have the same results.
Message-Id: <20211123134808.105068-1-mikolaj.sieluzycki@scylladb.com>
2021-11-24 21:01:57 +01:00
Michał Chojnowski
08f7b81b36 dist: scylla_io_setup: run iotune for supported but not preconfigured AWS instance types
Currently, for AWS instances in `is_supported_instance_class()` other than
i3* and *gd (for example: m5d), scylla_io_setup neither provides
preconfigured values for io_properties.yaml nor runs iotune nor fails.
This silently results in a broken io_properties.yaml, like so:

disks:
  - mountpoint: /var/lib/scylla

Fix that.

Closes #9660
2021-11-24 18:28:13 +02:00
Avi Kivity
f3faa48f8b Merge "Unglobal stream manager" from Pavel E
"
There's a nest of globals in streaming/ code. The stream_manager
itself and a whole lot of its dependencies (database, sys_dist_ks,
view_update_generator and messaging). Also streaming code gets
gossiper instance via global call.

The fix is, as usual, in keeping the sharded<stream_manager> in
the main() code and pushing its reference everywhere. Somwehere
in the middle the global pointers go away being replaced with
respective references pushed to the stream_manager ctor.

This reveals an implicit dependency:

  storage_service -> stream_manager

tests: unit(dev),
       dtest.cdc_tests.cluster_reduction_with_cdc(dev)
       v1: dtest.bootstrap_test.add_node(dev)
       v1: dtest.bootstrap_test.simple_bootstrap(dev)
"

* 'br-unglobal-stream-manager-3-rebase' of https://github.com/xemul/scylla: (26 commits)
  streaming, main: Remove global stream_manager
  stream_transfer_task: Get manager from session (result-future)
  stream_transfer_task: Keep Updater fn onboard
  stream_transfer_task: Remove unused database reference
  stream_session: Use manager reference from result-future
  stream_session: Capture container() in message handler
  stream_session: Keep stream_manager reference
  stream_session: Remove unused default contructor
  stream_result_future: Use local manager reference
  stream_result_future: Keep stream_manager reference
  stream_plan: Keep stream_manager onboard
  dht: Keep stream_manager on board
  streaming, api: Use captured manager in handlers
  streaming, api: Standardize the API start/stop
  storage_service: Sanitize streaming shutdown
  storage_service: Keep streaming_manager reference
  stream_manager: Use container() in notification code
  streaming: Move get_session into stream_manager
  streaming: Use container.invoke_on in rpc handlers
  streaming: Fix interaction with gossiper
  ...
2021-11-24 12:23:18 +02:00
Pavel Emelyanov
4a34226aa6 streaming, main: Remove global stream_manager
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-24 12:17:37 +03:00
Pavel Emelyanov
50e6d334a9 stream_transfer_task: Get manager from session (result-future)
When the task starts it needs the stream_manager to get messaging
service and database from. There's a session at hands and this
session is properly initialized thus it has the result-future.
Voila -- we have the manager!

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-24 12:17:37 +03:00
Pavel Emelyanov
95d26bc420 stream_transfer_task: Keep Updater fn onboard
The helper function called send_mutation_fragments needs the manager
to update stats about stream_transfer_task as it goes on. Carrying the
manager over its stack is quite boring, but there's a helper send_info
object that lives there. Equip the guy with the updating function and
capture the manager by it early to kill one more usage of the global
stream_manager call.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-24 12:17:37 +03:00
Pavel Emelyanov
9ee208de8d stream_transfer_task: Remove unused database reference
The send_info helper keeps it, but doesn't use. Remove.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-24 12:17:37 +03:00
Pavel Emelyanov
a3b4d4d3cf stream_session: Use manager reference from result-future
When the stream_session initializes it's being equipped with
the shared-pointer on the stream_result_future very early. In
all the places where stream_session needs the manager this
pointer is alive and session get get manager from it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-24 12:17:37 +03:00
Pavel Emelyanov
56f5327450 stream_session: Capture container() in message handler
The stream_mutation_fragments handler need to access the manager. Since
the handler is registered by the manager itself, it can capture the
local manager reference and use container() where appropriate.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-24 12:17:37 +03:00
Pavel Emelyanov
db33607eb2 stream_session: Keep stream_manager reference
The manager is needed to get messaging service and database from.
Actually, the database can be pushed though arguments in all the
places, so effectively session only needs the messaging. However,
the stream-task's need the manager badly and there's no other
place to get it from other than the session.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-24 12:17:37 +03:00
Pavel Emelyanov
f2ae080c63 stream_session: Remove unused default contructor
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-24 12:17:37 +03:00
Pavel Emelyanov
307a2583ee stream_result_future: Use local manager reference
The reference is present in all the required places already.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-24 12:17:37 +03:00
Pavel Emelyanov
5b748a72de stream_result_future: Keep stream_manager reference
The stream_result_future needs manager to register on it and to
unregister from it. Also the result-future is referenced from
stream_session that also needs the manager (see next patches).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-24 12:17:37 +03:00
Pavel Emelyanov
3087422d4d stream_plan: Keep stream_manager onboard
The plan itself doesn't need it, but it creates some lower level
objects that do. Next patches will use this reference.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-24 12:17:37 +03:00
Pavel Emelyanov
c593f8624d dht: Keep stream_manager on board
This is the preparation for the future patching. The stream_plan
creation will need the manager reference, so keep one on dht
object in advance. These are only created from the storage service
bootstrap code.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-24 12:17:37 +03:00
Pavel Emelyanov
5166a98ce4 streaming, api: Use captured manager in handlers
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-24 12:17:37 +03:00
Pavel Emelyanov
fd920e2420 streaming, api: Standardize the API start/stop
Todays idea of API reg/unreg is to carry the target service via
lambda captures down to the route handlers and unregister those
handers before the target is about to stop.

This patch makes it so for the streaming API.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-24 12:17:37 +03:00
Pavel Emelyanov
390a971bd8 storage_service: Sanitize streaming shutdown
Use local reference and don't use 'is_stopped' boolean as the
whole stop_transport is guarded with its own lock.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-24 12:17:37 +03:00
Pavel Emelyanov
aaa58b7b89 storage_service: Keep streaming_manager reference
The manager is drained() on drain/decommission/isolate. Since now
it's storage_service who orchestrates all of the above, it needs
and explicit reference on the target.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-24 12:17:35 +03:00
Pavel Emelyanov
3a9eb6af28 stream_manager: Use container() in notification code
Continuation of the previous patch -- some native stream_manager methods
can enjoy using container() call. One nit -- the [] access to the map
of statistics now runs in const context and cannot create elements, so
switch this place into .at() method.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-24 12:15:59 +03:00
Pavel Emelyanov
8ab96a8362 streaming: Move get_session into stream_manager
This makes the code a bit shorter and helps removing one more call
for global stream manager.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-24 12:15:59 +03:00
Pavel Emelyanov
228b4520a6 streaming: Use container.invoke_on in rpc handlers
This will help to reduce the usage of global manager instance.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-24 12:15:59 +03:00
Pavel Emelyanov
c2c676784a streaming: Fix interaction with gossiper
Streaming manager registers itself in gossiper, so it needs an explicit
dependency reference. Also it forgets to unregister itself, so do it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-24 12:15:59 +03:00
Pavel Emelyanov
73e10c7aed streaming: Move start/stop onto common rails
In case of streaming this mostly means dropping the global
init/uninit calls and replacing them with sharded<stream_manager>
instance. It's still global, but it's being fixed atm.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-24 12:15:58 +03:00
Pavel Emelyanov
08818ffe75 streaming: Rename .stop() into .shutdown()
The start/stop standard is becoming like

    sharded<foo> foo;
    foo.start();
    defer([] { foo.stop() });
    foo.invoke_on_all(&foo::start);
    ...
    defer([] { foo.shutdown() });
    wait_for_stop_signal();
    /* quit making the above defers self-unroll */

where .shutdown() for a service would mean "do whatever is
appropriate to start stopping, the real synchronous .stop() will
come some time later".

According to that, rename .stop() as it's really the mentioned
preparation, not real stopping.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-24 12:15:58 +03:00
Pavel Emelyanov
ba298bd5c6 streaming: Remove global dependency pointers
Now they are not needed.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-24 12:15:58 +03:00
Pavel Emelyanov
6d7eb76fad streaming: Use get_stream_manager to get dependencies
Currently streaming uses global pointers to save and get a
dependency. Now all the dependencies live on the manager,
this patch changes all the places in streaming/ to get the
needed dependencies from it, not from global pointer (next
patch will remove those globals).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-24 12:15:58 +03:00
Pavel Emelyanov
e448774588 streaming: Move rpc verbs reg/unreg into manager
As a part of streaming start/stop unification.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-24 12:15:58 +03:00
Pavel Emelyanov
165971fb7f streaming: Initialize stream manager with proper deps
The stream manager is going to become central point of control
for the streaming subsys. This patch makes its dependencies
explicit and prepares the gound for further patching.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-24 12:15:58 +03:00
Nadav Har'El
e71131091a cql-pytest: translate Cassandra's tests for user-defined types
This is a translation of Cassandra's CQL unit test source file
validation/entities/UserTypesTest.java into our our cql-pytest
framework.

This test file includes 26 tests for various features and corners of
the user-defined type feature. Two additional tests which were more
involved to translate were dropped with a comment explaining why.

All 26 tests pass on Cassandra, and all but one pass on Scylla:
The test testUDTWithUnsetValues fails on Scylla so marked xfail.
It reproduces a previously-unknown Scylla bug:

  Refs #9671: In some cases, trying to assign an UNSET value into part
              of a UDT is not detected

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20211124074001.708183-1-nyh@scylladb.com>
2021-11-24 10:37:15 +02:00
Avi Kivity
965ea4a3fa Merge "tools/scylla-sstable: add dumpers for all components" from Botond
"
Except for TOC, Filter, Digest and CRC32, these are trivial to read with
any text/binary editor.
"

* 'scylla-sstable-dump-components' of https://github.com/denesb/scylla:
  tools/scylla-sstable: add --dump-scylla-metadata
  tools/scylla-sstable: add --dump-statistics
  tools/scylla-sstable: add --dump-summary
  tools/scylla-sstable: add --dump-compression-info
  tools/scylla-sstable: extract unsupported flag checking into function
  sstables/sstable: add scylla metadata getter
  sstables/sstable: add statistics accessor
2021-11-23 16:13:02 +02:00
Michał Sala
27ff3e7de7 storage_proxy: check partition ranges contiguity
storage_proxy::query_partition_key_range_concurrent() iterates through
vnodes produced by its argument query_ranges_to_vnodes_generator&&
ranges_to_vnodes and tries to merge them. This commit introduces
checking if subsequent vnodes are contiguous with each other, before
merging them.

Fixes #9167

Closes #9175
2021-11-23 15:48:55 +02:00
Botond Dénes
9746dbe20d Merge "Add --cpus option to test.py" from Pavel Emelyanov
"
When provided all the tests start from under the 'taskset -c $value'.
This is _not_ the same as just doing 'taskset -c ... ./test.py ...'
because in the latter case test.py will compete with all the tests
for the provided cpuset and may not be able to run at desired speed.
With this option it's possible to isolate the tests themselves on a
cpuset without affecting the test.py performance.

One of the examples when test.py speed can be critical is catching
flaky tests that reveal their buggy nature only when ran in a tight
environment. The combination of --cpus, --repeat and --jobs creates
nice pressure on the cpu, and keeping the test.py out of the mincer
lets it fork and exec (and wait) the tests really fast.

tests: unit(dev, with and without --cpus)
"
* 'br-test-taskset-2' of https://github.com/xemul/scylla:
  test.py: Add --cpus option
  test.py: Lazily calculate args.jobs
2021-11-23 15:06:59 +02:00
Pavel Emelyanov
bd24c1eecf Merge "Deglobalize batchlog_manager" from Benny
This series gets rid of the global batchlog_manager instance.

It does so by first, allowing to set a global pointer
and instatiating stack-local instances in main and
cql_test_env.

Expose the cql_test_env batchlog_manager to tests
so they won't need the global `get_batchlog_manager()` as
used in batchlog_manager_test.test_execute_batch.

Then we pass a reference to the `sharded<db::batchlog_manager>` to
storage_service so it can be used instead of the global one.

Derive batchlog_manager from peering_sharded_service so it
get its `container()` rather than relying on the global `get_batchlog_manager()`.

And finally, handle a circular dependency between the batchlog_manager,
that relies on the query_processor that, in turn, relies on the storage_proxy,
and the the storage_proxy itself that depends on the batchlog_manager for
`mutate_atomically`.

Moved `endpoint_filter` to gossiper so `storage_proxy::mutate_atomically`
can call it via the `_gossiper` member it already has.
The function requires a gossiper object rather than a batchlog_manager
object.

Also moved `get_batch_log_mutation_for` to storage_proxy so it can be
called from `sync_write_to_batchlog` (also from the mutate_atomically path)

Test: unit(dev)
DTest: batch_test.py:TestBatch.test_batchlog_manager_issue(dev)

* git@github.com:bhalevy/scylla.git deglobalize-batchlog_manager-v2
  get rid of the global batchlog_manager
  batchlog_manager: get_batch_log_mutation_for: move to storage_proxy
  batchlog_manager: endpoint_filter: move to gossiper
  batchlog_manager: do_batch_log_replay: use lambda coroutine
  batchlog_manager: derive from peering_sharded_service
  storage_service: keep a reference to the batchlog_manager
  test: cql_test_env: expose batchlog_manager
  main: allow setting the global batchlog_manager
2021-11-23 15:10:50 +03:00
Benny Halevy
1740833324 test: sstable_compaction_test: autocompaction_control_test: use deferred_stop
To auto-stop the table and the compaction_manager, making the
test case exception-safe.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20211122204340.1020932-2-bhalevy@scylladb.com>
2021-11-23 12:10:12 +02:00
Benny Halevy
dfa6a494c2 test: sstable_compaction_test: require smp::count==1 where needed
These test cases may crash if running with more shards.
This is not required for test.py runs, but rather when
running the test manually using the command line.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20211122204340.1020932-1-bhalevy@scylladb.com>
2021-11-23 12:10:12 +02:00
Kamil Braun
a33b0649b1 Merge 'Block creation of MV on CDC Log' from Piotr Jastrzębski
Add a restriction in create_view_statement to disallow creation of MV for CDC Log table.

Also add a CQL test that checks the new restriction works.

Test: unit(dev)

Fixes #9233
Closes #9663

* 'fix9233' of https://github.com/haaawk/scylla:
  tests: Add cql test to verify it's impossible to create MV for CDC Log
  cql3: Make it impossible to create MV on CDC log
2021-11-23 10:51:02 +01:00
Nadav Har'El
3c0e7037be conf/scylla.yaml: change default Prometheus listen address
Developers often run Scylla with the default conf/scylla.yaml provided
with the source distribution. The existing default listens for all ports
but one (19042, 10000, 9042, 7000) on the *localhost* IP address (127.0.0.1).
But just one port - 9180 (Prometheus metrics) - is listened on 0.0.0.0.
This patch changes the default to be 127.0.0.1 for port 9180 as well.

Note that this just changes the default scylla.yaml - users can still
choose whatever listening address they want by changing scylla.yaml
and/or passing command line parameters.

The benefits of this patch are:
1. More consistent.
2. Better security for developers (don't open ports on external
   addresses while testing).
3. Allow test/cql-pytest/run to run in parallel with a default run of
   Scylla (currently, it fails to run Scylla on a random IP address,
   because the default run of Scylla already took port 9180 on all IP
   addresses.

The third benefit is what led me to write this patch. Fixes #8757.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210530130307.906051-1-nyh@scylladb.com>
2021-11-23 11:45:35 +02:00
Benny Halevy
ff18c0c14c messaging_service: remove unused include of db/system_keyspace.hh
As a followup to eba20c7e5d
"messaging_service: init_local_preferred_ip_cache: get preferred ips from caller".

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20211123080457.1247970-1-bhalevy@scylladb.com>
2021-11-23 11:12:36 +03:00
Pavel Emelyanov
dcefe98fbb test.py: Add --cpus option
The option accepts taskset-style cpulist and limits the launched tests
respectively. When specified, the default number of jobs is adjusted
accordingly, if --jobs is given it overrides this "default" as expected.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-23 11:08:41 +03:00
Pavel Emelyanov
0246841c5e test.py: Lazily calculate args.jobs
Next patch will need to know if the --jobs option was specified or the
caller is OK with the default. One way to achieve it is to keep 0 as the
default and set the default value afterwards.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-11-23 11:05:56 +03:00
Nadav Har'El
253387ea07 alternator: implement AttributeUpdates DELETE operation with Value
In the DynamoDB API, UpdateItem's AttributeUpdates parameter (the older
syntax, which was superseded by UpdateExpression) has a DELETE operation
that can do two different things: It can delete an attribute, or it can
delete elements from a set. Before this patch we only implemented the
first feature, and this patch implements the second.

Note that unlike the ordinary delete, the second feature - set subtraction -
is a read-modify-write operation. This is not only because of Alternator's
serialization (as JSON strings, not CRDTs) - but also fundementally because
of the API's guarantees - e.g., the operation is supposed to fail if the
attribute's existing value is *not* a set of the correct type, so it
needs to read the old value.

The test for this feature begins to pass, so its "xfail" mark is
removed. After this, all tests in test/alternator/test_item.py pass :-)

Fixes #5864.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20211103151206.157184-1-nyh@scylladb.com>
2021-11-23 08:51:06 +01:00
Benny Halevy
d344765ec6 get rid of the global batchlog_manager
Now that it's unused.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-11-23 08:27:30 +02:00
Benny Halevy
744275df73 batchlog_manager: get_batch_log_mutation_for: move to storage_proxy
And rename to get_batchlog_mutation_for while at it,
as it's about the batchlog, not batch_log.

This resolves a circular dependency between the
batchlog_manager and the storage_proxy that required
it in the case.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-11-23 08:27:30 +02:00