Commit Graph

1787 Commits

Author SHA1 Message Date
Avi Kivity
3e3003fcc1 Merge 'cql3: limit the concurrency of indexed statements' from Piotr Sarna
Indexed select statements fetch primary key information from
their internal materialized views and then use it to query
the base table. Unfortunately, the current mechanism for retrieving
base table rows makes it easy to overwhelm the replicas with unbounded
concurrency - the number of concurrent ops is increased exponentially
until a short read is encountered, but it's not enough to cap the
concurrency - if data is fetched row-by-row, then short reads usually
don't occur and as a result it's easy to see concurrency of 1M or
higher. In order to avoid overloading the replicas, the concurrency
of indexed queries is now capped at 4096 and additionally throttled
if enough results are already fetched. For paged queries it means that
the query returns as soon as 1MB of data is ready, and for unpaged ones
the concurrency will no longer be doubled as soon as the previous
iteration fetched 1MB of results.

The fixed 4096 value can be subject to debate, its reasoning is as follows:
for 2KiB rows, so moderately large but not huge, they result in
fetching 10MB of data, which is the granularity used by replicas.
For 200B rows, which is rather small, the result would still be
around 1MB.
At the same time, 4096 separate tasks also means 4096 allocations,
so increasing the number also strains the allocator.

Fixes #8799

Tests: unit(release),
       manual: observing metrics of modified index_paging_test

Closes #8814

* github.com:scylladb/scylla:
  cql3: limit the transitional result size for indexed queries
  cql3: return indexed pages after 1MB worth of data
  cql3: limit the concurrency of indexed statements
2021-06-07 18:00:51 +03:00
Gleb Natapov
01b6a2eb38 raft: randomized_nemesis_test: tick virtual clock less aggressively
Currently each tick of the virtual clock immediately schedules the next one
at the end of the task queue, but this is too aggressive. If a tick
generates work that need two tasks to be scheduled one after another
such implementation will make the task queue grow to infinity. Considering
that in the debug mode even ready future causes preemption and task
queue shuffling may cause two or more ticks to be executed without any
other work done in the middle it is very easy to get to such situation.

The patch changes the virtual clock to tick only when a shard is idle.
Message-Id: <20210606140305.2930189-1-gleb@scylladb.com>
2021-06-07 16:54:56 +02:00
Piotr Sarna
df0d44486a cql3: limit the transitional result size for indexed queries
Unpaged indexed queries already have a concurrency limit of 4096,
but now the concurrency is further limited by previous number of bytes
fetched. Once this number reached 1MB, the concurrency will not be
increased in consecutive queries to avoid overload.
2021-06-07 16:29:18 +02:00
Pavel Solodovnikov
76bea23174 treewide: reduce header interdependencies
Use forward declarations wherever possible.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>

Closes #8813
2021-06-07 15:58:35 +03:00
Tomasz Grabiec
50d64646cd Merge "raft: replication test fixes and OOP refactor" from Alejo
Feature requests, fixes, and OOP refactor of replication_test.

Note: all known bugs and hangs are now fixed.

A new helper class "raft_cluster" is created.
Each move of a helper function to the class has its own commit.
New helpers are provided

To simplify code, for now only a single apply function can be set per
raft_cluster. No tests were using in any other way. In the future,
there could be custom apply functions per server dynamically assigned,
if this becomes needed.

* alejo/raft-tests-replication-02-v3-30: (66 commits)
  raft: replication test: wait for log for both index and term
  raft: replication test: reset network at construction
  raft: replication test: use lambda visitor for updates
  raft: replication test: move structs into class
  raft: replication test: move data structures to cluster class
  raft: replication test: remove shared pointers
  raft: replication test: move get_states() to raft_cluster
  raft: replication test: test_server inside raft_cluster
  raft: replication test: rpc declarative tests
  raft: replication test: add wait_log
  raft: replication test: add stop and reset server
  raft: replication test: disconnect 2 support
  raft: replication test: explicit node_id naming
  raft: replication test: move definitions up
  raft: replication test: no append entries support
  raft: replication test: fix helper parameter
  raft: replication test: stop servers out of config
  raft: replication test: wait log when removing leader from configuration
  raft: replication test: only manipulate servers in configuration
  raft: replication test: only cancel rearm ticker for removed server
  ...
2021-06-06 19:18:49 +03:00
Piotr Sarna
cb17aa1e53 Merge 'test/alternator: rewrite run script to share code with cql-pytest's run script' from Nadav Har'El
In this small series, I rewrite test/alternator/run to Python using the utility
functions developed for test/cql-pytest. In the future, we should do the same to
test/redis/run and test/scylla-gdb/run.

The benefit of this rewrite is less code duplication (all run scripts start with
the same duplicate code to deal with temporary directories, to run Scylla IP
addresses, etc.), but most importantly - in the future fixes we do to cql-pytest
(e.g., parameters needed to start Scylla efficiently, how to shut down Scylla,
etc.) will appear automatically in alternator test without needing to remember
to change both.

Another benefit is that test/alternator/run will now be Python, not a shell
script. This should make it easier to integrate it into test.py (refs #6212) in
the future - if we want to.

Closes #8792

* github.com:scylladb/scylla:
  test/alternator: rewrite test/alternator/run script in Python
  test/cql-pytest: make test run code more general
2021-06-06 19:18:49 +03:00
Avi Kivity
872cd8f692 test: adjust copyright statement to use ScyllaDB rather than old name 2021-06-06 19:18:49 +03:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Pavel Solodovnikov
2187a59089 treewide: move service::cas_request out from storage_proxy.hh
And remove all remaining inclusions of `storage_proxy.hh` in the
headers.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2021-06-06 19:18:49 +03:00
Pavel Solodovnikov
e0749d6264 treewide: some random header cleanups
Eliminate not used includes and replace some more includes
with forward declarations where appropriate.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2021-06-06 19:18:49 +03:00
Gleb Natapov
bb822c92ab raft: change raft::rpc api to return void for most sending functions
Most RAFT packets are sent very rarely during special phases of the
protocol (like election or leader stepdown). The protocol itself does
not care if a packet is sent or dropped, so returning futures from their
send function does not serve any purpose. Change the raft's rpc interface
to return void for all packet types but append_request. We still want to
get a future from sending append_request for backpressure purposes since
replication protocol is more efficient if there is no packet loss, so
it is better to pause a sender than dropping packets inside the rpc. Rpc
is still allowed to drop append_requests if overloaded.
2021-06-06 19:18:49 +03:00
Benny Halevy
3f9bad0f0a test: compound_test: use tests::random
For reproducibility.

Test: compound_test(dev)
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20210602061910.286893-2-bhalevy@scylladb.com>
2021-06-06 09:21:23 +03:00
Benny Halevy
40e032ff8b test: compound_test: use to seastar test framework
Prepare for using tests::random instead of std::rand
for reproducibility.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20210602061910.286893-1-bhalevy@scylladb.com>
2021-06-06 09:21:23 +03:00
Calle Wilund
3b55ef36d1 cf_prop_defs: Fix extensions merge to handle removal
Fixes #8773

When refactored for cdc, properties -> extensions merge
was modified so it did not handle _removal_ (i.e. an
extension function returning null -> no entry in new map).

This causes certain enterprise extensions to not be able
to disable themselves.

Fixed by filtering existing extensions by property keywords.
Unit test added.

Closes #8774
2021-06-06 09:21:23 +03:00
Nadav Har'El
f22ed3ff5c test/alternator: reduce very high timeout in one tracing test
In test_tracing.py::test_slow_query_log, the was what looked like
an innocent 30-second timeout, but this was in fact a 8 minute
timeout - because it started with sleeping 1 second, then 2 seconds,
then 3, ... until 30 seconds. Such a high timeout is frustrating when
trying to debug failures in the test - which is only expected to take
2 seconds (and all of it because of an artificial timeout).

So fix the loop to stop iterating after 60 seconds (a compromise
between 30 seconds and 8 minutes...), sleeping a constant amount
between iterations.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210601150631.1037158-1-nyh@scylladb.com>
2021-06-06 09:21:23 +03:00
Avi Kivity
100d6f4094 build: enable -Wunused-function
Also drop a single violation in transport/server.cc. This helps
prevent dead code from piling up.

Three functions in row_cache_test that are not used in debug mode
are moved near their user, and under the same ifdef, to avoid triggering
the error.

Closes #8767
2021-06-06 09:21:23 +03:00
Alejo Sanchez
3e91a8ca0d raft: replication test: wait for log for both index and term
Waiting on index alone does not guarantee leader correct leader log
propagation. This patch add checking also the term of the leader's last
log entry.

This was exposed with occasional problems with packet drops.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-06-04 08:38:19 -04:00
Alejo Sanchez
545893145e raft: replication test: reset network at construction
Reset network in constructor, not in unrelated function.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-06-04 08:18:32 -04:00
Alejo Sanchez
294dcfb204 raft: replication test: use lambda visitor for updates
Process updates with a lambda visitor.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-06-04 08:18:31 -04:00
Nadav Har'El
0bb2e010f5 test/alternator: rewrite test/alternator/run script in Python
We already wrote the test/cql-pytest/run script in Python in a way
it can be reusable for the other test/*/run scripts.

So this patch replaces the test/alternator/run shell script with Python
code which does the same thing (safely runs Scylla with Alternator and
pytest on it in a temporary directory and IP address), but sharing most
of the code that cql-pytest uses.

The benefit of reusing the test/cql-pytest/run.py library goes beyond
shorter code - the main benefit will be that we can't forget to fix one
of the test/*/run scripts (e.g., add more command line options or fix a
bug) when fixing another one.

To make the test/cql-pytest/run.py library reusable for running
Alternator, I needed to generalize a few things in this patch (e.g.,
the way we check and wait for Scylla to boot with the different APIs we
intend to check). There is also one bug-fix on how interrupts are
handled (they are now better guaranteed to kill pytest) - and now fixing
this bug benefits all runners using run.py (cql-pytest/run,
cql-pytest/run-cassandra and alternator/run).

In the future, we can port the runners which are still duplicate shell
scripts - test/redis/run and test/scylla-gdb/run - to Python in a
similar manner to what we did here for test/alternator/run.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2021-06-03 11:23:00 +03:00
Nadav Har'El
ef45fccdae test/cql-pytest: make test run code more general
Change the cql-pytest-specific run_cql_pytest() function to a more
general function to run pytest in any directory. Will be useful for
reusing the same code for other test runners (e.g., Alternator), and
is also clearer.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2021-06-03 11:22:36 +03:00
Alejo Sanchez
a3fc974de9 raft: replication test: move structs into class
Move auxiliary classes connection and hash_connection out of
raft_cluster and into connected class.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-06-01 23:47:03 -04:00
Alejo Sanchez
5b688d42d7 raft: replication test: move data structures to cluster class
Move state_machine, persistence, connection, hash_connection, connected,
failure_detector, and rpc inside raft_cluster.

This commit moves declaration of class raft_cluster up.
(Minimize changed lines)
Moves apply_fn definition from state_machine to raft_cluster.
Fixes namespace in declarations
Keeps static rpc::net outside for now to keep this commit simple.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-06-01 23:47:03 -04:00
Alejo Sanchez
1250d910ee raft: replication test: remove shared pointers
Following gleb, tomek, and kamil's suggestion, remove unnecessary use of
lw_shared_ptr.

This also solves the problem of constructing a lw_shared_ptr from a
forward declaration (connected) in a subsequent patch.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-06-01 23:47:03 -04:00
Alejo Sanchez
aa1200ee50 raft: replication test: move get_states() to raft_cluster
Move get_states() helper inside raft cluster.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-06-01 23:47:03 -04:00
Alejo Sanchez
740545cdc5 raft: replication test: test_server inside raft_cluster
Since there are no more external users of test_server, move it to
raft_cluster and remove member access operator.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-06-01 23:47:03 -04:00
Alejo Sanchez
1ee4408869 raft: replication test: rpc declarative tests
Convert rpc replication tests to declarative form.

This will enable moving remaining parts inside raft_cluster.

For test stability, add support for checking rpc config of a node
eventually changes to the expected configuration.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-06-01 23:47:03 -04:00
Alejo Sanchez
f11ae18158 raft: replication test: add wait_log
Allow test cases to specify waiting for log for one or more servers.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-06-01 23:47:03 -04:00
Alejo Sanchez
fa84b15909 raft: replication test: add stop and reset server
Add stop an reset server support.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-06-01 23:47:03 -04:00
Alejo Sanchez
19d28e7e0f raft: replication test: disconnect 2 support
Support custom disconnection of 2 servers.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-06-01 23:47:03 -04:00
Alejo Sanchez
e2612e5327 raft: replication test: explicit node_id naming
Use node_id{x} for more expressive naming in tests.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-06-01 23:47:03 -04:00
Alejo Sanchez
bdfdd2da0b raft: replication test: move definitions up
Move definitions up for next patch.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-06-01 23:47:03 -04:00
Alejo Sanchez
14bd29f974 raft: replication test: no append entries support
Handle test cases not appending entries.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-06-01 23:47:03 -04:00
Alejo Sanchez
a73db881cb raft: replication test: fix helper parameter
Use vector instead of initializer_list for function helper parameter.
This is not a constructor and it complicates usage.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-06-01 23:47:03 -04:00
Alejo Sanchez
8468059d0e raft: replication test: stop servers out of config
As requested by @gleb-cloudious, stop servers taken out of
configuration.

Adjust other parts of code relying on all servers being active.

Remove temporary stop on rpc server.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-06-01 23:47:03 -04:00
Alejo Sanchez
51343d4de7 raft: replication test: wait log when removing leader from configuration
If leader is removed from configuration wait log first.

Remove wait_log_all for every case as it was too broad fix.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-06-01 23:47:03 -04:00
Alejo Sanchez
e032d8446f raft: replication test: only manipulate servers in configuration
Only start/stop, init/start/reamr tickers, wait log, elapse_election,
run free election, check for leader, and verify servers in current
configuration.

This is necessary for having servers out of configuration not
present/stopped.

Temporarily stop a server in rpc test until we truly stop servers out of
configuration in next commit.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-06-01 23:47:03 -04:00
Alejo Sanchez
ec078ca55f raft: replication test: only cancel rearm ticker for removed server
When changing configuration, don't pause and restart all tickers.
Only do it for the specific server(s) being removed.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-06-01 23:47:03 -04:00
Alejo Sanchez
802f68317e raft: replication test: only pause restart tickers in config
Only pause and restart tickers for servers in configuration.

Currently when a server is taken out it's reset and a new one is set up,
but out of configuration. @gleb-cloudious requested to have fully
stopped servers when out of configuration, until they are re-added.
This change is needed to allow that or else restart would arm tickers on
servers no longer present.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-06-01 23:47:03 -04:00
Alejo Sanchez
85f299e39b raft: replication test: simplify calls to helpers
Pass test update directly to helpers.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-06-01 23:47:03 -04:00
Alejo Sanchez
27f50b3589 raft: replication test: persisted snapshots in raft_cluster
Move persisted snapshots inside raft_cluster, de-cluttering code.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-06-01 23:47:03 -04:00
Alejo Sanchez
5b601c133b raft: replication test: verify in raft_cluster
Do verifications in raft_cluster::verify().

This will enable having persisted snapshots inside the class and
de-clutter caller code.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-06-01 23:47:03 -04:00
Alejo Sanchez
ce6746b888 raft: replication test: connected inside raft_cluster
Keep connected inside raft_cluster.

Helpers are already provided to handle connectivity.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-06-01 23:47:02 -04:00
Alejo Sanchez
b41ce7084b raft: replication test: snapshots inside raft_cluster
Keep snapshots inside raft_cluster, removing this need outside.

If this is needed later, a const getter can be implemented.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-06-01 23:47:02 -04:00
Alejo Sanchez
4c2f8d84c5 raft: replication test: remove obsolete param
Since create_server() is in raft_cluster, there's no need for
change_configuration() to pass total values anymore. Remove it.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-06-01 23:47:02 -04:00
Alejo Sanchez
e9df914692 raft: replication test: elect_new_leader wait log and pause
Do wait_log() for the next leader always in elect_new_leader.

Only wait log for new leader if it's connected to the old leader.

Pause and restart tickers when creating a candidate to avoid another
node to become a dueling candidate.

Remove pause and restart tickers around calls to elect_new_leader.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-06-01 23:47:02 -04:00
Alejo Sanchez
52188016af raft: replication test: create_server in raft_cluster
Remove the global create_raft_server() and replace with a
create_server() helper in replication_test().

This will allow not requiring the user of raft_cluster to create special
objects.

Note this does not move(apply) anymore as it's kept in raft_cluster.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-06-01 23:47:02 -04:00
Alejo Sanchez
1edcb6e647 raft: replication test: reset snapshots
When stopping a server also delete snapshots and persisted snapshots.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-06-01 23:46:11 -04:00
Alejo Sanchez
453f19cf0e raft: replication test: reset server helper
Add a helper to reset a server in raft_cluster.

Besides simplifying code and preventing errors, this will help move
create_raft_server logic to raft_cluster.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-06-01 21:50:20 -04:00
Alejo Sanchez
d3b7f21b88 raft: replication test: pause tickers before stopping
Pause tickers before stopping servers.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-06-01 21:50:20 -04:00