Commit Graph

1357 Commits

Author SHA1 Message Date
Dejan Mircevski
8db24fc03b cql3/expr: Handle IN ? bound to null
Previously, we crashed when the IN marker is bound to null.  Throw
invalid_request_exception instead.

Fixes #8265

Tests: unit (dev)

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>

Closes #8287
2021-03-17 09:59:22 +02:00
Avi Kivity
972ea9900c Merge 'commitlog: Make pre-allocation drop O_DSYNC while pre-filling' from Calle Wilund
Refs #7794

Iff we need to pre-fill segment file ni O_DSYNC mode, we should
drop this for the pre-fill, to avoid issuing flushes until the file
is filled. Done by temporarily closing, re-opening in "normal" mode,
filling, then re-opening.

Closes #8250

* github.com:scylladb/scylla:
  commitlog: Make pre-allocation drop O_DSYNC while pre-filling
  commitlog: coroutinize allocate_segment_ex
2021-03-17 09:59:22 +02:00
Tomasz Grabiec
40121621f6 Merge "Kill some get_local_migration_manager() calls" from Pavel Emelyanov
There are a bunch of such calls in schema altering statements and
there's currently no way to obtain the migration manager for such
statements, so a relatively big rework needed.

The solution in this set is -- all statements' execute() methods are
called with query processor as first argument (now the storage proxy
is there), query processor references and provides migration manager
for statements. Those statements that need proxy can already get it
from the query processor.

Afterwards table_helper and thrift code can also stop using the global
migration manager instance, since they both have query processor in
needed places. While patching them a couple of calls to global storage
proxy also go away.

The new query processor -> migration manager dependency fits into
current start-stop sequence: the migration manager is started early,
the query processor is started after it. On stop the query processor
remains alive, but the migration manager stops. But since no code
currently (should) call get_local_migration_manager() it will _not_
call the query_processor::get_migration_manager() either, so this
dangling reference is ugly, but safe.

Another option could be to make storage proxy reference migration
manager, but this dependency doesn't look correct -- migration manager
is higher-level service than the storage proxy is, it is migration
manager who currently calls storage proxy, but not the vice versa.

* xemul/br-kill-some-migration-managers-2:
  cql3: Get database directly from query processor
  thrift: Use query_processor::get_migration_manager()
  table_helper: Use query_processor::get_migration_manager()
  cql3: Use query_processor::get_migration_manager() (lambda captures cases)
  cql3: Use query_processor::get_migration_manager() (alter_type statement)
  cql3: Use query_processor::get_migration_manager() (trivial cases)
  query_processor: Keep migration manager onboard
  cql3: Pass query processor to announce_migration:s
  cql3: Switch to qp (almost) in schema-altering-stmt
  cql3: Change execute()'s 1st arg to query_processor
2021-03-17 09:59:22 +02:00
Pavel Solodovnikov
93c565a1bf raft: allow raft server to start with initial term 0
Prior to the fix there was an assert to check in
`raft::server_impl::start` that the initial term is not 0.

This restriction is completely artificial and can be lifted
without any problems, which will be described below.

The only place that is dependent on this corner case is in
`server_impl::io_fiber`. Whenever term or vote has changed,
they will be both set in `fsm::get_output`. `io_fiber` checks
whether it needs to persist term and vote by validating that
the term field is set (by actually executing a `term != 0`
condition).

This particular check is based on an unobvious fact that the
term will never be 0 in case `fsm::get_output` saves
term and vote values, indicating that they need to be
persisted.

Vote and term can change independently of each other, so that
checking only for term obscures what is happening and why
even more.

In either case term will never be 0, because:

1. If the term has changed, then it's naturally greater than 0,
   since it's a monotonically increasing value.
2. If the vote has changed, it means that we received
   a vote request message. In such case we have already updated
   our term to the requester's term.

Switch to using an explicit optional in `fsm_output` so that
a reader don't have to think about the motivation behind this `if`
and just checks that `term_and_vote` optional is engaged.

Given the motivation described above, the corresponding

    assert(_fsm->get_current_term() != term_t(0));

in `server_impl::start` is removed.

Tests: unit(dev)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2021-03-17 09:59:21 +02:00
Nadav Har'El
e344f74858 Merge 'logalloc: improve background reclaim shares management' from Avi Kivity
The log structured allocator's background reclaimer tries to
allocate CPU power proportional to memory demand, but a
bug made that not happen. Fix the bug, add some logging,
and future-proof the timer. Also, harden the test against
overcommitted test machines.

Fixes #8234.

Test: logalloc_test(dev), 20 concurrent runs on 2 cores (1 hyperthread each)

Closes #8281

* github.com:scylladb/scylla:
  test: logalloc_test: harden background reclain test against cpu overcommit
  logalloc: background reclaim: use default scheduling group for adjusting shares
  logalloc: background reclaim: log shares adjustment under trace level
  logalloc: background reclaim: fix shares not updated by periodic timer
2021-03-17 09:59:21 +02:00
Pavel Emelyanov
1de235f4da query_processor: Keep migration manager onboard
The query processor sits upper than the migration manager,
in the services layering, it's started after and (will be)
stopped before the migration manager.

The migration manager is needed in schema altering statements
which are called with query processor argument. They will
later get the migration manager from the query processor.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-03-15 19:00:58 +03:00
Avi Kivity
65fea203d2 test: logalloc_test: harden background reclain test against cpu overcommit
Use thread CPU time instead of real time to avoid an overcommitted
machine from not being able to supply enough CPU for the test.
2021-03-15 13:54:49 +02:00
Alejo Sanchez
88063b6e3e raft: tests: move common helpers to header
Move common test helper functions and data structures to a common
helpers.hh header.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-03-15 06:16:58 -04:00
Alejo Sanchez
6139ad6337 raft: tests: move boost tests to tests/raft
Move raft boost tests to test/raft directory.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-03-15 06:16:58 -04:00
Calle Wilund
48ca01c3ab commitlog: Make pre-allocation drop O_DSYNC while pre-filling
Refs #7794

Iff we need to pre-fill segment file ni O_DSYNC mode, we should
drop this for the pre-fill, to avoid issuing flushes until the file
is filled. Done by temporarily closing, re-opening in "normal" mode,
filling, then re-opening.

v2:
* More comment
v3:
* Add missing flush
v4:
* comment
v5:
* Split coroutine and fix into separate patches
2021-03-15 09:35:45 +00:00
Tomasz Grabiec
f2ecb4617e Merge "raft: implement prevoting stage in leader election" from Gleb
This is how PhD explain the need for prevoting stage:

  One downside of Raft's leader election algorithm is that a server that
  has been partitioned from the cluster is likely to cause a disruption
  when it regains connectivity. When a server is partitioned, it will
  not receive heartbeats. It will soon increment its term to start
  an election, although it won't be able to collect enough votes to
  become leader. When the server regains connectivity sometime later, its
  larger term number will propagate to the rest of the cluster (either
  through the server's RequestVote requests or through its AppendEntries
  response). This will force the cluster leader to step down, and a new
  election will have to take place to select a new leader.

  Prevoting stage is addressing that. In the Prevote algorithm, a
  candidate only increments its term if it first learns from a majority of
  the cluster that they would be willing to grant the candidate their votes
  (if the candidate's log is sufficiently up-to-date, and the voters have
  not received heartbeats from a valid leader for at least a baseline
  election timeout).

  The Prevote algorithm solves the issue of a partitioned server disrupting
  the cluster when it rejoins. While a server is partitioned, it won't
  be able to increment its term, since it can't receive permission
  from a majority of the cluster. Then, when it rejoins the cluster, it
  still won't be able to increment its term, since the other servers
  will have been receiving regular heartbeats from the leader. Once the
  server receives a heartbeat from the leader itself, it will return to
  the follower state(in the same term).

In our implementation we have "stable leader" extension that prevents
spurious RequestVote to dispose an active leader, but AppendEntries with
higher term will still do that, so prevoting extension is also required.

* scylla-dev/raft-prevote-v5:
  raft: store leader and candidate state in state variant
  raft: add boost tests for prevoting
  raft: implement prevoting stage in leader election
  raft: reset the leader on entering candidate state
  raft: use modern unordered_set::contains instead of find in become_candidate
2021-03-12 11:15:51 +01:00
Gleb Natapov
e231186a7b raft: store leader and candidate state in state variant
We already have server state dependant state in fsm, so there is no need
to maintain "voters" and "tracker" optionals as well. The upside is that
optional and variant sates cannot drift apart now.
2021-03-12 11:12:57 +02:00
Gleb Natapov
e17e7d57bd raft: add boost tests for prevoting 2021-03-12 11:12:57 +02:00
Avi Kivity
486f6bf29c Merge "sstables: move format specific reader code to kl/, mx/" from Botond
"
Currently the sstable reader code is scattered across several source
files as following (paths are relative to sstables/):
* partition.cc - generic reader code;
* row.hh - format specific code related to building mutation fragments
  from cells;
* mp_row_consumer.hh - format specific code related to parsing the raw
  byte stream;

This is a strange organization scheme given that the generic sstable
reader is a template and as such it doesn't itself depend on the other
headers where the consumer and context implementations live. Yet these
are all included in partition.cc just so the reader factory function can
instantiate the sstable reader template with the format specific
objects.

This patchset reorganizes this code such that the generic sstable reader
is exposed in a header. Furthermore, format specific code is moved to
the kl/ and mx/ directories respectively. Each directory has a
reader.hh with a single factory function which creates the reader, all
the format specific code is hidden from sight. The added benefit is that
now reader code specific to a format is centralized in the format
specific folder, just like the writer code.

This patchset only moves code around, no logical changes are made.

Tests: unit(dev)
"

* 'sstable-reader-separation/v1' of https://github.com/denesb/scylla:
  sstables: get rid of mp_row_consumer.{hh,cc}
  sstables: get rid of row.hh
  sstables/mp_row_consumer.hh: remove unused struct new_mutation
  sstables: move mx specific context and consumer to mx/reader.cc
  sstables: move kl specific context and consumer to kl/reader.cc
  sstables: mv partition.cc sstable_mutation_reader.hh
2021-03-11 16:57:54 +02:00
Botond Dénes
3ba782bddd sstables: get rid of row.hh
Move stuff contained therein to `sstable_mutation_reader.{hh,cc}` which
will serve as the collection point of utility stuff needed by all reader
implementations.
2021-03-11 12:17:13 +02:00
Botond Dénes
4e3ae9d913 sstables: move kl specific context and consumer to kl/reader.cc
Move all the kl format specific context and consumer code to
kl/reader* and add a factory function `kl::make_reader()` which takes
over the job of instantiating the `sstable_mutation_reader` with the kl
specific context and consumer. Code which is used by test is moved to
kl/reader_impl.hh, while code that can be hidden us moved to
kl/reader.cc. Users who just want to create a reader only have to
include kl/reader.hh.
2021-03-11 12:17:13 +02:00
Avi Kivity
c8f692e526 Merge 'cql3: Rewrite get_clustering_bounds() using expressions' from Dejan Mircevski
Instead of using the `restrictions` class hierarchy, calculate the clustering slice using the `expr::expression` representation of the WHERE clause.  This will allow us to eventually drop the `restrictions` hierarchy altogether.

Tests: unit (dev, debug)

Closes #8227

* github.com:scylladb/scylla:
  cql3: Make get_clustering_bounds() use expressions
  cql3/expr: Add is_multi_column()
  cql3/expr: Add more operators to needs_filtering
  cql3: Replace CK-bound mode with comparison_order
  cql3/expr: Make to_range globally visible
  cql3: Gather slice-defining WHERE expressions
  cql3: Add statement_restrictions::_where
  test: Add unit tests for get_clustering_bounds
2021-03-11 11:46:52 +02:00
Dejan Mircevski
990de02d28 cql3: Make get_clustering_bounds() use expressions
Use expressions instead of _clustering_columns_restrictions.  This is
a step towards replacing the entire restrictions class hierarchy with
expressions.

Update some expected results in unit tests to reflect the new code.
These new results are equivalent to the old ones in how
storage_proxy::query() will process them (details:
bound_view::from_range() returns the same result for an empty-prefix
singular as for (-inf,+inf)).

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2021-03-10 21:25:43 -05:00
Dejan Mircevski
2525759027 test: Add unit tests for get_clustering_bounds
... as guardrails for the upcoming rewrite.

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2021-03-10 21:17:26 -05:00
Benny Halevy
ff5b42a0fa bytes_ostream: max_chunk_size: account for chunk header
Currently, if the data_size is greater than
max_chunk_size - sizeof(chunk), we end up
allocating up to max_chunk_size + sizeof(chunk) bytes,
exceeding buf.max_chunk_size().

This may lead to allocation failures, as seen in
https://github.com/scylladb/scylla/issues/7950,
where we couldn't allocate 131088 (= 128K + 16) bytes.

This change adjusted the expose max_chunk_size()
to be max_alloc_size (128KB) - sizeof(chunk)
so that the allocated chunks would normally be allocated
in 128KB chunks in the write() path.

Added a unit test - test_large_placeholder that
stresses the chunk allocation path from the
write_place_holder(size) entry point to make
sure it handles large chunk allocations correctly.

Refs #7950
Refs #8081

Test: unit(release), bytes_ostream_test(debug)
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20210303143413.902968-1-bhalevy@scylladb.com>
2021-03-10 19:54:12 +02:00
Avi Kivity
5342d79461 Merge "Preparatory work in sstable_set for the upcoming compound_sstable_set_impl" from Raphael
* 'preparatory_work_for_compound_set' of github.com:raphaelsc/scylla:
  sstable_set: move all() implementation into sstable_set_impl
  sstable_set: preparatory work to change sstable_set::all() api
  sstables: remove bag_sstable_set
2021-03-10 19:19:26 +02:00
Botond Dénes
cf28552357 mutation_test: test_mutation_diff_with_random_generator: compact input mutations
This test checks that `mutation_partition::difference()` works correctly.
One of the checks it does is: m1 + m2 == m1 + (m2 - m1).
If the two mutations are identical but have compactable data, e.g. a
shadowable tombstone shadowed by a row marker, the apply will collapse
these, causing the above equality check to fail (as m2 - m1 is null).
To prevent this, compact the two input mutations.

Fixes: #8221
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20210310141118.212538-1-bdenes@scylladb.com>
2021-03-10 16:28:14 +01:00
Raphael S. Carvalho
05b07c7161 sstable_set: preparatory work to change sstable_set::all() api
users of sstable_set::all() rely on the set itself keeping a reference
to the returned list, so user can iterate through the list assuming
that it is alive all the way through.

this will change in the future though, because there will be a
compound set impl which will have to merge the all() of multiple
managed sets, and the result is a temporary value.

so even range-based loops on all() have to keep a ref to the returned
list, to avoid the list from being prematurely destroyed.

so the following code
	for (auto& sst : *sstable_set.all()) { ...}
becomes
	for (auto sstables = sstable_set.all(); auto& sst : *sstables) { ... }

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2021-03-10 12:02:12 -03:00
Avi Kivity
746798fd56 Merge "sstables: get rid of data_consume_context" from Botond
"
This class is basically a wrapper around a unique pointer and a few
short convenience methods, but is otherwise a distraction in trying to
untangle the maze that is the sstable reader class hierachy.
So this patchset folds it into its only real user: the sstable reader.
"

* 'data_consume_context_bye' of https://github.com/denesb/scylla:
  sstable: move data_consume_* factory methods to row.hh
  sstables: fold data_consume_context: into its users
  sstables: partition.cc: remove data_consume_* forward declarations
2021-03-10 16:45:32 +02:00
Botond Dénes
1aa2424dcf sstable: move data_consume_* factory methods to row.hh 2021-03-10 15:40:50 +02:00
Botond Dénes
a06465a8f3 sstables: fold data_consume_context: into its users
`data_consume_context` is a thin wrapper over the real context object
and it does little more than forward method calls to it. The few
methods doing more then mere forwarding can be folded into its single
real user: `sstable_reader`.
2021-03-10 15:38:58 +02:00
Nadav Har'El
f41dac2a3a alternator: avoid large contiguous allocation for request body
Alternator request sizes can be up to 16 MB, but the current implementation
had the Seastar HTTP server read the entire request as a contiguous string,
and then processed it. We can't avoid reading the entire request up-front -
we want to verify its integrity before doing any additional processing on it.
But there is no reason why the entire request needs to be stored in one big
*contiguous* allocation. This always a bad idea. We should use a non-
contiguous buffer, and that's the goal of this patch.

We use a new Seastar HTTPD feature where we can ask for an input stream,
instead of a string, for the request's body. We then begin the request
handling by reading lthe content of this stream into a
vector<temporary_buffer<char>> (which we alias "chunked_content"). We then
use this non-contiguous buffer to verify the request's signature and
if successful - parse the request JSON and finally execute it.

Beyond avoiding contiguous allocations, another benefit of this patch is
that while parsing a long request composed of chunks, we free each chunk
as soon as its parsing completed. This reduces the peak amount of memory
used by the query - we no longer need to store both unparsed and parsed
versions of the request at the same time.

Although we already had tests with requests of different lengths, most
of them were short enough to only have one chunk, and only a few had
2 or 3 chunks. So we also add a test which makes a much longer request
(a BatchWriteItem with large items), which in my experiment had 17 chunks.
The goal of this test is to verify that the new signature and JSON parsing
code which needs to cross chunk boundaries work as expected.

Fixes #7213.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210309222525.1628234-1-nyh@scylladb.com>
2021-03-10 09:22:34 +01:00
Tomasz Grabiec
c9c2beabc0 Merge "raft: replication tests as individual boost tests" from Alejo
* alejo/raft-tests-replication-boost-5:
  raft: replication test: use Seastar random generator
  raft: replication test: rename drop_replication
  raft: replication test: change to Boost test
  raft: replication test: id helper functions
  raft: replication test: improve handling connectivity
  raft: replication test: parametrize snapshots
  raft: replication test: parametrize drop_replication
  raft: replication test: remove unused configuration
  raft: replication test: add license
2021-03-09 17:58:59 +01:00
Pavel Emelyanov
096e452db9 test: Fix exit condition of row_cache_test::test_eviction_from_invalidated
The test populates the cache, then invalidates it, then tries to push
huge (10x times the segment size) chunks into seastar memory hoping that
the invalid entries will be evicted. The exit condition on the last
stage is -- total memory of the region (sum of both -- used and free)
becomes less than the size of one chunk.

However, the condition is wrong, because cache usually contains a dummy
entry that's not necessarily on lru and on some test iteration it may
happen that

  evictable size < chunk size < evictable size + dummy size

In this case test fails with bad_alloc being unable to evict the memory
from under the dummy.

fixes: #7959
tests: unit(row_cache_test), unit(the failing case with the triggering
       seed from the issue + 200 times more with random seeds)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20210309134138.28099-1-xemul@scylladb.com>
2021-03-09 17:57:52 +01:00
Alejo Sanchez
f67b85e2b3 raft: replication test: use Seastar random generator
Use the random generator provided by Seastar test suite.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-03-09 12:52:07 -04:00
Alejo Sanchez
1bf10a87c6 raft: replication test: rename drop_replication
Rename drop_replication to packet_drops for readability.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-03-09 12:52:07 -04:00
Alejo Sanchez
6e193ee3bf raft: replication test: change to Boost test
Change test/raft directory to Boost test type.

Run replication_test cases with their own test.

RAFT_TEST_CASE macro creates 2 test cases, one with random 20% packet
loss named name_drops.

The directory test/raft is changed to host Boost tests instead of unit.

While there improve the documentation.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-03-09 12:52:07 -04:00
Alejo Sanchez
8d9c797954 raft: replication test: id helper functions
In raft the UUID 0 is a special case so server ids start at 1.
Add two helper functions. Convert local 0-based id to raft 1-based
UUID. And from UUID to raft_id.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-03-09 12:50:12 -04:00
Alejo Sanchez
0ffa450222 raft: replication test: improve handling connectivity
Change global map of disconnected servers to a more intuitive class
connected. The class is callable for the most common case
connected(id).

Methods connect(), disconnect(), and all() are provided for readability
instead of directly calling map methods (insert, erase, clear). They
also support both numerical (0 based) and server_id (UUID, 1 based) ids.

The actual shared map is kept in a lw_shared_ptr.

The class is passed around to be copy-constructed which is practically
just creating a new lw_shared_ptr.

Internally it tracks disconnected servers but externally it's more
intuitive to use connect instead of disconnect. So it reads
"connected id" and "not disconnected id", without double negatives.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-03-09 12:39:29 -04:00
Alejo Sanchez
7a644f37d3 raft: replication test: parametrize snapshots
Snapshots and persisted snapshots created per test instead of globals.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-03-09 11:58:20 -04:00
Alejo Sanchez
f72e89fcfe raft: replication test: parametrize drop_replication
Pass drop_replication down instead of keeping it global.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-03-09 11:58:20 -04:00
Alejo Sanchez
5a03670f91 raft: replication test: remove unused configuration
Remove test case configuration as it's not implemented yet.

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-03-09 11:58:20 -04:00
Alejo Sanchez
efc6681cd6 raft: replication test: add license
Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2021-03-09 11:58:20 -04:00
Gleb Natapov
2a41ad0b57 raft: add testing for non-voting members
Add tests to check if quorum (for leader election and commit index
purposes) is calculated correctly in the presence of non-voting members.
Message-Id: <20210304101158.1237480-3-gleb@scylladb.com>
2021-03-09 13:51:09 +01:00
Gleb Natapov
dd6ba3d507 raft: add non-voting member support
This patch adds a support for non-voting members. Non voting member is a
member which vote is not counted for leader election purposes and commit
index calculation purposes and it cannot become a leader. But otherwise
it is a normal raft node. The state is needed to let new nodes to catch
up their log without disturbing a cluster.

All kind of transitions are allowed. A node may be added as a voting member
directly or it may be added as non-voting and then changed to be voting
one through additional configuration change. A node can be demoted from
voting to non-voting member through a configuration change as well.
Message-Id: <20210304101158.1237480-2-gleb@scylladb.com>
2021-03-09 13:47:48 +01:00
Nadav Har'El
28804a50f7 alternator-test: test that index can't be a name reference (#xyz)
We already have a test which shows verify DynamoDB and Alternator
do not allow an index in an attribute path - like a[0].b - to be
a value reference - a[:xyz].b. We forgot to verify that the index
also can't be a name reference - a[#xyz].b is a syntax error. So here
we add a test which confirms that this is indeed the case - DynamoDB
doesn't allow it, and neither does Alternator.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210219123310.1240271-1-nyh@scylladb.com>
2021-03-08 10:17:19 +01:00
Avi Kivity
1287a5e1d0 test: index_reader_assertions: fix misuse of trichotomic comparator in has_monotonic_positions
has_monotonic_positions() wants to check for a greater-than-or-equal-to
relation, but actually tests for not-equal, since it treats a
trichotomic comparator as a less-than comparator. This is clearly seen
in the BOOST_FAIL message just below.

Fix by aligning the test with the intended invariant. Luckily, the tests
still pass.

Ref #1449.

Closes #8222
2021-03-07 13:44:37 +02:00
Nadav Har'El
acfa180766 cql-pytest: recognize when Scylla crashes
Before this patch, if Scylla crashes during some test in cql-pytest, all
tests after it will fail because they can't connect to Scylla - and we can
get a report on hundreds of failures without a clear sign of where the real
problem was.

This patch introduces an autouse fixture (i.e., a fixture automatically
used by every test) which tries to run a do-nothing CQL command after each
test.  If this CQL command fails, we conclude that Scylla crashed and
report the test in which this happened - and exist pytest instead of failing
a hundred more tests.

Fixes #8080

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210304132804.1527977-1-nyh@scylladb.com>
2021-03-04 16:00:00 +02:00
Tomasz Grabiec
3cb01f218f Merge "raft: add unit tests for log, tracker, votes and fix found bugs" from Kostja
Test log consistency after apply_snapshot() is called.
Ensure log::last_term() log::last_conf_index() and log::size()
work as expected.

Misc cleanups.

* scylla-dev.git/raft-confchange-test-v4:
  raft: fix spelling
  raft: add a unit test for voting
  raft: do not account for the same vote twice
  raft: remove fsm::set_configuration()
  raft: consistently use configuration from the log
  raft: add ostream serialization for enum vote_result
  raft: advance commit index right after leaving joint configuration
  raft: add tracker test
  raft: tidy up follower_progress API
  raft: update raft::log::apply_snapshot() assert
  raft: add a unit test for raft::log
  raft: rename log::non_snapshoted_length() to log::in_memory_size()
  raft: inline raft::log::truncate_tail()
  raft: ignore AppendEntries RPC with a very old term
  raft: remove log::start_idx()
  raft: return a correct last term on an empty log
  raft: do not use raft::log::start_idx() outside raft::log()
  raft: rename progress.hh to tracker.hh
  raft: extend single_node_is_quiet test
2021-03-03 16:29:40 +01:00
Tomasz Grabiec
0dc57db248 Revert "Merge "raft: add unit tests for log, tracker, votes and fix found bugs" from Kostja"
This reverts commit f94f70cda8, reversing
changes made to 5206a97915.

Not the latest version of the series was merged. Rvert prior to
merging the latest one.
2021-03-03 16:29:02 +01:00
Nadav Har'El
4e3db5297a cql-pytest: rework tests for filtering leaving out most rows
Previously, we had two tests demonstrating issue #7966. But since then,
our understanding of this issue has improved which resulted in issue #8203,
so this patch improves those tests and makes them reproduce the new issue.

Importantly, we now know that this problem is not specific to a full-table
scan, and also happens in a single-partition scan, so we fix the test to
demonstrate this (instead of the old test, which missed the problem so
the test passed).

Both tests pass on Cassandra, and fail on Scylla.

Refs #8203.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210302224020.1498868-1-nyh@scylladb.com>
2021-03-03 11:22:08 +01:00
Nadav Har'El
0fea089b37 Merge 'Fix reading whole requests during shedding' from Piotr Sarna
When shedding requests (e.g. due to their size or number exceeding the
limits), errors were returned right after parsing their headers, which
resulted in their bodies lingering in the socket. The server always
expects a correct request header when reading from the socket after the
processing of a single request is finished, so shedding the requests
should also take care of draining their bodies from the socket.

Fixes #8193

Closes #8194

* github.com:scylladb/scylla:
  cql-pytest: add a shedding test
  transport: return error on correct stream during size shedding
  transport: return error on correct stream during shedding
  transport: skip the whole request if it is too large
  transport: skip the whole request during shedding
2021-03-03 08:52:48 +02:00
Piotr Sarna
4499f89916 cql-pytest: add a shedding test
This scylla-only test case tries to push a too-large request
to Scylla, and then retries with a smaller request, expecting
a success this time.

Refs #8193
2021-03-03 07:08:55 +01:00
Nadav Har'El
d6335b7fda test/alternator: better tests of oversized requests
Like DynamoDB, Alternator rejects requests larger than some fixed maximum
size (16MB). We had a test for this feature - test_too_large_request,
but it was too blunt, and missed two issues:

Refs #8195
Refs #8196

So this patch adds two better tests that reproduce these two issues:

First, test_too_large_request_chunked verifies that an oversized request
is detected even if the body is sent with chunked encoding.

Second, both tests - test_too_large_request_chunked and
test_too_large_request_content_length - verify that the rather limited
(and arguably buggy) Python HTTP client is able to read the 413 status
code - and doesn't report some generic I/O error.

Both tests pass on DynamoDB, but fail on Alternator because of these two
open issues.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210302154555.1488812-1-nyh@scylladb.com>
2021-03-03 07:06:45 +01:00
Nadav Har'El
c6ca1ec643 cql-pytest: add reproducers for two filtering-related issues
The main goal of this patch is to add a reproducer for issue #7966, where
partition-range scan with filtering that begins with a long string of
non-matches aborts the query prematurely - but the same thing is fine with
a single-partition scan. The test, test_filtering_with_few_matches, is
marked as "xfail" because it still fails on Scylla. It passes on Cassandra.

I put a lot of effort into making this reproducer *fast* - the dev-build
test takes 0.4 seconds on my laptop. Earlier reproducers for the same
problem took as much as 30 seconds, but 0.4 seconds turns this test into
a viable regression test.

We also add a test, test_filter_on_unset, reproduces issue #6295 (or
the duplicate #8122), which was already solved so this test passes.

Refs #6295
Refs #7966
Refs #8122

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210301170451.1470824-1-nyh@scylladb.com>
2021-03-03 07:06:45 +01:00