"
This patchset fixes the bug in SSTables 3.x parser that did not properly
handle deleted counter cells.
A write test is enriched to validate read after write so that this case
is covered.
Tests: unit {release}
"
* 'projects/sstables-30/fix-deleted-counters-read/v1' of https://github.com/argenet/scylla:
tests: Read after write in test_write_counter_table.
sstables: Fix deleted counter cells processing in SSTables 3.x parser.
When a query with multicolumn inequality is issued on clustering columns
having mixed order (ASC and DESC together), if the ranges are not
broken to none overlapping lexicographically monotonic ones, the node
return incorrect rows. This is due to the search nature
(prefix comparison). The solution is to break the range imposed
by the restriction into several single column restrictions OR-ed
together that will be logically equivalent and preserve the
monotonicity assumption. This commit also fixes incorrect results
returned by a multicolumn query on an all descending columns.
A unit test have been added to account for both issues fixed.
Fixes#2050
Tests: Unit test, manual tests of the use case in the issue.
Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
Message-Id: <3b96620a3bd8b0614359a3b0757f324d45189dbb.1536478193.git.eliransin@scylladb.com>
scripts/create-relocatable-package.py:24:1: F401 'shutil' imported but unused
scripts/create-relocatable-package.py:24:1: F401 'tempfile' imported but unused
scripts/create-relocatable-package.py:24:16: E401 multiple imports on one line
scripts/create-relocatable-package.py:26:1: E302 expected 2 blank lines, found 1
scripts/create-relocatable-package.py:47:1: E305 expected 2 blank lines after class or function definition, found 1
scripts/create-relocatable-package.py:93:6: E225 missing whitespace around operator
Signed-off-by: Alexys Jacob <ultrabug@gentoo.org>
Message-Id: <20180917152520.5032-1-ultrabug@gentoo.org>
The test in question is `restricted_reader_timeout`.
Use `eventually_true()` instead of `sleep()` to wait on the timeout
expiring, making the test more robust on overloaded machines.
Also fix graceful failing, another longstanding issue with this test.
The readers created for the test need different destruction logic
depending whether the test failed or succeeded. Previously this was
dealt with by using the logic that worked in case of success and using
asserts to abort when the test failed, thus avoiding developers
investigating the invalid memory accesses happening due to the wrong
destruction logic.
The solution is to use BOOST_CHECK() macro in the check that validates
whether timeout works as expected. This allows for execution to continue
even if the test failed, and thus allows for running the proper cleanup
code even when the test failed.
Fixes: #3719
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <911921dffc924f1b0a3e86408757467e9be2b65b.1537169933.git.bdenes@scylladb.com>
test_partial_delete_selected_column() does a long string of various
updates and deletes, each specifies a different timestamp. In one
of these updates, the timestamp was forgotten. This means that the
server picks the current time, a large number.
As the test is currently written, it doesn't matter which timestamp
was chosen, the test would still succeed (if timestamp >= 15, and it
must be since the timestamp is the time from the epoch).
But the intention was probably to use timestamp = 15, so let's make
this intention clear.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20180905095552.11883-2-nyh@scylladb.com>
We recently saw a failure in test_partial_delete_selected_column() but
this is a very long test doing many operations and comparisons of their
results, and without BOOST_TEST_PASSPOINT() we can't know which of them
really failed.
So let's sprinkle BOOST_TEST_PASSPOINT() calls between the different parts
of test_partial_delete_selected_column(). If this test ever fails again,
we'll know where.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20180905095552.11883-1-nyh@scylladb.com>
On Fedora 28, creating an instance of `std::random_device` opens a file
descriptor for `/dev/urandom` (observed via `strace`).
By declaring static thread-local instances of `std::random_device`,
these descriptors will be open (barring optimization by the compiler)
for the entire duration of the Scylla process's life.
However, the `std::random_device` instance is only necessary for
initializing the `RandomNumberEngine` for generating salts. With this
change, the file-descriptor is closed immediately after the engine is
initialized.
I considered generalizing this pattern of initialization into a
function, but with only two uses (and simple ones) I think this would
only obscure things.
Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com>
Tests: unit (release)
Message-Id: <f1b985d99f66e5e64d714fd0f087e235b71557d2.1536697368.git.jhaberku@scylladb.com>
multishard_mutation_reader starts read-aheads on the
shards-to-be-read-soon. When doing this it didn't check whether the
respective shards had an ongoing read-ahead already. This lead to a
single shard executing multiple concurrent read-aheads. This is damaging
for multiple reasons:
* Can lead to concurrent access of the remote reader's data members.
* The `shard_reader` was designed around a single read-ahead and
thus will synchronise foreground reads with only the last one.
The practical implications of this seen so far was that queries reading
a large number of rows (large enough to reliably trigger the
bug) would stop the read early, due the `combined_mutation_reader`'s
internal accounting being messed up by concurrent access.
Also add a unit test. Instead of coming up with a very specific, and
very contrived unit test, use the test-case that detected this bug in
the first place: count(*) on a table with lots of rows (>1000). This
unit-test should serve well for detecting any similar bugs in the
future.
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <ff1c49be64e2fb443f9aa8c5c8d235e682442248.1536746388.git.bdenes@scylladb.com>
Currently the `trace_state` is moved into the `querier` object's
constructor when one has to be created. Since the trace_state is used
below this lines this had the effect that on the first page of the
query, when a querier object has to be created, tracing would not work
inside the `querier_cache` which received a move-from `trace_state` (a
nullptr effectively).
Change the move to a copy so the other half of the function doesn't use
a moved-from `trace_state`.
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <4987419781aa287141aa9dc8ce99c5068b564c84.1536739052.git.bdenes@scylladb.com>
As our support for reading SSTables 3.x rows is nearly complete, the
write tests can be extended to read data after write.
This patchset adds reading to a handful of write tests.
* https://github.com/argenet/scylla/tree/projects/sstables-30/enrich-write-tests/v6:
tests: Factor out the helper building SSTables path for write tests.
tests: Add validate_read() helper to use in SSTables 3.x write tests.
tests: Preserve tmpdir in SSTables 3.x write tests upon comparison.
tests: Read SSTables for write_static_row test after validating write.
tests: Read SSTables for write_composite_partition_key test after
validating write.
tests: Read SSTables for write_composite_clustering_key test after
validating write.
tests: Read SSTables for write_wide_partitions test after validating
write.
tests: Read SSTables for write_ttled_column test after validating
write.
tests: Read SSTables for write_collection_wide_update test after
validating write.
tests: Read SSTables for write_collection_incremental_update test
after validating write.
tests: Read SSTables for write_missing_columns_large_set test after
validating write.
tests: Read SSTables for write_multiple_partitions test after
validating write.
tests: Read SSTables for write_multiple_rows test after validating
write.
tests: Read SSTables for write_different_types test after validating
write.
tests: Read SSTables for write_empty_clustering_values test after
validating write.
tests: Read SSTables for write_large_clustering_keys test after
validating write.
tests: Read SSTables for write_user_defined_type_table test after
validating write.
tests: Read SSTables for write_deleted_row test after validating
write.
sstables: Fix SSTables 3.x parsing: check use_row_ttl() for TTLed
columns.
tests: Read SSTables for write_ttled_row test after validating write.
Read SSTables for write_compact_table test after validating write.
tests: Read SSTables for tests of many partitions after validating
write.
"
This series contains miscellaneous improvements to the stateful range
scans. These improvements are either things that I forgot to include in
the original series (tracing), was requested by other developers
(comments) or I discovered them while reading the code (lockup and
cleanup).
"
* 'multishard_mutation_query_fixes/v1' of https://github.com/denesb/scylla:
multishard_mutation_query: add some tracing
multishard_mutation_query: add comment to `read_context`
multishard_mutation_query: always cleanup readers properly
multishard_mutation_query: fix possible deadlock when creating a reader fails
Add tracing for the following events:
1) Dismantling of the combined buffer.
2) Dismantling of the compaction state.
3) Cleaning up the readers.
(1) and (2) can possibly have adverse effects on the performance of the
query and hence it is important that details about the dismantled
fragments is exposed in the tracing data.
(3) is less critical but still good to know how much readers were
created by the read (in case they aren't saved). Since normally (in
strateful queries) this will always be 0 only trace this when it is
non-zero (and is interesting).
Currently the reader cleanup code, which ensures the readers and their
dependent objects are destroyed in the corect order and a single
smp::submit_to() message, are only run when the readers are attempted to
be saved. However proper cleanup is needed not only then, but also when
the query is not stateful. Rename the current `cleanup()` method to
`stop()`, make it public and call it from a `finally()` block after the
page is finalized to ensure readers are properly cleaned up at all
times.
Also make sure that failures in `stop()` are never propagated so that
a failure in the cleanup doesn't fail the read itself.
This covers five tests, including three for compressed tables:
- write_many_partitions_deflate
- write_many_partitions_lz4
- write_many_partitions_snappy
- write_many_live_partitions
- write_many_deleted_partitions
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
Failing to create a reader (`do_make_remote_reader()`) can lead to a
deadlock if the reader is in any of the future_*_state states, as the
`then()` block is not executed and hence the promise of the first
future in the chain is not set. Avoid this by changing the `then()` to a
`then_wrapped()` and using `set_exception()` and `set_value()`
accordingly, such that the future is resolved on both the happy and
error path.
The non-TLS RPC server has an rpc::resource_limits configuration that limits
its memory consumption, but the TLS server does not. That means a many-node
TLS configuration can OOM if all nodes gang up on a single replica.
Fix by passing the limits to the TLS server too.
Fixes#3757.
Message-Id: <20180907192607.19802-1-avi@scylladb.com>