Commit Graph

16503 Commits

Author SHA1 Message Date
Alexys Jacob
cd74dfebfb scripts: coding style fixes
scripts/create-relocatable-package.py:24:1: F401 'shutil' imported but unused
scripts/create-relocatable-package.py:24:1: F401 'tempfile' imported but unused
scripts/create-relocatable-package.py:24:16: E401 multiple imports on one line
scripts/create-relocatable-package.py:26:1: E302 expected 2 blank lines, found 1
scripts/create-relocatable-package.py:47:1: E305 expected 2 blank lines after class or function definition, found 1
scripts/create-relocatable-package.py:93:6: E225 missing whitespace around operator

Signed-off-by: Alexys Jacob <ultrabug@gentoo.org>
Message-Id: <20180917152520.5032-1-ultrabug@gentoo.org>
2018-09-17 18:40:23 +03:00
Alexys Jacob
c80d7b97cc scyllatop: more coding style fixes
tools/scyllatop/metric.py:2:1: F401 're' imported but unused
tools/scyllatop/metric.py:53:20: E221 multiple spaces before operator
tools/scyllatop/metric.py:69:20: E221 multiple spaces before operator

Signed-off-by: Alexys Jacob <ultrabug@gentoo.org>
Message-Id: <20180917153308.7240-1-ultrabug@gentoo.org>
2018-09-17 18:39:53 +03:00
Raphael S. Carvalho
5bc028f78b database: fix 2x increase in disk usage during cleanup compaction
Don't hold reference to sstables cleaned up, so that file descriptors
for their index and data files will be closed and consequently disk
space released.

Fixes #3735.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20180914194047.26288-1-raphaelsc@scylladb.com>
2018-09-17 17:26:46 +03:00
Alexys Jacob
46d101c1f2 scyllatop: coding style fixes
tools/scyllatop/prometheus.py:3:1: F401 'sys' imported but unused
tools/scyllatop/prometheus.py:7:1: E302 expected 2 blank lines, found 1
tools/scyllatop/prometheus.py:12:5: E301 expected 1 blank line, found 0
tools/scyllatop/prometheus.py:17:1: W293 blank line contains whitespace
tools/scyllatop/prometheus.py:22:82: E225 missing whitespace around operator

Signed-off-by: Alexys Jacob <ultrabug@gentoo.org>
Message-Id: <20180914110847.1862-1-ultrabug@gentoo.org>
2018-09-17 15:45:43 +03:00
Botond Dénes
a84c26799d tests/mutation_reader_test: fix flaky restricted reader timeout test
The test in question is `restricted_reader_timeout`.

Use `eventually_true()` instead of `sleep()` to wait on the timeout
expiring, making the test more robust on overloaded machines.

Also fix graceful failing, another longstanding issue with this test.
The readers created for the test need different destruction logic
depending whether the test failed or succeeded. Previously this was
dealt with by using the logic that worked in case of success and using
asserts to abort when the test failed, thus avoiding developers
investigating the invalid memory accesses happening due to the wrong
destruction logic.
The solution is to use BOOST_CHECK() macro in the check that validates
whether timeout works as expected. This allows for execution to continue
even if the test failed, and thus allows for running the proper cleanup
code even when the test failed.

Fixes: #3719
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <911921dffc924f1b0a3e86408757467e9be2b65b.1537169933.git.bdenes@scylladb.com>
2018-09-17 09:40:45 +01:00
Nadav Har'El
0006e21c4d tests/view_complex_test: add missing timestamp
test_partial_delete_selected_column() does a long string of various
updates and deletes, each specifies a different timestamp. In one
of these updates, the timestamp was forgotten. This means that the
server picks the current time, a large number.

As the test is currently written, it doesn't matter which timestamp
was chosen, the test would still succeed (if timestamp >= 15, and it
must be since the timestamp is the time from the epoch).
But the intention was probably to use timestamp = 15, so let's make
this intention clear.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20180905095552.11883-2-nyh@scylladb.com>
2018-09-17 00:38:55 +01:00
Nadav Har'El
2ae4ed151e tests/view_complex_test - add test passpoints
We recently saw a failure in test_partial_delete_selected_column() but
this is a very long test doing many operations and comparisons of their
results, and without BOOST_TEST_PASSPOINT() we can't know which of them
really failed.

So let's sprinkle BOOST_TEST_PASSPOINT() calls between the different parts
of test_partial_delete_selected_column(). If this test ever fails again,
we'll know where.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20180905095552.11883-1-nyh@scylladb.com>
2018-09-17 00:38:55 +01:00
Jesse Haber-Kucharsky
9d27045c76 auth: Shorten random_device instance life-span
On Fedora 28, creating an instance of `std::random_device` opens a file
descriptor for `/dev/urandom` (observed via `strace`).

By declaring static thread-local instances of `std::random_device`,
these descriptors will be open (barring optimization by the compiler)
for the entire duration of the Scylla process's life.

However, the `std::random_device` instance is only necessary for
initializing the `RandomNumberEngine` for generating salts. With this
change, the file-descriptor is closed immediately after the engine is
initialized.

I considered generalizing this pattern of initialization into a
function, but with only two uses (and simple ones) I think this would
only obscure things.

Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com>
Tests: unit (release)
Message-Id: <f1b985d99f66e5e64d714fd0f087e235b71557d2.1536697368.git.jhaberku@scylladb.com>
2018-09-12 12:14:21 +01:00
Botond Dénes
dfad223ea2 multishard_mutation_reader: shard_reader: don't do concurrent read-aheads
multishard_mutation_reader starts read-aheads on the
shards-to-be-read-soon. When doing this it didn't check whether the
respective shards had an ongoing read-ahead already. This lead to a
single shard executing multiple concurrent read-aheads. This is damaging
for multiple reasons:
    * Can lead to concurrent access of the remote reader's data members.
    * The `shard_reader` was designed around a single read-ahead and
    thus will synchronise foreground reads with only the last one.

The practical implications of this seen so far was that queries reading
a large number of rows (large enough to reliably trigger the
bug) would stop the read early, due the `combined_mutation_reader`'s
internal accounting being messed up by concurrent access.

Also add a unit test. Instead of coming up with a very specific, and
very contrived unit test, use the test-case that detected this bug in
the first place: count(*) on a table with lots of rows (>1000). This
unit-test should serve well for detecting any similar bugs in the
future.

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <ff1c49be64e2fb443f9aa8c5c8d235e682442248.1536746388.git.bdenes@scylladb.com>
2018-09-12 11:43:18 +01:00
Botond Dénes
6a07b8ae83 multishard_mutation_reader: update shard_reader's comment
The `adandoned` member was renamed to `stopped`. Update the comment
accordingly.

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <1d655785f28fe1e5fa041f2f49852f0ad88be53e.1536743950.git.bdenes@scylladb.com>
2018-09-12 11:32:08 +02:00
Botond Dénes
d9a2ffad84 mutation_partition: don't move tracing_state early
Currently the `trace_state` is moved into the `querier` object's
constructor when one has to be created. Since the trace_state is used
below this lines this had the effect that on the first page of the
query, when a querier object has to be created, tracing would not work
inside the `querier_cache` which received a move-from `trace_state` (a
nullptr effectively).
Change the move to a copy so the other half of the function doesn't use
a moved-from `trace_state`.

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <4987419781aa287141aa9dc8ce99c5068b564c84.1536739052.git.bdenes@scylladb.com>
2018-09-12 11:32:08 +02:00
Botond Dénes
49704755b0 combined_mutation_reader: propagate timeout in fill_buffer()
All user reads go through the combined reader. Not propagating the
timeout down from there means that the storage layer's timeout
functionality is effectively disabled. Spotted while reading the code.

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <7fc10eca1c231dd04ac433913d9e6a51b6b17139.1536657041.git.bdenes@scylladb.com>
2018-09-11 15:44:28 +02:00
Botond Dénes
99ab43a1cc flat_mutation_reader: add timeout parameter to operator()()
For consistency with fast_foward_to() and fill_buffer(), and for
correctness: operator()() calls fill_buffer() and thus should provide a
timeout for the storage layer.

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <6e97552ac2372e5846c955d94400b5315dbd2a89.1536657041.git.bdenes@scylladb.com>
2018-09-11 15:44:12 +02:00
Tomasz Grabiec
eb321a0830 Merge "Enrich SSTables 3.x write tests with subsequent read" from Vladimir
As our support for reading SSTables 3.x rows is nearly complete, the
write tests can be extended to read data after write.
This patchset adds reading to a handful of write tests.

* https://github.com/argenet/scylla/tree/projects/sstables-30/enrich-write-tests/v6:
  tests: Factor out the helper building SSTables path for write tests.
  tests: Add validate_read() helper to use in SSTables 3.x write tests.
  tests: Preserve tmpdir in SSTables 3.x write tests upon comparison.
  tests: Read SSTables for write_static_row test after validating write.
  tests: Read SSTables for write_composite_partition_key test after
    validating write.
  tests: Read SSTables for write_composite_clustering_key test after
    validating write.
  tests: Read SSTables for write_wide_partitions test after validating
    write.
  tests: Read SSTables for write_ttled_column test after validating
    write.
  tests: Read SSTables for write_collection_wide_update test after
    validating write.
  tests: Read SSTables for write_collection_incremental_update test
    after validating write.
  tests: Read SSTables for write_missing_columns_large_set test after
    validating write.
  tests: Read SSTables for write_multiple_partitions test after
    validating write.
  tests: Read SSTables for write_multiple_rows test after validating
    write.
  tests: Read SSTables for write_different_types test after validating
    write.
  tests: Read SSTables for write_empty_clustering_values test after
    validating write.
  tests: Read SSTables for write_large_clustering_keys test after
    validating write.
  tests: Read SSTables for write_user_defined_type_table test after
    validating write.
  tests: Read SSTables for write_deleted_row test after validating
    write.
  sstables: Fix SSTables 3.x parsing: check use_row_ttl() for TTLed
    columns.
  tests: Read SSTables for write_ttled_row test after validating write.
  Read SSTables for write_compact_table test after validating write.
  tests: Read SSTables for tests of many partitions after validating
    write.
2018-09-11 15:42:43 +02:00
Duarte Nunes
3f0643f34f Merge 'Misc improvements to stateful range scans' from Botond
"
This series contains miscellaneous improvements to the stateful range
scans. These improvements are either things that I forgot to include in
the original series (tracing), was requested by other developers
(comments) or I discovered them while reading the code (lockup and
cleanup).
"

* 'multishard_mutation_query_fixes/v1' of https://github.com/denesb/scylla:
  multishard_mutation_query: add some tracing
  multishard_mutation_query: add comment to `read_context`
  multishard_mutation_query: always cleanup readers properly
  multishard_mutation_query: fix possible deadlock when creating a reader fails
2018-09-11 10:26:05 +01:00
Botond Dénes
7d71b42651 multishard_mutation_query: add some tracing
Add tracing for the following events:
1) Dismantling of the combined buffer.
2) Dismantling of the compaction state.
3) Cleaning up the readers.

(1) and (2) can possibly have adverse effects on the performance of the
query and hence it is important that details about the dismantled
fragments is exposed in the tracing data.
(3) is less critical but still good to know how much readers were
created by the read (in case they aren't saved). Since normally (in
strateful queries) this will always be 0 only trace this when it is
non-zero (and is interesting).
2018-09-11 08:18:16 +03:00
Botond Dénes
b41be7c8e5 multishard_mutation_query: add comment to read_context
Explain the purpose of the class and its intended usage and any gotchas
the reader/modifier of the code has to keep in mind.
2018-09-11 08:18:16 +03:00
Botond Dénes
b6e1a8f32d multishard_mutation_query: always cleanup readers properly
Currently the reader cleanup code, which ensures the readers and their
dependent objects are destroyed in the corect order and a single
smp::submit_to() message, are only run when the readers are attempted to
be saved. However proper cleanup is needed not only then, but also when
the query is not stateful. Rename the current `cleanup()` method to
`stop()`, make it public and call it from a `finally()` block after the
page is finalized to ensure readers are properly cleaned up at all
times.
Also make sure that failures in `stop()` are never propagated so that
a failure in the cleanup doesn't fail the read itself.
2018-09-11 08:18:16 +03:00
Vladimir Krivopalov
c4a4ef6e3c tests: Read SSTables for tests of many partitions after validating write.
This covers five tests, including three for compressed tables:
  - write_many_partitions_deflate
  - write_many_partitions_lz4
  - write_many_partitions_snappy
  - write_many_live_partitions
  - write_many_deleted_partitions

Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-09-10 17:28:48 -07:00
Vladimir Krivopalov
f1214bfceb Read SSTables for write_compact_table test after validating write.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-09-10 17:28:48 -07:00
Vladimir Krivopalov
a39638c0ba tests: Read SSTables for write_ttled_row test after validating write.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-09-10 17:28:48 -07:00
Vladimir Krivopalov
bcae761d72 sstables: Fix SSTables 3.x parsing: check use_row_ttl() for TTLed columns.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-09-10 17:28:48 -07:00
Vladimir Krivopalov
9b55f06456 tests: Read SSTables for write_deleted_row test after validating write.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-09-10 17:28:48 -07:00
Vladimir Krivopalov
8869f1a591 tests: Read SSTables for write_user_defined_type_table test after validating write.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-09-10 17:28:48 -07:00
Vladimir Krivopalov
dae49358d8 tests: Read SSTables for write_large_clustering_keys test after validating write.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-09-10 17:28:48 -07:00
Vladimir Krivopalov
8c2bc4a16a tests: Read SSTables for write_empty_clustering_values test after validating write.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-09-10 17:28:48 -07:00
Vladimir Krivopalov
6f23446962 tests: Read SSTables for write_different_types test after validating write.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-09-10 17:28:48 -07:00
Vladimir Krivopalov
4865f2f5a3 tests: Read SSTables for write_multiple_rows test after validating write.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-09-10 17:28:48 -07:00
Vladimir Krivopalov
3594b887df tests: Read SSTables for write_multiple_partitions test after validating write.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-09-10 17:28:48 -07:00
Vladimir Krivopalov
eee775dab7 tests: Read SSTables for write_missing_columns_large_set test after validating write.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-09-10 17:28:48 -07:00
Vladimir Krivopalov
2d764da415 tests: Read SSTables for write_collection_incremental_update test after validating write.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-09-10 17:28:48 -07:00
Vladimir Krivopalov
88a3b05210 tests: Read SSTables for write_collection_wide_update test after validating write.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-09-10 17:28:48 -07:00
Vladimir Krivopalov
abdae2dd9e tests: Read SSTables for write_ttled_column test after validating write.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-09-10 17:28:48 -07:00
Vladimir Krivopalov
cdf148dc67 tests: Read SSTables for write_wide_partitions test after validating write.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-09-10 17:28:48 -07:00
Vladimir Krivopalov
5b1a4686eb tests: Read SSTables for write_composite_clustering_key test after validating write.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-09-10 17:28:48 -07:00
Vladimir Krivopalov
e908d07fe7 tests: Read SSTables for write_composite_partition_key test after validating write.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-09-10 17:28:48 -07:00
Vladimir Krivopalov
aa5dc16dbb tests: Read SSTables for write_static_row test after validating write.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-09-10 17:28:48 -07:00
Vladimir Krivopalov
42ab8ed3cd tests: Preserve tmpdir in SSTables 3.x write tests upon comparison.
It can be used to do other checks on written files, like reading them
back.

Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-09-10 17:28:48 -07:00
Vladimir Krivopalov
bc16304e99 tests: Add validate_read() helper to use in SSTables 3.x write tests.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-09-10 17:28:48 -07:00
Vladimir Krivopalov
6cddd7500a tests: Factor out the helper building SSTables path for write tests.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-09-10 17:28:48 -07:00
Botond Dénes
b3f1fe14e8 multishard_mutation_query: fix possible deadlock when creating a reader fails
Failing to create a reader (`do_make_remote_reader()`) can lead to a
deadlock if the reader is in any of the future_*_state states, as the
`then()` block is not executed and hence the promise of the first
future in the chain is not set. Avoid this by changing the `then()` to a
`then_wrapped()` and using `set_exception()` and `set_value()`
accordingly, such that the future is resolved on both the happy and
error path.
2018-09-10 16:41:13 +03:00
Avi Kivity
4553238653 messaging: fix unbounded allocation in TLS RPC server
The non-TLS RPC server has an rpc::resource_limits configuration that limits
its memory consumption, but the TLS server does not. That means a many-node
TLS configuration can OOM if all nodes gang up on a single replica.

Fix by passing the limits to the TLS server too.

Fixes #3757.
Message-Id: <20180907192607.19802-1-avi@scylladb.com>
2018-09-10 12:11:16 +01:00
Gleb Natapov
9e438933a2 mutation_query_test: add test for result size calculation
Check that digest only and digest+data query calculate result size to be
the same.

Message-Id: <20180906153800.GK2326@scylladb.com>
2018-09-06 20:54:57 +03:00
Gleb Natapov
d7674288a9 mutation_partition: accurately account for result size in digest only queries
When measuring_output_stream is used to calculate result's element size
it incorrectly takes into account not only serialized element size, but
a placeholder that ser::qr_partition__rows/qr_partition__static_row__cells
constructors puts in the beginning. Fix it by taking starting point in a
stream before element serialization and subtracting it afterwords.

Fixes #3755

Message-Id: <20180906153609.GJ2326@scylladb.com>
2018-09-06 20:52:44 +03:00
Takuya ASADA
2136479012 dist/debian: delete mounts.conf on scylla-server.postrm
Since we added mounts.conf on 687372bc48,
we need to delete the file on uninstall the package.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20180905204631.9265-1-syuu@scylladb.com>
2018-09-06 16:50:14 +03:00
Gleb Natapov
98092353df mutation_partition: correctly measure static row size when doing digest calculation
The code uses incorrect output stream in case only digest is requested
and thus getting incorrect data size. Failing to correctly account
for static row size while calculating digest may cause digest mismatch
between digest and data query.

Fixes #3753.

Message-Id: <20180905131219.GD2326@scylladb.com>
2018-09-06 13:09:41 +03:00
Takuya ASADA
ab361e9897 dist/redhat: add mounts.conf to ghost file
Since we added mounts.conf on 687372bc48,
we need to delete the file on uninstall the package.

Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Message-Id: <20180905191037.1570-1-syuu@scylladb.com>
2018-09-05 22:14:48 +03:00
Jesse Haber-Kucharsky
682805b22c auth: Use finite time-out for all QUORUM reads
Commit e664f9b0c6 transitioned internal
CQL queries in the auth. sub-system to be executed with finite time-outs
instead of infinite ones.

It should have also modified the functions in `auth/roles-metadata.cc`
to have finite time-outs.

This change fixes some previously failing dtests, particularly around
repair. Without this change, the QUORUM query fails to terminate when
the necessary consistency level cannot be achieved.

Fixes #3736.

Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com>
Message-Id: <e244dc3e731b4019f3be72c52a91f23ee4bb68d1.1536163859.git.jhaberku@scylladb.com>
2018-09-05 21:55:26 +03:00
Tomasz Grabiec
82270c8699 storage_proxy: Fix misqualification of reads as foreground or background in some cases
The foreground reads metric is derived from the number of live read
executors minus the number of background reads. Background reads are
counted down when their resolver times out. However, a read executor
may still be around for a while, resulting in such reads being
accounted as foreground.

Usually, the gap in which this happens is short, because executor
reference holders timeout quickly as well. It's not always the case
though. For instance, local read executor doesn't time out quickly
when the target shard has an overloaded CPU, and it takes a while
before the request goes through all the queues, even if IO is not
involved. Observed in #3628.

Fixes #3734.

Another problem is that all reads which received CL responses are
accounted as background, until all replicas respond, but if such read
needs reconciliation, it's still practically a foreground read and
should be accounted as such. Found during code review.

Fixes #3745.

This patch fixes both issues by rearranging accounting to track
foreground reads instead of background reads, and considering all
reads as foreground until the resulting promise is resolved.

Message-Id: <1535999620-25784-1-git-send-email-tgrabiec@scylladb.com>
2018-09-05 20:42:51 +03:00
Avi Kivity
c168805ca6 Merge "Filtering and fast-forwarding of range tombstones in SSTables 3.x" from Vladimir
"
This patchset adds proper support for sliced reads of partitions
containing range tombstones.

Given the SSTables 3.x repesentation of range tombstones by separate
start and end markers, we refer to the index for the information about
the currently opened range tombstone, if any, when skipping to the next
promoted index block.

Note that for this we have to take the promoted index block immediately
preceding the one we are jumping to.

Tests: unit {release}
"

* 'projects/sstables-30/range-tombstones-slicing/v3' of https://github.com/argenet/scylla:
  tests: Test filtering and forwarding on a partition with interleaved rows and RTs.
  tests: Add tests for reading wide partitions with range tombstones.
  sstables: Support slicing for range tombstones.
  sstables: Set/reset range tombstone start from end open marker.
  sstables: Fix end_open_marker population in promoted index blocks.
  sstables: Add need_skip() helper to data_consume_context.
  sstables: For end_open_marker, return both position in partition and deletion time.
2018-09-05 20:38:39 +03:00