Commit Graph

23930 Commits

Author SHA1 Message Date
Dejan Mircevski
6773563d3d cql3: Drop unneeded filtering for continuous CK
Don't require filtering when a continuous slice of the clustering key
is requested, even if partition is unrestricted.  The read command we
generate will fetch just the selected data; filtering is unnecessary.

Some tests needed to update the expected results now that we're not
fetching the extra data needed for filtering.  (Because tests don't do
the final trim to match selectors and assert instead on all the data
read.)

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2020-10-19 14:46:43 -04:00
Avi Kivity
86bbf1763d Merge "reader concurrency semaphore: dump permit diagnostics on timeout or queue overflow" from Botond
"
The reader concurrency semaphore timing out or its queue being overflown
are fairly common events both in production and in testing. At the same
time it is a hard to diagnose problem that often has a benign cause
(especially during testing), but it is equally possible that it points
to something serious. So when this error starts to appear in logs,
usually we want to investigate and the investigation is lengthy...
either involves looking at metrics or coredumps or both.

This patch intends to jumpstart this process by dumping a diagnostics on
semaphore timeout or queue overflow. The diagnostics is printed to the
log with debug level to avoid excessive spamming. It contains a
histogram of all the permits associated with the problematic semaphore
organized by table, operation and state.

Example:

DEBUG 2020-10-08 17:05:26,115 [shard 0] reader_concurrency_semaphore -
Semaphore _read_concurrency_sem: timed out, dumping permit diagnostics:
Permits with state admitted, sorted by memory
memory  count   name
3499M   27      ks.test:data-query

3499M   27      total

Permits with state waiting, sorted by count
count   memory  name
1       0B      ks.test:drain
7650    0B      ks.test:data-query

7651    0B      total

Permits with state registered, sorted by count
count   memory  name

0       0B      total

Total: permits: 7678, memory: 3499M

This allows determining several things at glance:
* What are the tables involved
* What are the operations involved
* Where is the memory

This can speed up a follow-up investigation greatly, or it can even be
enough on its own to determine that the issue is benign.

Tests: unit(dev, debug)
"

* 'dump-diagnostics-on-semaphore-timeout/v2' of https://github.com/denesb/scylla:
  reader_concurrency_semaphore: dump permit diagnostics on timeout or queue overflow
  utils: add to_hr_size()
  reader_concurrency_semaphore: link permits into an intrusive list
  reader_concurrency_semaphore: move expiry_handler::operator()() out-of-line
  reader_concurrency_semaphore: move constructors out-of-line
  reader_concurrency_semaphore: add state to permits
  reader_concurrency_semaphore: name permits
  querier_cache_test: test_immediate_evict_on_insert: use two permits
  multishard_combining_reader: reader_lifecycle_policy: add permit param to create_reader()
  multishard_combining_reader: add permit parameter
  multishard_combining_reader: shard_reader: use multishard reader's permit
2020-10-13 12:44:23 +03:00
Botond Dénes
18454e4a80 reader_concurrency_semaphore: dump permit diagnostics on timeout or queue overflow
The reader concurrency semaphore timing out or its queue being overflown
are fairly common events both in production and in testing. At the same
time it is a hard to diagnose problem that often has a benign cause
(especially during testing), but it is equally possible that it points
to something serious. So when this error starts to appear in logs,
usually we want to investigate and the investigation is lengthy...
either involves looking at metrics or coredumps or both.

This patch intends to jumpstart this process by dumping a diagnostics on
semaphore timeout or queue overflow. The diagnostics is printed to the
log with debug level to avoid excessive spamming. It contains a
histogram of all the permits associated with the problematic semaphore
organized by table, operation and state.

Example:

DEBUG 2020-10-08 17:05:26,115 [shard 0] reader_concurrency_semaphore -
Semaphore _read_concurrency_sem: timed out, dumping permit diagnostics:
Permits with state admitted, sorted by memory
memory  count   name
3499M   27      ks.test:data-query

3499M   27      total

Permits with state waiting, sorted by count
count   memory  name
1       0B      ks.test:drain
7650    0B      ks.test:data-query

7651    0B      total

Permits with state registered, sorted by count
count   memory  name

0       0B      total

Total: permits: 7678, memory: 3499M

This allows determining several things at glance:
* What are the tables involved
* What are the operations involved
* Where is the memory

This can speed up a follow-up investigation greatly, or it can even be
enough on its own to determine that the issue is benign.
2020-10-13 12:32:14 +03:00
Botond Dénes
0994e8b5e2 utils: add to_hr_size()
This utility function converts a potentially large number to a compact
representation, composed of at most 4 digits and a letter appropriate to
the power of two the number has to multiplied with to arrive to the
original number (with some loss of precision).

The different powers of two are the conventional 2 ** (N * 10) variants:
* N=0: (B)ytes
* N=1: (K)bytes
* N=2: (M)bytes
* N=3: (G)bytes
* N=4: (T)bytes

Examples:
* 87665 will be converted to 87K
* 1024 will be converted to 1K
2020-10-13 12:32:14 +03:00
Botond Dénes
27bbf5566d reader_concurrency_semaphore: link permits into an intrusive list 2020-10-13 12:32:14 +03:00
Botond Dénes
fdb93ae0fd reader_concurrency_semaphore: move expiry_handler::operator()() out-of-line
Soon we will want to add more logic to this now simple handler, move it
out-of-line in preparation.
2020-10-13 12:32:14 +03:00
Botond Dénes
85bfd28f4e reader_concurrency_semaphore: move constructors out-of-line
Soon, the semaphore will have a field that will not have a publicly
available definition. Move the constructor out-of-line in preparation.
2020-10-13 12:32:13 +03:00
Botond Dénes
70fa543c31 reader_concurrency_semaphore: add state to permits
Instead of a simple boolean, designating whether the permit was already
admitted or not, add a proper state field with a value for all the
different states the permit can be in. Currently there are three such
states:
* registered - the permit was created and started accounting resource
  consumption.
* waiting - the permit was queued to wait for admission.
* admitted - the permit was successfully admitted.

The state will be used for debugging purposes, both during coredump
debugging as well as for dumping diagnostics data about permits.
2020-10-13 12:32:13 +03:00
Botond Dénes
ff623e70b3 reader_concurrency_semaphore: name permits
Require a schema and an operation name to be given to each permit when
created. The schema is of the table the read is executed against, and
the operation name, which is some name identifying the operation the
permit is part of. Ideally this should be different for each site the
permit is created at, to be able to discern not only different kind of
reads, but different code paths the read took.

As not all read can be associated with one schema, the schema is allowed
to be null.

The name will be used for debugging purposes, both for coredump
debugging and runtime logging of permit-related diagnostics.
2020-10-13 12:32:13 +03:00
Takuya ASADA
ff129ee030 install.sh: set LC_ALL=en_US.UTF-8 on python3 thunk
scylla-python3 causes segfault when non-default locale specified.
As workaround for this, we need to set LC_ALL=en_US.UTF_8 on python3 thunk.

Fixes #7408

Closes #7414
2020-10-13 09:38:25 +03:00
Vlad Zolotarov
aec70d9953 cql3/statements/batch_statement.cc: improve batch size warning message
Make the warning message clearer:
 * Include the number of partitions affected by the batch.
 * Be clear that the warning is about the batch size in bytes.

Fixes #7367

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>

Closes #7417
2020-10-13 09:02:51 +03:00
Avi Kivity
3451579d81 sstables: move component_type formatter to namespace sstables
Without this, clang complains that we violate argument dependent
lookup rules:

  note: 'operator<<' should be declared prior to the call site or in namespace 'sstables'
  std::ostream& operator<<(std::ostream&, const sstables::component_type&);

we can't enforce the #include order, but we can easily move it it
to namespace sstables (where it belongs anyway), so let's do that.

gcc is happy either way.

Closes #7413
2020-10-12 21:49:25 +02:00
Tomasz Grabiec
29cf7fde03 Merge 'sstables: prepare bound_kind_m formatter for clang' from Avi Kivity
bound_kind_m's formatter violates argument dependent lookup rules
according to clang, so fix that. Along the way improve the formatter a
little.

Closes #7412

* git://github.com/avikivity/scylla.git avikivity-bound_kind_m-formatter:
  sstables: move bound_kind_m formatter to namespace sstables
  sstables: move bound_kind_m formatter to its natural place
  sstables: deinline bound_kind_m formatter
2020-10-12 21:47:53 +02:00
Avi Kivity
5065ae835f sstables: move bound_kind_m formatter to namespace sstables
Without this, clang complains that we violate argument dependent
lookup rules:

  note: 'operator<<' should be declared prior to the call site or in namespace 'sstables'
  std::ostream& operator<<(std::ostream&, const sstables::bound_kind_m&);

we can't enforce the #include order, but we can easily move it it
to namespace sstables (where it belongs anyway), so let's do that.

gcc is happy either way.
2020-10-12 20:38:11 +03:00
Avi Kivity
a00fca1a69 sstables: move bound_kind_m formatter to its natural place
Move bound_kind_m's formatter to the same header file where
is is defined. This prevents cases where the compiler decays
the type (an enum) to the underlying integral type because it
does not see the formatter declaration, resulting in the wrong
output.
2020-10-12 20:36:10 +03:00
Avi Kivity
69c3533d97 sstables: deinline bound_kind_m formatter
The formatter is by no means hot code and should not be inlined.
2020-10-12 20:35:08 +03:00
Piotr Dulikowski
77a0f1a153 hints: don't read hint files when it's not allowed to send
When there are hint files to be sent and the target endpoint is DOWN,
end_point_hints_manager works in the following loop:

- It reads the first hint file in the queue,
- For each hint in the file it decides that it won't be sent because the
  target endpoint is DOWN,
- After realizing that there are some unsent hints, it decides to retry
  this operation after sleeping 1 second.

This causes the first segment to be wholly read over and over again,
with 1 second pauses, until the target endpoint becomes UP or leaves the
cluster. This causes unnecessary I/O load in the streaming scheduling
group.

This patch adds a check which prevents end_point_hints_manager from
reading the first hint file at all when it is not allowed to send hints.

First observed in #6964

Tests:
- unit(dev)
- hinted handoff dtests

Closes #7407
2020-10-12 19:09:57 +03:00
Botond Dénes
40c5474022 querier_cache_test: test_immediate_evict_on_insert: use two permits
The test currently uses a single permit shared between two simulated
reads (to wait admission twice). This is not a supported way of using a
permit and will stop working soon as we make the states the permit is in
more pronounced.
2020-10-12 15:56:56 +03:00
Botond Dénes
307cdf1e0d multishard_combining_reader: reader_lifecycle_policy: add permit param to create_reader()
Allow the evictable reader managing the underlying reader to pass its
own permit to it when creating it, making sure they share the same
permit. Note that the two parts can still end up using different
permits, when the underlying reader is kept alive between two pages of a
paged read and thus keeps using the permit received on the previous
page.

Also adjust the `reader_context` in multishard_mutation_query.cc to use
the passed-in permit instead of creating a new one when creating a new
reader.
2020-10-12 15:56:56 +03:00
Botond Dénes
e09ab09fff multishard_combining_reader: add permit parameter
Don't create an own permit, take one as a parameter, like all other
readers do, so the permit can be provided by the higher layer, making
sure all parts of the logical read use the same permit.
2020-10-12 15:56:56 +03:00
Botond Dénes
600f1c7853 multishard_combining_reader: shard_reader: use multishard reader's permit
Don't create a new permit per shard reader, pass down the multishard
reader's one to be used by each shard reader. They all belong to the
same read, they should use the same permit. Note that despite its name
the shard readers are the local representation of a reader living on a
remote shard and as such they live on the same shard the multishard
combining reader lives on.
2020-10-12 15:56:56 +03:00
Avi Kivity
73718414e3 data/cell: fix value_writer use before definition
Clang parses templates more eagerly than gcc, so it fails on
some forward-declared templates. In this case, value_writer
was forward-declared and then used in data::cell. As it also
uses some definitions local to data::cell, it cannot be
defined before it as well as after it.

To solve the problem, we define it as a nested class so
it can use other local definitions, yet be defined before it
is used. No code changes.

Closes #7401
2020-10-12 13:41:09 +03:00
Avi Kivity
da3e51d7b8 build: use c++20 for all C++ files, not just those that use the seastar flags
A few source files (like those generated by antlr) don't build with seastar,
and so don't inherit all of its flags. They then use the compiler default dialect,
not C++20. With gcc that's just fine, since gcc supports concepts in earlier dialects,
but clang requires C++20.

Fix by forcing --std=gnu++20 for all files (same as what Seastar chooses).

Closes #7392
2020-10-12 13:16:27 +03:00
Avi Kivity
affa234151 types: don't linearize ascii during validation
ascii has no inter-byte dependencies and so can
be validated fragment by fragment, reducing large
contiguous allocations.

Fixes #7393.

Closes #7394
2020-10-12 13:15:24 +03:00
Gleb Natapov
9d7c81c1b8 raft: fix boost/raft_fsm_test complication
Message-Id: <20201011063802.GA2628121@scylladb.com>
2020-10-12 12:09:21 +02:00
Takuya ASADA
d5ff82dc61 scylla_setup: skip iotune when developer_mode is enabled
When developer mode automatically enabled on nonroot mode, we should skip
iotune since the parameter won't be used.

Closes #7327
2020-10-12 11:08:10 +03:00
Botond Dénes
d35b0c06da configure.py: add space before appending -ffile-prefix-map to user cflags
Otherwise, it concatenates it to the last user provided cflag, creating
a gibberish flag that gcc will choke on.

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20201012073523.305271-1-bdenes@scylladb.com>
2020-10-12 10:40:02 +03:00
Nadav Har'El
977da3567f Merge 'Alternator streams: Fix shard lengths, parenting, expiration, filter useless ones and improve paging' from Calle Wilund
The remains of the defunct #7246.

Fixes #7344
Fixes #7345
Fixes #7346
Fixes #7347

Shard ID length is now within limits.
Shard end sequence number should be set when appropriate.
Shard parent is selected a bit more carefully (sorting)
Shards are filtered by time to exclude cdc generations we cannot get data from (too old)
Shard paging improved

Closes #7348

* github.com:scylladb/scylla:
  test_streams: Add some more sanity asserts
  alternator::streams: Set dynamodb data TTL explicitly in cdc options
  alternator::streams: Improve paging and fix parent-child calculation
  alternator::streams: Remove table from shard_id
  alternator::streams: Filter our cdc streams older than data/table
  alternator::error: Add a few dynamo exception types
2020-10-12 09:43:12 +03:00
Avi Kivity
4d6739c2e6 Merge "Use max_concurrent_for_each" from Benny
"
max_concurrent_for_each was added to seastar for replacing
sstable_directory::parallel_for_each_restricted by using
more efficient concurrency control that doesn't create
unlimited number of continuations.

The series replaces the use of sstable_directory::parallel_for_each_restricted
with max_concurrent_for_each and exposes the sstable_directory::do_for_each_sstable
via a static method.

This method is used here by table::snapshot to limit concurrency
do snapshot operations that suffer from the same unbound
concurrency problem sstable_directory solved.

In addition sstable_directory::_load_semaphore that was used
across calls to do_for_each_sstable was replaced by a static per-shard
semaphore that caps concurrency across all calls to `do_for_each_sstable`
on that shard.  This makes sense since the disk is a shared resource.

In the future, we may want to have a load semaphore per device rather than
a single global one.  We should experiment with that.

Test: unit(dev)
"

* tag 'max_concurrent_for_each-v5' of github.com:bhalevy/scylla:
  table: snapshot: use max_concurrent_for_each
  sstable_directory: use a external load_semaphore
  test: sstable_directory_test: extract sstable_directory creation into with_sstable_directory
  distributed_loader: process_upload_dir: use initial_sstable_loading_concurrency
  sstables: sstable_directory: use max_concurrent_for_each
2020-10-12 09:43:12 +03:00
Avi Kivity
54386efe9e build: add libicui18n library for clang
The build with clang fails with

  ld.lld: error: undefined symbol: icu_65::Collator::createInstance(icu_65::Locale const&, UErrorCode&)
  >>> referenced by like_matcher.cc
  >>>               build/dev/utils/like_matcher.o:(boost::re_detail_106900::icu_regex_traits_implementation::icu_regex_traits_implementation(icu_65::Locale const&))
  >>> referenced by like_matcher.cc
  >>>               build/dev/utils/like_matcher.o:(boost::re_detail_106900::icu_regex_traits_implementation::icu_regex_traits_implementation(icu_65::Locale const&))

That symbol lives in libicui18n. It's not clear why clang fails to resolve it and gcc succeeds (after all,
both use lld as the linker) but it is easier to add the library than to attempt to figure out the
discrepancy.

Closes #7391
2020-10-11 22:14:00 +03:00
Avi Kivity
8d3fcdc600 serializer.hh: remove unneeded semicolon after function definition
Closes #7390
2020-10-11 22:12:04 +03:00
Avi Kivity
dfffa4dc71 utils: big_decimal: work around clang difficulty with boost::cpp_int(string_view) constructor
Clang has some difficulty with the boost::cpp_int constructor from string_view.
In fact it is a mess of enable_if<>s so a human would have trouble too.

Work around it by converting to std::string. This is bad for performance, but
this constructor is not going to be fast in any case.

Hopefully a fix will arrive in clang or boost.

Closes #7389
2020-10-11 22:09:19 +03:00
Bentsi Magidovich
7be252e929 dist: fix incorrect AWS user-data url
we used http://169.254.169.254/latest/meta-data/user-data
but correct one http://169.254.169.254/latest/user-data
Fixes: https://github.com/scylladb/scylla-machine-image/issues/63

Closes #7388
2020-10-11 18:20:54 +03:00
Avi Kivity
00864b26c3 query-result-writer: fix idl definition order related failures with clang
Following ad48d8b43c, fix a similar problem which popped up
with higher inlining thresholds in query-result-writer.hh. Since
idl/query depends on idl/keys, it must follow in definition order.

Closes #7384
2020-10-11 17:57:12 +03:00
Avi Kivity
1145462a05 cql3: select_statement: fix undefined pointer arithmetic
We add std::distance(...) + 1 to a vector iterator, but
the vector can be empty, so we're adding a non-zero value
to nullptr, which is undefined behavior.

Rearrange to perform the limit (std::min()) before adding
to the pointer.

Found by clang's ubsan.

Closes #7377
2020-10-11 17:54:08 +03:00
Avi Kivity
610fa83f28 test: database_test: fix threading confusion
database_test contains several instances of calling do_with_cql_test_env()
with a function that expects to be called in a thread. This mostly works
because there is an internal thread in do_with_cql_test_env(), but is not
guaranteed to.

Fix by switching to the more appropriate do_with_cql_test_env_thread().

Closes #7333
2020-10-11 17:44:30 +03:00
Avi Kivity
b172e4c2ce sstables: make index_bound a non-nested struct
Due to a longstanding bug in clang[1], the compiler doesn't think
that such a class is default-constructible. This causes
std::optional<index_bound>::optional() not to compile. Because it
depends on open_tt_marker, extract that too.

[1] https://stackoverflow.com/questions/47974898/clang-5-stdoptional-instantiation-screws-stdis-constructible-trait-of-the-p

Closes #7387
2020-10-11 17:40:01 +03:00
Avi Kivity
58e02c216a test: sstable_datafile_test: sstable_run_based_compaction_test: prevent use of uninitialized variable observer
The variable 'observer' (an std::optional) may be left uninitialized
if 'incremental_enabled' is false. However, it is used afterwards
with a call to disconnect, accessing garbage.

Fix by accessing it via the optional wrapper. A call to optional::reset()
destroys the observable, which in turn calls disconnect().

Closes #7380
2020-10-11 17:36:08 +03:00
Avi Kivity
af8fd8c8d8 utils: build_id: fix ubsan false positive on pointer arithmetic
get_nt_build_id() constructs a pointer by adding a base and an
offset, but if the base happens to be zero, that is undefined
under C++ rules (altough legal ELF).

Fix by performing the addition on integers, and only then
casting to a pointer.

Closes #7379
2020-10-11 17:23:40 +03:00
Avi Kivity
a36eb586ea cql3: selection: don't use gcc extension "typeof"
typeof is not recognized by clang. Use the modern equivalent "decltype"
instead.

Closes #7386
2020-10-11 17:21:15 +03:00
Avi Kivity
15ab6a3feb test: cql_repl: use boost::regex instead of std::regex to avoid stack overflow
libstdc++'s std::regex uses recursion[1], with a depth controlled by the
input. Together with clang's debug mode, this overflows the stack.

Use boost::regex instead, which is immune to the problem.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86164

Closes #7378
2020-10-11 17:12:21 +03:00
Avi Kivity
4fd0ba24ea Update seastar submodule
* seastar ebcb3aeec...35c255dcd (1):
  > append_challenged_posix_file_impl: allow destructing file with no queued work
Fixes #7285.
2020-10-11 16:49:03 +03:00
Avi Kivity
7d025b5cf4 utils: log_heap: relax check for clang's sanitizer
b1e78313fe added a check for ubsan to squelch a false positive,
but that check doesn't work with clang. Relax it to check for debug
mode, so clang doesn't hit the same false positive as gcc did.

Define a SANITIZE macro so we have a reliable way to detect if
we're running with a sanitizer.

Closes #7372
2020-10-11 16:07:16 +03:00
Avi Kivity
882ed2017a test: network_topology_strategy_test: fix overflow in d2t()
d2t() scales a fraction in the range [0, 1] to the range of
a biased token (same as unsigned long). But x86 doesn't support
conversion to unsigned, only signed, so this is a truncating
conversion. Clang's ubsan correctly warns about it.

Fix by reducing the range before converting, and expanding it
afterwards.

Closes #7376
2020-10-11 16:05:02 +03:00
Avi Kivity
8932c4e919 compaction: allow _max_sstable_size = 0
Some test (run_based_compaction_test at least) use _max_sstable_size = 0
in order to force one partition per sstable. That triggers an overflow
when calculating the expected bloom filter size. The overflow doesn't
matter for normal operation, because the result later appears on a
divisor, but does trigger a ubsan error.

Squelch the error by bot dividing by zero here.

I tried using _max_sstable_size = 1, but the test failed for other
reasons.

Closes #7375
2020-10-11 15:43:51 +03:00
Avi Kivity
fc1fcaa11e lua: expect overflow when selecting lua types
When converting a value to its Lua representation, we choose
an integer type if it fits. If it doesn't, we fall back to a
more expensive type. So we explicitly try to trigger an overflow.

However, clang's ubsan doesn't like the overflow, and kills the
test. Tell it that the overflow is expected here.

Closes #7374
2020-10-11 15:38:07 +03:00
Avi Kivity
6bc6db8037 utils/array-search: document restrictions
Our AVX2 implementation cannot load a partial vector,
or mask unused elements (that can be done with AVX-512/SVE2),
so it has some restrictions. Document them.

Closes #7385
2020-10-11 15:19:54 +03:00
Avi Kivity
3e2707c2bf utils: fragmented_temporary_buffer: don't add to potentially null pointers
Offsetting a null pointer is undefined, and clang's ubsan complains.

Rearrange the arithmetic so we never offset a null pointer. A function
is introduced for the remaining contiguous bytes so it can cast the result
to size_t, avoiding a compare-of-different-signedness warning from gcc.

Closes #7373
2020-10-11 15:05:15 +03:00
Benny Halevy
d55985bb7d build: Upgrade to seastar API level 6
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20201011105422.818623-2-bhalevy@scylladb.com>
2020-10-11 14:40:32 +03:00
Benny Halevy
064aae8ffa flush_queue: call_helper: support no variadic futures
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20201011105422.818623-1-bhalevy@scylladb.com>
2020-10-11 14:40:32 +03:00