Commit Graph

1243 Commits

Author SHA1 Message Date
Michał Chojnowski
b346136e98 utils: config_file: fix handling of workdir,W in the YAML file
Option names given in db/config.cc are handled for the command line by passing
them to boost::program_options, and by YAML by comparing them with YAML
keys.
boost::program_options has logic for understanding the
long_name,short_name syntax, so for a "workdir,W" option both --workdir and -W
worked, as intended. But our YAML config parsing doesn't have this logic
and expected "workdir,W" verbatim, which is obviously not intended. Fix that.

Fixes #7478
Fixes #9500
Fixes #11503

Closes #11506

(cherry picked from commit af7ace3926)
2023-02-22 21:33:04 +02:00
Pavel Emelyanov
3723713130 exceptions: Mark storage_io_error::code() with noexcept
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-10-13 15:36:38 +03:00
Nadav Har'El
abb6817261 cql: validate bloom_filter_fp_chance up-front
Scylla's Bloom filter implementation has a minimal false-positive rate
that it can support (6.71e-5). When setting bloom_filter_fp_chance any
lower than that, the compute_bloom_spec() function, which writes the bloom
filter, throws an exception. However, this is too late - it only happens
while flushing the memtable to disk, and a failure at that point causes
Scylla to crash.

Instead, we should refuse the table creation with the unsupported
bloom_filter_fp_chance. This is also what Cassandra did six years ago -
see CASSANDRA-11920.

This patch also includes a regression test, which crashes Scylla before
this patch but passes after the patch (and also passes on Cassandra).

Fixes #11524.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #11576

(cherry picked from commit 4c93a694b7)
2022-10-04 16:21:48 +03:00
Pavel Emelyanov
3e7c57d162 cross-shard-barrier: Capture shared barrier in complete
When cross-shard barrier is abort()-ed it spawns a background fiber
that will wake-up other shards (if they are sleeping) with exception.

This fiber is implicitly waited by the owning sharded service .stop,
because barrier usage is like this:

    sharded<service> s;
    co_await s.invoke_on_all([] {
        ...
        barrier.abort();
    });
    ...
    co_await s.stop();

If abort happens, the invoke_on_all() will only resolve _after_ it
queues up the waking lambdas into smp queues, thus the subseqent stop
will queue its stopping lambdas after barrier's ones.

However, in debug mode the queue can be shuffled, so the owning service
can suddenly be freed from under the barrier's feet causing use after
free. Fortunately, this can be easily fixed by capturing the shared
pointer on the shared barrier instead of a regular pointer on the
shard-local barrier.

fixes: #11303

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #11553
2022-10-03 13:20:28 +03:00
Tomasz Grabiec
25d2da08d1 db: range_tombstone_list: Avoid quadratic behavior when applying
Range tombstones are kept in memory (cache/memtable) in
range_tombstone_list. It keeps them deoverlapped, so applying a range
tombstone which covers many range tombstones will erase existing range
tombstones from the list. This operation needs to be exception-safe,
so range_tombstone_list maintains an undo log. This undo log will
receive a record for each range tombstone which is removed. For
exception safety reasons, before pushing an undo log entry, we reserve
space in the log by calling std::vector::reserve(size() + 1). This is
O(N) where N is the number of undo log entries. Therefore, the whole
application is O(N^2).

This can cause reactor stalls and availability issues when replicas
apply such deletions.

This patch avoids the problem by reserving exponentially increasing
amount of space. Also, to avoid large allocations, switches the
container to chunked_vector.

Fixes #11211

Closes #11215

(cherry picked from commit 7f80602b01)
2022-09-30 00:01:26 +03:00
Avi Kivity
3d9800eb1c logalloc: don't crash while reporting reclaim stalls if --abort-on-seastar-bad-alloc is specified
The logger is proof against allocation failures, except if
--abort-on-seastar-bad-alloc is specified. If it is, it will crash.

The reclaim stall report is likely to be called in low memory conditions
(reclaim's job is to alleviate these conditions after all), so we're
likely to crash here if we're reclaiming a very low memory condition
and have a large stall simultaneously (AND we're running in a debug
environment).

Prevent all this by disabling --abort-on-seastar-bad-alloc temporarily.

Fixes #11549

Closes #11555

(cherry picked from commit d3b8c0c8a6)
2022-09-18 13:24:21 +03:00
Benny Halevy
1d9862dab3 logalloc: region_impl: add moved method
Don't open-code calling the region_impl
_listeners->moved() in region move-constructor
and move-assignment op.

The other._impl->_region might be different then &other
post region::merge so let the region_impl
decide which region* is moved from.

The new_region is also set to region_impl->_region
so need to open-code that either in the said call sites.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-28 10:49:49 +03:00
Benny Halevy
cd4dbb1cae logalloc: region: merge: optimize getting other impl
The other _impl is presumed to be engaged already,
so just call other.get_impl() once for both use cases.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-28 10:49:36 +03:00
Benny Halevy
a547cb79e8 logalloc: region: merge: call region_impl::unlisten
We can't be sure that the other_impl->_region == &other
since it could be a result of a previous merge,
so don't decide for it which region to unlisten to,
let it use its current _region.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-28 10:49:27 +03:00
Benny Halevy
003216de59 logalloc: region: call unlisten rather than open coding it
Current ~region and region::operator= open-code
region_impl::unlisten.  Just call it so it will be
easier to maintain.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-28 10:49:11 +03:00
Benny Halevy
cff953535c logalloc: region move-ctor: initialize _impl
There's no need to default-initialize it
and then move-assign it.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-28 10:49:05 +03:00
Benny Halevy
c7d77e4076 logalloc: region: get_impl might be called on disengaged _impl when moved
First check if _impl is engaged before accessing it
to set its _region = this in the move constructor and
move assignment operator.

Add unit test for these odd orner cases.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-28 10:48:58 +03:00
Avi Kivity
2c0932cc41 Merge 'Reduce the amount of per-table metrics' from Amnon Heiman
This series is the first step in the effort to reduce the number of metrics reported by Scylla.
The series focuses on the per-table metrics.

The combination of histograms, per-tables, and per shard makes the number of metrics in a cluster explode.
The following series uses multiple tools to reduce the number of metrics.
1. Multiple metrics should only be reported for the user tables and the condition that checked it was not updated when more non-user keyspaces were added.
2. Second, instead of a histogram, per table, per shard, it will report a summary per table, per shard, and a single histogram per node.
3. Histograms, summaries, and counters will be reported only if they are used (for example, the cas-related metrics will not be reported for tables that are not using cas).

Closes #11058

* github.com:scylladb/scylla:
  Add summary_test
  database: Reduce the number of per-table metrics
  replica/table.cc: Do not register per-table metrics for system
  histogram_metrics_helper.hh: Add to_metrics_summary function
  Unified histogram, estimated_histogram, rates, and summaries
  Split the timed_rate_moving_average into data and timer
  utils/histogram.hh: should_sample should use a bitmask
  estimated_histogram: add missing getter method
2022-07-27 22:01:08 +03:00
Amnon Heiman
9a3e70adfb histogram_metrics_helper.hh: Add to_metrics_summary function
The to_metrics_summary is a helper function that create a metrics type
summary from a timed_rate_moving_average_with_summary object.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2022-07-27 16:58:52 +03:00
Amnon Heiman
c220e3a00f Unified histogram, estimated_histogram, rates, and summaries
Currently, there are two metrics reporting mechanisms: the metrics layer
and the API. In most cases, they use the same data sources. The main
difference is around histograms and rate.

The API calculates an exponentially weighted moving average using a
timer that decays the average on each time tick. It calculates a
poor-man histogram by holding the last few entries (typically the last
256 entries). The caller to the API uses those last entries to build a
histogram.

We want to add summaries to Scylla. Similar to the API rate and
histogram, summaries are calculated per time interval.

This patch creates a unified mechanism by introducing an object that
would hold both the old-style histogram and the new
(estimated_histogram). On each time tick, a summary would be calculated.
In the future, we'll replace the API to report summaries instead of the
old-style histogram and deprecate the old style completely.

summary_calculator uses two estimated_histogram to calculate a summary.

timed_rate_moving_average_summary_and_histogram is a unifed class for
ihistogram, rates, summary, and estimated_histogram and will replace
timed_rate_moving_average_and_histogram.

Follow-up patches would move code from using
timed_rate_moving_average_and_histogram to
timed_rate_moving_average_summary_and_histogram.  By keeping the API it
would make the transition easy.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2022-07-27 16:58:25 +03:00
Benny Halevy
6e961ead3b logalloc: mark free functions noexcept
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:43:17 +03:00
Benny Halevy
705b42efe2 logalloc: allocating_section: mark functions noexcept
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:43:17 +03:00
Benny Halevy
f9db708376 logalloc: allocating_section: guard: mark constructor noexcept
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:43:17 +03:00
Benny Halevy
5416808367 logalloc: reclaim_lock: mark functions noexcept
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:43:17 +03:00
Benny Halevy
95b0e41abb logalloc: tracker_reclaimer_lock: mark constructor noexcept
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:43:17 +03:00
Benny Halevy
ed9e036509 logalloc: mark shard_tracker noexcept
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:43:17 +03:00
Benny Halevy
d6e6ffc741 logalloc: region: mark functions const/noexcept
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:43:17 +03:00
Benny Halevy
2beee4a6cd logalloc: basic_region_impl: mark functions noexcept
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:43:17 +03:00
Benny Halevy
3ba85c3bbd logalloc: region_impl: mark functions noexcept
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:43:17 +03:00
Benny Halevy
d838456be2 utils: log_heap: mark functions noexcept
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:43:17 +03:00
Benny Halevy
3f96818c03 logalloc: region_impl: object_descriptor: mark functions noexcept
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:43:17 +03:00
Benny Halevy
0866548b27 logalloc: region_group: mark functions noexcept
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:43:17 +03:00
Benny Halevy
fe50c76dbc logalloc: tracker: mark functions const/noexcept
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:40:50 +03:00
Benny Halevy
71c21a83ad logalloc: tracker::impl: make region_occupancy and friends const
No that they don't modify the tracker impl.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:40:18 +03:00
Benny Halevy
1c0c01cc24 logalloc: tracker::impl: occupancy: get rid of reclaiming_lock
It was added in d20fae96a2
as a precaution not to invalidate iterators while
traversing _regions.  However it is not requried as no allocation
is done on this synchronous path - therefore there is no
point in preventing reclaim.

This will allow making the respective functions const
as they merely return stats and do not modify the tracker impl.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:39:18 +03:00
Benny Halevy
888e225113 logalloc: tracker::impl: mark functions noexcept
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:39:16 +03:00
Benny Halevy
f0027f60d4 logalloc: segment: mark functions const / noexcept
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:34:56 +03:00
Benny Halevy
830912cfa0 logalloc: segment_pool: add const variant of descriptor method
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:34:48 +03:00
Benny Halevy
f318d1664e logalloc: segment_pool: move descriptor method to class definition
To make the implementation inline and to prepare
for the next patch that adds a const overload of
this method.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:34:37 +03:00
Benny Halevy
35899463d4 logalloc: segment_pool: mark functions const/noexcept
Some methods were also marked inline when declared in the class
definition and in the ir definition site to provide a hint to
the compiler to inline them.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:33:47 +03:00
Benny Halevy
02e74696f2 logalloc: segment_pool: delete unused free_or_restore_to_reserve method
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:33:21 +03:00
Benny Halevy
00dae56e19 utils: dynamic_bitset: mark functions noexcept
dynamic_bitset allocates only when constructed.
then on it doesn't throw.

Though not that accessing bits out of range
is undefined behavior.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:32:36 +03:00
Benny Halevy
d911d03344 utils: dynamic_bitset: delete unused members
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:32:08 +03:00
Benny Halevy
da87a4a248 logalloc: segment_store, segment_pool: idx_from_segment: get a const segment* in const overload
To maintain the const chain from segment via segment_store to
segment_pool.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:28:21 +03:00
Benny Halevy
947f71ce91 logalloc: segment_store, segment_pool: return const segment* from segment_from_idx() const
Maintain the const chain by returning a const segment*
from segment_from_idx() const overload.

And add a respective mutable overload to return a mutable segment*.

This is done for a similar change in idx_from_segment.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:27:30 +03:00
Benny Halevy
17902da66c logalloc: segment_store: make can_allocate_more_segments const
Add a const noexcept overload of `find_empty()` so that
can_allocate_more_segments can be const noexcept as well.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:26:07 +03:00
Benny Halevy
2ae61d5209 logalloc: segment_store: mark functions noexcept
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:19:58 +03:00
Benny Halevy
852c23b97a logalloc: segment_descriptor: mark functions noexcept
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:18:15 +03:00
Benny Halevy
a49619a601 logalloc: occupancy_stats: mark functions noexcept
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:17:43 +03:00
Benny Halevy
721e94dcf1 min_max_tracker: mark functions noexcept
Based on tracked types being nothrow copy and move construtible.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:17:27 +03:00
Benny Halevy
a6356539bf logalloc: lsa_buffer: mark functions noexcept
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 10:22:35 +03:00
Amnon Heiman
72414b613b Split the timed_rate_moving_average into data and timer
This patch split the timed_rate_moving_average functionality into two, a
data class: rates_moving_average, and a wrapper class
timed_rate_moving_average that uses a timer to update the rates
periodically.

To make the transition as simple as possible timed_rate_moving_average,
takes the original API.

A new helper class meter_timer was introduced to handle the timer update
functionality.

This change required minimal code adaptation in some other parts of the
code.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2022-07-26 15:59:33 +03:00
Amnon Heiman
5bf51ed4af utils/histogram.hh: should_sample should use a bitmask
This patch fixes a bug in should_sample that uses its bitmask
incorrectly.

basic_ihistogram has a feature that allows it to sample values instead
of taking a timer each time.

To decide if it should sample or not, it uses a bitmask. The bitmask
is of the form 2^n-1, which means 1 out of 2^n will be sampled.

For example, if the mask is 0x1 (2^2-1) 1 out of 2 will be sampled.
If the mask is 0x7 (2^3-1) 1 out of 8 will be sampled.

There was a bug in the should_sampled() method.
The correct form is (value&mask) == mask

Ref #2747
It does not solve all of #2747, just the bug part of it.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2022-07-26 15:59:33 +03:00
Amnon Heiman
99bc6d882b estimated_histogram: add missing getter method
This patch adds the square bracket operator method that was missing.
2022-07-26 15:59:33 +03:00
Avi Kivity
5b541bed72 logalloc: drop region_impl public accessors
With the region heap handle removed from logalloc::region, there is
nothing remaining there that needs violation of the abstraction
boundary, so we can drop these hacks.
2022-07-26 11:12:10 +03:00