The test does the following:
- Enables an error injection which will cause further view updates to
get stuck, occupying space in memory and affecting the backlog,
- Performs a single, large write to the base table which causes a single
view update to be generated; the write is then followed with one more,
small write to make sure that the other write will be affected by the
first write's backlog,
- Reads relevant metrics in order to check the exact value of the delay
that was calculated for the base table write due to MV backpressure.
This is done for different values of the MV delay configuration
parameter (view_flow_control_delay_limit_in_ms) and the calculated
delays are collected into a list. Lastly, the test checks that the
relation between parameter value and the calculated delays is linear.
Information about view update backlog is propagated in two main ways:
- In RPCs that serve as responses to writes (MUTATION_DONE /
MUTATION_FAILED)
- Via gossip (application_state::VIEW_BACKLOG)
In tests, it can be benefical to disable the second mechanism. View
update backlog propagation via write responses happens synchronously
with respect to writes so it is easier to control and reason about,
while gossip is asynchronous and can overwrite the backlog that was
propagated via write responses.
Add `skip_updating_local_backlog_via_view_update_backlog_broker` error
injection which skips the logic that updates the local, per-endpoint
cache of view update backlogs from the gossip state.
Until this patch, the materialized view flow-control algorithm
(https://www.scylladb.com/2018/12/04/worry-free-ingestion-flow-control/)
used a constant delay_limit_us hard-coded to one second, which means
that when the size of view-update backlog reached the maximum (10%
of memory), we delay every request by an additional second - while
smaller amounts of backlog will result in smaller delays.
This hard-coded one maximum second delay was considered *huge* - it will
slow down a client with concurrency 1000 to just 1000 requests per
second - but we already saw some workloads where it was not enough -
such as a test workload running very slow reads at high concurrency
on a slow machine, where a latency of over one second was expected
for each read, so adding a one second latecy for writes wasn't having
any noticable affect on slowing down the client.
So this patch replaces the hard-coded default with a live-updateable
configuration parameter, `view_flow_control_delay_limit_in_ms`, which
defaults to 1000ms as before.
Another useful way in which the new `view_flow_control_delay_limit_in_ms`
can be used is to set it to 0. In that case, the view-update flow
control always adds zero delay, and in effect - does absolutely
nothing. This setting can be used in emergency situations where it
is suspected that the MV flow control is not behaving properly, and
the user wants to disable it.
The new parameter's help string mentions both these use cases of
the parameter.
Fixes#18187
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Remove the `--python` option which was originally added in 780d9a26b2 to support
CentOS's non-standard python3 path (`/usr/bin/python3.4`).
Since we now:
- Build using a Fedora-based container with standard python3 path
- Use properly configured shebangs in build scripts
- Set correct executable permissions on Python scripts
This change:
1. Removes the `--python` command line option
2. Updates build rules to execute Python scripts directly instead of via interpreter
This simplifies the build system and reduces differences between CMake and
configure.py-generated rules.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#21607
As reported by @Deexie, during the process of opening backport PRs in https://github.com/scylladb/scylladb/pull/21616, No invite emails were sent, causing a lack of permissions for the backport PR branch
The check if `has_in_collaborators(pr.user.login)` was pointing to `scylladb/scylladb` instead of `scylladbbot/scylladb`, fixing it
I also moved the collaborator check to an early stage, before trying to open a backport PR
Closesscylladb/scylladb#21645
This adds a new tablet migration kind: repair. It allows tablet repair
scheduler to use this migration kind to schedule repair jobs.
The current repair scheduler implementation does the following:
- A tablet is picked to be repaired when is requested by user
- The tablet repair can be scheduled along with tablet migration and
rebuild. It runs in the tablet_migration track.
- Repair jobs are scheduled in a smart way so that at any point in time,
there are no more than configured jobs per shard, which is similar to
scylla manager's control.
New feature. No backport is needed.
Closesscylladb/scylladb#21088
* github.com:scylladb/scylladb:
test: Add tests for tablet repair scheduler
repair: Add restful API for tablet repair
repair: Add tablet repair scheduler internal API support
docs: Update system_keyspace.md for tablet repair related info
docs: Add docs for tablet repair migration
repair: Add core tablet repair scheduler support
messaging_service: Introduce TABLET_REPAIR verb
tablet_allocator: Introduce stream_weight for tablet_migration_streaming_info
network_topology_strategy: Preserve fields of task_info in reallocate_tablets
now that we are allowed to use C++23. we now have the luxury of using `std::ranges::find_if`.
in this change, we:
- replace `boost::find_if` with `std::ranges::find_if`
- remove all `#include <boost/range/algorithm/find_if.hpp>`
to reduce the dependency to boost for better maintainability, and leverage standard library features for better long-term support.
this change is part of our ongoing effort to modernize our codebase and reduce external dependencies where possible.
---
it's a cleanup, hence no need to backport.
Closesscylladb/scylladb#21495
* github.com:scylladb/scylladb:
treewide: replace boost::find_if with std::ranges::find_if
counters: replace boost::find_if with std::ranges::find_if
combine.hh: use std::iter_const_reference_t when appropriate
Currently, PER PARTITION LIMIT is not implemented for aggregates and queries can result in more rows than expected from the same partition.
Instrument the result_set_builder class so that it can enforce PER PARTITION LIMIT for aggregate queries, specifically:
- add per_partition_limit to the result_set_builder
- expose the number of input rows in the selector
result_set_builder gets two new functions handling partition start and end:
- accept_partition_end for notifying that a partition has been finished. This is also called when a page ends, so we cannot simply flush here, as a naive implementation could do.
- accept_new_partition, where we flush_selectors() if it's indeed a new partition (and not a continuation of the previous) and the query has a grouping: we don't want to flush on new partition in a query like SELECT COUNT(*) FROM foo;
Fixes#5363Closesscylladb/scylladb#21125
* github.com:scylladb/scylladb:
test: enable PER PARTIION LIMIT + GROUP BY tests
cql3: respect PER PARTITION LIMIT for aggregates
cql3: selection: count input rows in the selector
cql3: selection: pass per partition limit to the result_set_builder
cql3: show different messages for LIMIT and PER PARTITION LIMIT in get_limit
Currently, task_manager_module::abort_all_repairs marks top-level repairs as aborted (but does not abort them) and aborts all existing shard tasks.
A running repair checks whether its id isn't contained in _aborted_pending_repairs and then proceeds to create shard tasks. If abort_all_repairs is executed after _aborted_pending_repairs is checked but before shard tasks are created, then those new tasks won't be aborted. The issue is the most severe for tablet_repair_task_impl that checks the _aborted_pending_repairs content from different shards, that do not see the top-level task. Hence the repair isn't stopped but it creates shard repair tasks on all shards but the one that initialized repair.
Abort top-level tasks in abort_all_repairs. Fix the shard on which the task abort is checked.
Fixes: #21612.
Needs backport to 6.1 and 6.2 as they contain the bug.
Closesscylladb/scylladb#21616
* github.com:scylladb/scylladb:
test: add test to check if repair is properly aborted
repair: add shard param to task_manager_module::is_aborted
repair: use task abort source to abort repair
repair: drop _aborted_pending_repairs and utilize tasks abort mechanism
repair: fix task_manager_module::abort_all_repairs
Those internal APIs allow to add / del a tablet repair request and
config the tablet repair scheduler.
It can be used by task manager or plain restful api.
This adds a new tablet migration kind: repair. It allows tablet repair
scheduler to use this migration kind to schedule repair jobs.
The current repair scheduler implementation does the following:
- A tablet is picked to be repaired when the time since last repair is
bigger than a threshold (auto repair mode) or it is requested by user
(manual repair mode)
- The tablet repair can be scheduled along with tablet migration and
rebuild. It runs in the tablet_migration track.
- Repair jobs are scheduled in a smart way so that at any point in time,
there are no more than configured jobs per shard, which is similar to
scylla manager's control.
In this patch, both the manual repair and the auto repair are not
enabled yet.
Before these changes, we didn't wait for the materialized views to
finish building before writing to the base table. That led to generating
an additional view update, which, in turn, led to test failures.
The scenario corresponding to the summary above looked like this:
1. The test creates an empty table and MVs on it.
2. The view builder starts, but it doesn't finish immediately.
3. The test performs mutations to the base table. Since the views
already exist, view updates are generated.
4. Finally, the view builder finishes. It notices that the base
table has a row, so it generates a view update for it because
it doesn't notice that we already have data in the view.
We solve it by explicitly waiting for both views to finish building
and only then start writing to the base table.
Additionally, we also fix a lifetime issue of the row the test revolves
around, further stabilizing CI.
Fixes https://github.com/scylladb/scylladb/issues/20889
Backport: These changes have no semantic effect on the codebase,
but they stabilize CI, so we want to backport them to the maintained
versions of Scylla.
Closesscylladb/scylladb#21632
* github.com:scylladb/scylladb:
test/boost/view_schema_test.cc: Increase TTL in test_view_update_generating_writetime
test/boost/view_schema_test.cc: Wait for views to build in test_view_update_generating_writetime
The auxiliary function `eventually()` (defined in `test/lib/eventually.hh`)
tries to execute a passed function. If it throws, `eventually()` sleeps
for `2^#previous_attempts` milliseconds and tries to perform it again.
The default limit of attempts is 17.
In `test_view_update_generating_writetime`, right before the last test
case, we perform:
```cql
UPDATE t USING TTL 10 AND TIMESTAMP 8 SET g=40 WHERE k=1 AND c=1;
```
The test case itself executes:
```cql
SELECT WRITETIME(g) FROM t;
```
and asserts that the result of the query is equal to 8, i.e. it
corresponds to the timestamp of the last write to the table `t`.
However, if the test case keeps failing, then during its 14th attempt
(so affter sleeping for at least `2^14 - 1` milliseconds, which amounts
to about 16 seconds), we'll observe the following error:
```
[Exception] - std::runtime_error: Expected row not found: [0000000000000008] not in {result_message::rows {row: null}}
```
The reason behind it is the specified TTL is too short. 10 seconds will
have already passed before the 14th attempt, so the value in the column
`g` will be `NULL` again. In particular, the `WRITETIME(g)` will no
longer be equal to `8`.
To solve that issue, we change the TTL in the CQL statement to 300.
The time spent on 17 loops of `eventually()` amounts to about
`2^18 - 1` milliseconds, which is about 263 seconds. That's why
setting the TTL to 300 seconds should be enough to prevent the error
from occurring.
Before these changes, we didn't wait for the materialized views to
finish building before writing to the base table. That led to generating
an additional view update, which, in turn, led to test failures.
The scenario corresponding to the summary above looked like this:
1. The test creates an empty table and MVs on it.
2. The view builder starts, but it doesn't finish immediately.
3. The test performs mutations to the base table. Since the views
already exist, view updates are generated.
4. Finally, the view builder finishes. It notices that the base
table has a row, so it generates a view update for it because
it doesn't notice that we already have data in the view.
We solve it by explicitly waiting for both views to finish building
and only then start writing to the base table.
Fixesscylladb/scylladb#20889
Alternator's "/localnodes" HTTP requests is supposed to return the list
of nodes in the local DC to which the user can send requests.
Before commit bac7c33313 we used the
gossiper is_alive() method to determine if a node should be returned.
That commit changed the check to is_normal() - because a node can be
alive but in non-normal (e.g., joining) state and not ready for
requests.
However, it turns out that checking is_normal() is not enough, because
if node is stopped abruptly, other nodes will still consider it "normal",
but down (this is so-called "DN" state). So we need to check **both**
is_alive() and is_normal().
This patch also adds a test reproducing this case, where a node is
shut down abruptly. Before this patch, the test failed ("/localnodes"
continued to return the dead node), and after it it passes.
Fixes#21538
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#21540
Adding `Fixes` validation to a PR when backport labels were added. When the auto backport process triggers (after promotion), we will ensure each PR with backport/x.y label also has in the PR body a `Fixes` reference to an issue
Fixes: https://github.com/scylladb/scylladb/issues/20021Closesscylladb/scylladb#21563
Change e3e8a94c9a changed
the semantics of the enable_tablets config option,
but updating that in the option documentation in scylla.yaml
was missed.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closesscylladb/scylladb#21614
This change was created in the same spirit of f8221b960f.
The S3ProxyServer (introduced in 8919e0abab) currently prints its
status directly to stdout, which can be distracting when reviewing test
results. For example:
```console
$ ./test.py --mode release object_store/test_backup::test_simple_backup_and_restore
Found 1 tests.
Setting minio proxy random seed to 1731924995
Starting S3 proxy server on ('127.193.179.2', 9002)
================================================================================
[N/TOTAL] SUITE MODE RESULT TEST
------------------------------------------------------------------------------
[1/1] object_store release [ PASS ] object_store.test_backup.1
Stopping S3 proxy server
------------------------------------------------------------------------------
CPU utilization: 3.1%
```
Move these messages to use proper logging to give developers more control
over their visibility:
- Make logger parameter mandatory in S3ProxyServer constructor
- Route "Stopping S3 proxy" message through the provided logger
- Add --log-level option to the standalone proxy server launcher
The message is now hidden:
```console
$ ./test.py --mode release object_store/test_backup::test_simple_backup_and_restore
Found 1 tests.
================================================================================
[N/TOTAL] SUITE MODE RESULT TEST
------------------------------------------------------------------------------
[1/1] object_store release [ PASS ] object_store.test_backup.1
------------------------------------------------------------------------------
CPU utilization: 4.1%
```
---
this change improves the developer experience, hence no need to backport.
Closesscylladb/scylladb#21610
* github.com:scylladb/scylladb:
test: route S3 Proxy server messages through logger
test: s3_proxy: remove unused method
now that we are allowed to use C++23. we now have the luxury of using
`std::ranges::find_if`.
in this change, we:
- replace `boost::find_if` with `std::ranges::find_if`
- remove all `#include <boost/range/algorithm/find_if.hpp>`
to reduce the dependency to boost for better maintainability, and
leverage standard library features for better long-term support.
this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
std::ranges allows us to create a range from a pair of iterators.
but the iterator has to fulfill the concept of `std::semiregular`.
in order to reduce the header dependency on boost, we need to
make `basic_counter_cell_view::shard_iterator` to support
`std::semiregular`.
in this change:
- define a default constructor for
`basic_counter_cell_view::shard_iterator`, so that the iterator
satisfies the constraints of `std::semiregular`, as required by
C++20's forward_iterator concept. please note, despite that
the standard requires the iterator to be `std::semiregular`, but
the iterator created by default constructor is not evaluated in
production. sometimes, the standard algorithms just need to
store/create itermediate iterators or to represent a "singular"
state for iterator. a use case is an empty container.
- change `basic_counter_cell_view::shard_iterator::reference` so
its dereference returns a rvalue instead of a reference. because
per C++20 standard, the dereference of a forward_iterator should
be stable, but we were returning a reference / pointer referencing
a member variable of the iterator. so once the iterator is destructed,
the returned reference / pointer would be invalidated. so we have to
return a value to fulfill the requiremend of forward_iterator. this
change also fulfills the requirement of `same_as<iter_reference_t<It>,
iter_reference_t<const It>>`, which a part of the
`indirectly_readable` requirement.
- let `basic_counter_cell_view::shards()` return a subrange
- let `basic_counter_shard_view::swap_value_and_clock()` accepts
a plain value instead of a reference. because the dereference of
the iterator does not return a reference anymore. and the returned
type is a lightweighted "view", so the performance penality is
negligible.
- use ranges libraries when appropriate in this header.
this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
before this change, we assumed that the dereference types of
the given `InputIterator1` and `InputIterator2` are always
references. but this does not hold if the `operator*` returns
a rvalue, as in the C++20 standard, unlike the LegacyForwardIterator
requirement, `std::forward_iterator` does not requires
dereference to return a reference. so we should not assume this,
if we want to use `combine()` with iterators whose dereference
return a, for instance, rvalue.
in this change, we use `std::iter_const_reference_t` instead. this
type is deduced from the behavior of the iterator instead of hardwire
it to a reference type. this allows us to use a C++20 forward_iterator
with this generic function.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
The stream_weight for repair migration is set to 2, because it requires
more work than just moving the tablet around. The stream_weight for all
other migrations are set to 1.
For historic reasons, we have (in bytes.hh) a type sstring_view which is an alias for std::string_view - since the same standard type can hold a pointer into both a seastar::sstring and std::string.
This alias in unnecessary and misleading to new developers, who might be misled to believe it is assume it is somehow different from std::string_view - when it isn't.
This series removes all uses of sstring_view (changing them to use std::string_view), and in the last patch removes the alias itself. A few functions whose name referred to "sstring" but take a std::string_view were renamed.
The patches are fairly mechanical and trivial, with no functional changes intended. To ease the review the series was split to a few smaller patches that modify specific areas of the code.
Fixes#4062.
Closesscylladb/scylladb#21617
* github.com:scylladb/scylladb:
bytes: remove unused alias sstring_view
change remaining sstring_view to std::string_view
test: change sstring_view to std::string_view
cql3: change sstring_view to std::string_view
alternator: change sstring_view to std::string_view
type: change from_sstring() to from_string_view()
cross-tree: change to_sstring_view() to to_string_view()
Wean the mutation code (at least the headers) from boost ranges to std ranges,
in order to reduce the dependency load.
Cleanup, so no backport.
Closesscylladb/scylladb#21601
* github.com:scylladb/scylladb:
partition_snapshot_row_cursor.hh: switch from boost ranges to std ranges
mutation: mutation_partition_v2.hh: switch from boost ranges to std ranges
mutation: mutation_partition.hh: switch from boost ranges to std ranges
partition_snapshot_reader.hh: drop unused include boost/range/algorithm/heap_algorithm.hpp
This change adds support for PER PARTITION LIMIT for aggregate queries.
result_set_builder gets two new functions handling partition start and
end:
- accept_partition_end for notifying that a partition has been finished.
This is also called when a page ends, so we cannot simply flush here,
as a naive implementation could do.
- accept_new_partition, where we flush_selectors() if it's indeed a new
partition (and not a continuation of the previous) and the query has a
grouping: we don't want to flush on new partition in a query like
SELECT COUNT(*) FROM foo;
select_statement::get_limit is used to evaluate the LIMIT value for both
LIMIT and PER PARTITION LIMIT. This change fixes the error message for
incorrect values passed by the user.
Currently, task_manager_module::is_aborted checks whether a task
with given id was aborted on this shard.
In tablet_repair_task_impl::run, is_aborted method is called on all
shards to check if the parent task was aborted. However, even
for aborted parent, is_aborted will return true only on owner shard
of the parent.
Pass shard param to task_manager_module::is_aborted that indicates
which shard to check.
Currently, task_manager_module::abort_all_repairs marks top-level
repairs as aborted (but does not abort them) and aborts all shard
tasks. If after that a top-level repair creates a shard task,
the new shard repair won't be aborted.
Abort top-level repair tasks in abort_all_repairs. They will abort
their children and newly created shard tasks will be immediately
aborted.
Our "sstring_view" was an historic alias for the standard std::string_view.
All its uses were removed in the previous patches, so we can now finally
remove this unused alias.
Refs #4062.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Our "sstring_view" is an historic alias for the standard std::string_view.
The patch changes the last remaining random uses of this old alias across
our source directory to the standard type name.
After this patch, there are no more uses of the "sstring_view" alias.
It will be removed in the following patch.
Refs #4062.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Our "sstring_view" is an historic alias for the standard std::string_view.
The test/ directory used this old alias in a few of random places, let's
change them to use the standard type name.
Refs #4062.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Our "sstring_view" is an historic alias for the standard std::string_view.
The cql3/ directory used this old alias in a few of random places, let's
change them to use the standard type name.
Refs #4062.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Our "sstring_view" is an historic alias for the standard std::string_view.
Alternator only used this alias in a couple of random names, let's change
them to the standard type name.
Refs #4062.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
All CQL type implementations have a from_sstring(sstring_view) method.
The "sstring_view" type is just an historic alias for std::string_view,
so this patch switches to use the standard type as suggested in #4062,
and also renames these functions from_string_view() to emphesize they can
take any string view, and not necessarily a "sstring" as their old name
suggested.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
For historic reasons, we have (in bytes.hh) a type sstring_view which
is an alias for std::string_view - since the same standard type can hold
a pointer into both a seastar::sstring and std::string.
This alias in unnecessary and misleading to new developers (who might
assume it is somehow different from std::string_view). This patch doesn't
yet remove all occurances of sstring_view (the request in #4062), but
begins to do it by renaming one commonly-used function, to_sstring_view(bytes)
to to_string_view() and of course changes all its uses to the new name.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
This change was created in the same spirit of f8221b960f.
The S3ProxyServer (introduced in 8919e0abab) currently prints its
status directly to stdout, which can be distracting when reviewing test
results. For example:
```console
$ ./test.py --mode release object_store/test_backup::test_simple_backup_and_restore
Found 1 tests.
Setting minio proxy random seed to 1731924995
Starting S3 proxy server on ('127.193.179.2', 9002)
================================================================================
[N/TOTAL] SUITE MODE RESULT TEST
------------------------------------------------------------------------------
[1/1] object_store release [ PASS ] object_store.test_backup.1
Stopping S3 proxy server
------------------------------------------------------------------------------
CPU utilization: 3.1%
```
Move these messages to use proper logging to give developers more control
over their visibility:
- Make logger parameter mandatory in S3ProxyServer constructor
- Route "Stopping S3 proxy" message through the provided logger
- Add --log-level option to the standalone proxy server launcher
The message is now hidden:
```console
$ ./test.py --mode release object_store/test_backup::test_simple_backup_and_restore
Found 1 tests.
================================================================================
[N/TOTAL] SUITE MODE RESULT TEST
------------------------------------------------------------------------------
[1/1] object_store release [ PASS ] object_store.test_backup.1
------------------------------------------------------------------------------
CPU utilization: 4.1%
```
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
neither `InjectingHandler.log_error`, nor `InjectingHandler.log_message`
is used. so let's drop them.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>