Commit Graph

45472 Commits

Author SHA1 Message Date
Piotr Dulikowski
dcaf6582c4 materialized views: test for the MV delay configuration parameter
The test does the following:

- Enables an error injection which will cause further view updates to
  get stuck, occupying space in memory and affecting the backlog,
- Performs a single, large write to the base table which causes a single
  view update to be generated; the write is then followed with one more,
  small write to make sure that the other write will be affected by the
  first write's backlog,
- Reads relevant metrics in order to check the exact value of the delay
  that was calculated for the base table write due to MV backpressure.

This is done for different values of the MV delay configuration
parameter (view_flow_control_delay_limit_in_ms) and the calculated
delays are collected into a list. Lastly, the test checks that the
relation between parameter value and the calculated delays is linear.
2024-12-05 11:48:45 +01:00
Piotr Dulikowski
66afdc9b3c service: add injection for skipping view update backlog
Information about view update backlog is propagated in two main ways:

- In RPCs that serve as responses to writes (MUTATION_DONE /
  MUTATION_FAILED)
- Via gossip (application_state::VIEW_BACKLOG)

In tests, it can be benefical to disable the second mechanism. View
update backlog propagation via write responses happens synchronously
with respect to writes so it is easier to control and reason about,
while gossip is asynchronous and can overwrite the backlog that was
propagated via write responses.

Add `skip_updating_local_backlog_via_view_update_backlog_broker` error
injection which skips the logic that updates the local, per-endpoint
cache of view update backlogs from the gossip state.
2024-12-05 09:51:57 +01:00
Nadav Har'El
49f11f655c materialized view: make flow-control maximum delay configurable
Until this patch, the materialized view flow-control algorithm
(https://www.scylladb.com/2018/12/04/worry-free-ingestion-flow-control/)
used a constant delay_limit_us hard-coded to one second, which means
that when the size of view-update backlog reached the maximum (10%
of memory), we delay every request by an additional second - while
smaller amounts of backlog will result in smaller delays.

This hard-coded one maximum second delay was considered *huge* - it will
slow down a client with concurrency 1000 to just 1000 requests per
second - but we already saw some workloads where it was not enough -
such as a test workload running very slow reads at high concurrency
on a slow machine, where a latency of over one second was expected
for each read, so adding a one second latecy for writes wasn't having
any noticable affect on slowing down the client.

So this patch replaces the hard-coded default with a live-updateable
configuration parameter, `view_flow_control_delay_limit_in_ms`, which
defaults to 1000ms as before.

Another useful way in which the new `view_flow_control_delay_limit_in_ms`
can be used is to set it to 0. In that case, the view-update flow
control always adds zero delay, and in effect - does absolutely
nothing. This setting can be used in emergency situations where it
is suspected that the MV flow control is not behaving properly, and
the user wants to disable it.

The new parameter's help string mentions both these use cases of
the parameter.

Fixes #18187

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-12-05 09:51:56 +01:00
Kefu Chai
f69ebc1797 configure.py: remove --python command line option
Remove the `--python` option which was originally added in 780d9a26b2 to support
CentOS's non-standard python3 path (`/usr/bin/python3.4`).

Since we now:
- Build using a Fedora-based container with standard python3 path
- Use properly configured shebangs in build scripts
- Set correct executable permissions on Python scripts

This change:
1. Removes the `--python` command line option
2. Updates build rules to execute Python scripts directly instead of via interpreter

This simplifies the build system and reduces differences between CMake and
configure.py-generated rules.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21607
2024-11-21 06:30:56 +02:00
Yaron Kaikov
d9cde7cca5 .github/scripts/auto-backport.py: add user as collaborator to scylladbbot fork
As reported by @Deexie, during the process of opening backport PRs in https://github.com/scylladb/scylladb/pull/21616, No invite emails were sent, causing a lack of permissions for the backport PR branch

The check if `has_in_collaborators(pr.user.login)` was pointing to `scylladb/scylladb` instead of `scylladbbot/scylladb`, fixing it

I also moved the collaborator check to an early stage, before trying to open a backport PR

Closes scylladb/scylladb#21645
2024-11-20 14:34:38 +02:00
Tomasz Grabiec
0d2583600d Merge 'Add tablet repair scheduler support' from Asias He
This adds a new tablet migration kind: repair. It allows tablet repair
scheduler to use this migration kind to schedule repair jobs.

The current repair scheduler implementation does the following:

- A tablet is picked to be repaired when is requested by user

- The tablet repair can be scheduled along with tablet migration and
  rebuild. It runs in the tablet_migration track.

- Repair jobs are scheduled in a smart way so that at any point in time,
  there are no more than configured jobs per shard, which is similar to
  scylla manager's control.

New feature. No backport is needed.

Closes scylladb/scylladb#21088

* github.com:scylladb/scylladb:
  test: Add tests for tablet repair scheduler
  repair: Add restful API for tablet repair
  repair: Add tablet repair scheduler internal API support
  docs: Update system_keyspace.md for tablet repair related info
  docs: Add docs for tablet repair migration
  repair: Add core tablet repair scheduler support
  messaging_service: Introduce TABLET_REPAIR verb
  tablet_allocator: Introduce stream_weight for tablet_migration_streaming_info
  network_topology_strategy: Preserve fields of task_info in reallocate_tablets
2024-11-20 13:28:17 +01:00
Botond Dénes
d94591c260 Merge 'treewide: replace boost::find_if with std::ranges::find_if' from Kefu Chai
now that we are allowed to use C++23. we now have the luxury of using `std::ranges::find_if`.

in this change, we:

- replace `boost::find_if` with `std::ranges::find_if`
- remove all `#include <boost/range/algorithm/find_if.hpp>`

to reduce the dependency to boost for better maintainability, and leverage standard library features for better long-term support.

this change is part of our ongoing effort to modernize our codebase and reduce external dependencies where possible.

---

it's a cleanup, hence no need to backport.

Closes scylladb/scylladb#21495

* github.com:scylladb/scylladb:
  treewide: replace boost::find_if with std::ranges::find_if
  counters: replace boost::find_if with std::ranges::find_if
  combine.hh: use std::iter_const_reference_t when appropriate
2024-11-20 09:58:13 +02:00
Botond Dénes
075ca6cc02 Merge 'cql3: respect PER PARTITION LIMIT for aggregate queries' from Paweł Zakrzewski
Currently, PER PARTITION LIMIT is not implemented for aggregates and queries can result in more rows than expected from the same partition.

Instrument the result_set_builder class so that it can enforce PER PARTITION LIMIT for aggregate queries, specifically:
- add per_partition_limit to the result_set_builder
- expose the number of input rows in the selector

result_set_builder gets two new functions handling partition start and end:
- accept_partition_end for notifying that a partition has been finished. This is also called when a page ends, so we cannot simply flush here, as a naive implementation could do.
- accept_new_partition, where we flush_selectors() if it's indeed a new partition (and not a continuation of the previous) and the query has a grouping: we don't want to flush on new partition in a query like SELECT COUNT(*) FROM foo;

Fixes #5363

Closes scylladb/scylladb#21125

* github.com:scylladb/scylladb:
  test: enable PER PARTIION LIMIT + GROUP BY tests
  cql3: respect PER PARTITION LIMIT for aggregates
  cql3: selection: count input rows in the selector
  cql3: selection: pass per partition limit to the result_set_builder
  cql3: show different messages for LIMIT and PER PARTITION LIMIT in get_limit
2024-11-20 09:54:28 +02:00
Botond Dénes
5ccbd500e0 Merge 'repair: fix task_manager_module::abort_all_repairs' from Aleksandra Martyniuk
Currently, task_manager_module::abort_all_repairs marks top-level repairs as aborted (but does not abort them) and aborts all existing shard tasks.

A running repair checks whether its id isn't contained in _aborted_pending_repairs and then proceeds to create shard tasks. If abort_all_repairs is executed after _aborted_pending_repairs is checked but before shard tasks are created, then those new tasks won't be aborted. The issue is the most severe for tablet_repair_task_impl that checks the _aborted_pending_repairs content from different shards, that do not see the top-level task. Hence the repair isn't stopped but it creates shard repair tasks on all shards but the one that initialized repair.

Abort top-level tasks in abort_all_repairs. Fix the shard on which the task abort is checked.

Fixes: #21612.

Needs backport to 6.1 and 6.2 as they contain the bug.

Closes scylladb/scylladb#21616

* github.com:scylladb/scylladb:
  test: add test to check if repair is properly aborted
  repair: add shard param to task_manager_module::is_aborted
  repair: use task abort source to abort repair
  repair: drop _aborted_pending_repairs and utilize tasks abort mechanism
  repair: fix task_manager_module::abort_all_repairs
2024-11-20 06:43:01 +02:00
Asias He
ddfec068d0 test: Add tests for tablet repair scheduler 2024-11-20 09:42:41 +08:00
Asias He
844129227e repair: Add restful API for tablet repair
It allows user to add and del a tablet repair request. The request is
executed by the tablet repair scheduler.
2024-11-20 09:42:41 +08:00
Asias He
ca1fc28605 repair: Add tablet repair scheduler internal API support
Those internal APIs allow to add / del a tablet repair request and
config the tablet repair scheduler.

It can be used by task manager or plain restful api.
2024-11-20 09:42:41 +08:00
Asias He
9d58a911f1 docs: Update system_keyspace.md for tablet repair related info 2024-11-20 09:42:41 +08:00
Asias He
afd356ea9a docs: Add docs for tablet repair migration 2024-11-20 09:42:41 +08:00
Asias He
b71a563030 repair: Add core tablet repair scheduler support
This adds a new tablet migration kind: repair. It allows tablet repair
scheduler to use this migration kind to schedule repair jobs.

The current repair scheduler implementation does the following:

- A tablet is picked to be repaired when the time since last repair is
  bigger than a threshold (auto repair mode) or it is requested by user
  (manual repair mode)

- The tablet repair can be scheduled along with tablet migration and
  rebuild. It runs in the tablet_migration track.

- Repair jobs are scheduled in a smart way so that at any point in time,
  there are no more than configured jobs per shard, which is similar to
  scylla manager's control.

In this patch, both the manual repair and the auto repair are not
enabled yet.
2024-11-20 09:42:41 +08:00
Nadav Har'El
733a4f94c7 Merge 'test/boost/view_schema_test.cc: Wait for views to build in test_view_update_generating_writetime' from Dawid Mędrek
Before these changes, we didn't wait for the materialized views to
finish building before writing to the base table. That led to generating
an additional view update, which, in turn, led to test failures.

The scenario corresponding to the summary above looked like this:

1. The test creates an empty table and MVs on it.
2. The view builder starts, but it doesn't finish immediately.
3. The test performs mutations to the base table. Since the views
   already exist, view updates are generated.
4. Finally, the view builder finishes. It notices that the base
   table has a row, so it generates a view update for it because
   it doesn't notice that we already have data in the view.

We solve it by explicitly waiting for both views to finish building
and only then start writing to the base table.

Additionally, we also fix a lifetime issue of the row the test revolves
around, further stabilizing CI.

Fixes https://github.com/scylladb/scylladb/issues/20889

Backport: These changes have no semantic effect on the codebase,
but they stabilize CI, so we want to backport them to the maintained
versions of Scylla.

Closes scylladb/scylladb#21632

* github.com:scylladb/scylladb:
  test/boost/view_schema_test.cc: Increase TTL in test_view_update_generating_writetime
  test/boost/view_schema_test.cc: Wait for views to build in test_view_update_generating_writetime
2024-11-19 18:10:52 +02:00
Dawid Mędrek
af4afc84ec test/boost/view_schema_test.cc: Increase TTL in test_view_update_generating_writetime
The auxiliary function `eventually()` (defined in `test/lib/eventually.hh`)
tries to execute a passed function. If it throws, `eventually()` sleeps
for `2^#previous_attempts` milliseconds and tries to perform it again.
The default limit of attempts is 17.

In `test_view_update_generating_writetime`, right before the last test
case, we perform:

```cql
UPDATE t USING TTL 10 AND TIMESTAMP 8 SET g=40 WHERE k=1 AND c=1;
```

The test case itself executes:

```cql
SELECT WRITETIME(g) FROM t;
```

and asserts that the result of the query is equal to 8, i.e. it
corresponds to the timestamp of the last write to the table `t`.

However, if the test case keeps failing, then during its 14th attempt
(so affter sleeping for at least `2^14 - 1` milliseconds, which amounts
to about 16 seconds), we'll observe the following error:

```
[Exception] - std::runtime_error: Expected row not found: [0000000000000008] not in {result_message::rows {row:  null}}
```

The reason behind it is the specified TTL is too short. 10 seconds will
have already passed before the 14th attempt, so the value in the column
`g` will be `NULL` again. In particular, the `WRITETIME(g)` will no
longer be equal to `8`.

To solve that issue, we change the TTL in the CQL statement to 300.
The time spent on 17 loops of `eventually()` amounts to about
`2^18 - 1` milliseconds, which is about 263 seconds. That's why
setting the TTL to 300 seconds should be enough to prevent the error
from occurring.
2024-11-19 13:02:34 +01:00
Dawid Mędrek
5ca0cc4e85 test/boost/view_schema_test.cc: Wait for views to build in test_view_update_generating_writetime
Before these changes, we didn't wait for the materialized views to
finish building before writing to the base table. That led to generating
an additional view update, which, in turn, led to test failures.

The scenario corresponding to the summary above looked like this:

1. The test creates an empty table and MVs on it.
2. The view builder starts, but it doesn't finish immediately.
3. The test performs mutations to the base table. Since the views
   already exist, view updates are generated.
4. Finally, the view builder finishes. It notices that the base
   table has a row, so it generates a view update for it because
   it doesn't notice that we already have data in the view.

We solve it by explicitly waiting for both views to finish building
and only then start writing to the base table.

Fixes scylladb/scylladb#20889
2024-11-19 12:51:22 +01:00
Aleksandra Martyniuk
f5795e8aa4 test: add test to check if repair is properly aborted 2024-11-19 11:59:29 +01:00
Paweł Zakrzewski
b893e63b4a test: enable PER PARTIION LIMIT + GROUP BY tests 2024-11-19 09:28:01 +01:00
Nadav Har'El
7607f5e33e alternator: fix "/localnodes" to not return down nodes
Alternator's "/localnodes" HTTP requests is supposed to return the list
of nodes in the local DC to which the user can send requests.

Before commit bac7c33313 we used the
gossiper is_alive() method to determine if a node should be returned.
That commit changed the check to is_normal() - because a node can be
alive but in non-normal (e.g., joining) state and not ready for
requests.

However, it turns out that checking is_normal() is not enough, because
if node is stopped abruptly, other nodes will still consider it "normal",
but down (this is so-called "DN" state). So we need to check **both**
is_alive() and is_normal().

This patch also adds a test reproducing this case, where a node is
shut down abruptly. Before this patch, the test failed ("/localnodes"
continued to return the dead node), and after it it passes.

Fixes #21538

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#21540
2024-11-19 10:04:59 +02:00
Yaron Kaikov
980f6a48ab .github/scripts/auto-backport.py: validate backport candidate with Fixes prefix
Adding `Fixes` validation to a PR when backport labels were added. When the auto backport process triggers (after promotion), we will ensure each PR with backport/x.y label also has in the PR body a `Fixes` reference to an issue

Fixes: https://github.com/scylladb/scylladb/issues/20021

Closes scylladb/scylladb#21563
2024-11-19 09:48:34 +02:00
Benny Halevy
165902b951 conf/scylla.yaml: update documentation for enable_tablets
Change e3e8a94c9a changed
the semantics of the enable_tablets config option,
but updating that in the option documentation in scylla.yaml
was missed.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#21614
2024-11-19 09:44:53 +02:00
Botond Dénes
36870feb29 Merge 'test: route S3 Proxy server messages through logger' from Kefu Chai
This change was created in the same spirit of f8221b960f.
The S3ProxyServer (introduced in 8919e0abab) currently prints its
status directly to stdout, which can be distracting when reviewing test
results. For example:

```console
$ ./test.py --mode release object_store/test_backup::test_simple_backup_and_restore
Found 1 tests.
Setting minio proxy random seed to 1731924995
Starting S3 proxy server on ('127.193.179.2', 9002)
================================================================================
[N/TOTAL]   SUITE    MODE   RESULT   TEST
------------------------------------------------------------------------------
[1/1]      object_store release [ PASS ] object_store.test_backup.1
Stopping S3 proxy server
------------------------------------------------------------------------------
CPU utilization: 3.1%
```

Move these messages to use proper logging to give developers more control
over their visibility:

- Make logger parameter mandatory in S3ProxyServer constructor
- Route "Stopping S3 proxy" message through the provided logger
- Add --log-level option to the standalone proxy server launcher

The message is now hidden:

```console
$ ./test.py --mode release object_store/test_backup::test_simple_backup_and_restore
Found 1 tests.
================================================================================
[N/TOTAL]   SUITE    MODE   RESULT   TEST
------------------------------------------------------------------------------
[1/1]      object_store release [ PASS ] object_store.test_backup.1
------------------------------------------------------------------------------
CPU utilization: 4.1%
```

---

this change improves the developer experience, hence no need to backport.

Closes scylladb/scylladb#21610

* github.com:scylladb/scylladb:
  test: route S3 Proxy server messages through logger
  test: s3_proxy: remove unused method
2024-11-19 06:42:28 +02:00
Kefu Chai
33a0e5b892 treewide: replace boost::find_if with std::ranges::find_if
now that we are allowed to use C++23. we now have the luxury of using
`std::ranges::find_if`.

in this change, we:

- replace `boost::find_if` with `std::ranges::find_if`
- remove all `#include <boost/range/algorithm/find_if.hpp>`

to reduce the dependency to boost for better maintainability, and
leverage standard library features for better long-term support.

this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-11-19 10:50:01 +08:00
Kefu Chai
3e75fbd9d3 counters: replace boost::find_if with std::ranges::find_if
std::ranges allows us to create a range from a pair of iterators.
but the iterator has to fulfill the concept of `std::semiregular`.
in order to reduce the header dependency on boost, we need to
make `basic_counter_cell_view::shard_iterator` to support
`std::semiregular`.

in this change:

- define a default constructor for
  `basic_counter_cell_view::shard_iterator`, so that the iterator
  satisfies the constraints of `std::semiregular`, as required by
  C++20's forward_iterator concept. please note, despite that
  the standard requires the iterator to be `std::semiregular`, but
  the iterator created by default constructor is not evaluated in
  production. sometimes, the standard algorithms just need to
  store/create itermediate iterators or to represent a "singular"
  state for iterator. a use case is an empty container.
- change `basic_counter_cell_view::shard_iterator::reference` so
  its dereference returns a rvalue instead of a reference. because
  per C++20 standard, the dereference of a forward_iterator should
  be stable, but we were returning a reference / pointer referencing
  a member variable of the iterator. so once the iterator is destructed,
  the returned reference / pointer would be invalidated. so we have to
  return a value to fulfill the requiremend of forward_iterator. this
  change also fulfills the requirement of `same_as<iter_reference_t<It>,
  iter_reference_t<const It>>`, which a part of the
  `indirectly_readable` requirement.
- let `basic_counter_cell_view::shards()` return a subrange
- let `basic_counter_shard_view::swap_value_and_clock()` accepts
  a plain value instead of a reference. because the dereference of
  the iterator does not return a reference anymore. and the returned
  type is a lightweighted "view", so the performance penality is
  negligible.
- use ranges libraries when appropriate in this header.

this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-11-19 10:50:01 +08:00
Kefu Chai
69939ee653 combine.hh: use std::iter_const_reference_t when appropriate
before this change, we assumed that the dereference types of
the given `InputIterator1` and `InputIterator2` are always
references. but this does not hold if the `operator*` returns
a rvalue, as in the C++20 standard, unlike the LegacyForwardIterator
requirement, `std::forward_iterator` does not requires
dereference to return a reference. so we should not assume this,
if we want to use `combine()` with iterators whose dereference
return a, for instance, rvalue.

in this change, we use `std::iter_const_reference_t` instead. this
type is deduced from the behavior of the iterator instead of hardwire
it to a reference type. this allows us to use a C++20 forward_iterator
with this generic function.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-11-19 10:50:01 +08:00
Asias He
5b17be6494 messaging_service: Introduce TABLET_REPAIR verb
It is used by the tablet repair scheduler.
2024-11-19 10:04:41 +08:00
Asias He
82a10eca55 tablet_allocator: Introduce stream_weight for tablet_migration_streaming_info
The stream_weight for repair migration is set to 2, because it requires
more work than just moving the tablet around. The stream_weight for all
other migrations are set to 1.
2024-11-19 10:04:41 +08:00
Asias He
c975882e03 network_topology_strategy: Preserve fields of task_info in reallocate_tablets
So other fields will not be dropped when the new tablet is created.
2024-11-19 10:04:41 +08:00
Avi Kivity
b14871ad3f Merge 'code cleanup: remove "sstring_view" and replace its usages by std::string_view' from Nadav Har'El
For historic reasons, we have (in bytes.hh) a type sstring_view which is an alias for std::string_view - since the same standard type can hold a pointer into both a seastar::sstring and std::string.

This alias in unnecessary and misleading to new developers, who might be misled to believe it is assume it is somehow different from std::string_view - when it isn't.

This series removes all uses of sstring_view (changing them to use std::string_view), and in the last patch removes the alias itself. A few functions whose name referred to "sstring" but take a std::string_view were renamed.

The patches are fairly mechanical and trivial, with no functional changes intended. To ease the review the series was split to a few smaller patches that modify specific areas of the code.

Fixes #4062.

Closes scylladb/scylladb#21617

* github.com:scylladb/scylladb:
  bytes: remove unused alias sstring_view
  change remaining sstring_view to std::string_view
  test: change sstring_view to std::string_view
  cql3: change sstring_view to std::string_view
  alternator: change sstring_view to std::string_view
  type: change from_sstring() to from_string_view()
  cross-tree: change to_sstring_view() to to_string_view()
2024-11-18 22:43:46 +02:00
Tomasz Grabiec
06d478793d Merge 'mutation: switch from boost ranges to std ranges' from Avi Kivity
Wean the mutation code (at least the headers) from boost ranges to std ranges,
in order to reduce the dependency load.

Cleanup, so no backport.

Closes scylladb/scylladb#21601

* github.com:scylladb/scylladb:
  partition_snapshot_row_cursor.hh: switch from boost ranges to std ranges
  mutation: mutation_partition_v2.hh: switch from boost ranges to std ranges
  mutation: mutation_partition.hh: switch from boost ranges to std ranges
  partition_snapshot_reader.hh: drop unused include boost/range/algorithm/heap_algorithm.hpp
2024-11-18 21:23:29 +01:00
Luis Freitas
34d7a4401d ./github/workflows/conflict_reminder.yaml: fix assignee object
References the login property of object assignee

Closes scylladb/scylladb#21615
2024-11-18 19:42:58 +02:00
Paweł Zakrzewski
08eb853a96 cql3: respect PER PARTITION LIMIT for aggregates
This change adds support for PER PARTITION LIMIT for aggregate queries.
result_set_builder gets two new functions handling partition start and
end:
- accept_partition_end for notifying that a partition has been finished.
  This is also called when a page ends, so we cannot simply flush here,
  as a naive implementation could do.
- accept_new_partition, where we flush_selectors() if it's indeed a new
  partition (and not a continuation of the previous) and the query has a
  grouping: we don't want to flush on new partition in a query like
  SELECT COUNT(*) FROM foo;
2024-11-18 17:56:53 +01:00
Paweł Zakrzewski
8190d76dd6 cql3: selection: count input rows in the selector
This will allow result_set_builder::flush_selectors() to only flush when
there are input rows.
2024-11-18 17:56:53 +01:00
Paweł Zakrzewski
aea3c3851e cql3: selection: pass per partition limit to the result_set_builder
Aggregates require the limit to be applied from within the builder
class, so it needs to be passed to it.
2024-11-18 17:56:53 +01:00
Paweł Zakrzewski
cb1483037c cql3: show different messages for LIMIT and PER PARTITION LIMIT in get_limit
select_statement::get_limit is used to evaluate the LIMIT value for both
LIMIT and PER PARTITION LIMIT. This change fixes the error message for
incorrect values passed by the user.
2024-11-18 17:56:53 +01:00
Aleksandra Martyniuk
ca14167b20 repair: add shard param to task_manager_module::is_aborted
Currently, task_manager_module::is_aborted checks whether a task
with given id was aborted on this shard.

In tablet_repair_task_impl::run, is_aborted method is called on all
shards to check if the parent task was aborted. However, even
for aborted parent, is_aborted will return true only on owner shard
of the parent.

Pass shard param to task_manager_module::is_aborted that indicates
which shard to check.
2024-11-18 16:26:08 +01:00
Aleksandra Martyniuk
de8d59172a repair: use task abort source to abort repair
Aborting of a top-level repair does not need task_mananger_module
anymore. Use task's abort source wherever possible.
2024-11-18 16:26:08 +01:00
Aleksandra Martyniuk
a6d9931705 repair: drop _aborted_pending_repairs and utilize tasks abort mechanism 2024-11-18 16:26:08 +01:00
Aleksandra Martyniuk
db8308fe93 repair: fix task_manager_module::abort_all_repairs
Currently, task_manager_module::abort_all_repairs marks top-level
repairs as aborted (but does not abort them) and aborts all shard
tasks. If after that a top-level repair creates a shard task,
the new shard repair won't be aborted.

Abort top-level repair tasks in abort_all_repairs. They will abort
their children and newly created shard tasks will be immediately
aborted.
2024-11-18 16:25:57 +01:00
Nadav Har'El
5e20cb8c66 bytes: remove unused alias sstring_view
Our "sstring_view" was an historic alias for the standard std::string_view.
All its uses were removed in the previous patches, so we can now finally
remove this unused alias.

Refs #4062.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-11-18 16:51:15 +02:00
Nadav Har'El
e639434a89 change remaining sstring_view to std::string_view
Our "sstring_view" is an historic alias for the standard std::string_view.
The patch changes the last remaining random uses of this old alias across
our source directory to the standard type name.

After this patch, there are no more uses of the "sstring_view" alias.
It will be removed in the following patch.

Refs #4062.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-11-18 16:48:57 +02:00
Nadav Har'El
e72aabae7f test: change sstring_view to std::string_view
Our "sstring_view" is an historic alias for the standard std::string_view.
The test/ directory used this old alias in a few of random places, let's
change them to use the standard type name.

Refs #4062.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-11-18 16:26:20 +02:00
Nadav Har'El
b778ce08a9 cql3: change sstring_view to std::string_view
Our "sstring_view" is an historic alias for the standard std::string_view.
The cql3/ directory used this old alias in a few of random places, let's
change them to use the standard type name.

Refs #4062.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-11-18 15:57:20 +02:00
Nadav Har'El
f2b4a59ec7 alternator: change sstring_view to std::string_view
Our "sstring_view" is an historic alias for the standard std::string_view.
Alternator only used this alias in a couple of random names, let's change
them to the standard type name.

Refs #4062.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-11-18 15:44:49 +02:00
Nadav Har'El
766ee56536 type: change from_sstring() to from_string_view()
All CQL type implementations have a from_sstring(sstring_view) method.
The "sstring_view" type is just an historic alias for std::string_view,
so this patch switches to use the standard type as suggested in #4062,
and also renames these functions from_string_view() to emphesize they can
take any string view, and not necessarily a "sstring" as their old name
suggested.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-11-18 15:33:04 +02:00
Nadav Har'El
da99dc3a7f cross-tree: change to_sstring_view() to to_string_view()
For historic reasons, we have (in bytes.hh) a type sstring_view which
is an alias for std::string_view - since the same standard type can hold
a pointer into both a seastar::sstring and std::string.

This alias in unnecessary and misleading to new developers (who might
assume it is somehow different from std::string_view). This patch doesn't
yet remove all occurances of sstring_view (the request in #4062), but
begins to do it by renaming one commonly-used function, to_sstring_view(bytes)
to to_string_view() and of course changes all its uses to the new name.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2024-11-18 14:57:49 +02:00
Kefu Chai
cb24022b54 test: route S3 Proxy server messages through logger
This change was created in the same spirit of f8221b960f.
The S3ProxyServer (introduced in 8919e0abab) currently prints its
status directly to stdout, which can be distracting when reviewing test
results. For example:
```console
$ ./test.py --mode release object_store/test_backup::test_simple_backup_and_restore
Found 1 tests.
Setting minio proxy random seed to 1731924995
Starting S3 proxy server on ('127.193.179.2', 9002)
================================================================================
[N/TOTAL]   SUITE    MODE   RESULT   TEST
------------------------------------------------------------------------------
[1/1]      object_store release [ PASS ] object_store.test_backup.1
Stopping S3 proxy server
------------------------------------------------------------------------------
CPU utilization: 3.1%
```

Move these messages to use proper logging to give developers more control
over their visibility:

- Make logger parameter mandatory in S3ProxyServer constructor
- Route "Stopping S3 proxy" message through the provided logger
- Add --log-level option to the standalone proxy server launcher

The message is now hidden:

```console
$ ./test.py --mode release object_store/test_backup::test_simple_backup_and_restore
Found 1 tests.
================================================================================
[N/TOTAL]   SUITE    MODE   RESULT   TEST
------------------------------------------------------------------------------
[1/1]      object_store release [ PASS ] object_store.test_backup.1
------------------------------------------------------------------------------
CPU utilization: 4.1%
```

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-11-18 18:41:17 +08:00
Kefu Chai
0dff187b7a test: s3_proxy: remove unused method
neither `InjectingHandler.log_error`, nor `InjectingHandler.log_message`
is used. so let's drop them.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-11-18 18:39:15 +08:00