Commit Graph

45501 Commits

Author SHA1 Message Date
Dawid Mędrek
fb62fc6061 test/boost/view_schema_test.cc: Split test cases in test_view_update_generating_writetime
We split some of the test cases so it's clearer what's going on in the
test. Also, if a bug happens in the future, it should be easier to
reason about it when it corresponds to exactly one CQL statement
instead of possibly two.
2024-11-24 22:47:27 +01:00
Nadav Har'El
7014aec452 Merge 'Alternator measuring RCU and WCU' from Amnon Heiman
Read and Write Consumed Capacity units are an abstract way of measuring Alternator actions. In general, they correspond to the read or write data.

In the long run, the RCU/WCU adds a way of charging an operation and limiting usage.

This series addresses two issues: consume capacity request API and metering.

The Alternator (and DynmoDB) API has an optional parameter allowing users to check the number of units an operation consumes. When a user adds that parameter, the response will contain the number of units used for the operation.

This series adds the consume capacity support to the get_item and put_item, adds a metric to collect the overall RCU and WCU used, and adds a test for the new functionality.

Follow-up PRs will add support for more operations and GSI.

Replaces #19811
Partially implement: #5027

Closes scylladb/scylladb#21543

* github.com:scylladb/scylladb:
  alternator/test_metrics: Add tests for table consumption units
  test_returnconsumedcapacity.py: Add putItem tests
  Alternator: add WCU support
  Add test/alternator/test_returnconsumedcapacity.py
  alternator/executor: Add consume capacity for get_item
  alsternator/stats: Add rcu and wcu metrics to stats
  alternator/executor.hh: white-space cleanup
  Add the consume_capacity helper class
2024-11-24 19:27:03 +02:00
Dawid Mędrek
f913ae571f db/view: Don't generate view updates for unselected columns
The semantics of Scylla's materialized views may vary depending on how their
primary keys correspond to the base table's one. One of the differences is
how we handle writes to columns in the base table that are not selected by
a view:

* Case 1: The view's PK is a permutation of the base table's PK:

  Since the view's primary key cannot be changed in an update, a row in
  the view remains alive as long as the corresponding row in the base table
  is alive.

  The tricky part comes when the base table has columns that are NOT selected
  by the view. CQL3 used to not allow for defining a table that didn't have
  any other columns besides its primary key. Also, when inserting a row into
  a table, it was mandatory to provide at least one value aside from the
  primary key. At some point it changed [1] and the implementation of the
  solution relied on the notion of the row marker.

  Putting the details aside, consider the following scenario:

  (i)   the base table has a primary key consisting of columns
        c_1, ..., c_k, and it has regular columns rc_1, ..., rc_n,
  (ii)  the primary key of an MV defined on that table consists of
        a permutation of c_1, ..., c_k. The MV doesn't select at least
        one of the regular columns of the base table. Without loss of
        generality, let that unselected column be rc_1.
  (iii) the base table has a row R whose only non-null value is the one
        in the regular column rc_1.

  Now, what will R correspond to in the MV? The base table doesn't have a row
  marker, but all of its regular columns in the MV will be NULLs. That's NOT
  allowed.

  To solve that problem, all unselected columns have corresponding virtual
  columns in the MV; the only information they provide is whether there is
  a value in the base table or not. This way, the MV knows if a row is still
  alive or not.

  For that reason, we send view updates to virtual columns in the following
  cases:

  (i)  the value in the column changes from NULL to a value, i.e. it's
       created,
  (ii) the value in the column exists, but its TTL has been updated.

* Case 2: The view's PK has one more column that the base table's one:

  Since the primary key of the view has a regular column C from the base
  table, it is guaranteed that if there's a row in the MV, the corresponding
  row in the base table can remain alive: since C is part of the view's PK,
  it must have a value, so the row in the base table has a value in C too.
  The problem with virtual columns from the previous case doesn't manifest
  in this one. The liveness of the cell in C determines the liveness of
  the whole row in the view.

The semantics gets more complex, but the conclusion is this: in case 1,
virtual columns exist and we may need to generate view updates for them,
while in case 2 virtual columns do NOT exist and so we don't generate
view updates for them.

What changes in this patch is we adjust the code to it. If a view has
a regular column from the base table as part of its primary key, we
no longer emit view updates when we change a column unselected by that
view. It is purely an OPTIMIZATION change.

[1]: https://issues.apache.org/jira/browse/CASSANDRA-4361

Fixes scylladb/scylladb#21652

Closes scylladb/scylladb#21653
2024-11-24 19:01:28 +02:00
Avi Kivity
29497f8c5d Merge 'Automatically compute schema version of system tables' from Tomasz Grabiec
Schema of system tables is defined statically and table_schema_version needs to be explicitly set in code like this:

```
builder.with_version(system_keyspace::generate_schema_version(table_id, version_offset));
```

Whenever schema is changed, the schema version needs to change, otherwise we hit undefined behavior when trying to interpret mutation data created with the old schema using the new schema.

It's not obvious that one needs to do that and developers often forget to do that. There were several instances of mistakes of omission, some caught during review, some not, e.g.: 31ea74b96e.

This patch changes definitions to call the new `schema_builder::with_hash_version()`, which will make the schema builder compute version from schema definition so that changes of the schema will automatically change the version. This way we no longer rely on the developer to remember to bump the version offset.

All nodes should arrive at the same version, which is verified by existing `test_group0_schema_versioning` and a new unit test: `test_system_schema_version_is_stable`.

Closes scylladb/scylladb#21602

* github.com:scylladb/scylladb:
  system_tables: Compute schema version automatically
  schema_builder: Introduce with_hash_version()
  schema: Store raw_view_info in schema::raw_schema
  schema: Remove dead comment
  hashing: Add hasher for unordered_map
  hashing: Add hasher for unique_ptr
  hashing: Add hasher for double

[avi: add missing include <memory> to hashing.hh]
2024-11-24 18:44:32 +02:00
Amnon Heiman
1f688bc670 cql3/query_processor.cc: Add skip_when_empty to metrics
This patch introduces the skip_when_empty flag to all CQL counters that
previously lacked this setting.

The skip_when_empty flag is a metric optimization that prevents
reporting on counters that have never been used. Once a counter has been
used (i.e., it holds a positive value), it will continue to be reported
consistently from that point onward.

Fixes #21046

Signed-off-by: Amnon Heiman <amnon@scylladb.com>

Closes scylladb/scylladb#21565
2024-11-24 17:30:46 +02:00
Kefu Chai
e2e6f4f441 repair: s/Exceute/Execute/ in logging message
fix a typo in the logging message.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21661
2024-11-22 15:21:56 +02:00
muthu90tech
0ea0234a7a Avoid unnecessary copy in query_processor::execute_direct_without_checking_exception_message
instead of making a copy of the warnings vector, make the warnings a non const in prepared_statement
and move the warnings vector to execute_maybe_with_guard

Closes scylladb/scylladb#20361

Closes scylladb/scylladb#21083
2024-11-22 13:34:31 +02:00
Alexander Turetskiy
e83ab28d2d Improve compation on read of expired tombstones
compact expired tombstones in cache even if they are blocked by
commitlog

fixes #16781

Closes scylladb/scylladb#21613
2024-11-22 10:31:21 +02:00
Kamil Braun
8d52f30b74 Merge 'more gossiper code cleanups' from Gleb
More gossiper cleanups that accumulated since the previous one.

* 'gleb/more-gossip-cleanup-v2' of github.com:scylladb/scylla-dev:
  gossiper: replace milliseconds with seconds where appropriate
  gossiper: simplify failure_detector_loop loop a bit
  gossiper: use fmt library to format time
  gossiper: drop on_success callback from mutate_live_and_unreachable_endpoints
  gossiper: remove code duplication between shadow round and regular path when state is applied
  gossiper: remove remnants of old shadow round
  gossiper: fix indentation after the last patch
  gossiper: co-routinize do_shadow_round
2024-11-21 11:10:23 +01:00
Kefu Chai
f69ebc1797 configure.py: remove --python command line option
Remove the `--python` option which was originally added in 780d9a26b2 to support
CentOS's non-standard python3 path (`/usr/bin/python3.4`).

Since we now:
- Build using a Fedora-based container with standard python3 path
- Use properly configured shebangs in build scripts
- Set correct executable permissions on Python scripts

This change:
1. Removes the `--python` command line option
2. Updates build rules to execute Python scripts directly instead of via interpreter

This simplifies the build system and reduces differences between CMake and
configure.py-generated rules.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21607
2024-11-21 06:30:56 +02:00
Yaron Kaikov
d9cde7cca5 .github/scripts/auto-backport.py: add user as collaborator to scylladbbot fork
As reported by @Deexie, during the process of opening backport PRs in https://github.com/scylladb/scylladb/pull/21616, No invite emails were sent, causing a lack of permissions for the backport PR branch

The check if `has_in_collaborators(pr.user.login)` was pointing to `scylladb/scylladb` instead of `scylladbbot/scylladb`, fixing it

I also moved the collaborator check to an early stage, before trying to open a backport PR

Closes scylladb/scylladb#21645
2024-11-20 14:34:38 +02:00
Tomasz Grabiec
0d2583600d Merge 'Add tablet repair scheduler support' from Asias He
This adds a new tablet migration kind: repair. It allows tablet repair
scheduler to use this migration kind to schedule repair jobs.

The current repair scheduler implementation does the following:

- A tablet is picked to be repaired when is requested by user

- The tablet repair can be scheduled along with tablet migration and
  rebuild. It runs in the tablet_migration track.

- Repair jobs are scheduled in a smart way so that at any point in time,
  there are no more than configured jobs per shard, which is similar to
  scylla manager's control.

New feature. No backport is needed.

Closes scylladb/scylladb#21088

* github.com:scylladb/scylladb:
  test: Add tests for tablet repair scheduler
  repair: Add restful API for tablet repair
  repair: Add tablet repair scheduler internal API support
  docs: Update system_keyspace.md for tablet repair related info
  docs: Add docs for tablet repair migration
  repair: Add core tablet repair scheduler support
  messaging_service: Introduce TABLET_REPAIR verb
  tablet_allocator: Introduce stream_weight for tablet_migration_streaming_info
  network_topology_strategy: Preserve fields of task_info in reallocate_tablets
2024-11-20 13:28:17 +01:00
Amnon Heiman
1e4fb2442a alternator/test_metrics: Add tests for table consumption units
Adding tests to verify the RCU and WCU metrics.

A new helper function check_increases_metric_exact check that a given
metrics increased by a given number.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2024-11-20 11:28:53 +02:00
Amnon Heiman
95c45ca269 test_returnconsumedcapacity.py: Add putItem tests
This patch adds testing for putItem consume capacity.

There is an additional test for number support. Numbers are encoded
differently with alternator and dynamoDB, the test adds some flexibility
in the result so it would pass both DynamoDB and Alternator.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2024-11-20 11:27:43 +02:00
Gleb Natapov
812a90bfe3 gossiper: replace milliseconds with seconds where appropriate 2024-11-20 10:52:19 +02:00
Gleb Natapov
18caa6b22f gossiper: simplify failure_detector_loop loop a bit 2024-11-20 10:52:19 +02:00
Gleb Natapov
9489ad0d2f gossiper: use fmt library to format time 2024-11-20 10:52:19 +02:00
Gleb Natapov
39e44db01f gossiper: drop on_success callback from mutate_live_and_unreachable_endpoints
There is only one user of it and it can just execute its code after
calling mutate_live_and_unreachable_endpoints.
2024-11-20 10:52:18 +02:00
Gleb Natapov
0116704226 gossiper: remove code duplication between shadow round and regular path when state is applied
Differences is about notification so move the notification check into
functions that handle state change.
2024-11-20 10:52:10 +02:00
Botond Dénes
d94591c260 Merge 'treewide: replace boost::find_if with std::ranges::find_if' from Kefu Chai
now that we are allowed to use C++23. we now have the luxury of using `std::ranges::find_if`.

in this change, we:

- replace `boost::find_if` with `std::ranges::find_if`
- remove all `#include <boost/range/algorithm/find_if.hpp>`

to reduce the dependency to boost for better maintainability, and leverage standard library features for better long-term support.

this change is part of our ongoing effort to modernize our codebase and reduce external dependencies where possible.

---

it's a cleanup, hence no need to backport.

Closes scylladb/scylladb#21495

* github.com:scylladb/scylladb:
  treewide: replace boost::find_if with std::ranges::find_if
  counters: replace boost::find_if with std::ranges::find_if
  combine.hh: use std::iter_const_reference_t when appropriate
2024-11-20 09:58:13 +02:00
Botond Dénes
075ca6cc02 Merge 'cql3: respect PER PARTITION LIMIT for aggregate queries' from Paweł Zakrzewski
Currently, PER PARTITION LIMIT is not implemented for aggregates and queries can result in more rows than expected from the same partition.

Instrument the result_set_builder class so that it can enforce PER PARTITION LIMIT for aggregate queries, specifically:
- add per_partition_limit to the result_set_builder
- expose the number of input rows in the selector

result_set_builder gets two new functions handling partition start and end:
- accept_partition_end for notifying that a partition has been finished. This is also called when a page ends, so we cannot simply flush here, as a naive implementation could do.
- accept_new_partition, where we flush_selectors() if it's indeed a new partition (and not a continuation of the previous) and the query has a grouping: we don't want to flush on new partition in a query like SELECT COUNT(*) FROM foo;

Fixes #5363

Closes scylladb/scylladb#21125

* github.com:scylladb/scylladb:
  test: enable PER PARTIION LIMIT + GROUP BY tests
  cql3: respect PER PARTITION LIMIT for aggregates
  cql3: selection: count input rows in the selector
  cql3: selection: pass per partition limit to the result_set_builder
  cql3: show different messages for LIMIT and PER PARTITION LIMIT in get_limit
2024-11-20 09:54:28 +02:00
Botond Dénes
5ccbd500e0 Merge 'repair: fix task_manager_module::abort_all_repairs' from Aleksandra Martyniuk
Currently, task_manager_module::abort_all_repairs marks top-level repairs as aborted (but does not abort them) and aborts all existing shard tasks.

A running repair checks whether its id isn't contained in _aborted_pending_repairs and then proceeds to create shard tasks. If abort_all_repairs is executed after _aborted_pending_repairs is checked but before shard tasks are created, then those new tasks won't be aborted. The issue is the most severe for tablet_repair_task_impl that checks the _aborted_pending_repairs content from different shards, that do not see the top-level task. Hence the repair isn't stopped but it creates shard repair tasks on all shards but the one that initialized repair.

Abort top-level tasks in abort_all_repairs. Fix the shard on which the task abort is checked.

Fixes: #21612.

Needs backport to 6.1 and 6.2 as they contain the bug.

Closes scylladb/scylladb#21616

* github.com:scylladb/scylladb:
  test: add test to check if repair is properly aborted
  repair: add shard param to task_manager_module::is_aborted
  repair: use task abort source to abort repair
  repair: drop _aborted_pending_repairs and utilize tasks abort mechanism
  repair: fix task_manager_module::abort_all_repairs
2024-11-20 06:43:01 +02:00
Asias He
ddfec068d0 test: Add tests for tablet repair scheduler 2024-11-20 09:42:41 +08:00
Asias He
844129227e repair: Add restful API for tablet repair
It allows user to add and del a tablet repair request. The request is
executed by the tablet repair scheduler.
2024-11-20 09:42:41 +08:00
Asias He
ca1fc28605 repair: Add tablet repair scheduler internal API support
Those internal APIs allow to add / del a tablet repair request and
config the tablet repair scheduler.

It can be used by task manager or plain restful api.
2024-11-20 09:42:41 +08:00
Asias He
9d58a911f1 docs: Update system_keyspace.md for tablet repair related info 2024-11-20 09:42:41 +08:00
Asias He
afd356ea9a docs: Add docs for tablet repair migration 2024-11-20 09:42:41 +08:00
Asias He
b71a563030 repair: Add core tablet repair scheduler support
This adds a new tablet migration kind: repair. It allows tablet repair
scheduler to use this migration kind to schedule repair jobs.

The current repair scheduler implementation does the following:

- A tablet is picked to be repaired when the time since last repair is
  bigger than a threshold (auto repair mode) or it is requested by user
  (manual repair mode)

- The tablet repair can be scheduled along with tablet migration and
  rebuild. It runs in the tablet_migration track.

- Repair jobs are scheduled in a smart way so that at any point in time,
  there are no more than configured jobs per shard, which is similar to
  scylla manager's control.

In this patch, both the manual repair and the auto repair are not
enabled yet.
2024-11-20 09:42:41 +08:00
Amnon Heiman
56dce5fe8a Alternator: add WCU support
This patch adds functionality to track Write Capacity Units (WCU).
Currently for the put_item operation.

This enhancement allows for standardized measurement of write
operations, aligning with DynamoDB-like metrics.

Additionally, the WCU value is now optionally included in the response to provide
immediate feedback on the write capacity usage.

The implementation adds a consumed_capacity_counter member to
rmw_operation, this will allow to add WCU functionality to update_item
and delete_item
2024-11-19 18:43:28 +02:00
Amnon Heiman
3c46d78e6a Add test/alternator/test_returnconsumedcapacity.py
This patch adds testing for the consumedCapacity header.
It's currently only test get_item

The test works with both AWS and alternator.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2024-11-19 18:43:28 +02:00
Amnon Heiman
b8f7b2eb52 alternator/executor: Add consume capacity for get_item
This patch adds functionality to track Read Capacity Units (RCU) for the
get_item operation. This enhancement allows for standardized measurement
of read operations, aligning with DynamoDB-like metrics.

Additionally, the RCU value can now be included in the response to
provide immediate feedback on the read capacity usage.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2024-11-19 18:43:28 +02:00
Amnon Heiman
2b10296a82 alsternator/stats: Add rcu and wcu metrics to stats
Introduced `rcu` (Read Capacity Units) and `wcu` (Write Capacity Units)
metrics to the `stats` object for enhanced capacity tracking.

`rcu` and `wcu` provide a simplified way of measuring reads and writes,
respectively, by representing capacity usage in standardized units.

This patch adds these metrics to the existing alternator stats, enabling
monitoring of the total consumed units.
2024-11-19 18:43:28 +02:00
Amnon Heiman
b0e699e7ec alternator/executor.hh: white-space cleanup
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2024-11-19 18:43:28 +02:00
Amnon Heiman
eedf390196 Add the consume_capacity helper class
Alternator API should support returning WCU and RCU when requested.
The consumed capacity helper class serves multiple purposes:
1. Break the logic of calculating the RCU and WCU from the main code.
2. Add a helper class consumed_capacity_counter that can accumulate bytes.
3. Optionally update counters for RCU and WCU that will be used by the
   metric layer.
4. Update the response with the consumed units if needed.

The consumed_capacity_counter is a base class with two implementations:
A read and write implmenentation.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2024-11-19 18:42:56 +02:00
Nadav Har'El
733a4f94c7 Merge 'test/boost/view_schema_test.cc: Wait for views to build in test_view_update_generating_writetime' from Dawid Mędrek
Before these changes, we didn't wait for the materialized views to
finish building before writing to the base table. That led to generating
an additional view update, which, in turn, led to test failures.

The scenario corresponding to the summary above looked like this:

1. The test creates an empty table and MVs on it.
2. The view builder starts, but it doesn't finish immediately.
3. The test performs mutations to the base table. Since the views
   already exist, view updates are generated.
4. Finally, the view builder finishes. It notices that the base
   table has a row, so it generates a view update for it because
   it doesn't notice that we already have data in the view.

We solve it by explicitly waiting for both views to finish building
and only then start writing to the base table.

Additionally, we also fix a lifetime issue of the row the test revolves
around, further stabilizing CI.

Fixes https://github.com/scylladb/scylladb/issues/20889

Backport: These changes have no semantic effect on the codebase,
but they stabilize CI, so we want to backport them to the maintained
versions of Scylla.

Closes scylladb/scylladb#21632

* github.com:scylladb/scylladb:
  test/boost/view_schema_test.cc: Increase TTL in test_view_update_generating_writetime
  test/boost/view_schema_test.cc: Wait for views to build in test_view_update_generating_writetime
2024-11-19 18:10:52 +02:00
Gleb Natapov
8204bbf547 gossiper: remove remnants of old shadow round
Starting from 108aae09c5 new way of doing
shadow round is mandatory.
2024-11-19 14:43:51 +02:00
Gleb Natapov
4b3d160f34 gossiper: fix indentation after the last patch 2024-11-19 14:43:50 +02:00
Gleb Natapov
edeb8b0e46 gossiper: co-routinize do_shadow_round 2024-11-19 14:43:50 +02:00
Dawid Mędrek
af4afc84ec test/boost/view_schema_test.cc: Increase TTL in test_view_update_generating_writetime
The auxiliary function `eventually()` (defined in `test/lib/eventually.hh`)
tries to execute a passed function. If it throws, `eventually()` sleeps
for `2^#previous_attempts` milliseconds and tries to perform it again.
The default limit of attempts is 17.

In `test_view_update_generating_writetime`, right before the last test
case, we perform:

```cql
UPDATE t USING TTL 10 AND TIMESTAMP 8 SET g=40 WHERE k=1 AND c=1;
```

The test case itself executes:

```cql
SELECT WRITETIME(g) FROM t;
```

and asserts that the result of the query is equal to 8, i.e. it
corresponds to the timestamp of the last write to the table `t`.

However, if the test case keeps failing, then during its 14th attempt
(so affter sleeping for at least `2^14 - 1` milliseconds, which amounts
to about 16 seconds), we'll observe the following error:

```
[Exception] - std::runtime_error: Expected row not found: [0000000000000008] not in {result_message::rows {row:  null}}
```

The reason behind it is the specified TTL is too short. 10 seconds will
have already passed before the 14th attempt, so the value in the column
`g` will be `NULL` again. In particular, the `WRITETIME(g)` will no
longer be equal to `8`.

To solve that issue, we change the TTL in the CQL statement to 300.
The time spent on 17 loops of `eventually()` amounts to about
`2^18 - 1` milliseconds, which is about 263 seconds. That's why
setting the TTL to 300 seconds should be enough to prevent the error
from occurring.
2024-11-19 13:02:34 +01:00
Dawid Mędrek
5ca0cc4e85 test/boost/view_schema_test.cc: Wait for views to build in test_view_update_generating_writetime
Before these changes, we didn't wait for the materialized views to
finish building before writing to the base table. That led to generating
an additional view update, which, in turn, led to test failures.

The scenario corresponding to the summary above looked like this:

1. The test creates an empty table and MVs on it.
2. The view builder starts, but it doesn't finish immediately.
3. The test performs mutations to the base table. Since the views
   already exist, view updates are generated.
4. Finally, the view builder finishes. It notices that the base
   table has a row, so it generates a view update for it because
   it doesn't notice that we already have data in the view.

We solve it by explicitly waiting for both views to finish building
and only then start writing to the base table.

Fixes scylladb/scylladb#20889
2024-11-19 12:51:22 +01:00
Aleksandra Martyniuk
f5795e8aa4 test: add test to check if repair is properly aborted 2024-11-19 11:59:29 +01:00
Paweł Zakrzewski
b893e63b4a test: enable PER PARTIION LIMIT + GROUP BY tests 2024-11-19 09:28:01 +01:00
Nadav Har'El
7607f5e33e alternator: fix "/localnodes" to not return down nodes
Alternator's "/localnodes" HTTP requests is supposed to return the list
of nodes in the local DC to which the user can send requests.

Before commit bac7c33313 we used the
gossiper is_alive() method to determine if a node should be returned.
That commit changed the check to is_normal() - because a node can be
alive but in non-normal (e.g., joining) state and not ready for
requests.

However, it turns out that checking is_normal() is not enough, because
if node is stopped abruptly, other nodes will still consider it "normal",
but down (this is so-called "DN" state). So we need to check **both**
is_alive() and is_normal().

This patch also adds a test reproducing this case, where a node is
shut down abruptly. Before this patch, the test failed ("/localnodes"
continued to return the dead node), and after it it passes.

Fixes #21538

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#21540
2024-11-19 10:04:59 +02:00
Yaron Kaikov
980f6a48ab .github/scripts/auto-backport.py: validate backport candidate with Fixes prefix
Adding `Fixes` validation to a PR when backport labels were added. When the auto backport process triggers (after promotion), we will ensure each PR with backport/x.y label also has in the PR body a `Fixes` reference to an issue

Fixes: https://github.com/scylladb/scylladb/issues/20021

Closes scylladb/scylladb#21563
2024-11-19 09:48:34 +02:00
Benny Halevy
165902b951 conf/scylla.yaml: update documentation for enable_tablets
Change e3e8a94c9a changed
the semantics of the enable_tablets config option,
but updating that in the option documentation in scylla.yaml
was missed.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#21614
2024-11-19 09:44:53 +02:00
Botond Dénes
36870feb29 Merge 'test: route S3 Proxy server messages through logger' from Kefu Chai
This change was created in the same spirit of f8221b960f.
The S3ProxyServer (introduced in 8919e0abab) currently prints its
status directly to stdout, which can be distracting when reviewing test
results. For example:

```console
$ ./test.py --mode release object_store/test_backup::test_simple_backup_and_restore
Found 1 tests.
Setting minio proxy random seed to 1731924995
Starting S3 proxy server on ('127.193.179.2', 9002)
================================================================================
[N/TOTAL]   SUITE    MODE   RESULT   TEST
------------------------------------------------------------------------------
[1/1]      object_store release [ PASS ] object_store.test_backup.1
Stopping S3 proxy server
------------------------------------------------------------------------------
CPU utilization: 3.1%
```

Move these messages to use proper logging to give developers more control
over their visibility:

- Make logger parameter mandatory in S3ProxyServer constructor
- Route "Stopping S3 proxy" message through the provided logger
- Add --log-level option to the standalone proxy server launcher

The message is now hidden:

```console
$ ./test.py --mode release object_store/test_backup::test_simple_backup_and_restore
Found 1 tests.
================================================================================
[N/TOTAL]   SUITE    MODE   RESULT   TEST
------------------------------------------------------------------------------
[1/1]      object_store release [ PASS ] object_store.test_backup.1
------------------------------------------------------------------------------
CPU utilization: 4.1%
```

---

this change improves the developer experience, hence no need to backport.

Closes scylladb/scylladb#21610

* github.com:scylladb/scylladb:
  test: route S3 Proxy server messages through logger
  test: s3_proxy: remove unused method
2024-11-19 06:42:28 +02:00
Kefu Chai
33a0e5b892 treewide: replace boost::find_if with std::ranges::find_if
now that we are allowed to use C++23. we now have the luxury of using
`std::ranges::find_if`.

in this change, we:

- replace `boost::find_if` with `std::ranges::find_if`
- remove all `#include <boost/range/algorithm/find_if.hpp>`

to reduce the dependency to boost for better maintainability, and
leverage standard library features for better long-term support.

this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-11-19 10:50:01 +08:00
Kefu Chai
3e75fbd9d3 counters: replace boost::find_if with std::ranges::find_if
std::ranges allows us to create a range from a pair of iterators.
but the iterator has to fulfill the concept of `std::semiregular`.
in order to reduce the header dependency on boost, we need to
make `basic_counter_cell_view::shard_iterator` to support
`std::semiregular`.

in this change:

- define a default constructor for
  `basic_counter_cell_view::shard_iterator`, so that the iterator
  satisfies the constraints of `std::semiregular`, as required by
  C++20's forward_iterator concept. please note, despite that
  the standard requires the iterator to be `std::semiregular`, but
  the iterator created by default constructor is not evaluated in
  production. sometimes, the standard algorithms just need to
  store/create itermediate iterators or to represent a "singular"
  state for iterator. a use case is an empty container.
- change `basic_counter_cell_view::shard_iterator::reference` so
  its dereference returns a rvalue instead of a reference. because
  per C++20 standard, the dereference of a forward_iterator should
  be stable, but we were returning a reference / pointer referencing
  a member variable of the iterator. so once the iterator is destructed,
  the returned reference / pointer would be invalidated. so we have to
  return a value to fulfill the requiremend of forward_iterator. this
  change also fulfills the requirement of `same_as<iter_reference_t<It>,
  iter_reference_t<const It>>`, which a part of the
  `indirectly_readable` requirement.
- let `basic_counter_cell_view::shards()` return a subrange
- let `basic_counter_shard_view::swap_value_and_clock()` accepts
  a plain value instead of a reference. because the dereference of
  the iterator does not return a reference anymore. and the returned
  type is a lightweighted "view", so the performance penality is
  negligible.
- use ranges libraries when appropriate in this header.

this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-11-19 10:50:01 +08:00
Kefu Chai
69939ee653 combine.hh: use std::iter_const_reference_t when appropriate
before this change, we assumed that the dereference types of
the given `InputIterator1` and `InputIterator2` are always
references. but this does not hold if the `operator*` returns
a rvalue, as in the C++20 standard, unlike the LegacyForwardIterator
requirement, `std::forward_iterator` does not requires
dereference to return a reference. so we should not assume this,
if we want to use `combine()` with iterators whose dereference
return a, for instance, rvalue.

in this change, we use `std::iter_const_reference_t` instead. this
type is deduced from the behavior of the iterator instead of hardwire
it to a reference type. this allows us to use a C++20 forward_iterator
with this generic function.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-11-19 10:50:01 +08:00
Asias He
5b17be6494 messaging_service: Introduce TABLET_REPAIR verb
It is used by the tablet repair scheduler.
2024-11-19 10:04:41 +08:00