Commit Graph

45802 Commits

Author SHA1 Message Date
Pavel Emelyanov
67089fd5a1 nodetool: Implement [gs]etcompationthroughput commands
They exist in the original documentation, but are not yet implemented.
Now it's possible to do it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-12-13 14:39:47 +03:00
Pavel Emelyanov
eb29d6f4b0 test: Add validation of how IO-updating endpoints work
There are now four of those and these are all the same in the way they
interpret the value parameter (though it's named differently)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-12-13 13:02:44 +03:00
Pavel Emelyanov
fa1ad5ecfd api: Implement /storage_service/(stream|compaction)_throughput endpoints
Both values are in fact db::config named values. They are observed by,
respectively, compaction manager and stream manager: when changed, the
observer kicks corresponding sched group's update_io_bandwidth() method.

Despite being referenced by managers, there's no way to update those
values anyhow other than updating config's named values themselves.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-12-13 11:51:52 +03:00
Pavel Emelyanov
6659ceca4f api: Disqualify const config reference
Some endpoints in config block will need to actually _update_ values on
config (see next patches why), and const reference stands on the way.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-12-13 11:51:52 +03:00
Pavel Emelyanov
f3775ba957 api: Implement /storage_service/stream_throughput endpoint
The value can be obtained from the stream_manager

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-12-13 11:51:52 +03:00
Pavel Emelyanov
b8bd170212 api: Move stream throughput set/get endpoints from storage service block
In order to get stream throughput, the API will need stream_manager.
In order to set stream throughput, the API will need db::config to
update the corresponding named value on it.

Said that, move the endpoints to relevant blocks.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-12-13 11:51:52 +03:00
Pavel Emelyanov
d2c9c2abe8 api: Move set_compaction_throughput_mb_per_sec to config block
In order to update compaction throughput API would need to update the
db::config value, so the endpoint in question should sit in the block
that has db::config at hand.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-12-13 11:51:52 +03:00
Pavel Emelyanov
7d6f8d728b util: Include fmt/ranges.h in config_file.hh
The operator() of named_value() prints the allowed values on error which
can be a vector, so the ranges formatting should be there.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-12-13 11:51:52 +03:00
Avi Kivity
0114e4c2ae Update seastar submodule
* seastar 72c7ac575...3133ecdd6 (12):
  > util/backtrace: Optimize formatter to reduce memory allocation overhead
  > scheduler: Report long queue stall
  > log: drop specialization of boost::lexical_cast for log_level
  > stall-detector: Remove unused _stall_detector_reports_per_minute
  > Merge 'when_all: add Sentinel support to when_all_succeed() ' from Kefu Chai
  > scripts/perftune.py: Implement AWS IMDSv2 call
  > net/tls: Add a way to disable certificate validation
  > tests: Improve websocket parser tests
  > scripts/stall-analyser: improve error messages on invalid input
  > reserve-memory: document that seastar just doesnt use the reserves
  > Merge 'Minor metrics memory optimizations' from Stephan Dollberg
  > json_formatter: Add support for standard range containers

Closes scylladb/scylladb#21869
2024-12-12 18:30:54 +02:00
Gleb Natapov
34a4144a17 messaging_service: do not rely on address map to find an IP rpc client is connected to
Store the endpoint ip address together with the client (note it may be
different from the address the client is connected to in case
preferable address is different). This allows up to drop lookup in the
address map which may eventually fail if an endpoint was already
deleted.

Fixes: scylladb/scylladb#21840

Message-ID: <Z1mpMMe-o0ggBU_F@scylladb.com>
2024-12-12 18:10:58 +02:00
Avi Kivity
ecd78c88bf Merge "move more verbs to idl" from Gleb
"
The series moves node ops, repair and streaming verbs to IDL. Also
contains IDL related cleanups.

In addition to the CI tested manually by bootstrapping a node with the
series into a cluster of old nodes with repair and streaming both in
gossiper and raft mode. This exercises repair, streaming and node_ops
paths.

"

* 'gleb/move-more-rpcs-to-idl-v3' of github.com:scylladb/scylla-dev:
  repair: repair_flush_hints_batchlog_request::target_nodes is not used any more, so mark it as such
  streaming: move streaming verbs to IDL
  messaging_service: move repair verbs to IDL
  node_ops: move node_ops_cmd to IDL
  idl: rename partition_checksum.dist.hh to repair.dist.hh
  idl: move node_ops related stuff from the repair related IDL
2024-12-12 17:19:43 +02:00
muthu90tech
e49381119d locator: topology: use node& instead of node*
This change goes thru locator:topology to use node&
instead of node* where nullptr is not possible. There are
places where the node object is used in unordered_set, in
those cases the node is wrapped in std::reference_wrapper.

Fixes scylladb/scylladb#20357

Closes scylladb/scylladb#21863
2024-12-12 13:22:55 +01:00
Botond Dénes
05246e123d Merge 'sstables: Avoid computing column_values_fixed_lengths on each read' from Tomasz Grabiec
Reads which need sstable index were computing
column_values_fixed_lengths each time. This showed up in perf profile
for a sstable-read heavy workload, and amounted to about 1-2% of time.

Computing it involves type name parsing.

Avoid by using cached per-sstable mapping. There is already
sstable::_column_translation which can be used for this. It caches the
mapping for the least-recently used schema. Since the cursor uses the
mapping only for primary key columns, which are stable, any schema
will do, so we can use the last _column_translation. We only need to
make sure that it's always armed, so sstable loading is augmented with
arming with sstable's schema.

Also, fixes a potential use-after-free on schema in column_translation.

Closes scylladb/scylladb#21347

* github.com:scylladb/scylladb:
  sstables: Fix potential use-after-free on column_translation::column_info::name
  sstables: Avoid computing column_values_fixed_lengths on each read
2024-12-12 12:22:32 +02:00
Kefu Chai
714d12014e sstable/mx: use subrange.advance() when appropriate
Replace manual subrange advancement with the more concise and readable
`subrange.advance()` method. This change:

- Eliminates unnecessary subrange instance creation
- Improves code readability
- Reduces potential for unnecessary object allocation
- Leverages the built-in `advance()` method for cleaner iterator handling

The modification simplifies the iteration logic while maintaining the
same functional behavior.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21865
2024-12-12 10:04:12 +02:00
Gleb Natapov
c095f63ea5 repair: repair_flush_hints_batchlog_request::target_nodes is not used any more, so mark it as such
After b3b3e880d3 target_nodes is not used
by the receiver, so we can skip setting it on sender as well.
2024-12-11 18:26:57 +02:00
Gleb Natapov
92c2558a83 streaming: move streaming verbs to IDL 2024-12-11 18:26:50 +02:00
Anna Stuchlik
98860905d8 doc: remove wrong image upgrade info (5.2-to-2023.1)
This commit removes the information about the recommended way of upgrading
ScyllaDB images - by updating ScyllaDB and OS packages in one step. This upgrade
procedure is not supported (it was implemented, but then reverted).

Refs https://github.com/scylladb/scylladb/issues/15733

Closes scylladb/scylladb#21876
2024-12-11 14:00:30 +02:00
Kefu Chai
03599477af dht: include a smaller header file
Replace `dht/sharder.hh` with a "smaller" header, which provides
just the enough dependencies.

in f744007e, we traded `database.hh` with a smaller set of headers.
but it turns out `dht/sharder.hh` can be replaced with a even smaller
one. because `dht::sharder` is defined by `dht/token-sharding.hh`, and
what we need from `dht/sharder.hh` is this class's declaration.
`clang-include-cleaner` identified this issue.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21881
2024-12-11 13:53:01 +02:00
Michael Litvak
373855b493 service/qos/service_level_controller: update cache on startup
Update the service level cache in the node startup sequence, after the
service level and auth service are initialized.

The cache update depends on the service level data accessor being set
and the auth service being initialized. Before the commit, it may happen that a
cache update is not triggered after the initialization. The commit adds
an explicit call to update the cache where it is guaranteed to be ready.

Fixes scylladb/scylladb#21763

Closes scylladb/scylladb#21773
2024-12-11 12:05:28 +01:00
Tomasz Grabiec
440a96605f Merge 'topology_custom/test_tablets: add remove/replace tests for edge cases' from Benny Halevy
Test cases related to #21826:

1. test_remove_failure_with_no_normal_token_owners_in_dc: attempts to
    remove a node with another node down in the datacenter, leaving
    no normal token owners in that dc (reproducing #21826).
    Removenode is expected to fail in this case since it
    should have no place to rebuild the removed node replicas,
    yet it currently succeeds unexpectedly.

2. test_remove_failure_then_replace: verify that removenode
    fails as expected when there are not enough nodes to
    rebuild its replicas on, with and without additional zero-token nodes.

3. test_replace_with_no_normal_token_owners_in_dc: verify that
    nodes can be replaced in a datacenter that has no live
    token owners, with and without additional zero-token nodes.
    Tablet replace uses all replicas to rebuild the lost replicas
    and therefore should succeed in the edge case.
    The restored data is verified as well.

Refs #21826

* New tests, no backport needed

Closes scylladb/scylladb#21827

* github.com:scylladb/scylladb:
  topology_custom/test_tablets: add remove/replace tests for edge cases
  test: pylib: _cluster_remove_node: log message on successful paths
  test: pylib: _cluster_remove_node: mark server as removed only when removenode succeeded
2024-12-11 12:04:14 +01:00
Kefu Chai
9f749487cd main.cc: fix typos in comment
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21868
2024-12-11 08:42:41 +02:00
Benny Halevy
b95312064f topology_custom/test_tablets: add remove/replace tests for edge cases
Test cases related to #21826:

1. test_remove_failure_with_no_normal_token_owners_in_dc: attempts to
remove a node with another node down in the datacenter, leaving
no normal token owners in that dc (reproducing #21826).
Removenode is expected to fail in this case since it
should have no place to rebuild the removed node replicas,
yet it currently succeeds unexpectedly.

2. test_remove_failure_then_replace: verify that removenode
fails as expected when there are not enough nodes to
rebuild its replicas on, with and without additional zero-token nodes.

3. test_replace_with_no_normal_token_owners_in_dc: verify that
nodes can be replaced in a datacenter that has no live
token owners, with and without additional zero-token nodes.
Tablet replace uses all replicas to rebuild the lost replicas
and therefore should succeed in the edge case.
The restored data is verified as well.

Refs #21826

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-12-10 21:39:15 +02:00
Tomasz Grabiec
8e60a0b831 Merge 'truncate: make TRUNCATE TABLE safe with tablets' from Ferenc Szili
Currently truncating a table works by issuing an RPC to all the nodes which call `database::truncate_table_on_all_shards()`, which makes sure that older writes are dropped.

It works with tablets, but is not safe. A concurrent replication process may bring back old data.

This change makes makes TRUNCATE TABLE a topology operation, so that it excludes with other processes in the system which could interfere with it. More specifically, it makes TRUNCATE a global topology request.

Backporting is not needed.

Fixes #16411

Closes scylladb/scylladb#19789

* github.com:scylladb/scylladb:
  docs: docs: topology-over-raft: Document truncate_table request
  storage_proxy: fix indentation and remove empty catch/rethrow
  test: add tests for truncate with tablets
  storage_proxy: use new TRUNCATE for tablets
  truncate: make TRUNCATE a global topology operation
  storage_service: move logic of wait_for_topology_request_completion()
  RPC: add truncate_with_tablets RPC with frozen_topology_guard
  feature_service: added cluster feature for system.topology schema change
  system.topology_requests: change schema
  storage_proxy: propagate group0 client and TSM dependency
2024-12-10 17:50:50 +01:00
Kefu Chai
8d63d31e57 service: fix a typo in comment
s/contraints/constraints/

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21851
2024-12-10 15:58:49 +02:00
Gleb Natapov
fbfee9666e locator: put real host id into the replication map for everywhere replication strategy
Everywhere replication strategy returns zero host id in replica set instead
of the real one if no tokens are configured yet in token metadata. It
worked because code that translates ids to ips knows that zero host id
is a special one, so putting zero there was equivalent to allow local
access. But now we use host ids directly so we need to return real host
id here to allow local access before token metadata is populated.

Message-ID: <Z1hBHsEo4wYzzgvJ@scylladb.com>
2024-12-10 15:36:00 +02:00
Patryk Jędrzejczak
74dad7d1eb raft: improve logs for abort while waiting for apply
New logs allow us to easily distinguish two cases in which
waiting for apply times out:
- the node didn't receive the entry it was waiting for,
- the node received the entry but didn't apply it in time.

Distinguishing these cases simplifies reasoning about failures.
The first case indicates that something went wrong on the leader.
The second case indicates that something went wrong on the node
on which waiting for apply timed out.

As it turns out, many different bugs result in the `read_barrier`
(which calls `wait_for_apply`) timeout. This change should help
us in debugging bugs like these.

We want to backport this change to all supported branches so that
it helps us in all tests.

Closes scylladb/scylladb#21855
2024-12-10 14:23:39 +01:00
Tomasz Grabiec
bf18a17bd6 tablets: scheduler: Fix temporary imbalance in a mixed-capacity cluster on decommission
When tablet scheduler drains nodes, it chooses target location based
on "badness" metric. Nodes with lowest score are preferred. Before the
patch, the score which was used was the number of tablets on that node
post-movement. This way we populate least-loaded node first. But this
works only if nodes have equal number of shards. If nodes have different
capacity, then number of tablets is not a good metric, because we don't
aim to equalize per-node count, but per-shard count. We assume that each
shard has equal capacity.

Because of this bug, during decommission, the nodes with fewer shards
would be preferred to receive replicas, which may lead to overloading
of those nodes. This imbalance would be later fixed by the normal load
balancing logic, but it's still problematic.

Fixes #21783

Closes scylladb/scylladb#21860
2024-12-10 14:18:03 +02:00
Benny Halevy
eeb6d3dd74 test: pylib: _cluster_remove_node: log message on successful paths
Log a message when removenode succeeded as expected
or when it failed as expected with the `expected_error`.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-12-10 11:55:27 +02:00
Benny Halevy
cd566924b9 test: pylib: _cluster_remove_node: mark server as removed only when removenode succeeded
Currently, we call server_mark_removed also when removenode
failed with the `expected_error`, where the function returns success
but the server is not supposed to be in a removed state.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-12-10 11:55:27 +02:00
Botond Dénes
5d040e0206 Merge 'truncate: commit log replay positions are not saved correctly' from Ferenc Szili
TRUNCATE TABLE saves the current commit log replay positions in case there is a crash so that replay knows where to begin replaying the mutations. These are collected and saved per shard into `system.truncated`. In case a shard received no mutations, its replay position will be an empty, default constructed object of type `db::replay_position` with its members set to 0. Truncate will incorrectly interpret these empty replay positions as if they were coming from shard 0, and save them as such, potentially overwriting an actual valid replay position coming from the actual shard 0. In the case of a crash, this will cause the commit log on shard 0 to be replayed from the beginning, and result with data resurrection.

Fixes #21719

Closes scylladb/scylladb#21722

* github.com:scylladb/scylladb:
  test: add test for truncate saving replay positions
  database: correctly save replay position for truncate
2024-12-10 10:05:30 +02:00
Botond Dénes
924189c50e Merge 'replica/table: improve error message when encountering orphaned sstables' from Lakshmi Narayanan Sreethar
On startup, if a server reads an sstable that belongs to a tablet that
doesn't have any local replica, it throws an error in the following
format and refuses to start :

```
Storage wasn't found for tablet 1 of table test.test
```

This patch updates the code path to throw a nicer error that includes
the sstable name that caused the problem.

This patch also adds a testcase to verify the error being thrown.

Fixes https://github.com/scylladb/scylladb/issues/18038

PR improves an error message - no need to backport.

Closes scylladb/scylladb#21805

* github.com:scylladb/scylladb:
  replica/table: fix indent in compaction_group_for_sstable
  replica/table: improve error message when encountering orphaned sstables
2024-12-10 06:34:12 +02:00
Kefu Chai
ce2f80c227 treewide: migrate from boost::make_iterator_range to ranges::subrange
Replace boost::make_iterator_range() with std::ranges::subrange.

This change improves code modernization and reduces external dependencies:

- Replace boost::make_iterator_range() with std::ranges::subrange
- Remove boost/range/iterator_range.hpp include
- Improve iterator type detection in interval.hh using std::ranges::const_iterator_t<Range>

This is part of ongoing efforts to modernize our codebase and minimize
external dependencies.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21787
2024-12-09 21:31:53 +02:00
Pavel Emelyanov
6eb6b96456 dirty-memory-manager: Brush up "blocked" state check
One of run_when_memory_available() checks mirrors the one done by the
execution_permitted() helper, so its worth re-using it. Since the former
helper is header template, the latter is worth moving to header too.
And, once re-used, the `bool blocking` variable becomes excessive, and
the `if (blocking)` check can also be expressed with fewer LOCs.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#21812
2024-12-09 20:44:22 +02:00
Kefu Chai
48c8d24345 treewide: drop support for fmt < v10
since fedora 38 is EOL. and fedora 39 comes with fmt v10.0.0, also,
we've switched to the build image based on fedora 40, which ships
fmt-devel v10.2.1, there is no need to support fmt < 10.

in this change, we drop the support fmt < 10.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21847
2024-12-09 20:42:38 +02:00
Avi Kivity
1bac6b75dc Merge 'Reserve IOCBs for tool applications' from Botond Dénes
Artifact tests have been failing since the switch to the native nodetool, because ScyllaDB doesn't leave any IOCBs for tools. On some setups it will consume all of them and then nodetool and any other native app will refuse to start because it will fail to allocate IOCBs.
This PR fixes this by making use of the freshly introduced `--reserve-io-control-blocks` seastar option, to reserve IOCBs for tool applications. Since the `linux-aio` and `epoll` reactor backends require quite a bit of these, we enable the `io_uring` reactor backend and switch tools to use this backend instead. The `io_uring` reactor backend needs just 2 IOCBs to function, so the reserve of 10 IOCBs set up in this PR is good for running 5 tool applications in parallel, which should be more than enough.

Fixes: https://github.com/scylladb/scylladb/issues/19185

The problem this PR fixes has a manual workaround (and is rare to begin with), no backport needed.

Closes scylladb/scylladb#21527

* github.com:scylladb/scylladb:
  main: configure a reserve IOCB for scylla-nodetool and friends
  configure: enable the io_uring backend
  main: use configure seastar defaults via app_template::seastar_options
2024-12-09 19:22:19 +02:00
Kefu Chai
a9c244ddf7 dist: scylla_io_setup: use raw string to avoid invalid escape sequence
Use raw string literals to prevent syntax warnings when using regular
expressions with backslash-based patterns.

The original code triggered a SyntaxWarning in developer mode (`python3 -Xdev`)
due to unescaped backslash characters in regex patterns like '\s'. While
CPython typically interprets these silently, strict Python parsing modes
raise warnings about potentially unintended escape sequences.

This change adds the `r` prefix to string literals containing regex patterns,
ensuring consistent behavior across different Python runtime configurations
and eliminating unnecessary syntax warning like:

```
/opt/scylladb/scripts/libexec/scylla_io_setup:41: SyntaxWarning: invalid escape sequence '\s'
  pattern = re.compile(_nocomment + r"CPUSET=\s*\"" + _reopt(_cpuset) + _reopt(_smp) + "\s*\"")
```

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21839
2024-12-09 19:18:39 +03:00
Emil Maskovsky
969b396699 gossiper: fix the backward incompatible change
In the cleanup commit a840949ea0
a regression was introduced that caused backward incompatible changes
in the gossiper application state name strings.

In the e486e0f759 the value
`application_state::CDC_STREAMS_TIMESTAMP` was changed to
`application_state::CDC_GENERATION_ID`, but the name string
"CDC_STREAMS_TIMESTAMP" was kept for backward compatibility.

The cleanup commit a840949ea0 however
changed the name string to "CDC_GENERATION_ID" by ommission (not noticing
the difference) which caused backward incompatible change.

There is also another case found of "IGNOR_MSB_BITS" (that has a typo -
missing the "E" in "IGNORE") to "IGNORE_MSB_BITS", which also needs to
be reverted back to keep the backward compatibility.

Fixes: scylladb/scylladb#21811

Closes scylladb/scylladb#21813
2024-12-09 16:46:25 +01:00
Ferenc Szili
49cc771bda docs: docs: topology-over-raft: Document truncate_table request 2024-12-09 16:38:50 +01:00
Ferenc Szili
781f0a2397 storage_proxy: fix indentation and remove empty catch/rethrow
This change fixes code indentation in storage_proxy::remote::send_truncate_blocking()
It also removes an empty catch and rethrow block.
2024-12-09 16:38:50 +01:00
Ferenc Szili
e65a235fd5 test: add tests for truncate with tablets
This patch adds the unit tests for truncate with tablets.

test_truncate_while_migration() triggers a tablet migration, then runs
a TRUNCATE TABLE for the table containing the tablet being migrated.
test_truncate_with_concurrent_drop() starts a truncate, then attempts to
drop the table while it is being truncated.
test_truncate_while_node_restart() validates the case where a replica
node is restarted while truncate is running.
test_truncate_with_coordinator_crash() validates if truncate is
correctly completed in cases where the topology coordinator has crashed
or restarted after the truncate session is cleared, but before the
truncate request is finalized.
2024-12-09 16:38:50 +01:00
Ferenc Szili
4cd7a1acab storage_proxy: use new TRUNCATE for tablets
This change adds branching based on keyspace replication method, and
uses the new TRUNCATE for keyspaces with tablets.
2024-12-09 16:38:50 +01:00
Ferenc Szili
93cfeb9160 truncate: make TRUNCATE a global topology operation
This commit adds the code needed to create a TRUNCATE global topology
request. It also adds the handler for this request to the topology
coordinator.
The execution of the truncate operation is not canceled on a timeout,
but the query coordinator side will return a timeout error.
2024-12-09 16:38:37 +01:00
Gleb Natapov
ed7ea1dc71 feature_service: fix typo in address_nodes_by_host_ids feature name
Message-ID: <Z1WYaYuQuPP8lNAX@scylladb.com>
2024-12-09 17:27:27 +02:00
Tomasz Grabiec
2b16428b4f sstables: Fix potential use-after-free on column_translation::column_info::name
column_translation::state is storing pointers to column names, which
are stable only as long as schema_ptr is alive. sstable object caches
last used column_translation, and reuses column_translation::state if
the schema version matches. But this doesn't guarantee that the schema
object was not destroyed and recreated in between. This can happen if
the schema version expired in registry and then was pulled again from a
different node via get_schema_for_read().

Spotted by reading the code.

Fix by storing schema_ptr in column_translation. This can pin old
schema in memory until a newer schema is used to read the sstable, or
until sstable is compacted away. I think this shouldn't be a problem
in practice.
2024-12-09 14:05:37 +01:00
Tomasz Grabiec
b0a5bf8b4a sstables: Avoid computing column_values_fixed_lengths on each read
Reads which need clustering index cursor were computing
column_values_fixed_lengths each time. This showed up in perf profile
for a sstable-read heavy workload, and amounted to about 1%.

Avoid by using cached per-sstable mapping. There is already
sstable::_column_translation which can be used for this. It caches the
mapping for the most recently used schema. Since the cursor uses the
mapping only for primary key columns, which are stable, any schema
will do, so we can use the last _column_translation. We only need to
make sure that it's always armed, so sstable loading is augmented with
arming with sstable's schema.
2024-12-09 14:05:37 +01:00
Gleb Natapov
bfee93c747 messaging_service: move repair verbs to IDL 2024-12-09 14:50:52 +02:00
Gleb Natapov
5f6007f6ec node_ops: move node_ops_cmd to IDL 2024-12-09 14:50:52 +02:00
Gleb Natapov
39c75d3add idl: rename partition_checksum.dist.hh to repair.dist.hh
The file has many more things than partition_checksum. All of them
are repair related now.
2024-12-09 14:49:59 +02:00
Michael Litvak
53224d90be service/qos: increase timeout of internal get_service_levels queries
The function get_service_levels is used to retrieve all service levels
and it is called from multiple different contexts.
Importantly, it is called internally from the context of group0 state reload,
where it should be executed with a long timeout, similarly to other
internal queries, because a failure of this function affects the entire
group0 client, and a longer timeout can be tolerated.
The function is also called in the context of the user command LIST
SERVICE LEVELS, and perhaps other contexts, where a shorter timeout is
preferred.

The commit introduces a function parameter to indicate whether the
context is internal or not. For internal context, a long timeout is
chosen for the query. Otherwise, the timeout is shorter, the same as
before. When the distinction is not important, a default value is
chosen which maintains the same behavior.

The main purpose is to fix the case where the timeout is too short and causes
a failure that propagates and fails the group0 client.

Fixes scylladb/scylladb#20483

Closes scylladb/scylladb#21748
2024-12-09 13:20:32 +01:00
Kefu Chai
6a18db0aea node_ops: switch from boost::join() to std::ranges::join_view()
Replace boost::join() with std::ranges::join_view() as an interim solution
before C++26's std::views::concat becomes available. This change:

- Reduces dependencies on the Boost Ranges library
- Moves closer to standard library implementations
- Improves code maintainability and future compatibility

This is part of ongoing efforts to modernize our codebase and minimize
external dependencies.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21786
2024-12-09 13:46:44 +03:00