Commit Graph

48949 Commits

Author SHA1 Message Date
Petr Gusev
ed6bec2cac storage_proxy: node_local_only: always use my_host_id
The previous implementation did not handle topology changes well:
* In node_local_only mode with CL=1, if the current node is pending,
  the CL is raised to 2, causing unavailable_exception.
* If the current tablet is in write_both_read_old and we read with
  node_local_only on the new node, the replica list is empty.

This patch changes node_local_only mode to always use my_host_id as
the replica list. An explicit check ensures the current node is a
replica for the operation; otherwise on_internal_error is called.
2025-08-19 16:11:49 +02:00
Pavel Emelyanov
f689d41747 Merge 'db/hints: Improve logs' from Dawid Mędrek
Before these changes, the logs in hinted handoff often didn't provide
crucial information like the identifier of the node that hints were
being sent to. Also, some of the logs were misleading and referred to
other places in the code than the one where an exception or some other
situation really occurred.

We modify those logs, extending them by more valuable information
and fixing existing issues. What's more, all of the logs in
`hint_endpoint_manager` and `hint_sender` follow a consistent format
now:

```
<class_name>[<destination host ID>]:<function_name>: <message>
```

This way, we should always have AT LEAST the basic information.

Fixes scylladb/scylladb#25466

Backport:
There is no risk in backporting these changes. They only have
impact on the logs. On the other hand, they might prove helpful
when debugging an issue in hinted handoff.

Closes scylladb/scylladb#25470

* github.com:scylladb/scylladb:
  db/hints: Add new logs
  db/hints: Adjust log levels
  db/hints: Improve logs
2025-08-15 09:34:29 +03:00
Patryk Jędrzejczak
03cc34e3a0 test: test_maintenance_socket: use cluster_con for driver sessions
The test creates all driver sessions by itself. As a consequence, all
sessions use the default request timeout of 10s. This can be too low for
the debug mode, as observed in scylladb/scylla-enterprise#5601.

In this commit, we change the test to use `cluster_con`, so that the
sessions have the request timeout set to 200s from now on.

Fixes scylladb/scylla-enterprise#5601

This commit changes only the test and is a CI stability improvement,
so it should be backported all the way to 2024.2. 2024.1 doesn't have
this test.

Closes scylladb/scylladb#25510
2025-08-15 09:32:20 +03:00
Pavel Emelyanov
05d8d94257 Merge 'test.py: Add -k=EXPRESSION pytest argument support for boost tests.' from Artsiom Mishuta
follow-up PR after fast fix https://github.com/scylladb/scylladb/pull/25394
should be merged only after  - https://github.com/scylladb/scylla-pkg/pull/5414

Since boost tests run via pure pytest, we can finally run tests using
-k=EXPRESSION pytest argument. This expression will be applied to the "test
function". So it will be possible to run: subset of test functions that match patterns across all boosts tests(functions)

arguments --skip and -k are mutually exclusive
due to -k extends --skip functionality

examples:
```
./build/release/test/boost/auth_passwords_test --list_content
passwords_are_salted*
correct_passwords_authenticate*
incorrect_passwords_do_not_authenticate*

./test.py --mode=dev  -k="correct" -vv test/boost/auth_passwords_test.cc
PASSED test/boost/auth_passwords_test.cc::incorrect_passwords_do_not_authenticate.dev.1
PASSED test/boost/auth_passwords_test.cc::correct_passwords_authenticate.dev.1

./test.py --mode=dev  -k="not incorrect and not passwords_are_salted" -vv test/boost/auth_passwords_test.cc
PASSED test/boost/auth_passwords_test.cc::correct_passwords_authenticate.dev.1

./test.py --mode=dev  --skip=incorrect --skip=passwords_are_salted -vv test/boost/auth_passwords_test.cc
PASSED test/boost/auth_passwords_test.cc::correct_passwords_authenticate.dev.1

./test.py --mode=dev  -k="correct and not incorrect" -vv test/boost/auth_passwords_test.cc
ASSED test/boost/auth_passwords_test.cc::correct_passwords_authenticate.dev.1
```

Closes scylladb/scylladb#25400

* github.com:scylladb/scylladb:
  test.py: add -k=EXPRESSION pytest argument support for boost tests.
  test.py: small refactoring of how boost test arguments make
2025-08-15 09:24:56 +03:00
Jenkins Promoter
d4ce070168 Update pgo profiles - aarch64 2025-08-15 05:03:28 +03:00
Jenkins Promoter
c0f691f4d9 Update pgo profiles - x86_64 2025-08-15 04:56:11 +03:00
Taras Veretilnyk
30ff5942c6 database_test: fix race in test_drop_quarantined_sstables
The test_drop_quarantined_sstables test could fail due to a race between
compaction and quarantining of SSTables. If compaction selects
an SSTable before it is moved to quarantine, and change_state is called during
compaction, the SSTable may already be removed, resulting in a
std::filesystem_error due to missing files.

This patch resolves the issue by wrapping the quarantine operation inside
run_with_compaction_disabled(). This ensures compaction is paused on the
compaction group view while SSTables are being quarantined, preventing the
race.

Additionally, updates the test to quarantine up to 1/5 SSTables instead
of one randomly and increases the number of sstables genereted to improve
test scenario.

Fixes scylladb/scylladb#25487

Closes scylladb/scylladb#25494
2025-08-14 20:23:42 +03:00
Taras Veretilnyk
367eaf46c5 keys: from_nodetool_style_string don't split single partition keys
Users with single-column partition keys that contain colon characters
were unable to use certain REST APIs and 'nodetool' commands, because the
API split key by colon regardless of the partition key schema.

Affected commands:
- 'nodetool getendpoints'
- 'nodetool getsstables'
Affected endpoints:
- '/column_family/sstables/by_key'
- '/storage_service/natural_endpoints'

Refs: #16596 - This does not fully fix the issue, as users with compound
keys will face the issue if any column of the partition key contains
a colon character.

Closes scylladb/scylladb#24829
2025-08-14 19:52:04 +03:00
Avi Kivity
1ef6697949 Merge 'service/vector_store_client: Add live configuration update support' from Karol Nowacki
Enable runtime updates of vector_store_uri configuration without
requiring server restart.
This allows to dynamically enable, disable, or switch the vector search service endpoint on the fly.

To improve the clarity the seastar::experimental::http::client is now wrapped in a private http_client class that also holds the host, address, and port information.

Tests have been added to verify that the client correctly handles transitions between enabled/disabled states and successfully switches traffic to a new endpoint after a configuration update.

Closes: VECTOR-102

No backport is needed as this is a new feature.

Closes scylladb/scylladb#25208

* github.com:scylladb/scylladb:
  service/vector_store_client: Add live configuration update support
  test/boost/vector_store_client_test.cc: Refactor vector store client test
  service/vector_store_client: Refactor host_port struct created
  service/vector_store_client: Refactor HTTP request creation
2025-08-14 19:45:06 +03:00
Avi Kivity
fe6e1071d3 Merge 'locator: util: optimize describe_ring' from Benny Halevy
This change includes basic optimizations to
locator::describe_ring, mainly caching the per-endpoint information in an unordered_map instead of looking them up in every inner-loop.

This yields an improvement of 20% in cpu time.
With 45 nodes organized as 3 dcs, 3 racks per dc, 5 nodes per rack, 256 tokens per node, yielding 11520 ranges and 9 replicas per range, describe_ring took Before: 30 milliseconds (2.6 microseconds per range) After:  24 milliseconds (2.1 microseconds per range)

Add respective unit test for vnode keyspace
and for tablets.

Fixes #24887

* backport up to 2025.1 as describe_ring slowness was hit in the field with large clusters

Closes scylladb/scylladb#24889

* github.com:scylladb/scylladb:
  locator: util: optimize describe_ring
  locator: util: construct_range_to_endpoint_map: pass is_vnode=true to get_natural_replicas
  vnode_effective_replication_map: do_get_replicas: throw internal error if token not found in map
  locator: effective_replication_map: get_natural_replicas: get is_vnode param
  test: cluster: test_repair: add test_vnode_keyspace_describe_ring
2025-08-14 19:39:17 +03:00
Artsiom Mishuta
fcd511a531 test.py: add -k=EXPRESSION pytest argument support for boost tests.
Since boost tests run via pure pytest, we can finally run tests using
-k=EXPRESSION pytest argument. This expression will be applied to the "test
function". So it will be possible to run: subset of test functions that match patterns across all boosts tests(functions)

arguments --skip and -k are mutually exclusive
due to -k extends --skip functionality

examples:
./build/release/test/boost/auth_passwords_test --list_content
passwords_are_salted*
correct_passwords_authenticate*
incorrect_passwords_do_not_authenticate*

./test.py --mode=dev  -k="correct" -vv test/boost/auth_passwords_test.cc
PASSED test/boost/auth_passwords_test.cc::incorrect_passwords_do_not_authenticate.dev.1
PASSED test/boost/auth_passwords_test.cc::correct_passwords_authenticate.dev.1

./test.py --mode=dev  -k="not incorrect and not passwords_are_salted" -vv test/boost/auth_passwords_test.cc
PASSED test/boost/auth_passwords_test.cc::correct_passwords_authenticate.dev.1

./test.py --mode=dev  --skip=incorrect --skip=passwords_are_salted -vv test/boost/auth_passwords_test.cc
PASSED test/boost/auth_passwords_test.cc::correct_passwords_authenticate.dev.1

./test.py --mode=dev  -k="correct and not incorrect" -vv test/boost/auth_passwords_test.cc
ASSED test/boost/auth_passwords_test.cc::correct_passwords_authenticate.dev.1
2025-08-14 14:45:40 +02:00
Artsiom Mishuta
d589f36645 test.py: small refactoring of how boost test arguments make
During migration, boost tests to pytest, a big portion of the logic was
used "as is" with bad code and bugs

This PR refactors the function that makes an argument for the pytest command:

1)refactor how modes are provided
2)refactor how --skip provided
3)remove shlex.split woraround
2025-08-14 14:45:28 +02:00
Abhinav Jha
a0ee5e4b85 raft: replication test: change rpc_propose_conf_change test to SEASTAR_THREAD_TEST_CASE
RAFT_TEST_CASE macro creates 2 test cases, one with random 20% packet
loss named name_drops. The framework makes hard coded assumptions about
leader which doesn't hold well in case of packet losses.

This short term fix disables the packet drop variant of the specified test.
It should be safe to re-enable it once the whole framework is re-worked to
remove these hard coded assumptions.

This PR fixes a bug. Hence we need to backport it.

Fixes: scylladb/scylladb#23816

Closes scylladb/scylladb#25489
2025-08-14 13:15:16 +02:00
Dawid Mędrek
6f1fb7cfb5 db/hints: Add new logs
We're adding new logs in just a few places that may however prove
important when debugging issues in hinted handoff in the future.
2025-08-14 11:45:24 +02:00
Dawid Mędrek
d7bc9edc6c db/hints: Adjust log levels
Some of the logs could be clogging Scylla's logs, so we demote their
level to a lower one.

On the other hand, some of the logs would most likely not do that,
and they could be useful when debugging -- we promote them to debug
level.
2025-08-14 11:45:24 +02:00
Dawid Mędrek
2327d4dfa3 db/hints: Improve logs
Before these changes, the logs in hinted handoff often didn't provide
crucial information like the identifier of the node that hints were
being sent to. Also, some of the logs were misleading and referred to
other places in the code than the one where an exception or some other
situation really occurred.

We modify those logs, extending them by more valuable information
and fixing existing issues. What's more, all of the logs in
`hint_endpoint_manager` and `hint_sender` follow a consistent format
now:

```
<class_name>[<destination host ID>]:<function_name>: <message>
```

This way, we should always have AT LEAST the basic information.
2025-08-14 11:45:04 +02:00
Anna Stuchlik
841ba86609 doc: document support for new z3 instance types
This commit adds new z3 instances we now support to the list of GCP instance types.

Fixes https://github.com/scylladb/scylladb/issues/25438

Closes scylladb/scylladb#25446
2025-08-14 10:59:45 +02:00
Avi Kivity
66173c06a3 Merge 'Eradicate the ability to create new sstables with numerical sstable generation' from Benny Halevy
Remove support for generating numerical sstable generation for new sstables.
Loading such sstables is still supported but new sstables are always created with a uuid generation.
This is possible since:
* All live versions (since 5.4 / f014ccf369) now support uuid sstable generations.
* The `uuid_sstable_identifiers_enabled` config option (that is unused from version 2025.2 / 6da758d74c) controls only the use of uuid generations when creating new sstables. SSTables with uuid generations should still be properly loaded by older versions, even if `uuid_sstable_identifiers_enabled` is set to `false`.

Fixes #24248

* Enhancement, no backport needed

Closes scylladb/scylladb#24512

* github.com:scylladb/scylladb:
  streaming: stream_blob: use the table sstable_generation_generator
  replica: distributed_loader: process_upload_dir: use the table sstable_generation_generator
  sstables: sstable_generation_generator: stop tracking highest generation
  replica: table: get rid of update_sstables_known_generation
  sstables: sstable_directory: stop tracking highest_generation
  replica: distributed_loader: stop tracking highest_generation
  sstables: sstable_generation: get rid of uuid_identifiers bool class
  sstables_manager: drop uuid_sstable_identifiers
  feature_service: move UUID_SSTABLE_IDENTIFIERS to supported_feature_set
  test: cql_query_test: add test_sstable_load_mixed_generation_type
  test: sstable_datafile_test: move copy_directory helper to test/lib/test_utils
  test: database_test: move table_dir helper to test/lib/test_utils
2025-08-14 11:54:33 +03:00
Anna Stuchlik
1e5659ac30 doc: add the information about ScyllaDB C# Driver
This commit adds the driver to the list of ScyllaDB drivers,
including the information about:
- CDC integration (not available)
- Tablets (supported)

Fixes https://github.com/scylladb/scylladb/issues/25495

Closes scylladb/scylladb#25498
2025-08-14 11:29:52 +03:00
Patryk Jędrzejczak
6ad2b71d04 Merge 'LWT: communicate RPC errors to the user' from Petr Gusev
Currently, if the accept or prepare verbs fail on the replica side, the user only receives a generic error message of the form "something went wrong for this table", which provides no insight into the root cause. Additionally, these error messages are not logged by default, requiring the user to restart the node with trace or debug logging to investigate the issue.

This PR improves error handling for the accept and prepare verbs by preserving and propagating the original error messages, making it easier to diagnose failures.

backport: not needed, not a bug

Closes scylladb/scylladb#25318

* https://github.com/scylladb/scylladb:
  test_tablets_lwt: add test_error_message_for_timeout_due_to_uncertainty
  storage_proxy: preserve accept error messages
  storage_proxy: preserve prepare error message
  storage_proxy: fix log message
  exceptions.hh: fix message argument passing
  exceptions: add constructors that accept explicit error messages
2025-08-14 10:23:32 +02:00
Nadav Har'El
2d3c0eb25a test/alternator: speed up test_ttl_expiration_lsi_key
The Alternator test test_ttl.py::test_ttl_expiration_lsi_key is
currently the second-slowest test/alternator test, run a "whopping"
2.6 seconds (the total of two parameterizations - with vnodes and
tables).

This patch reduces it to 0.9 seconds.

The fix is simple: Unfortunately, tests that need to wait for actual
TTL expiration take time, but the test framework configures the TTL
scanner to have a period of half a second, so the wait should be on
average around 0.25 seconds. But the test code by mistake slept 1.2
seconds between retries. We even had a good "sleep" variable for the
amount of time we should sleep between retries, but forgot to use it.

So after lowering the sleep between retries, this test is still not
instantenous - it still needs to wait up to 0.5 seconds for the
expirations to occur - but it's almost 3 times faster than before.

While working on this test, I also used the opportunity to update its
comment which excused why we are testing LSI and not GSI. Its
suggestions of what is planned for GSI have already become a reality,
so let's update the comment to say so.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#25386
2025-08-14 11:21:52 +03:00
Pavel Emelyanov
eaec7c9b2e Merge 'cql3: add default replication strategy to create_keyspace_statement' from Dario Mirovic
When creating a new keyspace, both replication strategy and replication
factor must be stated. For example:
`CREATE KEYSPACE ks WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'replication_factor' : 3 };`

This syntax is verbose, and in all but some testing scenarios
`NetworkTopologyStrategy` is used.

This patch allows skipping replication strategy name, filling it with
`NetworkTopologyStrategy` when that happens. The following syntax is now
valid:
`CREATE KEYSPACE ks WITH REPLICATION = { 'replication_factor' : 3 };`
and will give the same result as the previous, more explicit one.

Fixes https://github.com/scylladb/scylladb/issues/16029

Backport is not needed. This is an enhancement for future releases.

Closes scylladb/scylladb#25236

* github.com:scylladb/scylladb:
  docs/cql: update documentation for default replication strategy
  test/cqlpy: add keyspace creation default strategy test
  cql3: add default replication strategy to `create_keyspace_statement`
2025-08-14 11:18:36 +03:00
Andrzej Jackowski
bf8be01086 test: audit: add logging of get_audit_log_list and set_of_rows_before
Without those logs, analysing some test failures is difficult.

Refs: scylladb/scylladb#25442

Closes scylladb/scylladb#25485
2025-08-14 09:53:05 +03:00
Ernest Zaslavsky
dd51e50f60 s3_client: add memory fallback in chunked_download_source
Introduce fallback logic in `chunked_download_source` to handle
memory exhaustion. When memory is low, feed the `deque` with only
one uncounted buffer at a time. This allows slow but steady progress
without getting stuck on the memory semaphore.

Fixes: https://github.com/scylladb/scylladb/issues/25453
Fixes: https://github.com/scylladb/scylladb/issues/25262

Closes scylladb/scylladb#25452
2025-08-14 09:52:10 +03:00
Wojciech Mitros
2ece08ba43 test: run mv tests depending on metrics on a standalone instance
The test_base_partition_deletion_with_metrics test case (and the batch
variant) uses the metric of view updates done during its runtime to check
if we didn't perform too many of them. The test runs in the cqlpy suite,
which  runs all test cases sequentially on one Scylla instance. Because
of this, if another test case starts a process which generates view
updates and doesn't wait for it to finish before it exists, we may
observe too many view updates in test_base_partition_deletion_with_metrics
and fail the test.
In all test cases we make sure that all tables that were created
during the test are dropped at the end. However, that doesn't
stop the view building process immediately, so the issue can happen
even if we drop the view. I confirmed it by adding a test just before
test_base_partition_deletion_with_metrics which builds a big
materialized view and drops it at the end - the metrics check still failed.

The issue could be caused by any of the existing test cases where we create
a view and don't wait for it to be built. Note that even if we start adding
rows after creating the view, some of them may still be included in the view
building, as the view building process is started asynchronously. In such
a scenario, the view building also doesn't cause any issues with the data in
these tests - writes performed after view creation generate view updates
synchronously when they're local (and we're running a single Scylla server),
the corresponding view udpates generated during view building are redundant.

Because we have many test cases which could be causing this issue, instead
of waiting for the view building to finish in every single one of them, we
move the susceptible test cases to be run on separate Scylla instances, in
the "cluster" suite. There, no other test cases will influence the results.

Fixes https://github.com/scylladb/scylladb/issues/20379

Closes scylladb/scylladb#25209
2025-08-13 15:08:50 +03:00
Petr Gusev
3f287275b8 test_tablets_lwt: add test_error_message_for_timeout_due_to_uncertainty 2025-08-13 14:03:57 +02:00
Petr Gusev
8bd936b72c storage_proxy: preserve accept error messages 2025-08-13 13:43:12 +02:00
Petr Gusev
00c25d396f storage_proxy: preserve prepare error message 2025-08-13 13:43:12 +02:00
Petr Gusev
0724fafe47 storage_proxy: fix log message 2025-08-13 13:40:09 +02:00
Petr Gusev
ffaee20b62 exceptions.hh: fix message argument passing
The message argument is usually taken from a temporary variable
constructed with the format() function. It is more efficient to
pass it by value and move it along the constructor chain.
2025-08-13 13:39:52 +02:00
Benny Halevy
50abeb1270 locator: util: optimize describe_ring
This change includes basic optimizations to
locator::describe_ring, mainly caching the per-endpoint
information in an unordered_map instead of looking
them up in every inner-loop.

This yields an improvement of 20% in cpu time.
With 45 nodes organized as 3 dcs, 3 racks per dc, 5 nodes per rack, 256 tokens per
node, yielding 11520 ranges and 9 replicas per range, describe_ring took
Before: 30 milliseconds (2.6 microseconds per range)
After:  24 milliseconds (2.1 microseconds per range)

Add respective unit test of describe_ring for tablets.
A unit test for vnodes already exists in
test/nodetool/test_describering.py

Fixes #24887

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-08-13 12:42:25 +03:00
Benny Halevy
60d2cc886a locator: util: construct_range_to_endpoint_map: pass is_vnode=true to get_natural_replicas
First, let get_all_ranges return all vnode ranges
with a corrected wrapping range covering the [last token, first token)
range, such that all ranges start tokens are vndoe tokens
and must be in the vnode replication map.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-08-13 12:42:23 +03:00
Benny Halevy
195d02d64e vnode_effective_replication_map: do_get_replicas: throw internal error if token not found in map
Prevent a crash, especially in the is_vnode=true case,
if the key_token is not found in the map.
Rather than the undefined behavior when dereferencing the
end() iterator, throw an internal error with additional
logging about the search logic and parameters.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-08-13 12:41:03 +03:00
Benny Halevy
4d646636f2 locator: effective_replication_map: get_natural_replicas: get is_vnode param
Some callers, like `construct_range_to_endpoint_map` for describe_ring,
or `get_secondary_ranges` for alternator ttl pass vnode tokens (the
vnodes' start token), and therefore can benefit from the fast lookup
path in `vnode_effective_replication_map::do_get_replicas`.
Otherwise the vnode token is binary-searched in sorted_tokens using
token_metadata::first_token().

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-08-13 12:41:00 +03:00
Benny Halevy
f22a870a04 test: cluster: test_repair: add test_vnode_keyspace_describe_ring
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-08-13 12:39:40 +03:00
Yaniv Michael Kaul
b75799c21c skip instead of xfail test_change_replication_factor_1_to_0
It's a waste of good machine time to xfail this rather than just skip.
It takes >3m just to run the test and xfail.
We have a marker for it, we know why we skip it.

Fixes: https://github.com/scylladb/scylladb/issues/25310
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>

Closes scylladb/scylladb#25311
2025-08-13 10:32:22 +02:00
Ernest Zaslavsky
380c73ca03 s3_client: make memory semaphore acquisition abortable
Add `abort_source` to the `get_units` call for the memory semaphore
in the S3 client, allowing the acquisition process to be aborted.

Fixes: https://github.com/scylladb/scylladb/issues/25454

Closes scylladb/scylladb#25469
2025-08-13 08:48:55 +03:00
Jenkins Promoter
2de91d43d5 Update pgo profiles - x86_64 2025-08-13 07:52:17 +03:00
Jenkins Promoter
647d9fe45d Update pgo profiles - aarch64 2025-08-13 07:43:38 +03:00
Dario Mirovic
2ac37b4fde docs/cql: update documentation for default replication strategy
Update create-keyspace-statement section of ddl.rst since `class` is no longer mandatory.
Add an example for keyspace creation without specifying `class`.

Refs: #16029
2025-08-13 01:52:00 +02:00
Dario Mirovic
ef63d343ba test/cqlpy: add keyspace creation default strategy test
Add a test case for create keyspace default replication strategy.
It is expected that the default replication strategy is `NetworkTopologyStrategy`.

Refs: #16029
2025-08-13 01:52:00 +02:00
Dario Mirovic
bc8bb0873d cql3: add default replication strategy to create_keyspace_statement
When creating a new keyspace, both replication strategy and replication
factor must be stated. For example:
`CREATE KEYSPACE ks WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'replication_factor' : 3 };`

This syntax is verbose, and in all but some testing scenarios
`NetworkTopologyStrategy` is used.

This patch allows skipping replication strategy name, filling it with
`NetworkTopologyStrategy` when that happens. The following syntax is now
valid:
`CREATE KEYSPACE ks WITH REPLICATION = { 'replication_factor' : 3 };`
and will give the same result as the previous, more explicit one.

Fixes #16029
2025-08-13 01:51:53 +02:00
Botond Dénes
72b2bbac4f pgo/pgo.py: use tablet repair API for repair
Since a1d7722 tablet keyspaces are not allowed to be repaired via the
old /storage_service/repair_async/{keyspace} API, instead the new
/storage_service/tablets/repair API has to be used. Adjust the repair
code and also add await_completion=true: the script just waits
for the repair to finish immediately after starting it.

Closes scylladb/scylladb#25455
2025-08-12 20:32:19 +03:00
Petr Gusev
ff89c03c7f exceptions: add constructors that accept explicit error messages
To improve debuggability, we need to propagate original error messages
from Paxos verbs to the user. This change adds constructors that take
an error message directly, enabling better error reporting.

Additionally, functions such as write_timeout_to_read,
write_failure_to_read etc are updated to use these message-based
constructors. These functions are used in storage_proxy::cas to
convert between different error types, and without this change,
they could lose the original error message during conversion.
2025-08-12 16:31:05 +02:00
Taras Veretilnyk
b7097b2993 database_test: fix abandoned futures in test_drop_quarantined_sstables
The lambda passed to do_with_cql_env_thread() in test_drop_quarantined_sstables
was mistakenly written as a coroutine.
This change replaces co_await with .get() calls on futures
and changes lambda return type to void.

Fixes scylladb/scylladb#25427

Closes scylladb/scylladb#25431
2025-08-12 13:31:06 +03:00
Patryk Jędrzejczak
a1b2f99dee Merge 'test: test_mv_backlog: fix to consider internal writes' from Michael Litvak
The PR fixes a test flakiness issue in test_mv_backlog related to reading metrics.

The first commit fixes a more general issue in the ScyllaMetrics helper class where it doesn't return the value of all matching lines when a specific shard is requested, but it breaks after the first match.

The second commit fixes a test issue where it expects exactly one write to be throttled, not taking into account other internal writes that may be executed during this time.

Fixes https://github.com/scylladb/scylladb/issues/23139

backport to improve CI stability - test only change

Closes scylladb/scylladb#25279

* https://github.com/scylladb/scylladb:
  test: test_mv_backlog: fix to consider internal writes
  test/pylib/rest_client: fix ScyllaMetrics filtering
2025-08-12 10:05:15 +02:00
Wojciech Przytuła
7600ccfb20 Fix link to ScyllaDB manual
The link would point to outdated OS docs. I fixed it to point to up-to-date Enterprise docs.

Closes scylladb/scylladb#25328
2025-08-12 10:33:06 +03:00
Avi Kivity
ac1f6aa0de auth: resource: simplify some range transformations
Supply the member function directly to std::views::transform,
rather than going through a lambda.

Closes scylladb/scylladb#25419
2025-08-12 10:30:06 +03:00
Karol Nowacki
22a133df9b service/vector_store_client: Add live configuration update support
Enable runtime updates of vector_store_uri configuration without
requiring server restart.
This allows to dynamically enable, disable, or switch the vector search node endpoint on the fly.
2025-08-12 08:12:53 +02:00
Karol Nowacki
152274735e test/boost/vector_store_client_test.cc: Refactor vector store client test
Consolidate consecutive setup functions into a dedicated helper.
Extract test table creation into a separate function.
Remove redundant assertions to improve clarity.
2025-08-12 08:12:53 +02:00