Right now, service levels are migrated in one group0 command and auth
is migrated in the next one. This has a bad effect on the group0 state
reload logic - modifying service levels in group0 causes the effective
service levels cache to be recalculated, and to do so we need to fetch
information about all roles. If the reload happens after SL upgrade and
before auth upgrade, the query for roles will be directed to the legacy
auth tables in system_auth - and the query, being a potentially remote
query, has a timeout. If the query times out, it will throw
an exception which will break the group0 apply fiber and the node will
need to be restarted to bring it back to work.
In order to solve this issue, make sure that the service level module
does not start populating and using the service level cache until both
service levels and auth are migrated to raft. This is achieved by adding
the check both to the cache population logic and the effective service
level getter - they now look at service level's accessor new method,
`can_use_effective_service_level_cache` which takes a look at the auth
version.
Fixes: scylladb/scylladb#24963
Currently we do token metadata barrier before accepting a replacing
node. It was needed for the "replace with the same IP" case to make sure
old request will not contact new node by mistake. But now since we
address nodes by id this is no longer possible since old requests will
use old id and will be rejected.
Closesscylladb/scylladb#25047
Skip removing any artifacts when -s provided between test.py invocation.
Logs from the previous run will be overridden if tests were executed one
more time. Fox example:
1. Execute tests A, B, C with parameter -s
2. All logs are present even if tests are passed
3. Execute test B with parameter -s
4. Logs for A and C are from the first run
5. Logs for B are from the most recent run
Backport is not needed, since it framework enhancement.
Closesscylladb/scylladb#24838
* github.com:scylladb/scylladb:
test.py: skip cleaning artifacts when -s provided
test.py: move deleting directory to prepare_dir
Currently it grows dynamically and triggers oversized allocation
warning. Also it may be hard to find sufficient contiguous memory chunk
after the system runs for a while. This patch pre-allocates enough
memory for ~1M outstanding writes per shard.
Fixes#24660Fixes#24217Closesscylladb/scylladb#25098
When a node shuts down, in storage service, after storage_proxy RPCs are stopped, some write handlers within storage_proxy may still be waiting for background writes to complete. These handlers hold appropriate ERMs to block schema changes before the write finishes. After the RPCs are stopped, these writes cannot receive the replies anymore.
If, at the same time, there are RPC commands executing `barrier_and_drain`, they may get stuck waiting for these ERM holders to finish, potentially blocking node shutdown until the writes time out.
This change introduces cancellation of all outstanding write handlers from storage_service after the storage proxy RPCs were stopped.
Fixesscylladb/scylladb#23665
Backport: since this fixes an issue that frequently causes issues in CI, backport to 2025.1, 2025.2, and 2025.3.
Closesscylladb/scylladb#24714
* https://github.com/scylladb/scylladb:
storage_service: Cancel all write requests on storage_proxy shutdown
test: Add test for unfinished writes during shutdown and topology change
If schema pull are disabled group0 is used to bring up to date schema
by calling start_group0_operation() which executes raft read barrier
internally, but if the group0 is still in use_pre_raft_procedures
start_group0_operation() silently does nothing. Later the code that
assumes that schema is already up-to-date will fail and print warnings
into the log. But since getting queries in the state when a node is in
raft enabled mode but group0 is still not configured is illegal it is
better to make those errors more visible buy asserting them during
testing.
Closesscylladb/scylladb#25112
* seastar 26badcb1...60b2e7da (42):
> Revert "Fix incorrect defaults for io queue iops/bandwidth"
> fair_queue: Ditch queue-wide accumulator reset on overflow
> addr2line, scripts/stall-analyser: change the default tool to llvm-addr2line
> Fix incorrect defaults for io queue iops/bandwidth
> core/reactor: add cxx_exceptions() getter
> gate: make destructor virtual
> scripts/seastar-addr2line: change the default addr2line utility to llvm-addr2line
> coding-style: Align example return types
> reactor: Remove min_vruntime() declaration
> reactor: Move enable_timer() method to private section
> smp: fix missing span include
> core: Don't keep internal errors counter on reactor
> pollable_fd: Untangle shutdown()
> io_queue: Remove deprecated statistics getters
> fair_queue: Remove queued/executing resource counters
> reactor: Move set_current_task() from public reactor API
> util: make SEASTAR_ASSERT() failure generate SIGABRT
> core: fix high CPU use at idle on high core count machines
> Merge 'Move output IO throttler to IO queue level' from Pavel Emelyanov
fair_queue: Move io_throttler to io_queue.hh
fair_queue: Move metrics from to io_queue::stream
fair_queue: Remove io_throttler from tests
fair_queue_test: Remove io-throttler from fair-queue
fair_queue: Remove capacity getters
fair_queue: Move grab_result into io_queue::stream too
fair_queue: Move throtting code to io_queue.cc
fair_queue: Move throttling code to io_queue::stream class
fair_queue: Open-code dispatch_requests() into users
fair_queue: Split dispatch_requests() into top() and pop_front()
fair_queue: Swap class push back and dispatch
fair_queue: Configure forgiving factor externally
fair_queue: Move replenisher kick to dispatch caller
io_queue: Introduce io_queue::stream
fair_queue: Merge two grab_capacity overloads
fair_queue: Detatch outcoming capacity grabbing from main dispatch loop
fair_queue: Move available tokens update into if branch
io_queue: Rename make_fair_group_config into configure_throttler
io_queue: Rename get_fair_group into get_throttler
fair_queue: Rename fair_group -> io_throttler
> http::reply: Add 308 (permanent redirect) and make pretty-print handle unknown values
> Merge 'Relax reactor coupling with file_data_source_impl' from Pavel Emelyanov
reactor: Relax friendship with file_data_source_impl
fstream: Use direct io_stats reference
> thread_pool: Relax coupling with reactor
> reactor: Mark some IO classes management methods private
> http: Deprecate json_exception
> io_tester: Collect and report disk queue length samples
> test/perf: Add context-switch measurer
> http/client: Zero-copy forward content-length body into the underlying stream
> json2code: Genrate move constructor and move-assignment operator
> Merge 'Semi-mixed mode for output_stream' from Pavel Emelyanov
output_stream: Support semi-mixed mode writing
output_stream: Complete write(temporary_buffer) piggy-back-ing write(packet)
iostream: Add friends for iostream tests
packet: Mark bool cast operator const
iostream: Document output_stream::write() methods
> io_tester: Show metrics about requests split
> reactor: add counter for internal errors
> iotune: Print correct throughput units
> core: add label to io_threaded_fallbacks to categorize operations
> slab: correct allocation logic and enforce memory limits
> Merge 'Fix for non-json http function_handlers' from Travis Downs
httpd_test: add test for non-JSON function handler
function_handlers: avoid implicit conversions
http: do not always treat plain text reply as json
> Merge 'tls: add ALPN support' from Łukasz Kurowski
tls: add server-side ALPN support
tls: add client-side ALPN support
> Merge 'coroutine: experimental: generator: implement move and swap' from Benny Halevy
coroutine: experimental: generator: implement move and swap
coroutine: experimental: generator: unconstify buffer capacity
> future: downgrade asserts
> output_stream: Remove unused bits
> Merge 'Upstream a couple of minor reactor optimizations' from Travis Downs
Match type for pure_check_for_work
Do not use std::function for check_for_work()
> Handle ENOENT in getgrnam
Includes scylla-gdb.py update by Pavel Emelyanov.
Closesscylladb/scylladb#25094
During a graceful node shutdown, RPC listeners are stopped in `storage_service::drain_on_shutdown`
as one of the first steps. However, even after RPCs are shut down, some write handlers in
`storage_proxy` may still be waiting for background writes to complete. These handlers retain the ERM.
Since the RPC subsystem is no longer active, replies cannot be received, and if any RPC commands are
concurrently executing `barrier_and_drain`, they may get stuck waiting for those writes. This can block
the messaging server shutdown and delay the entire shutdown process until the write timeout occurs.
This change introduces the cancellation of all outstanding write handlers in `storage_proxy`
during shutdown to prevent unnecessary delays.
Fixesscylladb/scylladb#23665
This test reproduces an issue where a topology change and an ongoing write query
during query coordinator shutdown can cause the node to get stuck.
When a node receives a write request, it creates a write handler that holds
a copy of the current table's ERM (Effective Replication Map). The ERM ensures
that no topology or schema changes occur while the request is being processed.
After the query coordinator receives the required number of replica write ACKs
to satisfy the consistency level (CL), it sends a reply to the client. However,
the write response handler remains alive until all replicas respond — the remaining
writes are handled in the background.
During shutdown, when all network connections are closed, these responses can no longer
be received. As a result, the write response handler is only destroyed once the write
timeout is reached.
This becomes problematic because the ERM held by the handler blocks topology or schema
change commands from executing. Since shutdown waits for these commands to complete,
this can lead to unnecessary delays in node shutdown and restarts, and occasional
test case failures.
Test for: scylladb/scylladb#23665
- Change schedule from twice weekly (Mon/Thu) to once weekly (Mon only)
- Extend notification cooldown period from 3 days to 1 week
- Prevent notification spam while maintaining immediate conflict detection on pushes
Fixes: https://github.com/scylladb/scylladb/issues/25130Closesscylladb/scylladb#25131
If a test fails very early (still have to find why), test.py
crashes while flushing a non-existent log_file, as shown below.
To fix, initialize the property to None and check it during
cleanup.
```
================================================================================
[N/TOTAL] SUITE MODE RESULT TEST
------------------------------------------------------------------------------
'ScyllaServer' object has no attribute 'log_file'
test_cluster_features Traceback (most recent call last):
File "/home/avi/scylla-maint/./test.py", line 816, in <module>
sys.exit(asyncio.run(main()))
~~~~~~~~~~~^^^^^^^^
File "/usr/lib64/python3.13/asyncio/runners.py", line 195, in run
return runner.run(main)
~~~~~~~~~~^^^^^^
File "/usr/lib64/python3.13/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^
File "/usr/lib64/python3.13/asyncio/base_events.py", line 725, in run_until_complete
return future.result()
~~~~~~~~~~~~~^^
File "/home/avi/scylla-maint/./test.py", line 523, in main
total_tests_pytest, failed_pytest_tests = await run_all_tests(signaled, options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/avi/scylla-maint/./test.py", line 452, in run_all_tests
failed += await reap(done, pending, signaled)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/avi/scylla-maint/./test.py", line 418, in reap
result = coro.result()
File "/home/avi/scylla-maint/test/pylib/suite/python.py", line 143, in run
return await super().run(test, options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/avi/scylla-maint/test/pylib/suite/base.py", line 216, in run
await test.run(options)
File "/home/avi/scylla-maint/test/pylib/suite/topology.py", line 48, in run
async with get_cluster_manager(self.uname, self.suite.clusters, str(self.suite.log_dir)) as manager:
~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.13/contextlib.py", line 221, in __aexit__
await anext(self.gen)
File "/home/avi/scylla-maint/test/pylib/scylla_cluster.py", line 2006, in get_cluster_manager
await manager.stop()
File "/home/avi/scylla-maint/test/pylib/scylla_cluster.py", line 1539, in stop
await self.clusters.put(self.cluster, is_dirty=True)
File "/home/avi/scylla-maint/test/pylib/pool.py", line 104, in put
await self.destroy(obj)
File "/home/avi/scylla-maint/test/pylib/suite/python.py", line 65, in recycle_cluster
srv.log_file.close()
^^^^^^^^^^^^
AttributeError: 'ScyllaServer' object has no attribute 'log_file'
```
Closesscylladb/scylladb#24885
CPU scheduling has been with us since 641aaba12c
(2017), and no one ever disables it. Likely nothing really works without
it.
Make it mandatory and mark the option unused.
Closesscylladb/scylladb#24894
Rather than calling std::views::transform with a lambda that extracts
a member from a class, call std::views::transform with a pointer-to-member
to do the same thing. This results in more concise code.
Closesscylladb/scylladb#25012
test/cqlpy/README.md explains how to run the cqlpy tests against
Cassandra, and mentions that if you don't have "nodetool" in your path
you need to set the NODETOOL variable. However, when giving a simple
example how to use the run-cassandra script, we forgot to remind the
user to set NODETOOL in addition to CASSANDRA, causing confusion for
users who didn't know why tests were failing.
So this patch fixes the section in test/cqlpy/README.md with the
run-cassandra example to also set the NODETOOL environment variable,
not just CASSANDRA.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#25051
Parent task keeps a vector of statuses (task_essentials) of its finished
children. When the children number is large - for example because we
have many tables and a child task is created for each table - we may hit
oversize allocation while adding a new child essentials to the vector.
Keep task_essentails of children in chunked_vector.
Fixes: #25040.
Closesscylladb/scylladb#25064
Audit tests are vulnerable to noise from LOGIN queries (because AUTH
audit logs can appear at any time). Most tests already use the
`filter_out_noise` mechanism to remove this noise, but tests
focused on AUTH verification did not, leading to sporadic failures.
This change adds a filter to ignore AUTH logs generated by the default
"cassandra" user, so tests only verify logs from the user created
specifically for each test.
Additionally, this PR:
- Adds missing `nonlocal new_rows` statement that prevented some checks from being called
- Adds a testcase for audit logs of `cassandra` user
Fixes: https://github.com/scylladb/scylladb/issues/25069
Better backport those test changes to 2025.3. 2025.2 and earlier don't have `./cluster/dtest/audit_test.py`.
Closesscylladb/scylladb#25111
* github.com:scylladb/scylladb:
test: audit: add cassandra user test case
test: audit: ignore cassandra user audit logs in AUTH tests
test: audit: change names of `filter_out_noise` parameters
test: audit: add missing `nonlocal new_rows` statement
Enhance and fix error handling in the `chunked_download_source` to prevent errors seeping from the request callback. Also stop retrying on seastar's side since it is going to break the integrity of data which maybe downloaded more than once for the same range.
Fixes: https://github.com/scylladb/scylladb/issues/25043
Should be backported to 2025.3 since we have an intention to release native backup/restore feature
Closesscylladb/scylladb#24883
* github.com:scylladb/scylladb:
s3_client: Disable Seastar-level retries in HTTP client creation
s3_test: Validate handling of non-`aws_error` exceptions
s3_client: Improve error handling in chunked_download_source
aws_error: Add factory method for `aws_error` from exception
Fixes: #25045
added the ability to supply the list of files to
restore from the a given file.
mainly required for local testing.
Signed-off-by: Ran Regev <ran.regev@scylladb.com>
Closesscylladb/scylladb#25077
Prevent Seastar from retrying HTTP requests to avoid buffer double-feed
issues when an entire request is retried. This could cause data
corruption in `chunked_download_source`. The change is global for every
instance of `s3_client`, but it is still safe because:
* Seastar's `http_client` resets connections regardless of retry behavior
* `s3_client` retry logic handles all error types—exceptions, HTTP errors,
and AWS-specific errors—via `http_retryable_client`
Create aws_error from raised exceptions when possible and respond
appropriately. Previously, non-aws_exception types leaked from the
request handler and were treated as non-retryable, causing potential
data corruption during download.
Audit tests use the `filter_out_noise` function to remove noise from
audit logs generated by user authentication. As a result, none of the
existing tests covered audit logs for the default `cassandra` user.
This change adds a test case for that user.
Refs: scylladb/scylladb#25069
Audit tests are vulnerable to noise from LOGIN queries (because AUTH
audit logs can appear at any time). Most tests already use the
`filter_out_noise` mechanism to remove this noise, but tests
focused on AUTH verification did not, leading to sporadic failures.
This change adds a filter to ignore AUTH logs generated by the default
"cassandra" user, so tests only verify logs from the user created
specifically for each test.
Fixes: scylladb/scylladb#25069
This is a refactoring commit that changes the names of the parameters
of the `filter_out_noise` function, as well as names of related
variables. The motiviation for the change is introduction of more
complex filtering logic in next commit of this patch series.
Refs: scylladb/scylladb#25069
The variable `new_rows` was not updated by the inner function
`is_number_of_new_rows_correct` because the `nonlocal new_rows`
statement was missing. As a result, `sorted_new_rows` was empty and
certain checks were skipped.
This change:
- Introduces the missing `nonlocal new_rows` declaration
- Adds an assertion verifying that the number of new rows matches
the expected count
- Fixes the incorrect variable name in the lambda used for row sorting
Currently when refreshing submodule, the script puts a plain list of
non-merge commits into commit message. The resulting summary contains
everything, but is hard to understand. E.g. if updating seastar today
the summary would start with
* seastar 26badcb1...86c4893b (55):
> util: make SEASTAR_ASSERT() failure generate SIGABRT
> core: fix high CPU use at idle on high core count machines
> http::reply: Add 308 (permanent redirect) and make pretty-print handle unknown values
> reactor: Relax friendship with file_data_source_impl
> fstream: Use direct io_stats reference
> thread_pool: Relax coupling with reactor
> reactor: Mark some IO classes management methods private
> http: Deprecate json_exception
> fair_queue: Move io_throttler to io_queue.hh
> fair_queue: Move metrics from to io_queue::stream
> fair_queue: Remove io_throttler from tests
> fair_queue_test: Remove io-throttler from fair-queue
> fair_queue: Remove capacity getters
> fair_queue: Move grab_result into io_queue::stream too
> fair_queue: Move throtting code to io_queue.cc
> fair_queue: Move throttling code to io_queue::stream class
> fair_queue: Open-code dispatch_requests() into users
> fair_queue: Split dispatch_requests() into top() and pop_front()
> fair_queue: Swap class push back and dispatch
> fair_queue: Configure forgiving factor externally
...
That's not very informative, because the update includes several large
"merges" that have their summary which is missing here. This update
changes the way summary is generated to include merges and their
summaries and all merged commits are listed as sub-lines, like this
* seastar 26badcb1...86c4893b (26):
> util: make SEASTAR_ASSERT() failure generate SIGABRT
> core: fix high CPU use at idle on high core count machines
> Merge 'Move output IO throttler to IO queue level' from Pavel Emelyanov
fair_queue: Move io_throttler to io_queue.hh
fair_queue: Move metrics from to io_queue::stream
fair_queue: Remove io_throttler from tests
fair_queue_test: Remove io-throttler from fair-queue
fair_queue: Remove capacity getters
fair_queue: Move grab_result into io_queue::stream too
fair_queue: Move throtting code to io_queue.cc
fair_queue: Move throttling code to io_queue::stream class
fair_queue: Open-code dispatch_requests() into users
fair_queue: Split dispatch_requests() into top() and pop_front()
fair_queue: Swap class push back and dispatch
fair_queue: Configure forgiving factor externally
fair_queue: Move replenisher kick to dispatch caller
io_queue: Introduce io_queue::stream
fair_queue: Merge two grab_capacity overloads
fair_queue: Detatch outcoming capacity grabbing from main dispatch loop
fair_queue: Move available tokens update into if branch
io_queue: Rename make_fair_group_config into configure_throttler
io_queue: Rename get_fair_group into get_throttler
fair_queue: Rename fair_group -> io_throttler
> http::reply: Add 308 (permanent redirect) and make pretty-print handle unknown values
> Merge 'Relax reactor coupling with file_data_source_impl' from Pavel Emelyanov
reactor: Relax friendship with file_data_source_impl
fstream: Use direct io_stats reference
> thread_pool: Relax coupling with reactor
> reactor: Mark some IO classes management methods private
...
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#24834
Do not use default, instead list all fall-through components
explicitly, so if we add a new one, the developer doing that
will be forced to consider what to do here.
Eliminate the `default` case from the switch in
`encryption_file_io_extension::wrap_sink`, and explicitly
handle all `component_type` values within the switch statement.
fixes: https://github.com/scylladb/scylladb/issues/23724Closesscylladb/scylladb#24987
As requested in #22114, moved the files and fixed other includes and build system.
Moved files:
- interval.hh
- Map_difference.hh
Fixes: #22114
This is a cleanup, no need to backport
Closesscylladb/scylladb#25095
The set of columns of a CDC log table should be managed automatically
by Scylla, and the user should not have the ability to manipulate them
directly. That could lead to disastrous consequences such as a
segmentation fault.
In this commit, we're restricting those operations. We also provide two
validation tests.
One of the existing tests had to be adjusted as it modified the type
of a column in a CDC log table. Since the test simply verifies that
the user has sufficient permissions to perform `ALTER TABLE` on the log
table, the test is still valid.
Fixesscylladb/scylladb#24643
Backport: we should backport the change to all affected
branches to prevent the consequences that may affect the user.
Closesscylladb/scylladb#25008
* github.com:scylladb/scylladb:
cdc: Forbid altering columns of inactive CDC log table
cdc: Forbid altering columns of CDC log tables directly
`protocol_exception` is thrown in several places. This has become a performance issue, especially when starting/restarting a server. To alleviate this issue, throwing the exception has to be replaced with returning it as a result or an exceptional future.
This PR replaces throws in the `transport/server` module. This is achieved by using result_with_exception, and in some places, where suitable, just by creating and returning an exceptional future.
There are four commits in this PR. The first commit introduces tests in `test/cqlpy`. The second commit refactors transport server `handle_error` to not rethrow exceptions. The third commit refactors reusable buffer writer callbacks. The fourth commit replaces throwing `protocol_exception` to returning it.
Based on the comments on an issue linked in https://github.com/scylladb/scylladb/issues/24567, the main culprit from the side of protocol exceptions is the invalid protocol version one, so I tested that exception for performance.
In order to see if there is a measurable difference, a modified version of `test_protocol_version_mismatch` Python is used, with 100'000 runs across 10 processes (not threads, to avoid Python GIL). One test run consisted of 1 warm-up run and 5 measured runs. First test run has been executed on the current code, with throwing protocol exceptions. Second test urn has been executed on the new code, with returning protocol exceptions. The performance report is in https://github.com/scylladb/scylladb/pull/24738#issuecomment-3051611069. It shows ~10% gains in real, user, and sys time for this test.
Testing
Build: `release`
Test file: `test/cqlpy/test_protocol_exceptions.py`
Test name: `test_protocol_version_mismatch` (modified for mass connection requests)
Test arguments:
```
max_attempts=100'000
num_parallel=10
```
Throwing `protocol_exception` results:
```
real=1:26.97 user=10:00.27 sys=2:34.55 cpu=867%
real=1:26.95 user=9:57.10 sys=2:32.50 cpu=862%
real=1:26.93 user=9:56.54 sys=2:35.59 cpu=865%
real=1:26.96 user=9:54.95 sys=2:32.33 cpu=859%
real=1:26.96 user=9:53.39 sys=2:33.58 cpu=859%
real=1:26.95 user=9:56.85 sys=2:34.11 cpu=862% # average
```
Returning `protocol_exception` as `result_with_exception` or an exceptional future:
```
real=1:18.46 user=9:12.21 sys=2:19.08 cpu=881%
real=1:18.44 user=9:04.03 sys=2:17.91 cpu=869%
real=1:18.47 user=9:12.94 sys=2:19.68 cpu=882%
real=1:18.49 user=9:13.60 sys=2:19.88 cpu=883%
real=1:18.48 user=9:11.76 sys=2:17.32 cpu=878%
real=1:18.47 user=9:10.91 sys=2:18.77 cpu=879% # average
```
This PR replaced `transport/server` throws of `protocol_exception` with returns. There are a few other places where protocol exceptions are thrown, and there are many places where `invalid_request_exception` is thrown. That is out of scope of this single PR, so the PR just refs, and does not resolve issue #24567.
Refs: #24567
This PR improves performance in cases when protocol exceptions happen, for example during connection storms. It will require backporting.
Closesscylladb/scylladb#24738
* github.com:scylladb/scylladb:
test/cqlpy: add cpp exception metric test conditions
transport/server: replace protocol_exception throws with returns
utils/reusable_buffer: accept non-throwing writer callbacks via result_with_exception
transport/server: avoid exception-throw overhead in handle_error
test/cqlpy: add protocol_exception tests
When CDC becomes disabled on the base table, the CDC log table
still exsits (cf. scylladb/scylladb@adda43edc7).
If it continues to exist up to the point when CDC is re-enabled
on the base table, no new log table will be created -- instead,
the old olg table will be *re-attached*.
Since we want to avoid situations when the definition of the log
table has become misaligned with the definition of the base table
due to actions of the user, we forbid modifying the set of columns
or renaming them in CDC log tables, even when they're inactive.
Validation tests are provided.
Quit from the repeats if the test is under the pytest runner directory and has some typos or is absent. This allows not going several times through the discovery and stopping execution.
Print a warning at the end of the run when no tests were selected by provided name.
Fixes: scylladb/scylladb#24892Closesscylladb/scylladb#24918
* github.com:scylladb/scylladb:
test.py: print warning in case no tests were found
test.py: break the loop when there is no tests for pytest
in the CDC log transformer, when creating a CDC mutation based on some
base table mutation, for each value of a base column we set the value in
the CDC column with the same name.
When looking up the column in the CDC schema by name, we may get a null
pointer if a column by that name is not found. This shouldn't happen
normally because the base schema and CDC schema should be compatible,
and for each base column there should be a CDC column with the same
name.
However, there are scenarios where the base schema and CDC schema are
incompatible for a short period of time when they are being altered.
When a base column is being added or dropped, we could get a base
mutation with this column set, and then the CDC transformer picks up the
latest CDC schema which doesn't have this column.
If such thing happens, we fix the code to throw an exception instead of
crashing on null pointer dereference. Currently we don't have a safer
approach to handle this, but this might be changed in the future. The
other alternative is dropping that data silently which we prefer not to
do.
Throwing an error is acceptable because this scenario most likely
indicates this behavior by the user:
* The user adds a new column, and start writing values to the column
before the ALTER is complete. or,
* The user drops a column, and continues writing values to the column
while it's being dropped.
Both cases might as well fail with an error because the column is not
found in the base table.
Fixesscylladb/scylladb#24952
backport needed - simple fix for a node crash
Closesscylladb/scylladb#24986
* github.com:scylladb/scylladb:
test: cdc: add test_cdc_with_alter
cdc: throw error if column doesn't exist
in the CDC log transformer, when creating a CDC mutation based on some
base table mutation, for each value of a base column we set the value in
the CDC column with the same name.
When looking up the column in the CDC schema by name, we may get a null
pointer if a column by that name is not found. This shouldn't happen
normally because the base schema and CDC schema should be compatible,
and for each base column there should be a CDC column with the same
name.
However, there are scenarios where the base schema and CDC schema are
incompatible for a short period of time when they are being altered.
When a base column is being added or dropped, we could get a base
mutation with this column set, and then the CDC transformer picks up the
latest CDC schema which doesn't have this column.
If such thing happens, we fix the code to throw an exception instead of
crashing on null pointer dereference. Currently we don't have a safer
approach to handle this, but this might be changed in the future. The
other alternative is dropping that data silently which we prefer not to
do.
Throwing an error is acceptable because this scenario most likely
indicates this behavior by the user:
* The user adds a new column, and start writing values to the column
before the ALTER is complete. or,
* The user drops a column, and continues writing values to the column
while it's being dropped.
Both cases might as well fail with an error because the column is not
found in the base table.
Fixesscylladb/scylladb#24952
Tested code paths should not throw exceptions. `scylla_reactor_cpp_exceptions`
metric is used. This is a global metric. To address potential test flakiness,
each test runs multiple times:
- `run_count = 100`
- `cpp_exception_threshold = 10`
If a change in the code introduced an exception, expectation is that the number
of registered exceptions will be > `cpp_exception_threshold` in `run_count` runs.
In which case the test fails.
Replace throwing protocol_exception with returning it as a result
or an exceptional future in the transport server module. This
improves performance, for example during connection storms and
server restarts, where protocol exceptions are more frequent.
In functions already returning a future, protocol exceptions are
propagated using an exceptional future. In functions not already
returning a future, result_with_exception is used.
Notable change is checking v.failed() before calling v.get() in
process_request function, to avoid throwing in case of an
exceptional future.
Refs: #24567
Make make_bytes_ostream and make_fragmented_temporary_buffer accept
writer callbacks that return utils::result_with_exception instead of
forcing them to throw on error. This lets callers propagate failures
by returning an error result rather than throwing an exception.
Introduce buffer_writer_for, bytes_ostream_writer, and fragmented_buffer_writer
concepts to simplify and document the template requirements on writer callbacks.
This patch does not modify the actual callbacks passed, except for the syntax
changes needed for successful compilation, without changing the logic.
Refs: #24567
Previously, connection::handle_error always called f.get() inside a try/catch,
forcing every failed future to throw and immediately catch an exception just to
classify it. This change eliminates that extra throw/catch cycle by first checking
f.failed(), getting the stored std::exception_ptr via f.get_exception(), and
then dispatching on its type via utils::try_catch<T>(eptr).
The error-response logic is not changed - cassandra_exception, std::exception,
and unknown exceptions are caught and processed, and any exceptions thrown by
write_response while handling those exceptions continues to escape handle_error.
Refs: #24567
Add a helper to fetch scylla_transport_cql_errors_total{type="protocol_error"} counter
from Scylla's metrics endpoint. These metrics are used to track protocol error
count before and after each test.
Add cql_with_protocol context manager utility for session creation with parameterized
protocol_version value. This is used for testing connection establishment with
different protocol versions, and proper disposal of successfully established sessions.
The tests cover two failure scenarios:
- Protocol version mismatch in test_protocol_version_mismatch which tests both supported
and unsupported protocol version
- Malformed frames via raw socket in _protocol_error_impl, used by several test functions,
and also test_no_protocol_exceptions test to assert that the error counters never decrease
during test execution, catching unintended metric resets
Refs: #24567
This reverts commit 45f5efb9ba.
The load_and_repair_paxos_state function was introduced in
scylladb/scylladb#24478, but it has never been tested or proven useful.
One set of problems stems from its use of local data structures
from a remote shard. In particular, system_keyspace and schema_ptr
cannot be directly accessed from another shard — doing so is a bug.
More importantly, load_paxos_state on different shards can't ever
return different values. The actual shard from which data is read is
determined by sharder.shard_for_reads, and storage_proxy will jump
back to the appropriate shard if the current one doesn't match. This
means load_and_repair_paxos_state can't observe paxos state from
write-but-not-read shard, and therefore will never be able to
repair anything.
We believe this explicit Paxos state read-repair is not needed at all.
Any paxos state read which drives some paxos round forward is already
accompanied by a paxos state write. Suppose we wrote the state to the
old shard but not to the new shard (because of some error) while
streaming is already finished. The RPC call (prepare or accept) will
return error to the coordinator, such replica response won't affect
the current round. This write won't affect any subsequent paxos rounds
either, unless in those rounds the write actually succeeds on both
shards, effectively 'auto-repairing' paxos state.
Same if we managed to write to the new shard but not to the old shard.
Any subsequent reads will observe either the old state or the new
state (if the tablet already switched reads to the new shard). In any
case, we'll have to write the state to all relevant shards
from sharder.shard_for_writes (one or two) before sending rpc
response, making this state visible for all subsequent reads.
Thus, the monotonicity property ("once observed, the state must always
be observed") appears to hold without requiring explicit read-repair
and load_and_repair_paxos_state is not needed.
Closesscylladb/scylladb#24926
This is a refactoring patch in preparation for BTI indexes. It contains no functional changes (or at least it's not intended to).
In this patch, we modify the sstable readers to use index readers through a new virtual `abstract_index_readers` interface.
Later, we will add BTI indexes which will also implement this interface.
This interface contains the methods of `index_reader` which are needed by sstable readers, and leaves out all other methods, such as `current_clustered_cursor`.
Not all methods of this interface will be implementable by a trie-based index later. For example, a trie-based index can't provide a reliable `get_partition_key()`, because — unlike the current index — it only stores partition keys for partitions which have a row index. So the interface will have to be further restricted later. We don't do that in this patch because that will require changes to sstable reader logic, and this patch is supposed to only include cosmetic changes.
No backports needed, this is a preparation for new functionality.
Closesscylladb/scylladb#25000
* github.com:scylladb/scylladb:
sstables: add sstable::make_index_reader() and use where appropriate
sstables/mx: in readers, use abstract_index_reader instead of index_reader
sstables: in validate(), use abstract_index_reader instead of index_reader where possible
test/lib/index_reader_assertions: accept abstract_index_reader instead of index_reader
sstables/index_reader: introduce abstract_index_reader
sstables/index_reader: extract a prefetch_lower_bound() method
This PR adds a way for custom indexes to decide whether a view should be created for them, as for the vector_index the view is not needed, because we store it in the external service. To allow this, custom logic for describing indexes using custom classes was added (as it used to depend on the view corresponding to an index).
Fixes: VECTOR-10
Closesscylladb/scylladb#24438
* github.com:scylladb/scylladb:
custom_index: do not create view when creating a custom index
custom_index: refactor describe for custom indexes
custom_index: remove unneeded duplicate of a static string
If we add multiple index implementations, users of index readers won't
easily know which concrete index reader type is the right one to construct.
We also don't want pieces of code to depend on functionality specific to
certain concrete types, if that's not necessary.
So instead of constructing the readers by themselves, they can use a helper
function, which will return an abstract (virtual) index reader.
This patch adds such a function, as a method of `sstable`.