scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-22 09:30:45 +00:00

Files

Asias He 6f04de3efd streaming: Fail stream plan on stream_mutation_fragments handler in case of error

The following is observed in pytest:

1) node1, stream master, tried to pull data from node3

2) node3, stream follower, found node1 restarted

3) node3 killed the rpc stream

4) node1 did not get the stream session failure message from node3. This
failure message was supposed to kill the stream plan on node1. That's the
reason node1 failed the stream session much later at "2024-08-19 21:07:45,539".
Note, node3 failed the stream on its side, so it should have sent the stream
session failure message.

```
$ cat node1.log |grep f890bea0-5e68-11ef-99ae-e5bca04385fc
INFO  2024-08-19 20:24:01,162 [shard 0:strm] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] Executing streaming plan for Tablet migration-ks-index-0 with peers={127.0.34.3}, master
ERROR 2024-08-19 20:24:01,190 [shard 1:strm] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] Failed to handle STREAM_MUTATION_FRAGMENTS (receive and distribute phase) for ks=ks, cf=cf, peer=127.0.34.3: seastar::nested_exception: seastar::rpc::stream_closed (rpc stream was closed by peer) (while cleaning up after seastar::rpc::stream_closed (rpc stream was closed by peer))
WARN  2024-08-19 21:07:45,539 [shard 0:main] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] Streaming plan for Tablet migration-ks-index-0 failed, peers={127.0.34.3}, tx=0 KiB, 0.00 KiB/s, rx=484 KiB, 0.18 KiB/s

$ cat node3.log |grep f890bea0-5e68-11ef-99ae-e5bca04385fc
INFO  2024-08-19 20:24:01,163 [shard 0:strm] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] Executing streaming plan for Tablet migration-ks-index-0 with peers=127.0.34.1, slave
INFO  2024-08-19 20:24:01,164 [shard 1:strm] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] Start sending ks=ks, cf=cf, estimated_partitions=2560, with new rpc streaming
WARN  2024-08-19 20:24:01,187 [shard 0: gms] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] Streaming plan for Tablet migration-ks-index-0 failed, peers={127.0.34.1}, tx=633 KiB, 26506.81 KiB/s, rx=0 KiB, 0.00 KiB/s
WARN  2024-08-19 20:24:01,188 [shard 0:strm] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] stream_transfer_task: Fail to send to 127.0.34.1:0: seastar::rpc::stream_closed (rpc stream was closed by peer)
WARN  2024-08-19 20:24:01,189 [shard 0:strm] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] Failed to send: seastar::rpc::stream_closed (rpc stream was closed by peer)
WARN  2024-08-19 20:24:01,189 [shard 0:strm] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] Streaming error occurred, peer=127.0.34.1
```

To be safe in case the stream fail message is not received, node1 could fail
the stream plan as soon as the rpc stream is aborted in the
stream_mutation_fragments handler.

Fixes #20227

Closes scylladb/scylladb#21960

2025-02-10 16:32:12 +01:00

alternator

test/alternator: fix running against installation blocking CQL

2025-02-05 19:01:31 +03:00

auth_cluster

test: implement test_auth_password_ensured

2025-02-06 10:30:55 +01:00

boost

Merge 'test: tablets_test: Create proper schema in load balancer tests' from Tomasz Grabiec

2025-02-10 16:08:41 +02:00

broadcast_tables

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

cql

cql: restore validating replication strategies options

2025-02-04 12:27:33 +01:00

cqlpy

cql: remove expansion of "SELECT *" in DESC MATERIALIZED VIEW

2025-02-10 15:01:23 +02:00

ldap

test.py: Add possibility to run ldap tests from pytest

2025-02-07 21:40:28 +01:00

lib

Merge 'test: tablets_test: Create proper schema in load balancer tests' from Tomasz Grabiec

2025-02-10 16:08:41 +02:00

manual

moved cache files to db

2025-02-04 12:21:31 +03:00

nodetool

api: task_manager: add /task_manager/drain

2025-01-27 11:23:45 +01:00

object_store

backup_task: remove a component once it is uploaded

2025-01-22 11:17:01 +08:00

perf

Merge 'Add per-table tablet options in schema' from Benny Halevy

2025-02-08 20:32:19 +02:00

pylib

streaming: Fail stream plan on stream_mutation_fragments handler in case of error

2025-02-10 16:32:12 +01:00

pylib_test

test.py: Create central conftest.

2024-11-24 20:09:48 +02:00

raft

test: Use linux-aio backend again on seastar-based tests

2025-02-05 15:19:24 +02:00

redis

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

resource

build: cmake: use wasm32-wasip1 as an alternative of wasm32-wasi

2025-01-16 16:28:29 +03:00

rest_api

api: task_manager: do not unregister tasks on get_status

2025-01-27 11:23:45 +01:00

scylla_gdb

test/scylla_gdb: add more checks to coro_task()

2025-01-29 11:02:24 +02:00

topology

test.py: Add discovery for C++ tests for pytest

2025-02-07 19:44:06 +01:00

topology_custom

streaming: Fail stream plan on stream_mutation_fragments handler in case of error

2025-02-10 16:32:12 +01:00

topology_random_failures

test.py:topology_random_failures: enable tests deselected for #21534

2025-01-30 12:12:19 +01:00

topology_tasks

tasks: add shard, start_time, and end_time to task_stats

2025-02-04 12:11:24 +02:00

unit

test.py: Add the possibility to run unit tests from pytest

2025-02-07 21:40:28 +01:00

__init__.py

…

CMakeLists.txt

Introduce LDAP role manager & saslauthd authenticator

2025-01-12 14:50:29 +02:00

conftest.py

test.py: Add discovery for C++ tests for pytest

2025-02-07 19:44:06 +01:00

pytest.ini

test.py: Add the possibility to run boost test from pytest

2025-02-07 21:40:25 +01:00

README.md

test: rename "cql-pytest" to "cqlpy"

2024-11-06 16:48:36 +02:00

README.md

Scylla in-source tests.

For details on how to run the tests, see docs/dev/testing.md

Shared C++ utils, libraries are in lib/, for Python - pylib/

alternator - Python tests which connect to a single server and use the DynamoDB API unit, boost, raft - unit tests in C++ cqlpy - Python tests which connect to a single server and use CQL topology* - tests that set up clusters and add/remove nodes cql - approval tests that use CQL and pre-recorded output rest_api - tests for Scylla REST API Port 9000 scylla-gdb - tests for scylla-gdb.py helper script nodetool - tests for C++ implementation of nodetool

If you can use an existing folder, consider adding your test to it. New folders should be used for new large categories/subsystems, or when the test environment is significantly different from some existing suite, e.g. you plan to start scylladb with different configuration, and you intend to add many tests and would like them to reuse an existing Scylla cluster (clusters can be reused for tests within the same folder).

To add a new folder, create a new directory, and then copy & edit its suite.ini.