scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-09 16:33:35 +00:00

Files

Botond Dénes 9190d42863 Merge 'repair: Fix rwlock in compaction_state and lock holder lifecycle' from Raphael Raph Carvalho

Consider this:

- repair takes the lock holder
- tablet merge filber destories the compaction group and the compaction state
- repair fails
- repair destroy the lock holder

This is observed in the test:

```
repair - repair[5d73d094-72ee-4570-a3cc-1cd479b2a036] Repair 1 out of 1 tablets: table=sec_index.users range=(432345564227567615,504403158265495551] replicas=[0e9d51a5-9c99-4d6e-b9db-ad36a148b0ea:15, 498e354c-1254-4d8d-a565-2f5c6523845a:9, 5208598c-84f0-4526-bb7f-573728592172:28]

...

repair - repair[5d73d094-72ee-4570-a3cc-1cd479b2a036]: Started to repair 1 out of 1 tables in keyspace=sec_index, table=users, table_id=ea2072d0-ccd9-11f0-8dba-c5ab01bffb77, repair_reason=repair
repair - Enable incremental repair for table=sec_index.users range=(432345564227567615,504403158265495551]
table - Disabled compaction for range=(432345564227567615,504403158265495551] session_id=a13a72cc-cd2d-11f0-8e9b-76d54580ab09 for incremental repair
table - Got unrepaired compaction and repair lock for range=(432345564227567615,504403158265495551] session_id=a13a72cc-cd2d-11f0-8e9b-76d54580ab09 for incremental repair
table - Disabled compaction for range=(432345564227567615,504403158265495551] session_id=a13a72cc-cd2d-11f0-8e9b-76d54580ab09 for incremental repair
table - Got unrepaired compaction and repair lock for range=(432345564227567615,504403158265495551] session_id=a13a72cc-cd2d-11f0-8e9b-76d54580ab09 for incremental repair
repair - repair[5d73d094-72ee-4570-a3cc-1cd479b2a036]: get_sync_boundary: got error from node=0e9d51a5-9c99-4d6e-b9db-ad36a148b0ea, keyspace=sec_index, table=users, range=(432345564227567615,504403158265495551], error=seastar::rpc::remote_verb_error (Compaction state for table [0x60f008fa34c0] not found)
compaction_manager - Stopping 1 tasks for 1 ongoing compactions for table sec_index.users compaction_group=238 due to tablet merge
compaction_manager - Stopping 1 tasks for 1 ongoing compactions for table sec_index.users compaction_group=238 due to tablet merge

....

scylla[10793] Segmentation fault on shard 28, in scheduling group streaming
```

The rwlock in compaction_state could be destroyed before the lock holder
of the rwlock is destroyed. This causes user after free when the lock
the holder is destroyed.

To fix it, users of repair lock will now be waited when a compaction
group is being stopped.
That way, compaction group - which controls the lifetime of rwlock -
cannot be destroyed while the lock is held.
Additionally, the merge completion fiber - that might remove groups -
is properly serialized with incremental repair.

The issue can be reproduced using sanitize build consistently and can not
be reproduced after the fix.

Fixes #27365

Closes scylladb/scylladb#28823

* github.com:scylladb/scylladb:
  repair: Fix rwlock in compaction_state and lock holder lifecycle
  repair: Prevent repair lock holder leakage after table drop

(cherry picked from commit 509f2af8db)

Closes scylladb/scylladb#28934

2026-03-09 10:25:47 +02:00

alternator

schema: Add initializer for compression defaults

2026-01-13 20:45:59 +02:00

boost

Merge '[Backport 2026.1] Fix regression in Alternator TTL with tablets and node going down' from Scylladb[bot]

2026-03-04 14:21:44 +02:00

broadcast_tables

test.py: switch of execution of several test directories by test.py runner

2026-01-09 11:59:25 +01:00

cluster

Merge 'repair: Fix rwlock in compaction_state and lock holder lifecycle' from Raphael Raph Carvalho

2026-03-09 10:25:47 +02:00

cql

test.py: switch of execution of several test directories by test.py runner

2026-01-09 11:59:25 +01:00

cqlpy

test: vector_similarity: Fix similarity value checks

2026-03-05 20:48:36 +02:00

ldap

main: auth: add auth cache dependency to auth service

2025-11-26 12:01:31 +01:00

lib

Merge 'db/batchlog_manager: re-add v1 support for mixed clusters' from Botond Dénes

2026-03-04 08:28:39 +02:00

manual

Populate all sl:* groups into dedicated top-level supergroup

2026-01-21 14:14:48 +02:00

nodetool

tools/scylla-nodetool: Increase precision of compression ratio from 1 to 2 decimal places

2026-01-05 07:07:06 +02:00

perf

tablets: Cache pointer to stats during plan-making

2026-01-29 09:06:49 +00:00

pylib

tests: pylib: util: Add exponential backoff to wait_for

2026-02-20 16:35:39 +00:00

pylib_test

…

raft

test/raft: use valid sentinel in liveness check to prevent digest errors

2026-01-06 14:34:02 +01:00

resource

build: apply sccache to rust builds too

2025-12-22 15:36:15 +02:00

rest_api

test: Keep test_gossiper_live_endpoints checks togethger

2026-01-23 16:53:48 +02:00

scylla_gdb

test/scylla_gdb: fix coro_task request usage, rename duplicate test

2026-01-23 15:25:58 +02:00

storage

topology_coordinator, tablets: Fail draining operations when tablet migration fails due to critical disk utilization

2026-01-18 15:36:07 +01:00

unit

code: Replace distributed<> with sharded<>

2025-09-19 12:22:51 +02:00

vector_search

Merge 'vector_search: test: fix HTTPS client test flakiness' from Karol Nowacki

2026-03-04 18:09:04 +01:00

__init__.py

test.py: introduce new environment variable TESTPY_PREPARED_ENVIRONMENT

2026-01-09 11:59:25 +01:00

CMakeLists.txt

Revert "Merge 'vector_search: add validator tests' from Pawel Pery"

2026-02-09 15:16:40 +02:00

conftest.py

test.py: refactor: move framework-related code to test.pylib.runner

2025-08-17 12:32:35 +00:00

pytest.ini

test.py: convert skip_mode function to pytest.mark

2026-01-08 21:55:16 +02:00

README.md

…

README.md

Scylla in-source tests.

For details on how to run the tests, see docs/dev/testing.md

Shared C++ utils, libraries are in lib/, for Python - pylib/

alternator - Python tests which connect to a single server and use the DynamoDB API unit, boost, raft - unit tests in C++ cqlpy - Python tests which connect to a single server and use CQL topology* - tests that set up clusters and add/remove nodes cql - approval tests that use CQL and pre-recorded output rest_api - tests for Scylla REST API Port 9000 scylla-gdb - tests for scylla-gdb.py helper script nodetool - tests for C++ implementation of nodetool

If you can use an existing folder, consider adding your test to it. New folders should be used for new large categories/subsystems, or when the test environment is significantly different from some existing suite, e.g. you plan to start scylladb with different configuration, and you intend to add many tests and would like them to reuse an existing Scylla cluster (clusters can be reused for tests within the same folder).

To add a new folder, create a new directory, and then copy & edit its suite.ini.