This allows the user of `raft::server` to cause it to create a snapshot
and truncate the Raft log (leaving no trailing entries; in the future we
may extend the API to specify number of trailing entries left if
needed). In a later commit we'll add a REST endpoint to Scylla to
trigger group 0 snapshots.
One use case for this API is to create group 0 snapshots in Scylla
deployments which upgraded to Raft in version 5.2 and started with an
empty Raft log with no snapshot at the beginning. This causes problems,
e.g. when a new node bootstraps to the cluster, it will not receive a
snapshot that would contain both schema and group 0 history, which would
then lead to inconsistent schema state and trigger assertion failures as
observed in scylladb/scylladb#16683.
In 5.4 the logic of initial group 0 setup was changed to start the Raft
log with a snapshot at index 1 (ff386e7a44)
but a problem remains with these existing deployments coming from 5.2,
we need a way to trigger a snapshot in them (other than performing 1000
arbitrary schema changes).
Another potential use case in the future would be to trigger snapshots
based on external memory pressure in tablet Raft groups (for strongly
consistent tables).
The PR adds the API to `raft::server` and a HTTP endpoint that uses it.
In a follow-up PR, we plan to modify group 0 server startup logic to automatically
call this API if it sees that no snapshot is present yet (to automatically
fix the aforementioned 5.2 deployments once they upgrade.)
Closes scylladb/scylladb#16816
* github.com:scylladb/scylladb:
raft: remove `empty()` from `fsm_output`
test: add test for manual triggering of Raft snapshots
api: add HTTP endpoint to trigger Raft snapshots
raft: server: add `trigger_snapshot` API
raft: server: track last persisted snapshot descriptor index
raft: server: framework for handling server requests
raft: server: inline `poll_fsm_output`
raft: server: fix indentation
raft: server: move `io_fiber`'s processing of `batch` to a separate function
raft: move `poll_output()` from `fsm` to `server`
raft: move `_sm_events` from `fsm` to `server`
raft: fsm: remove constructor used only in tests
raft: fsm: move trace message from `poll_output` to `has_output`
raft: fsm: extract `has_output()`
raft: pass `max_trailing_entries` through `fsm_output` to `store_snapshot_descriptor`
raft: server: pass `*_aborted` to `set_exception` call
Scylla in-source tests.
For details on how to run the tests, see docs/dev/testing.md
Shared C++ utils, libraries are in lib/, for Python - pylib/
alternator - Python tests which connect to a single server and use the DynamoDB API unit, boost, raft - unit tests in C++ cql-pytest - Python tests which connect to a single server and use CQL topology* - tests that set up clusters and add/remove nodes cql - approval tests that use CQL and pre-recorded output rest_api - tests for Scylla REST API Port 9000 scylla-gdb - tests for scylla-gdb.py helper script nodetool - tests for C++ implementation of nodetool
If you can use an existing folder, consider adding your test to it. New folders should be used for new large categories/subsystems, or when the test environment is significantly different from some existing suite, e.g. you plan to start scylladb with different configuration, and you intend to add many tests and would like them to reuse an existing Scylla cluster (clusters can be reused for tests within the same folder).
To add a new folder, create a new directory, and then
copy & edit its suite.ini.