Files
scylladb/replica
Raphael S. Carvalho cc5b1acadf Improve log when sstable load fails due to missing tablet replica
A bug or some bad operator intervention can lead to a sstable existing in a
node after the tablet replica was moved to a different node.

This will result sstable loading during boot failing, requiring operator
intervention.
The log today just dumps the name of the "orphaned" sstable, but one
investigating it might want to know which process (repair, memtable, whatever)
generated that sstable, if the sstable was created locally or remotely,
and the current replica set of the underlying tablet.
From the original identifier, we can know the exact time the sstable was
created on its original node. From the current id, we know the time it
was created on the current node.
All this info can help the investigator to correlate with events in other nodes
(includes actions from the coordinator) to get closer to the root cause.

The new log will look like this:

"Unable to load SSTable .../me-3gyg_1fsw_2u0u826b00b71vc46o-big-Data.db
(originated from compaction with id 913f41c0-18c2-11f1-8f08-cb8521b3f330
on host e483238c-2287-4022-8bc4-b4f1c4cb2b0d)
of tablet 6 (replica set: [e483238c-2287-4022-8bc4-b4f1c4cb2b0d:0])"

Refs https://scylladb.atlassian.net/browse/SCYLLADB-788.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes scylladb/scylladb#28921
2026-03-11 06:20:34 +02:00
..