mirror of
https://github.com/scylladb/scylladb.git
synced 2026-04-22 17:40:34 +00:00
This patch adds the beginning of node repair support. Repair is initiated on a node using the REST API, for example to repair all the column families in the "try1" keyspace, you can use: curl -X GET --header "Content-Type: application/json" --header "Accept: application/json" "http://127.0.0.1:10000/storage_service/repair_async/try1" I tested that the repair already works (exchanges mutations with all other replicas, and successfully repairs them), so I think can be committed, but will need more work to be completed 1. Repair options are not yet supported (range repair, sequential/parallel repair, choice of hosts, datacenters and column families, etc.). 2. *All* the data of the keyspace is exchanged - Merkle Trees (or an alternative optimization) and partial data exchange haven't been implemented yet. 3. Full repair for nodes with multiple separate ranges is not yet implemented correctly. E.g., consider 10 nodes with vnodes and RF=2, so each vnode's range has a different host as a replica, so we need to exchange each key range separately with a different remote host. 4. Our repair operation returns a numeric operation id (like Origin), but we don't yet provide any means to use this id to check on ongoing repairs like Origin allows. 5. Error hangling, logging, etc., needs to be improved. 6. SMP nodes (with multiple shards) should work correctly (thanks to Asias's latest patch for SMP mutation streaming) but haven't been tested. 7. Incremental repair is not supported (see http://www.datastax.com/dev/blog/more-efficient-repairs) Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>