Files
scylladb/api
Avi Kivity 3ffe93b6ae Merge 'Enhance load-and-stream with "scope"' from Pavel Emelyanov
The main purpose of this change is to enhance the restore from object storage usage.

Currently, restore uses the load-and-stream facility. When triggered, the restoring task opens the provided list of sstables directory from the remote bucket and then feeds the list of sstables to load_and_stream() method. The method, in turn, iterates over this list, reads mutations and for each mutation decides where to send one by checking the replication map (it's pretty much the same for both vnodes and tablets, but for tablets that are "fully contained" by a range there's the plan to stream faster).

As described above, restore is governed by a single node and this single node reads all sstables from the object store, which can be very slow. This PR allows speeding things up. For that, the load-and-stream code is equipped with the "scope" filter which limits where mutations can be streamed to. There are four options for that -- all, dc, rack and node. The "all" is how things work currently, "dc" and "rack" filter out target nodes that don't belong to this node's dc/rack respectively. The "node" scope only streams mutations to local node.

With the "node" scope it's possible to make all nodes in the cluster load mutations that belong to them in parallel, without re-sending them to peers. The last patch in this PR is the test that shows how it can be possible.

Closes scylladb/scylladb#21169

* github.com:scylladb/scylladb:
  test: Add scope-streaming test (for restore from backup)
  api: New "scope" API param to load-and-stream calls
  sstables_loader: Propagate scope from API down
  sstables_loader: Filter tablets based on scope
  streamer: Disable scoped streaming of primary replica only
  sstables_loader: Introduce streaming scope
  sstables_loader: Wrap get_endpoints()
2024-12-25 13:52:51 +02:00
..