mirror of
https://github.com/scylladb/scylladb.git
synced 2026-05-13 03:12:13 +00:00
For example, to bootstrap a 50th node in a cluster [shard 0] range_streamer - Bootstrap with [127.0.0.8, 127.0.0.2, 127.0.0.24, 127.0.0.21, 127.0.0.49, 127.0.0.44, 127.0.0.9, 127.0.0.7, 127.0.0.47, 127.0.0.15, 127.0.0.5, 127.0.0.30, 127.0.0.14, 127.0.0.12, 127.0.0.36, 127.0.0.11, 127.0.0.48, 127.0.0.28, 127.0.0.33, 127.0.0.10, 127.0.0.41, 127.0.0.4, 127.0.0.40, 127.0.0.3, 127.0.0.6, 127.0.0.43, 127.0.0.22, 127.0.0.26, 127.0.0.42, 127.0.0.25, 127.0.0.17, 127.0.0.37, 127.0.0.23, 127.0.0.13, 127.0.0.38, 127.0.0.1, 127.0.0.18, 127.0.0.20, 127.0.0.39, 127.0.0.27, 127.0.0.34, 127.0.0.32, 127.0.0.19, 127.0.0.16, 127.0.0.31, 127.0.0.45, 127.0.0.29, 127.0.0.35, 127.0.0.46] for keyspace=keyspace1 started, nodes_to_stream=49, nodes_in_parallel=49 the new node will get data from 49 existing nodes. Currently, it will stream from all the 49 existing nodes at the same time. It is not a good idea to stream from all the nodes in parallel which can overwhelm the bootstrap node, i.e., 49 nodes sending, 1 node receiving. To fix this, limit the nr of nodes to stream in parallel. We should have a better control over the memory usage and parallelism. But for now, limit the nr of nodes to a maximum of 16 as a starter. With this limit, each shard can work with as many as 16 remote nodes in parallel, I think this has enough parallelism for streaming in terms of performance. This change have effect on the bootstrap/decommission/removenode node operations, and do not have effect on repair. Refs #2782 Message-Id: <980610dc97490d4f16281a0c3203b9bee73e04e4.1531989557.git.asias@scylladb.com>