mirror of
https://github.com/tendermint/tendermint.git
synced 2026-05-29 10:30:20 +00:00
Added communication section
This commit is contained in:
66
spec/blocksync/communication.md
Normal file
66
spec/blocksync/communication.md
Normal file
@@ -0,0 +1,66 @@
|
||||
## Communication between peers and components within the blocksync reactor
|
||||
|
||||
Each peer has an open p2p channel. The number of total requests in flight is limited (`maxPendingRequests` initially set to `maxTotalRequesters`). Additionally, there is an upper limit on requests **per** peer (20).
|
||||
|
||||
Once a node receives messages via the p2p channel, they are propagated further via the reactor's go channels. This section contains details on each of the open communication channels, their capacity and when they get activated.
|
||||
|
||||
On startup, the reactor fires up four go routines:
|
||||
1. Process requests
|
||||
2. Pool routine
|
||||
3. Handle block sync channel messages
|
||||
4. Process peer updates
|
||||
|
||||
The pool routine picks out blocks form the block pool and processes them. It also checks whether we should switch to consensus if we are caught up. ToDo - change wording (Remove we, discuss whether the pool routing should be the one checking the condition for consensus).
|
||||
|
||||
|
||||
### Communication channels
|
||||
|
||||
`BlockSyncCh`: a p2p channel for sending/receiving requests to/from peers.
|
||||
- Channel id: 0x40
|
||||
- Size of receive buffer: 1024 messages
|
||||
- Size of send queue: 1000 messages
|
||||
- Message size: maximum size of a block + size of proto block messages (response message prefix and key size)
|
||||
|
||||
Messages processed via the channel:
|
||||
|
||||
- `BlockRequest {height int64, peerID types.NodeID}` : request block at height `height` from peer `peerID`.
|
||||
- `BlockResponse {block types.Block} `: Send `block` to peer that requested it.
|
||||
- `NoBlockResponse{height int64} `: Indicates that a peer does not have a block at `height`.
|
||||
- `StatusRequest {} `: Sent to a peer to request its status.
|
||||
- `StatusResponse {height int64, base int64} `: Send to a peer the lowest and heights height of blocks within it's store (`store.Height()`, `store.Base()`).
|
||||
|
||||
### Reactor channels
|
||||
|
||||
This section describes, per reactor component, the open channels and information they process.
|
||||
|
||||
#### `BlockPool`
|
||||
`requestsCh chan<- BlockRequest` The number of requests is capped by a fixed parameter `maxPendingRequestsPerPeer` (initially set to 20).
|
||||
errorsCh chan<- peerError ; Channel buffer size is limited.
|
||||
|
||||
The reactor sends a p2p block request once it receives a signal via this channel from the block pool. The block pool will first pick a peer (in round robin) and assign it this particular height. Once this is done, the reactor can request a block.
|
||||
|
||||
#### `bpRequester`
|
||||
`gotBlock chan struct{}`; capped at 1; Here we simply register a received block and keep waiting for the reactor to terminate or a redo request.
|
||||
|
||||
**Note**. It is not clear why we need this.
|
||||
|
||||
`redoCh chan types.NodeID`; capped at 1 ; Signals the requester to redo a request for aparticular block after replacing the peer for this height
|
||||
|
||||
When a block is received by a requester, the requester does a number of checks on the received block. Before marking the block as available, the requester verifies the following:
|
||||
- that we expected a block at the particular height.
|
||||
- the block came from the peer we assigned to it.
|
||||
- the block is not nil
|
||||
|
||||
In the code there is the following ToDo listed:
|
||||
` // TODO: ensure that blocks come in order for each peer.` This needs further specification.
|
||||
|
||||
If the checks pass, the `block` field of the requester is populated with the new block and i sthus made available to the blocksync reactor.
|
||||
|
||||
#### `Reactor`
|
||||
`requestsCh chan BlockRequest` :size `maxTotalRequesters`
|
||||
`errorsCh chan peerError` : size `maxPeerErrBuffer`
|
||||
|
||||
`didProcessCh chan struct{}` : size `1`.
|
||||
The channel is created within the pool routine of the reactor and is used to signal that the reactor should check the block pool for new blocks. A message is sent to the channel after a fixed timeout (`trySyncTicker`). As we need two blocks to verify one of them (this is more clearly defined in [verification](#./verification.md), if we miss only on of them, we will not wait for the sync timer to time out, but rather try quickly again until we fetch both.
|
||||
|
||||
`switchToConsensusTicker`. In addition to the sync timeout, in the same routine, the reactor checks periodically whether the conditions to switch to consensus are fullfilled.
|
||||
@@ -1,20 +1,11 @@
|
||||
### Data structures
|
||||
|
||||
These are the core data structures necessary to provide the Blocksync Reactor logic.
|
||||
There are four core components of the blocksync reactor: the reactor itself, a block pool, requesters and peers.
|
||||
|
||||
The requester data structure is used to track the assignment of a request for a `block` at position `height` to a peer whose id equals to `peerID`.
|
||||
The reactor verifies received blocks, executes them against the application and commits them into the blockstore of the node. It also sends out requests to peers asking for more blocks and contains the logic to switch from blocksync to consenus.
|
||||
It contains a pointer to the block pool.
|
||||
|
||||
```go
|
||||
type bpRequester {
|
||||
mtx Mutex
|
||||
block *types.Block
|
||||
height int64
|
||||
peerID types.nodeID
|
||||
redoChannel chan type.nodeID //redo may send multi-time; peerId is used to identify repeat
|
||||
goBlockCh chan struct{}{}
|
||||
}
|
||||
```
|
||||
Pool is a core data structure that stores last executed block (`height`), assignment of requests to peers (`requesters`), current height for each peer and number of pending requests for each peer (`peers`), maximum peer height, etc.
|
||||
The block pool stores the last executed block(`height`), keeps track of peers connected to a node, assigns requests to peers (by creating `requesters`), the current height for each peer, along with the number of pending requestes for each peer.
|
||||
|
||||
```go
|
||||
type BlockPool {
|
||||
@@ -30,7 +21,20 @@ type BlockPool {
|
||||
}
|
||||
```
|
||||
|
||||
The `Peer` data structure stores for each peer its current `height` and number of pending requests sent to the peer (`numPending`), etc.
|
||||
Each requester is used to track the assignement of a request for a `block` at position `height` to a peer whose id equals to `peerID`.
|
||||
|
||||
```go
|
||||
type bpRequester {
|
||||
mtx Mutex
|
||||
block *types.Block
|
||||
height int64
|
||||
peerID types.nodeID
|
||||
redoChannel chan type.nodeID //redo may send multi-time; peerId is used to identify repeat
|
||||
goBlockCh chan struct{}{}
|
||||
}
|
||||
```
|
||||
|
||||
Each `Peer` data structure stores for each peer its current `height` and number of pending requests sent to the peer (`numPending`), etc. When a block is processed, this number is decremented.
|
||||
|
||||
```go
|
||||
type bpPeer struct {
|
||||
@@ -44,12 +48,3 @@ type bpPeer struct {
|
||||
}
|
||||
```
|
||||
|
||||
BlockRequest is an internal data structure used to denote current mapping of request for a block at some `height` to a peer (`PeerID`).
|
||||
|
||||
```go
|
||||
type BlockRequest {
|
||||
Height int64
|
||||
PeerID p2p.ID
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
@@ -25,10 +25,18 @@ using the Consensus Reactor.
|
||||
|
||||
A node can switch to blocksync directly on start-up or after completing `state-sync`. Currently, switching back to blocksync from consensus is not possible. It is expected to be handled in [Issue #129](https://github.com/tendermint/tendermint/issues/129).
|
||||
|
||||
### Switching from block sync to consensus
|
||||
The blocksync reactor service is started at the same time as all the other services in Tendermint. But blocksync-inc is disabled (blockSync boolean flag is false) initially and thus the blockpool and the routine to process blocks from the pool are not launched until the reactor is actually activated.
|
||||
|
||||
The reactor is actived after state sync, where the pool and request processing routines are launched.
|
||||
|
||||
However, receiving messages via the p2p channel and sending status updates to other nodes is enabled regardless of whether the blocksync reactor is started. This makes sense as a node should be able to send updates to other peers regardless of whether it itself is blocksyncing.
|
||||
|
||||
**Note**. In the current version, if we start from state sync and block sync is not launched before as a service, the internal channels used by the reactor will not be created. We need to be careful to launch the blocksync *service* before we call the function to switch from statesync to blocksync.
|
||||
|
||||
### Switching from blocksync to consensus
|
||||
Ideally, the switch to consensus is done either after we have caught up to the maximum height reported by a peer or we have not advanced our height for more than 60s.
|
||||
|
||||
This former checked by calling `isCaughtUp` inside `poolRoutine` periodically. This period is set with `switchToConsensusTicker`. We consider a node to be caught up if it is 1 height away from the maximum height reported by its peers. The reason we **do not catch up until the maximum height** (`pool.maxPeerHeight`)is that we cannot verify block at `pool.maxPeerHeight` without the `lastCommit` of the block at `pool.maxPeerHeight + 1`.
|
||||
The former id checked by calling `isCaughtUp` inside `poolRoutine` periodically. This period is set with `switchToConsensusTicker`. We consider a node to be caught up if it is 1 height away from the maximum height reported by its peers. The reason we **do not catch up until the maximum height** (`pool.maxPeerHeight`)is that we cannot verify block at `pool.maxPeerHeight` without the `lastCommit` of the block at `pool.maxPeerHeight + 1`.
|
||||
|
||||
BlockSync **does not** switch to consensus until we have synced at least one block as we need to have vote extensions in order to participate in consensus . Vote extensions are not provided to the blocksync reactor after state sync and we need to receive them from one of our peers.
|
||||
|
||||
@@ -38,7 +46,7 @@ The Blocksync reactor is organised as a set of concurrent tasks:
|
||||
|
||||
- Receive routine of Blocksync Reactor
|
||||
- Task for creating Requesters
|
||||
- Set of Requesters tasks and - Controller task.
|
||||
- Set of Requester tasks and - Controller task.
|
||||
|
||||
|
||||

|
||||
@@ -49,3 +57,5 @@ This section describes the Blocksync reactor and its internals including:
|
||||
- [Block Verification](./block_verification.md)
|
||||
|
||||
More details on how to use the Blocksync reactor and configure it when running Tendermint can be found [here](./../docs/tendermint-core/block-sync/README.md).
|
||||
|
||||
|
||||
|
||||
Reference in New Issue
Block a user