From b39af911aedd0125b238662caa0ea2e0109c2e11 Mon Sep 17 00:00:00 2001 From: Callum Waters Date: Mon, 19 Apr 2021 17:02:09 +0200 Subject: [PATCH] RFC: ReverseSync - fetching historical data (#224) --- rfc/005-reverse-sync.md | 197 ++++++++++++++++++++++++++++++++++++++++ rfc/README.md | 2 + 2 files changed, 199 insertions(+) create mode 100644 rfc/005-reverse-sync.md diff --git a/rfc/005-reverse-sync.md b/rfc/005-reverse-sync.md new file mode 100644 index 000000000..339b9d9fb --- /dev/null +++ b/rfc/005-reverse-sync.md @@ -0,0 +1,197 @@ +# RFC 004: ReverseSync - fetching historical data + +## Changelog + +- 2021-04-19: Use P2P to gossip necessary data for reverse sync. +- 2021-03-03: Simplify proposal to the state sync case. +- 2021-02-17: Add notes on asynchronicity of processes. +- 2020-12-10: Rename backfill blocks to reverse sync. +- 2020-11-25: Initial draft. + +## Author(s) + +- Callum Waters (@cmwaters) + +## Context + +Two new features: [Block pruning](https://github.com/tendermint/tendermint/issues/3652) +and [State sync](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-042-state-sync.md) +meant nodes no longer needed a complete history of the blockchain. This +introduced some challenges of its own which were covered and subsequently +tackled with [RFC-001](https://github.com/tendermint/spec/blob/master/rfc/001-block-retention.md). +The RFC allowed applications to set a block retention height; an upper bound on +what blocks would be pruned. However nodes who state sync past this upper bound +(which is necessary as snapshots must be saved within the trusting period for +the assisting light client to verify) have no means of backfilling the blocks +to meet the retention limit. This could be a problem as nodes who state sync and +then eventually switch to consensus (or fast sync) may not have the block and +validator history to verify evidence causing them to panic if they see 2/3 +commit on what the node believes to be an invalid block. + +Thus, this RFC sets out to instil a minimum block history invariant amongst +honest nodes. + +## Proposal + +A backfill mechanism can simply be defined as an algorithm for fetching, +verifying and storing, headers and validator sets of a height prior to the +current base of the node's blockchain. In matching the terminology used for +other data retrieving protocols (i.e. fast sync and state sync), we +call this method **ReverseSync**. + +We will define the mechanism in four sections: +- Usage +- Design +- Verification +- Termination + +### Usage + +For now, we focus purely on the case of a state syncing node, whom after +syncing to a height will need to verify historical data in order to be capable +of processing new blocks. We can denote the earliest height that the node will +need to verify and store in order to be able to verify any evidence that might +arise as the `max_historical_height`/`time`. Both height and time are necessary +as this maps to the BFT time used for evidence expiration. After acquiring +`State`, we calculate these parameters as: + +```go +max_historical_height = max(state.InitialHeight, state.LastBlockHeight - state.ConsensusParams.EvidenceAgeHeight) +max_historical_time = max(GenesisTime, state.LastBlockTime.Sub(state.ConsensusParams.EvidenceAgeTime)) +``` + +Before starting either fast sync or consensus, we then run the following +synchronous process: + +```go +func ReverseSync(max_historical_height int64, max_historical_time time.Time) error +``` + +Where we fetch and verify blocks until a block `A` where +`A.Height <= max_historical_height` and `A.Time <= max_historical_time`. + +Upon successfully reverse syncing, a node can now safely continue. As this +feature is only used as part of state sync, one can think of this as merely an +extension to it. + +In the future we may want to extend this functionality to allow nodes to fetch +historical blocks for reasons of accountability or data accessibility. + +### Design + +This section will provide a high level overview of some of the more important +characteristics of the design, saving the more tedious details as an ADR. + +#### P2P + +Implementation of this RFC will require the addition of a new channel and two +new messages. + +```proto +message LightBlockRequest { + uint64 height = 1; +} +``` + +```proto +message LightBlockResponse { + Header header = 1; + Commit commit = 2; + ValidatorSet validator_set = 3; +} +``` + +The P2P path may also enable P2P networked light clients and a state sync that +also doesn't need to rely on RPC. + +### Verification + +ReverseSync is used to fetch the following data structures: +- `Header` +- `Commit` +- `ValidatorSet` + +Nodes will also need to be able to verify these. This can be achieved by first +retrieving the header at the base height from the block store. From this trusted +header, the node hashes each of the three data structures and checks that they are correct. + +1. The trusted header's last block ID matches the hash of the new header +```go +header[height].LastBlockID == hash(header[height-1]) +``` + +2. The trusted header's last commit hash matches the hash of the new commit +```go +header[height].LastCommitHash == hash(commit[height-1]) +``` + +3. Given that the node now trusts the new header, check that the header's validator set +hash matches the hash of the validator set +```go +header[height-1].ValidatorsHash == hash(validatorSet[height-1]) +``` + +### Termination + +ReverseSync draws a lot of parallels with fast sync. An important consideration +for fast sync that also extends to ReverseSync is termination. ReverseSync will +finish it's task when one of the following conditions have been met: + +1. It reaches a block `A` where `A.Height <= max_historical_height` and +`A.Time <= max_historical_time`. +2. None of it's peers reports to have the block at the height below the +processes current block. +3. A global timeout. + +This implies that we can't guarantee adequate history and thus the term +"invariant" can't be used in the strictest sense. In the case that the first +condition isn't met, the node will log an error and optimistically attempt +to continue with either fast sync or consensus. + +## Alternative Solutions + +The need for a minimum block history invariant stems purely from the need to +validate evidence (although there may be some application relevant needs as +well). Because of this, an alternative, could be to simply trust whatever the +2/3+ majority has agreed upon and in the case where a node is at the head of the +blockchain, you simply abstain from voting. + +As it stands, if 2/3+ vote on evidence you can't verify, in the same manner if +2/3+ vote on a header that a node sees as invalid (perhaps due to a different +app hash), the node will halt. + +Another alternative is the method with which the relevant data is retrieved. +Instead of introducing new messages to the P2P layer, RPC could have been used +instead. + +The aforementioned data is already available via the following RPC endpoints: +`/commit` for `Header`'s' and `/validators` for `ValidatorSet`'s'. It was +decided predominantly due to the instability of the current RPC infrastructure +that P2P be used instead. + +## Status + +Proposed + +## Consequences + +### Positive + +- Ensures a minimum block history invariant for honest nodes. This will allow + nodes to verify evidence. + +### Negative + +- Statesync will be slower as more processing is required. + +### Neutral + +- By having validator sets served through p2p, this would make it easier to +extend p2p support to light clients and state sync. +- In the future, it may also be possible to extend this feature to allow for +nodes to freely fetch and verify prior blocks + +## References + +- [RFC-001: Block retention](https://github.com/tendermint/spec/blob/master/rfc/001-block-retention.md) +- [Original issue](https://github.com/tendermint/tendermint/issues/4629) diff --git a/rfc/README.md b/rfc/README.md index b07a2c37f..c9932b749 100644 --- a/rfc/README.md +++ b/rfc/README.md @@ -25,3 +25,5 @@ Some RFC's will be presented at a Tendermint Dev Session. If you are an outside - [001-block-retention](./001-block-retention.md) - [002-nonzero-genesis](./002-nonzero-genesis.md) - [003-ed25519-verification](./003-ed25519-verification.md) +- [004-abci++](./004-abci++.md) +- [005-reverse-sync](./005-reverse-sync.md)