This PR makes vote extensions optional within Tendermint. A new ConsensusParams field, called ABCIParams.VoteExtensionsEnableHeight, has been added to toggle whether or not extensions should be enabled or disabled depending on the current height of the consensus engine. Related to: #8453
Closes: #8575
This PR aims to fix the `LastCommitRound can only be negative for initial height 0` issue we see in the e2e tests by initializing the `state` object before starting the receive routines in the consensus reactor. This is somewhat inelegant, but it should fix the issue.
This pull request adds an additional set of metrics targeted at providing more visibility into `abci++`.
The following set of metrics are added and exposed through the `metrics` endpoint:
```
tendermint_consensus_proposal_receive_count{chain_id="test-chain-IrF74Y",status="accepted"} 34
tendermint_consensus_proposal_create_count{chain_id="test-chain-IrF74Y"} 34
tendermint_consensus_vote_extension_receive_count{chain_id="test-chain-IrF74Y",status="accepted"} 34
tendermint_consensus_round_voting_power_percent{chain_id="test-chain-IrF74Y",vote_type="precommit"} 1
tendermint_consensus_round_voting_power_percent{chain_id="test-chain-IrF74Y",vote_type="prevote"} 1
tendermint_state_consensus_param_updates{chain_id="test-chain-IrF74Y"} 0
tendermint_state_validator_set_updates{chain_id="test-chain-IrF74Y"} 0
tendermint_consensus_late_votes{chain_id="test-chain-IrF74Y",vote_type="precommit"} 16
```
This pull request also updates the `metrics.md` file to include some metrics that were previously missed. My hope is to generate the `metrics.md` file with a future version of the tool being architected in #8479
* Refactor so building and linting works
This is the first step towards implementing vote extensions: generating
the relevant proto stubs and getting the build and linter to pass.
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Fix typo
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Better describe method given vote extensions
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Fix types tests
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Move CanonicalVoteExtension to canonical types proto defs
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Regenerate protos including latest PBTS synchrony params update
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Inject vote extensions into proposal
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Thread vote extensions through code and fix tests
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Remove extraneous empty value initialization
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Fix lint
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Fix missing VerifyVoteExtension request data
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Explicitly ensure length > 0 to sign vote extension
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Explicitly ensure length > 0 to sign vote extension
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Remove extraneous comment
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Update privval/file.go
Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
* Update types/vote_test.go
Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
* Format
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Fix ABCI proto generation scripts for Linux
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Sync intermediate and goal protos
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Update internal/consensus/common_test.go
Co-authored-by: Sergio Mena <sergio@informal.systems>
* Use dummy value with clearer meaning
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Rewrite loop for clarity
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Panic on ABCI++ method call failure
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Add strong correctness guarantees when constructing extended commit info for ABCI++
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Add strong guarantee in extendedCommitInfo that the number of votes corresponds
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Make extendedCommitInfo function more robust
At first extendedCommitInfo expected votes to be in the same order as
their corresponding validators in the supplied CommitInfo struct, but
this proved to be rather difficult since when a validator set's loaded
from state it's first sorted by voting power and then by address.
Instead of sorting the votes in the same way, this approach simply maps
votes to their corresponding validator's address prior to constructing
the extended commit info. This way it's easy to look up the
corresponding vote and we don't need to care about vote order.
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Remove extraneous validator address assignment
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Sign over canonical vote extension
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Validate vote extension signature against canonical vote extension
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Update privval tests for more meaningful dummy value
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Add vote extension capability to E2E test app
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Disable lint for weak RNG usage for test app
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Use parseVoteExtension instead of custom parsing in PrepareProposal
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Only include extension if we have received txs
It's unclear at this point why this is necessary to ensure that the
application's local app_hash matches that committed in the previous
block.
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Require app_hash from app to match that from last block
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Add contrived (possibly flaky) test to check that vote extensions code works
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Remove workaround for problem now solved by #8229
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* add tests for vote extension cases
* Fix spelling mistake to appease linter
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Collapse redundant if statement
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Formatting
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Always expect an extension signature, regardless of whether an extension is present
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Votes constructed from commits cannot include extensions or signatures
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Pass through vote extension in test helpers
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Temporarily disable vote extension signature requirement
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Expand on vote equality test errors for clarity
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Expand on vote matching error messages in testing
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Allow for selective subscription by vote type
This is an attempt to fix the intermittently failing
`TestPrepareProposalReceivesVoteExtensions` test in the internal
consensus package.
Occasionally we get prevote messages via the subscription channel, and
we're not interested in those. This change allows us to specify what
types of votes we're interested in (i.e. precommits) and discard the
rest.
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Read lock consensus state mutex in test helper to avoid data race
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Revert BlockIDFlag parameter in node test
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Perform additional check in ProcessProposal for special txs generated by vote extensions
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* e2e: check that our added tx does not cause all txs to exceed req.MaxTxBytes
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Only set vote extension signatures when signing is successful
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Remove channel capacity constraint in test helper to avoid missing messages
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Add TODO to always require extension signatures in vote validation
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* e2e: reject vote extensions if the request height does not match what we expect
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* types: remove extraneous call to voteWithoutExtension in test
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Remove unnecessary address parameter from CanonicalVoteExtension
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* privval: change test vote type to precommit since we use an extension
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* privval: update signing logic to cater for vote extensions
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* proto: update field descriptions for vote message
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* proto: update field description for vote extension sig in vote message
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* proto/types: use fixed-length 64-bit integers for rounds in CanonicalVoteExtension
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* consensus: fix flaky TestPrepareProposalReceivesVoteExtensions
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* consensus: remove previously added test helper functionality
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* e2e: add error logs when we get an unexpected height in ExtendVote or VerifyVoteExtension requests
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* node_test: get validator addresses from privvals
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* privval/file_test: optimize filepv creation in tests
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* privval: add test to check that vote extensions are always signed
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Add a script to check documentation for ToC entries. (#8356)
This script verifies that each document in the docs and architecture directory
has a corresponding table-of-contents entry in its README file. It can be run
manually from the command line.
- Hook up this script to run in CI (optional workflow).
- Update ADR ToC to include missing entries this script found.
* build(deps): Bump async from 2.6.3 to 2.6.4 in /docs (#8357)
Bumps [async](https://github.com/caolan/async) from 2.6.3 to 2.6.4.
- [Release notes](https://github.com/caolan/async/releases)
- [Changelog](https://github.com/caolan/async/blob/v2.6.4/CHANGELOG.md)
- [Commits](https://github.com/caolan/async/compare/v2.6.3...v2.6.4)
---
updated-dependencies:
- dependency-name: async
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* privval/file_test: reset vote ext sig before signing
Signed-off-by: Thane Thomson <connect@thanethomson.com>
Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
Co-authored-by: Sergio Mena <sergio@informal.systems>
Co-authored-by: William Banfield <wbanfield@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Outstanding abci-gen changes to 'pb.go' files
* Removed modified_tx_status from spec and protobufs
* Fix sed for OSX
* Regenerated abci protobufs with 'abci-proto-gen'
* Code changes. UTs e2e tests passing
* Recovered UT: TestPrepareProposalModifiedTxStatusFalse
* Adapted UT
* Fixed UT
* Revert "Fix sed for OSX"
This reverts commit e576708c61.
* Update internal/state/execution_test.go
Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com>
* Update abci/example/kvstore/kvstore.go
Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
* Update internal/state/execution_test.go
Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com>
* Update spec/abci++/abci++_tmint_expected_behavior_002_draft.md
Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com>
* Addressed some comments
* Added one test that tests error at the ABCI client + Fixed some mock calls
* Addressed remaining comments
* Update abci/example/kvstore/kvstore.go
Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com>
* Update abci/example/kvstore/kvstore.go
Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com>
* Update abci/example/kvstore/kvstore.go
Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com>
* Update spec/abci++/abci++_tmint_expected_behavior_002_draft.md
Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com>
* Addressed William's latest comments
* Adressed Michael's comment
* Fixed UT
* Some md fixes
* More md fixes
* gofmt
Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com>
Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
Replaces the set of timeout parameters in the config.toml file with unsafe-*override versions of the corresponding ConsensusParams.Timeout field. These fields can be used for the duration of v0.36 to override the consensus param in case of emergency.
Adds a set to the ./internal/consensus/State type for correctly calculating the value of each timeout based on the set of overrides specified.
I see this panic in tests occasionally, and I don't think there's any
need to close this channel:
- it's only sent to in one place which has a select case with a
default clause, so there's no chance of deadlocks.
- the only place we recieve from it thas a timeout.
closes: #7756
# What does this pull request change?
This pull request adds a new runbook for operators enountering errors related to the new Proposer-Based Timestamps algorithm. The goal of this runbook is to give operators a set of clear steps that they can follow if they are having issues producing blocks because of clock synchronization problems.
This pull request also renames the `*PrevoteDelay` metrics to drop the term `MessageDelay`. These metrics provide a combined view of `message_delay` + `synchrony` so the name may be confusing.
# Questions to reviewers
* Are there ways to make the set of steps clearer or are there any pieces that seem confusing?
This change implements the logic for the PrepareProposal ABCI++ method call. The main logic for creating and issuing the PrepareProposal request lives in execution.go and is tested in a set of new tests in execution_test.go. This change also updates the mempool mock to use a mockery generated version and removes much of the plumbing for the no longer used ABCIResponses.
Updates #8077. The panic handler for consensus currently attempts to effect a
clean shutdown, but this can leave a failed node running in an unknown state
for an arbitrary amount of time after the failure.
Since a panic at this point means consensus is already irrecoverably broken, we
should not allow the node to continue executing. After making a best effort to
shut down the writeahead log, re-panic to ensure the node will terminate before
any further state transitions are processed.
Even with this change, it is possible some transitions may occur while the
cleanup is happening. It might be preferable to abort unconditionally without
any attempt at cleanup.
Related changes:
- Clean up the creation of WAL directories.
- Filter WAL close errors at rethrow.
This change implements the spec for `ProcessProposal`. It first calls the Tendermint block validation logic to check that all of the proposed block fields are well formed and do not violate any of the rules for Tendermint to consider the block valid and then passes the validated block the `ProcessProposal`.
This change also adds additional fixtures to test the change. It adds the `baseMock` types that holds a mock as well as a reference to `BaseApplication`. If the function was not setup by the test on the contained mock Application, the type delegates to the `BaseApplication` and returns what `BaseApplication` returns.
The change also switches the `makeState` helper to take an arg struct so that an ABCI application can be plumbed through when needed.
closes: #7656
* event: Added Events after evidence validation; evidence: refactored AddEvidence
Added context and Metrics as parameter for the pool constructor
* evidence: pushed event firing into evidence pool and added metrics to represent the size of the evpool
* state: fixed parameters of evpool mock functions
* evidence: added test to confirm events are generated
* Removed obsolete EvidenceEventPublisher interface
* evidence: pool removed error on missing eventbus
This change adds logic to double the message delay bound after every 10 rounds. Alternatives to this somewhat magic number were discussed. Specifically, whether or not to make '10' modifiable as a parameter was discussed. Since this behavior only exists to ensure liveness in the case that these values were poorly chosen to begin with, a method to configure this value was not created. Chains that notice many 'untimely' rounds per the [relevant metric](https://github.com/tendermint/tendermint/pull/7709) are expected to take action to increase the configured message delay to more accurately match the conditions of the network.
closes: https://github.com/tendermint/spec/issues/371
This pull request merges in the changes for implementing Proposer-based timestamps into `master`. The power was primarily being done in the `wb/proposer-based-timestamps` branch, with changes being merged into that branch during development. This pull request represents an amalgamation of the changes made into that development branch. All of the changes that were placed into that branch have been cleanly rebased on top of the latest `master`. The changes compile and the tests pass insofar as our tests in general pass.
### Note To Reviewers
These changes have been extensively reviewed during development. There is not much new here. In the interest of making effective use of time, I would recommend against trying to perform a complete audit of the changes presented and instead examine for mistakes that may have occurred during the process of rebasing the changes. I gave the complete change set a first pass for any issues, but additional eyes would be very appreciated.
In sum, this change set does the following:
closes#6942
merges in #6849
There are no further uses of this package anywhere in Tendermint.
All the uses in the Cosmos SDK are for types that now work correctly with the
standard encoding/json package.
## What does this pull request do?
This pull requests adds two metrics intended for use in calculating an experimental value for `MessageDelay`.
The metrics are as follows:
```
# HELP tendermint_consensus_complete_prevote_message_delay Difference in seconds between the proposal timestamp and the timestamp of the prevote that achieved 100% of the voting power in the prevote step.
# TYPE tendermint_consensus_complete_prevote_message_delay gauge
tendermint_consensus_complete_prevote_message_delay{chain_id="test-chain-aZbwF1"} 0.013025505
# HELP tendermint_consensus_quorum_prevote_message_delay Difference in seconds between the proposal timestamp and the timestamp of the prevote that achieved a quorum in the prevote step.
# TYPE tendermint_consensus_quorum_prevote_message_delay gauge
tendermint_consensus_quorum_prevote_message_delay{chain_id="test-chain-aZbwF1"} 0.013025505
```
## Why this change?
For more information on what these metrics are calculating, see #7202. The aim is to merge to backport these metrics to v0.34 and run nodes on a few popular chains with these metrics to determine the experimental values for `MessageDelay` on these popular chains and use these to select our default `SynchronyParams.MessageDelay` value.
## Why Gauges for the metrics?
Gauges allow us to overwrite the metric on each successive observation. We can then capture these metrics over time to track the highest and lowest observed value.