The dial routines perform network i/o, which is a blocking call into the kernel. These routines are completely unable to do anything else while the dial occurs, so for most of their lifecycle they are sitting idle waiting for the tcp stack to hand them data. We should increase this value by _a lot_ to enable more concurrent dials. This is unlikely to cause CPU starvation because these routines sit idle most of the time. The current value causes dials to occur _way_ too slowly.
Below is a graph demonstrating the before and after of this change in a testnetwork with many dead peers. You can observe that the rate that we connect to new, valid peers, is _much_ higher than previously. Change was deployed around the 31 minute mark on the graph.

Closes#8069
* Type `ABCIResponses` was just wrapping type `ResponseFinalizeBlock`. This patch removes the former.
* Did some renaming to avoid confusion on the data structure we are working with.
* We also remove any stale ABCIResponses we may have in the state store at the time of pruning
**IMPORTANT**: There is an undesirable side-effect of the unwrapping. An empty `ResponseFinalizeBlock` yields a 0-length proto-buf serialized buffer. This was not the case with `ABCIResponses`. I have added an interim solution, but open for suggestions on more elegant ones.
* abci:mempoolError from ResponseCheckTx
* responseCheckTx returns an error if Tendermint decides not to accept an app after CheckTx
*updated spec, upgrading.md and changelog.md
This pull requests adds the protocol buffer field for the `ABCI.VoteExtensionsEnableHeight` parameter. This proto field is threaded throughout all of the relevant places where consensus params are used and referenced.
This PR also adds validation of the consensus param updates. Previous consensus param changes didn't depend on _previous_ versions of the params, so this change adds a method for validating against the old params as well.
closes: #8453
This PR makes vote extensions optional within Tendermint. A new ConsensusParams field, called ABCIParams.VoteExtensionsEnableHeight, has been added to toggle whether or not extensions should be enabled or disabled depending on the current height of the consensus engine. Related to: #8453
Closes: #8575
This PR aims to fix the `LastCommitRound can only be negative for initial height 0` issue we see in the e2e tests by initializing the `state` object before starting the receive routines in the consensus reactor. This is somewhat inelegant, but it should fix the issue.
* Fix lock sequencing in socket client request tracking.
It is not safe to check base service state (IsRunning) while holding the lock
for the client state. If we do, then during shutdown we may deadlock with the
invocation of the OnStop handler, which the base service executes while holding
the service lock.
* Enqueue pending requests before sending them to the server.
If we don't do this, the server can reply before the request lands in the
queue. That will cause the receiver to terminate early for an unsolicited
response. So enqueue first: This is safe because we're doing it in the same
routine as services the channel, so we won't take another message till we are
safely past that point.
* Document what we did.
* Fix socket paths in tests.
In #3435 we allowed this timeout to override the global write timeout.
But after #8570 this meant we were applying a shorter timeout by default.
Don't do the patch if the timeout is already unlimited.
This is a temporary workaround; in light of #8561 I plan to get rid of this
option entirely during the v0.37 cycle, but meanwhile we should keep existing
use more or less coherent.
* rpc: rework timeouts to be per-method instead of global
Prior to this change, we set a 10-second global timeout for all RPC methods
using the net/http Server type's WriteTimeout. This meant that any request
whose handler did not return within that period would simply drop the
connection to the client.
This timeout is too short for a default, as evidenced by issues like [1] and
[2]. In addition, the mode of failure on the client side is confusing; it
shows up as a dropped connection (EOF) rather than a meaningful error from the
service. More importantly, various methods have diffent constraints: Some
should be able to return quickly, others may need to adjust based on the
application workload.
This is a first step toward supporting configurable timeouts. This change:
- Removes the server-wide default global timeout, and instead:
- Wires up a default context timeout for all RPC handlers.
- Increases the default timeout from 10s to 60s.
- Adds a hook to override this per-method as needed.
This does NOT expose the timeouts in the configuration file (yet).
[1] https://github.com/osmosis-labs/osmosis/issues/1391
[2] https://github.com/tendermint/tendermint/issues/8465
## What does this change do?
This pull request completes the change to the `metricsgen` metrics. It adds `go generate` directives to all of the files containing the `Metrics` structs.
Using the outputs of `metricsdiff` between these generated metrics and `master`, we can see that there is not a diff between the two sets of metrics when run locally.
```
[william@sidewinder] tendermint[wb/metrics-gen-transition]:. ◆ ./scripts/metricsgen/metricsdiff/metricsdiff metrics_master metrics_generated
[william@sidewinder] tendermint[wb/metrics-gen-transition]:. ◆
```
This change also adds parsing for a `metrics:` key in a field comment. If a comment line begins with `//metrics:` the rest of the line is interpreted to be the metric help text. Additionally, a bug where lists of labels were not properly quoted in the `metricsgen` rendered output was fixed.
Although we index block.height for blocks in the KV indexer, this reserved
attribute was not previously exposed to the event subscription API. Despite
being advertised in the OpenAPI spec, neither the old (websocket) nor new
(events) query interface could see it. This change exposes block.height to the
/events API.
In addition: Remove a non-public constant from types (finalize_block). This
value is used only as an internal tag by the indexer, and should not be exposed
to users of the public interface. (We could probably drop it entirely, as it
was previously a disambiguator for BeginBlock vs. EndBlock events, but keeping
a tag here simplifies the cleanup).
This pull request adds an additional set of metrics targeted at providing more visibility into `abci++`.
The following set of metrics are added and exposed through the `metrics` endpoint:
```
tendermint_consensus_proposal_receive_count{chain_id="test-chain-IrF74Y",status="accepted"} 34
tendermint_consensus_proposal_create_count{chain_id="test-chain-IrF74Y"} 34
tendermint_consensus_vote_extension_receive_count{chain_id="test-chain-IrF74Y",status="accepted"} 34
tendermint_consensus_round_voting_power_percent{chain_id="test-chain-IrF74Y",vote_type="precommit"} 1
tendermint_consensus_round_voting_power_percent{chain_id="test-chain-IrF74Y",vote_type="prevote"} 1
tendermint_state_consensus_param_updates{chain_id="test-chain-IrF74Y"} 0
tendermint_state_validator_set_updates{chain_id="test-chain-IrF74Y"} 0
tendermint_consensus_late_votes{chain_id="test-chain-IrF74Y",vote_type="precommit"} 16
```
This pull request also updates the `metrics.md` file to include some metrics that were previously missed. My hope is to generate the `metrics.md` file with a future version of the tool being architected in #8479
This pull requests adds a new tool, metricsgen, for generating Tendermint metrics constructors from `Metrics` struct definitions. This tool aims to reduce the amount of boilerplate required to add additional metrics to Tendermint.
Its working is fairly simple, it parses the go ast, extracts field information, and uses this field information to execute a go template.
This pull request also adds a proof-of-concept of the tool's output and working by using it to generate the [indexer metrics](https://github.com/tendermint/tendermint/pull/8479/files#diff-4b0c597b6fa05332a2f9a8e0ce079e360602942fae99dc5485f1edfe71c0a29e) using `//go:generate` directives and a simple `make` target.
The next steps for this tool are documented in https://github.com/tendermint/tendermint/issues/8485 and https://github.com/tendermint/tendermint/issues/8486, which detail using the tool to generate the `metrics.md` documentation file and using the tool to migrate away from `go-kit`.
* Split vote verification/validation based on vote extensions
Some parts of the code need vote extensions to be verified and
validated (mostly in consensus), and other parts of the code don't
because its possible that, in some cases (as per RFC 017), we won't have
vote extensions.
This explicitly facilitates that split.
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Only sign extensions in precommits, not prevotes
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Update privval/file.go
Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
* Apply suggestions from code review
Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
* Temporarily disable extension requirement again for E2E testing
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Reorganize comment for clarity
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Leave vote validation and pre-call nil check up to caller of VoteToProto
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Split complex vote validation test into multiple tests
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Universally enforce no vote extensions on any vote type but precommits
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Make error messages more generic
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Verify with vote extensions when constructing a VoteSet
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Expand comment for clarity
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Add extension check for prevotes prior to signing votes
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Fix supporting test code to only inject extensions into precommits
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Separate vote malleation from signing in vote tests for clarity
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Add extension signature length check and corresponding test
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Perform basic vote validation in CommitToVoteSet
Signed-off-by: Thane Thomson <connect@thanethomson.com>
Co-authored-by: M. J. Fromberger <fromberger@interchain.io>