consensus: additional timing metrics (port of #7849) (#9168)

This change introduces an additional set of metrics aimed at helping operators understand the timing for consensus.

This change adds the following metrics:

```
tendermint_consensus_round_duration_seconds_bucket{chain_id="test-chain-IrF74Y",le="0.1"} 29
tendermint_consensus_round_duration_seconds_bucket{chain_id="test-chain-IrF74Y",le="0.2682695795279726"} 29
tendermint_consensus_round_duration_seconds_bucket{chain_id="test-chain-IrF74Y",le="0.7196856730011522"} 29
tendermint_consensus_round_duration_seconds_bucket{chain_id="test-chain-IrF74Y",le="1.9306977288832508"} 29
tendermint_consensus_round_duration_seconds_bucket{chain_id="test-chain-IrF74Y",le="5.1794746792312125"} 29
tendermint_consensus_round_duration_seconds_bucket{chain_id="test-chain-IrF74Y",le="13.894954943731381"} 29
tendermint_consensus_round_duration_seconds_bucket{chain_id="test-chain-IrF74Y",le="37.27593720314942"} 29
tendermint_consensus_round_duration_seconds_bucket{chain_id="test-chain-IrF74Y",le="100.00000000000006"} 29
tendermint_consensus_round_duration_seconds_bucket{chain_id="test-chain-IrF74Y",le="+Inf"} 29
tendermint_consensus_round_duration_seconds_sum{chain_id="test-chain-IrF74Y"} 0.028651869999999996
tendermint_consensus_round_duration_seconds_count{chain_id="test-chain-IrF74Y"} 29
```

```
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Commit",le="0.1"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Commit",le="0.2682695795279726"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Commit",le="0.7196856730011522"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Commit",le="1.9306977288832508"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Commit",le="5.1794746792312125"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Commit",le="13.894954943731381"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Commit",le="37.27593720314942"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Commit",le="100.00000000000006"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Commit",le="+Inf"} 29
tendermint_consensus_step_duration_seconds_sum{chain_id="test-chain-IrF74Y",step="Commit"} 0.26650875
tendermint_consensus_step_duration_seconds_count{chain_id="test-chain-IrF74Y",step="Commit"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewHeight",le="0.1"} 0
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewHeight",le="0.2682695795279726"} 0
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewHeight",le="0.7196856730011522"} 0
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewHeight",le="1.9306977288832508"} 28
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewHeight",le="5.1794746792312125"} 28
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewHeight",le="13.894954943731381"} 28
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewHeight",le="37.27593720314942"} 28
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewHeight",le="100.00000000000006"} 28
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewHeight",le="+Inf"} 28
tendermint_consensus_step_duration_seconds_sum{chain_id="test-chain-IrF74Y",step="NewHeight"} 27.773921702
tendermint_consensus_step_duration_seconds_count{chain_id="test-chain-IrF74Y",step="NewHeight"} 28
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewRound",le="0.1"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewRound",le="0.2682695795279726"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewRound",le="0.7196856730011522"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewRound",le="1.9306977288832508"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewRound",le="5.1794746792312125"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewRound",le="13.894954943731381"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewRound",le="37.27593720314942"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewRound",le="100.00000000000006"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewRound",le="+Inf"} 29
tendermint_consensus_step_duration_seconds_sum{chain_id="test-chain-IrF74Y",step="NewRound"} 0.168961052
tendermint_consensus_step_duration_seconds_count{chain_id="test-chain-IrF74Y",step="NewRound"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Precommit",le="0.1"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Precommit",le="0.2682695795279726"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Precommit",le="0.7196856730011522"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Precommit",le="1.9306977288832508"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Precommit",le="5.1794746792312125"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Precommit",le="13.894954943731381"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Precommit",le="37.27593720314942"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Precommit",le="100.00000000000006"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Precommit",le="+Inf"} 29
tendermint_consensus_step_duration_seconds_sum{chain_id="test-chain-IrF74Y",step="Precommit"} 0.06414115999999999
tendermint_consensus_step_duration_seconds_count{chain_id="test-chain-IrF74Y",step="Precommit"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Prevote",le="0.1"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Prevote",le="0.2682695795279726"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Prevote",le="0.7196856730011522"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Prevote",le="1.9306977288832508"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Prevote",le="5.1794746792312125"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Prevote",le="13.894954943731381"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Prevote",le="37.27593720314942"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Prevote",le="100.00000000000006"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Prevote",le="+Inf"} 29
tendermint_consensus_step_duration_seconds_sum{chain_id="test-chain-IrF74Y",step="Prevote"} 0.177714525
tendermint_consensus_step_duration_seconds_count{chain_id="test-chain-IrF74Y",step="Prevote"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Propose",le="0.1"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Propose",le="0.2682695795279726"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Propose",le="0.7196856730011522"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Propose",le="1.9306977288832508"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Propose",le="5.1794746792312125"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Propose",le="13.894954943731381"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Propose",le="37.27593720314942"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Propose",le="100.00000000000006"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Propose",le="+Inf"} 29
tendermint_consensus_step_duration_seconds_sum{chain_id="test-chain-IrF74Y",step="Propose"} 0.221851927
tendermint_consensus_step_duration_seconds_count{chain_id="test-chain-IrF74Y",step="Propose"} 29
```


```
tendermint_consensus_block_gossip_parts_received{chain_id="test-chain-IrF74Y",matches_current="true"} 29
```


---

#### PR checklist

- [x] Tests written/updated, or no tests needed
- [ ] `CHANGELOG_PENDING.md` updated, or no changelog entry needed
- [x] Updated relevant documentation (`docs/`) and code comments, or no
      documentation updates needed

Closes: #9166
This commit is contained in:
William Banfield
2022-08-05 17:24:02 -04:00
committed by Sam Ricotta
parent 14d3683dc8
commit e39f49dde2
3 changed files with 118 additions and 40 deletions

View File

@@ -18,39 +18,42 @@ Listen address can be changed in the config file (see
The following metrics are available:
| **Name** | **Type** | **Tags** | **Description** |
|----------------------------------------|-----------|---------------|------------------------------------------------------------------------|
| abci_connection_method_timing_seconds | Histogram | method, type | Timings for each of the ABCI methods |
| consensus_height | Gauge | | Height of the chain |
| consensus_validators | Gauge | | Number of validators |
| consensus_validators_power | Gauge | | Total voting power of all validators |
| consensus_validator_power | Gauge | | Voting power of the node if in the validator set |
| consensus_validator_last_signed_height | Gauge | | Last height the node signed a block, if the node is a validator |
| consensus_validator_missed_blocks | Gauge | | Total amount of blocks missed for the node, if the node is a validator |
| consensus_missing_validators | Gauge | | Number of validators who did not sign |
| consensus_missing_validators_power | Gauge | | Total voting power of the missing validators |
| consensus_byzantine_validators | Gauge | | Number of validators who tried to double sign |
| consensus_byzantine_validators_power | Gauge | | Total voting power of the byzantine validators |
| consensus_block_interval_seconds | Histogram | | Time between this and last block (Block.Header.Time) in seconds |
| consensus_rounds | Gauge | | Number of rounds |
| consensus_num_txs | Gauge | | Number of transactions |
| consensus_total_txs | Gauge | | Total number of transactions committed |
| consensus_block_parts | counter | peer_id | number of blockparts transmitted by peer |
| consensus_latest_block_height | gauge | | /status sync_info number |
| consensus_fast_syncing | gauge | | either 0 (not fast syncing) or 1 (syncing) |
| consensus_state_syncing | gauge | | either 0 (not state syncing) or 1 (syncing) |
| consensus_block_size_bytes | Gauge | | Block size in bytes |
| p2p_peers | Gauge | | Number of peers node's connected to |
| p2p_peer_receive_bytes_total | counter | peer_id, chID | number of bytes per channel received from a given peer |
| p2p_peer_send_bytes_total | counter | peer_id, chID | number of bytes per channel sent to a given peer |
| p2p_peer_pending_send_bytes | gauge | peer_id | number of pending bytes to be sent to a given peer |
| p2p_num_txs | gauge | peer_id | number of transactions submitted by each peer_id |
| p2p_pending_send_bytes | gauge | peer_id | amount of data pending to be sent to peer |
| mempool_size | Gauge | | Number of uncommitted transactions |
| mempool_tx_size_bytes | histogram | | transaction sizes in bytes |
| mempool_failed_txs | counter | | number of failed transactions |
| mempool_recheck_times | counter | | number of transactions rechecked in the mempool |
| state_block_processing_time | histogram | | time between BeginBlock and EndBlock in ms |
| **Name** | **Type** | **Tags** | **Description** |
|----------------------------------------|-----------|-----------------|--------------------------------------------------------------------------------------------------------------------------------------------|
| abci_connection_method_timing_seconds | Histogram | method, type | Timings for each of the ABCI methods |
| consensus_height | Gauge | | Height of the chain |
| consensus_validators | Gauge | | Number of validators |
| consensus_validators_power | Gauge | | Total voting power of all validators |
| consensus_validator_power | Gauge | | Voting power of the node if in the validator set |
| consensus_validator_last_signed_height | Gauge | | Last height the node signed a block, if the node is a validator |
| consensus_validator_missed_blocks | Gauge | | Total amount of blocks missed for the node, if the node is a validator |
| consensus_missing_validators | Gauge | | Number of validators who did not sign |
| consensus_missing_validators_power | Gauge | | Total voting power of the missing validators |
| consensus_byzantine_validators | Gauge | | Number of validators who tried to double sign |
| consensus_byzantine_validators_power | Gauge | | Total voting power of the byzantine validators |
| consensus_block_interval_seconds | Histogram | | Time between this and last block (Block.Header.Time) in seconds |
| consensus_rounds | Gauge | | Number of rounds |
| consensus_num_txs | Gauge | | Number of transactions |
| consensus_total_txs | Gauge | | Total number of transactions committed |
| consensus_block_parts | counter | peer_id | number of blockparts transmitted by peer |
| consensus_latest_block_height | gauge | | /status sync_info number |
| consensus_fast_syncing | gauge | | either 0 (not fast syncing) or 1 (syncing) |
| consensus_state_syncing | gauge | | either 0 (not state syncing) or 1 (syncing) |
| consensus_block_size_bytes | Gauge | | Block size in bytes |
| consensus_step_duration | Histogram | step | Histogram of durations for each step in the consensus protocol |
| consensus_round_duration | Histogram | | Histogram of durations for all the rounds that have occurred since the process started |
| consensus_block_gossip_parts_received | Counter | matches_current | Number of block parts received by the node |
| p2p_peers | Gauge | | Number of peers node's connected to |
| p2p_peer_receive_bytes_total | counter | peer_id, chID | number of bytes per channel received from a given peer |
| p2p_peer_send_bytes_total | counter | peer_id, chID | number of bytes per channel sent to a given peer |
| p2p_peer_pending_send_bytes | gauge | peer_id | number of pending bytes to be sent to a given peer |
| p2p_num_txs | gauge | peer_id | number of transactions submitted by each peer_id |
| p2p_pending_send_bytes | gauge | peer_id | amount of data pending to be sent to peer |
| mempool_size | Gauge | | Number of uncommitted transactions |
| mempool_tx_size_bytes | histogram | | transaction sizes in bytes |
| mempool_failed_txs | counter | | number of failed transactions |
| mempool_recheck_times | counter | | number of transactions rechecked in the mempool |
| state_block_processing_time | histogram | | time between BeginBlock and EndBlock in ms |
## Useful queries