Commit Graph

29 Commits

Author SHA1 Message Date
Callum Waters
99a7ac84dc metrics: add separate statesync and blocksync metrics (#9682) 2022-11-10 19:13:15 +01:00
William Banfield
f2c32c9b3e metrics: fix panic because of absent prometheus label (#9455)
Absence of this label causes a panic because the setters try to access the label despite it never being added to the metric. This PR adds the label to the metrics, thus preventing the panic.

#### PR checklist

- [ ] Tests written/updated, or no tests needed
- [ ] `CHANGELOG_PENDING.md` updated, or no changelog entry needed
- [ ] Updated relevant documentation (`docs/`) and code comments, or no
      documentation updates needed
2022-09-21 17:07:07 +00:00
Thane Thomson
cceea4de22 chore: Format and fix lints (#9336)
* make format

Signed-off-by: Thane Thomson <connect@thanethomson.com>

* Fix linting directives

Signed-off-by: Thane Thomson <connect@thanethomson.com>

* make mockery

Signed-off-by: Thane Thomson <connect@thanethomson.com>

* Appease CI linter

Signed-off-by: Thane Thomson <connect@thanethomson.com>

* Appease CI linter

Signed-off-by: Thane Thomson <connect@thanethomson.com>

Signed-off-by: Thane Thomson <connect@thanethomson.com>
2022-08-30 12:28:46 -04:00
Sergio Mena
50b5c23d88 Merge branch 'feature/abci++ppp' 2022-08-22 17:16:17 +02:00
William Banfield
1069ffc6aa config: backport the rename of fastsync to blocksync (#9259)
This is largely a cherry pick of #6755 with some additional fixups added where detected. 
This change moves the blockchain package to a package called blocksync. Additionally, it renames the relevant uses of the term `fastsync` to `blocksync`.

closes: #9227 

#### PR checklist

- [ ] Tests written/updated, or no tests needed
- [x] `CHANGELOG_PENDING.md` updated, or no changelog entry needed
- [x] Updated relevant documentation (`docs/`) and code comments, or no
      documentation updates needed
2022-08-17 15:19:20 +00:00
William Banfield
7bd86ec004 consensus: backport abci and consensus metrics (#9273)
Partial backport of #8480
2022-08-17 09:37:45 -04:00
William Banfield
69845bb44e metrics: fixup after cherry-pick (#9195)
* fixup after cherry-pick

* cherry-pick fixups
2022-08-08 15:55:01 -04:00
William Banfield
10e1ac8fea metrics: transition all metrics to using metricsgen generated constructors. (port of #8488) (#9178)
This pull request completes the change to the `metricsgen` metrics. It adds `go generate` directives to all of the files containing the `Metrics` structs.

Using the outputs of `metricsdiff` between these generated metrics and `main`, we can see that there is a minimal diff between the two sets of metrics when run locally. The diff here stems from removal of the word 'message' which was done in v0.36+ and is ultimately a better phrasing. This metric has not yet been released, so this phrasing is preferred.

```
./metricsdiff old new
Metric changes:
+++ tendermint_consensus_full_prevote_delay
+++ tendermint_consensus_quorum_prevote_delay
--- tendermint_consensus_full_prevote_message_delay
--- tendermint_consensus_quorum_prevote_message_delay
```

This change also adds parsing for a `metrics:` key in a field comment. If a comment line begins with `//metrics:` the rest of the line is interpreted to be the metric help text. Additionally, a bug where lists of labels were not properly quoted in the `metricsgen` rendered output was fixed.

In my view, docs and tests are not needed for this internal only change.

---
#### PR checklist

- [ ] Tests written/updated, or no tests needed
- [ ] `CHANGELOG_PENDING.md` updated, or no changelog entry needed
- [ ] Updated relevant documentation (`docs/`) and code comments, or no
      documentation updates needed

Ports #8488 to main
2022-08-08 16:53:00 +00:00
William Banfield
608933b73e consensus: additional timing metrics (port of #7849) (#9168)
This change introduces an additional set of metrics aimed at helping operators understand the timing for consensus.

This change adds the following metrics:

```
tendermint_consensus_round_duration_seconds_bucket{chain_id="test-chain-IrF74Y",le="0.1"} 29
tendermint_consensus_round_duration_seconds_bucket{chain_id="test-chain-IrF74Y",le="0.2682695795279726"} 29
tendermint_consensus_round_duration_seconds_bucket{chain_id="test-chain-IrF74Y",le="0.7196856730011522"} 29
tendermint_consensus_round_duration_seconds_bucket{chain_id="test-chain-IrF74Y",le="1.9306977288832508"} 29
tendermint_consensus_round_duration_seconds_bucket{chain_id="test-chain-IrF74Y",le="5.1794746792312125"} 29
tendermint_consensus_round_duration_seconds_bucket{chain_id="test-chain-IrF74Y",le="13.894954943731381"} 29
tendermint_consensus_round_duration_seconds_bucket{chain_id="test-chain-IrF74Y",le="37.27593720314942"} 29
tendermint_consensus_round_duration_seconds_bucket{chain_id="test-chain-IrF74Y",le="100.00000000000006"} 29
tendermint_consensus_round_duration_seconds_bucket{chain_id="test-chain-IrF74Y",le="+Inf"} 29
tendermint_consensus_round_duration_seconds_sum{chain_id="test-chain-IrF74Y"} 0.028651869999999996
tendermint_consensus_round_duration_seconds_count{chain_id="test-chain-IrF74Y"} 29
```

```
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Commit",le="0.1"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Commit",le="0.2682695795279726"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Commit",le="0.7196856730011522"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Commit",le="1.9306977288832508"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Commit",le="5.1794746792312125"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Commit",le="13.894954943731381"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Commit",le="37.27593720314942"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Commit",le="100.00000000000006"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Commit",le="+Inf"} 29
tendermint_consensus_step_duration_seconds_sum{chain_id="test-chain-IrF74Y",step="Commit"} 0.26650875
tendermint_consensus_step_duration_seconds_count{chain_id="test-chain-IrF74Y",step="Commit"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewHeight",le="0.1"} 0
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewHeight",le="0.2682695795279726"} 0
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewHeight",le="0.7196856730011522"} 0
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewHeight",le="1.9306977288832508"} 28
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewHeight",le="5.1794746792312125"} 28
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewHeight",le="13.894954943731381"} 28
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewHeight",le="37.27593720314942"} 28
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewHeight",le="100.00000000000006"} 28
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewHeight",le="+Inf"} 28
tendermint_consensus_step_duration_seconds_sum{chain_id="test-chain-IrF74Y",step="NewHeight"} 27.773921702
tendermint_consensus_step_duration_seconds_count{chain_id="test-chain-IrF74Y",step="NewHeight"} 28
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewRound",le="0.1"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewRound",le="0.2682695795279726"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewRound",le="0.7196856730011522"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewRound",le="1.9306977288832508"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewRound",le="5.1794746792312125"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewRound",le="13.894954943731381"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewRound",le="37.27593720314942"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewRound",le="100.00000000000006"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="NewRound",le="+Inf"} 29
tendermint_consensus_step_duration_seconds_sum{chain_id="test-chain-IrF74Y",step="NewRound"} 0.168961052
tendermint_consensus_step_duration_seconds_count{chain_id="test-chain-IrF74Y",step="NewRound"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Precommit",le="0.1"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Precommit",le="0.2682695795279726"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Precommit",le="0.7196856730011522"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Precommit",le="1.9306977288832508"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Precommit",le="5.1794746792312125"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Precommit",le="13.894954943731381"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Precommit",le="37.27593720314942"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Precommit",le="100.00000000000006"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Precommit",le="+Inf"} 29
tendermint_consensus_step_duration_seconds_sum{chain_id="test-chain-IrF74Y",step="Precommit"} 0.06414115999999999
tendermint_consensus_step_duration_seconds_count{chain_id="test-chain-IrF74Y",step="Precommit"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Prevote",le="0.1"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Prevote",le="0.2682695795279726"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Prevote",le="0.7196856730011522"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Prevote",le="1.9306977288832508"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Prevote",le="5.1794746792312125"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Prevote",le="13.894954943731381"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Prevote",le="37.27593720314942"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Prevote",le="100.00000000000006"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Prevote",le="+Inf"} 29
tendermint_consensus_step_duration_seconds_sum{chain_id="test-chain-IrF74Y",step="Prevote"} 0.177714525
tendermint_consensus_step_duration_seconds_count{chain_id="test-chain-IrF74Y",step="Prevote"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Propose",le="0.1"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Propose",le="0.2682695795279726"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Propose",le="0.7196856730011522"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Propose",le="1.9306977288832508"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Propose",le="5.1794746792312125"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Propose",le="13.894954943731381"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Propose",le="37.27593720314942"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Propose",le="100.00000000000006"} 29
tendermint_consensus_step_duration_seconds_bucket{chain_id="test-chain-IrF74Y",step="Propose",le="+Inf"} 29
tendermint_consensus_step_duration_seconds_sum{chain_id="test-chain-IrF74Y",step="Propose"} 0.221851927
tendermint_consensus_step_duration_seconds_count{chain_id="test-chain-IrF74Y",step="Propose"} 29
```


```
tendermint_consensus_block_gossip_parts_received{chain_id="test-chain-IrF74Y",matches_current="true"} 29
```


---

#### PR checklist

- [x] Tests written/updated, or no tests needed
- [ ] `CHANGELOG_PENDING.md` updated, or no changelog entry needed
- [x] Updated relevant documentation (`docs/`) and code comments, or no
      documentation updates needed

Closes: #9166
2022-08-05 21:24:02 +00:00
mergify[bot]
f36cc80568 consensus: calculate prevote message delay metric (backport #7551) (#7617)
* consensus: calculate prevote message delay metric (#7551)

## What does this pull request do?
This pull requests adds two metrics intended for use in calculating an experimental value for `MessageDelay`.

The metrics are as follows:
```
# HELP tendermint_consensus_complete_prevote_message_delay Difference in seconds between the proposal timestamp and the timestamp of the prevote that achieved 100% of the voting power in the prevote step.
# TYPE tendermint_consensus_complete_prevote_message_delay gauge
tendermint_consensus_complete_prevote_message_delay{chain_id="test-chain-aZbwF1"} 0.013025505

# HELP tendermint_consensus_quorum_prevote_message_delay Difference in seconds between the proposal timestamp and the timestamp of the prevote that achieved a quorum in the prevote step.
# TYPE tendermint_consensus_quorum_prevote_message_delay gauge
tendermint_consensus_quorum_prevote_message_delay{chain_id="test-chain-aZbwF1"} 0.013025505
```

## Why this change?

 For more information on what these metrics are calculating, see #7202. The aim is to merge to backport these metrics to v0.34 and run nodes on a few popular chains with these metrics to determine the experimental values for `MessageDelay` on these popular chains and use these to select our default `SynchronyParams.MessageDelay` value.

## Why Gauges for the metrics?
Gauges allow us to overwrite the metric on each successive observation. We can then capture these metrics over time to track the highest and lowest observed value.

(cherry picked from commit 0c82ceaa5f)

# Conflicts:
#	consensus/metrics.go
#	consensus/state.go

* fix merge conflicts

Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com>
Co-authored-by: William Banfield <wbanfield@gmail.com>
2022-01-19 12:10:18 -05:00
Marko
4787c5b61a metrics: switch from gauge to histogram (#5326)
## Description

Part of the issue is to add metrics to the websocket connection. It seems this would require some moving around of things in the node pkg. I opted to not make this change now, and wait for when we do a node pkg refactor. 

If someone disagrees with this appraoch please let me know, I can attempt to get metrics into the rpc layer.

There is not a need to update documentation as it already states this metric is a histogram..

Closes: #1791
2020-09-03 13:08:13 +00:00
Marko
f7d4fafa73 docs: add missing metrics (#5325)
## Description

Add missing metrics. 

`Blockchain/v2` exposes metrics for events. I don't find these as something a node operator should utilize as it does not bring insight. 

Closes: #XXX
2020-09-03 07:50:31 +00:00
Erik Grinaker
511ab6717c add state sync reactor (#4705)
Fixes #828. Adds state sync, as outlined in [ADR-053](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-053-state-sync-prototype.md). See related PRs in Cosmos SDK (https://github.com/cosmos/cosmos-sdk/pull/5803) and Gaia (https://github.com/cosmos/gaia/pull/327).

This is split out of the previous PR #4645, and branched off of the ABCI interface in #4704. 

* Adds a new P2P reactor which exchanges snapshots with peers, and bootstraps an empty local node from remote snapshots when requested.

* Adds a new configuration section `[statesync]` that enables state sync and configures the light client. Also enables `statesync:info` logging by default.

* Integrates state sync into node startup. Does not support the v2 blockchain reactor, since it needs some reorganization to defer startup.
2020-04-29 10:47:00 +02:00
Marko
7368ba358b prometheus/metrics: three new metrics for consensus (#4263)
* prometheus/metrics: two new metrics for consensus

- add consensus_validator_power metric so if a node is a validator it can see its own power in prometheus
- add last_signed_height metric so if a node is a validator it can be aware of at which height the most recent time it signed.

- closes: #3773
- closes: #3083

Signed-off-by: Marko Baricevic <marbar3778@yahoo.com>

* check if signature is present

* minor change and check if sig is not absent

* add changelog entry

* Update consensus/state.go

Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com>

* Update CHANGELOG_PENDING.md

Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com>

* Update CHANGELOG_PENDING.md

* add label with validator address

* change address to vali_address

* add metric missed blocks

* add changelog entry for missed blocks metric

* address naming & add docs
2019-12-19 18:04:57 +01:00
Anton Kaliaev
dcb8f88525 add "chain_id" label for all metrics (#3123)
* add "chain_id" label for all metrics

Refs #3082

* fix labels extraction
2019-01-15 12:16:33 -05:00
Matthew Slipper
92343ef484 Add additional metrics (#2500)
* Add additional metrics
Continues addressing https://github.com/cosmos/cosmos-sdk/issues/2169.
* Add nop metrics to fix NPE
* Tweak buckets, code review
* Update buckets
* Update docs with new metrics
* Code review updates
2018-10-10 18:27:43 +02:00
Matthew Slipper
587116dae1 metrics: Add additional metrics to p2p and consensus (#2425)
* Add additional metrics to p2p and consensus
Partially addresses https://github.com/cosmos/cosmos-sdk/issues/2169.
* WIP
* Updates from code review
* Updates from code review
* Add instrumentation namespace to configuration
* Fix test failure
* Updates from code review
* Add quotes
* Add atomic load
* Use storeint64
* Use addInt64 in writePacketMsgTo
2018-09-25 13:14:38 +02:00
Zarko Milosevic
f99e4010f2 Add stats related channel between consensus state and reactor (#2388) 2018-09-21 14:36:48 -04:00
Anton Kaliaev
788474d08d change consensus_block_interval_seconds metric type to gauge
Otherwise, it's impossible to see outliers
https://github.com/tendermint/tendermint/issues/1835#issuecomment-402054099
2018-09-18 12:16:01 +04:00
Anton Kaliaev
205d8b8062 fixes after @xla review
- move prometheus metrics into internal packages
- *Option structs
- misc. format changes
2018-06-20 12:40:25 +04:00
Anton Kaliaev
e4bb3566a0 move metrics constructors to a separate package 2018-06-20 12:40:25 +04:00
Anton Kaliaev
b10b0da3fd bundle imports 2018-06-20 12:40:11 +04:00
Anton Kaliaev
19699d644f p2p metric, make height and totalTxs gauges 2018-06-20 12:38:45 +04:00
Anton Kaliaev
0cb50c05fc add rounds metric 2018-06-20 12:38:45 +04:00
Anton Kaliaev
e58d674f4c add validators power gauges 2018-06-20 12:38:45 +04:00
Anton Kaliaev
fad76e103b extract metrics to provider, remove height label 2018-06-20 12:38:45 +04:00
Anton Kaliaev
489d9b9184 more metrics 2018-06-20 12:38:45 +04:00
Anton Kaliaev
5c869b5888 validator metrics 2018-06-20 12:38:45 +04:00
Anton Kaliaev
5c7093cc9f go-kit metrics plus prometheus: one metric 2018-06-20 12:38:45 +04:00