Commit Graph

9245 Commits

Author SHA1 Message Date
William Banfield
29c075f722 cherry-pick fixup 2021-11-30 10:16:02 -05:00
William Banfield
095df5cc1c docs: add abci timing metrics to the metrics docs (#7311) 2021-11-30 10:12:12 -05:00
William Banfield
f6de2f5851 lint++ 2021-11-30 10:11:44 -05:00
William Banfield
07d12d9dfe internal/proxy: add initial set of abci metrics (#7115)
This PR adds an initial set of metrics for use ABCI. The initial metrics enable the calculation of timing histograms and call counts for each of the ABCI methods. The metrics are also labeled as either 'sync' or 'async' to determine if the method call was performed using ABCI's `*Async` methods.

An example of these metrics is included here for reference:
```
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="0.0001"} 0
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="0.0004"} 5
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="0.002"} 12
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="0.009"} 13
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="0.02"} 13
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="0.1"} 13
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="0.65"} 13
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="2"} 13
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="6"} 13
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="25"} 13
tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="+Inf"} 13
tendermint_abci_connection_method_timing_sum{chain_id="ci",method="commit",type="sync"} 0.007802058000000001
tendermint_abci_connection_method_timing_count{chain_id="ci",method="commit",type="sync"} 13
```

These metrics can easily be graphed using prometheus's `histogram_quantile(...)` method to pick out a particular quantile to graph or examine. I chose buckets that were somewhat of an estimate of expected range of times for ABCI operations. They start at .0001 seconds and range to 25 seconds. The hope is that this range captures enough possible times to be useful for us and operators.
2021-11-30 09:55:09 -05:00
M. J. Fromberger
1c1ce83e2d Performance improvements for the event query API (#7338)
A manual backport of #7319 and #7336.
2021-11-30 06:23:27 -08:00
mergify[bot]
3e41def0eb docs: go tutorial fixed for 0.35.0 version (#7329) (#7330) (#7331)
(cherry picked from commit a36dd49eae)

Co-authored-by: Piotr Pędziwiatr <84311757+ppedziwiatr@users.noreply.github.com>
2021-11-27 09:27:16 -08:00
mergify[bot]
333d7f7068 Update example code in OpenAPI docs. (#7318) (#7322)
The event examples for the query filter language were not updated after the
change of key and value types from []byte to string. Also, the attributes need
to be a slice not a bare value.

(cherry picked from commit da3449599f)

Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
2021-11-25 08:33:07 -08:00
M. J. Fromberger
778be06de8 pubsub: Report a non-nil error when shutting down. (#7310)
If a subscriber arrives while the pubsub service is shutting down, the existing
code will return a nil subscription without error. With unlucky timing, this
may lead to a nil indirection panic in the RPC service.

To avoid that problem, make sure that when a subscription fails for this
reason, we report a non-nil error so that the client will detect it and give up
gracefully.
2021-11-23 12:26:22 -08:00
M. J. Fromberger
a97b081df1 Partial backport of protobuf generation changes. (#7302)
This is a manual backport of the changes to how we build and run the protobuf
toolchain images in Docker. The main effect here is to point to the new image
from ghcr.io/tendermint/docker-proto-builder, but to make that work it is also
necessary to update some of the branch pointers.

This change does NOT include the changes from #7269 and #7291 to point to the
proto files in the spec repo. To do that, we will need to create a branch or
tag on the spec that has the released version, which does not exist in the spec
history as it currently stands.
2021-11-22 12:03:24 -08:00
dependabot[bot]
b506801b2a build(deps): Bump github.com/tendermint/tm-db from 0.6.4 to 0.6.6 (#7285) 2021-11-16 17:29:09 +01:00
mergify[bot]
97f888ea30 Add v0.35 to the configs for building the docs website. (#7055) (#7278)
(cherry picked from commit b3b1279d1f)

Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
2021-11-14 20:46:18 -08:00
Thane Thomson
035da42a91 rpc: backport experimental buffer size control parameters from #7230 (tm v0.35.x) (#7276)
* Update error message to correspond to changes in v0.34.x
* Add buffer size and client-close config parameters

Signed-off-by: Thane Thomson <connect@thanethomson.com>
2021-11-13 13:48:45 -08:00
mergify[bot]
37e0779d6d p2p: reduce peer score for dial failures (backport #7265) (#7271)
* p2p: reduce peer score for dial failures (#7265)

When dialing fails to succeed we should reduce the score of the peer,
which puts the peer at (potentially) greater chances of being removed
from the peer manager, and reduces the chance of the peer being
gossiped by the PEX reactor.

(cherry picked from commit 27560cf7a4)

Co-authored-by: Sam Kleinman <garen@tychoish.com>
2021-11-10 11:10:06 -05:00
mergify[bot]
38f9078435 evidence: remove source of non-determinism from test (#7266) (#7268)
The evidence test produces a set of mock evidence in the evidence pool of the 'Primary' node. The test then fills the evidence pools of secondaries with half of this mock evidence. Finally, the test waits until the secondary has an evidence pool as full as the primary.

The assertions that are removed here were checking that the primary and secondaries' evidence channels were empty. However, nothing in the test actually ensures that the channels are empty. The test only waits for the secondaries to have received the complete set of evidence, and the secondaries already received half of the evidence at the beginning. It's more than possible that the secondaries can receive the complete set of evidence and not finish reading the duplicate evidence off the channels.

(cherry picked from commit 4acd117b5e)

Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com>
2021-11-09 16:36:00 -05:00
mergify[bot]
052b08160a Set a cap on the length of subscription queries. (backport #7263) (#7264)
As a safety measure, don't allow a query string to be unreasonably
long. The query filter is not especially efficient, so a query that
needs more than basic detail should filter coarsely in the subscriber
and refine on the client side.

This affects Subscribe and TxSearch queries.

(cherry picked from commit 9dc3d7f9a2)
2021-11-09 19:53:44 +01:00
mergify[bot]
4a664931b4 consensus: add some more checks to vote counting (#7253) (#7262)
(cherry picked from commit b3b90f820c)

Co-authored-by: Callum Waters <cmwaters19@gmail.com>
2021-11-09 10:47:38 -05:00
dependabot[bot]
cf018baa88 build(deps): Bump github.com/lib/pq from 1.10.3 to 1.10.4 (#7260)
Bumps [github.com/lib/pq](https://github.com/lib/pq) from 1.10.3 to 1.10.4.
- [Release notes](https://github.com/lib/pq/releases)
- [Commits](https://github.com/lib/pq/compare/v1.10.3...v1.10.4)

---
updated-dependencies:
- dependency-name: github.com/lib/pq
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-09 08:09:35 -05:00
M. J. Fromberger
643a3f56f6 backport: Add basic metrics to the indexer package. (#7250) (#7252) 2021-11-08 06:48:38 -08:00
dependabot[bot]
1c5bb6e921 build(deps): Bump google.golang.org/grpc from 1.41.0 to 1.42.0 (#7218)
Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.41.0 to 1.42.0.
- [Release notes](https://github.com/grpc/grpc-go/releases)
- [Commits](https://github.com/grpc/grpc-go/compare/v1.41.0...v1.42.0)

This required a patch to Unix-domain socket addresses.
Per https://github.com/grpc/grpc/blob/master/doc/naming.md, the socket path
using the unix://... address format must be absolute. This recently started
being enforced in the library. That change was not documented.

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
2021-11-05 19:11:09 -07:00
M. J. Fromberger
97a3e44e07 Prepare CHANGELOG and config settings for v0.35.0 release. (#7228) v0.35.0 2021-11-04 08:22:39 -07:00
mergify[bot]
d021d068da docs: add upgrading info about node service (#7241) (#7242) 2021-11-04 15:10:15 +01:00
Sam Kleinman
d59565d050 pex: avoid starting reactor twice (#7239) 2021-11-04 07:58:26 -04:00
Sam Kleinman
003d15fa4b lint: cleanup branch lint errors (#7238) 2021-11-04 07:44:13 -04:00
dependabot[bot]
8629d31de3 build(deps): Bump github.com/rs/zerolog from 1.25.0 to 1.26.0 (#7222)
Bumps [github.com/rs/zerolog](https://github.com/rs/zerolog) from 1.25.0 to 1.26.0.
- [Release notes](https://github.com/rs/zerolog/releases)
- [Commits](https://github.com/rs/zerolog/compare/v1.25.0...v1.26.0)

---
updated-dependencies:
- dependency-name: github.com/rs/zerolog
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-03 12:28:52 -04:00
dependabot[bot]
113bebb314 build(deps): Bump github.com/golangci/golangci-lint (#7224)
Bumps [github.com/golangci/golangci-lint](https://github.com/golangci/golangci-lint) from 1.42.1 to 1.43.0.
- [Release notes](https://github.com/golangci/golangci-lint/releases)
- [Changelog](https://github.com/golangci/golangci-lint/blob/master/CHANGELOG.md)
- [Commits](https://github.com/golangci/golangci-lint/compare/v1.42.1...v1.43.0)

---
updated-dependencies:
- dependency-name: github.com/golangci/golangci-lint
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-03 11:19:06 -04:00
Sam Kleinman
5591d08b4a ci: backport lint configuration changes (#7226)
(cherry picked from commit 6f66c60397)
2021-11-03 15:41:41 +01:00
dependabot[bot]
56607d406b build(deps): Bump github.com/adlio/schema from 1.1.13 to 1.1.14 (#7217)
Bumps [github.com/adlio/schema](https://github.com/adlio/schema) from 1.1.13 to 1.1.14.
- [Release notes](https://github.com/adlio/schema/releases)
- [Commits](https://github.com/adlio/schema/compare/v1.1.13...v1.1.14)

---
updated-dependencies:
- dependency-name: github.com/adlio/schema
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-03 10:22:33 -04:00
mergify[bot]
5fca090e6a pex: allow disabled pex reactor (backport #7198) (#7201)
This ensures the implementation respects disabling the pex reactor.

(cherry picked from commit ffcd347ef6)

Co-authored-by: Sam Kleinman <garen@tychoish.com>
2021-11-03 09:41:11 -04:00
M. J. Fromberger
3e9ecd8197 Prepare changelog for v0.35.0-rc4. (#7181) v0.35.0-rc4 2021-10-29 09:57:23 -07:00
Sam Kleinman
e40a8468a4 config: backport file writing changes (#7182) 2021-10-29 06:38:52 -04:00
M. J. Fromberger
85086d7452 Fix metric cardinality left over from backport (#7180)
One of the patched uses in #7161 missed the message type field,
triggering panic failures from Prometheus.
2021-10-28 15:29:53 -07:00
mergify[bot]
8314f24d79 pubsub: Use distinct client IDs for test subscriptions. (#7178) (#7179)
Fixes #7176. Some of the benchmarks create a bunch of different subscriptions all sharing the same query. These were all using the same client ID, which violates one of the subscriber rules. Ensure each subscriber gets a unique ID.

This has been broken as long as this library has been in the repo—I tracked it back to bb9aa85d and it was already failing there, so I think this never really worked. I'm not sure these test anything useful, but at least now they run.

(cherry picked from commit 1fd7060542)

Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
2021-10-28 05:59:39 -04:00
mergify[bot]
dd1471da91 p2p: add message type into the send/recv bytes metrics (backport #7155) (#7161)
* p2p: add message type into the send/recv bytes metrics (#7155)

This pull request adds a new "mesage_type" label to the send/recv bytes metrics calculated in the p2p code.

Below is a snippet of the updated metrics that includes the updated label:
```
tendermint_p2p_peer_receive_bytes_total{chID="32",chain_id="ci",message_type="consensus_HasVote",peer_id="2551a13ed720101b271a5df4816d1e4b3d3bd133"} 652
tendermint_p2p_peer_receive_bytes_total{chID="32",chain_id="ci",message_type="consensus_HasVote",peer_id="4b1068420ef739db63377250553562b9a978708a"} 631
tendermint_p2p_peer_receive_bytes_total{chID="32",chain_id="ci",message_type="consensus_HasVote",peer_id="927c50a5e508c747830ce3ba64a3f70fdda58ef2"} 631
tendermint_p2p_peer_receive_bytes_total{chID="32",chain_id="ci",message_type="consensus_NewRoundStep",peer_id="2551a13ed720101b271a5df4816d1e4b3d3bd133"} 393
tendermint_p2p_peer_receive_bytes_total{chID="32",chain_id="ci",message_type="consensus_NewRoundStep",peer_id="4b1068420ef739db63377250553562b9a978708a"} 357
tendermint_p2p_peer_receive_bytes_total{chID="32",chain_id="ci",message_type="consensus_NewRoundStep",peer_id="927c50a5e508c747830ce3ba64a3f70fdda58ef2"} 386
```

(cherry picked from commit b4bc6bb4e8)
2021-10-27 07:34:24 -04:00
mergify[bot]
c6d62cc8b2 docs: fix broken links and layout (#7154) (#7163)
This PR does a few minor touch ups to the docs

(cherry picked from commit ce89292712)

Co-authored-by: Callum Waters <cmwaters19@gmail.com>
2021-10-27 05:14:35 -04:00
mergify[bot]
ce6014ddf5 docs: add reactor sections (backport #6510) (#7151) 2021-10-22 18:29:33 +02:00
mergify[bot]
e62a75b627 state: add height assertion to rollback function (#7143) (#7148)
(cherry picked from commit a8ff617773)

Co-authored-by: Callum Waters <cmwaters19@gmail.com>
2021-10-21 18:07:51 +02:00
mergify[bot]
dbc72e0d69 mempool: remove panic when recheck-tx was not sent to ABCI application (#7134) (#7142)
This pull request fixes a panic that exists in both mempools. The panic occurs when the ABCI client misses a response from the ABCI application. This happen when the ABCI client drops the request as a result of a full client queue. The fix here was to loop through the ordered list of recheck-tx in the callback until one matches the currently observed recheck request.

(cherry picked from commit b0130c88fb)

Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com>
2021-10-19 10:21:47 -04:00
mergify[bot]
57e4e18ba3 build: Fix build-docker to include the full context. (#7114) (#7116)
Fixes #7068. The build-docker rule relies on being able to run make
build-linux, but did not pull the Makefile into the build context.
There are various ways to fix this, but this was probably the smallest.

(cherry picked from commit 6538776e6a)

Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
2021-10-12 16:50:34 -07:00
mergify[bot]
b7fe214b81 Revert "abci: change client to use multi-reader mutexes (#6306)" (backport #7106) (#7110)
* Revert "abci: change client to use multi-reader mutexes (#6306)" (#7106)

This reverts commit 1c4dbe30d4.

(cherry picked from commit 34a3fcd8fc)
2021-10-12 12:03:00 -04:00
mergify[bot]
66e8eec194 light: Update links in package docs. (#7099) (#7101)
Fixes #7098. The light client documentation moved to the spec repository.

I was not able to figure out what happened to light-client-protocol.md, it was removed in #5252 but no corresponding file exists in the spec repository. Since the spec also discusses the protocol, this change simply links to the spec and removes the non-functional reference.

Alternatively we could link to the top-level [light client doc](https://docs.tendermint.com/master/tendermint-core/light-client.html) if you think that's better.

(cherry picked from commit 48295955ed)

Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
2021-10-11 19:49:25 -07:00
mergify[bot]
22e33aba98 e2e: light nodes should use builtin abci app (#7095) (#7097)
(cherry picked from commit befd669794)

Co-authored-by: Sam Kleinman <garen@tychoish.com>
2021-10-09 00:32:53 -04:00
mergify[bot]
af85f7e917 e2e: abci protocol should be consistent across networks (#7078) (#7086)
It seems weird in retrospect that we allow networks to contain
applications that use different ABCI protocols.

(cherry picked from commit f2a8f5e054)

Co-authored-by: Sam Kleinman <garen@tychoish.com>
2021-10-08 10:15:15 -04:00
mergify[bot]
f0cd54825f cli: allow node operator to rollback last state (backport #7033) (#7081) 2021-10-08 09:56:18 +02:00
M. J. Fromberger
98bc4f0e2b Update changelog for v0.35.0-rc3. (#7074) v0.35.0-rc3 2021-10-06 11:19:26 -07:00
mergify[bot]
bff85fc07b mempool,rpc: add removetx rpc method (#7047) (#7065)
Addresses one of the concerns with #7041.

Provides a mechanism (via the RPC interface) to delete a single transaction, described by its hash, from the mempool. The method returns an error if the transaction cannot be found. Once the transaction is removed it remains in the cache and cannot be resubmitted until the cache is cleared or it expires from the cache.

(cherry picked from commit 851d2e3bde)

Co-authored-by: Sam Kleinman <garen@tychoish.com>
2021-10-05 16:36:21 -04:00
mergify[bot]
4a952885c5 e2e: automatically prune old app snapshots (#7034) (#7063)
This PR tackles the case of using the e2e application in a long lived testnet. The application continually saves snapshots (usually every 100 blocks) which after a while bloats the size of the application. This PR prunes older snapshots so that only the most recent 10 snapshots remain.

(cherry picked from commit 5703ae2fb3)

Co-authored-by: Callum Waters <cmwaters19@gmail.com>
2021-10-05 15:26:08 -04:00
William Banfield
42ed5d75a5 consensus: wait until peerUpdates channel is closed to close remaining peers (#7058) (#7060)
The race occurred as a result of a goroutine launched by `processPeerUpdate` racing with the `OnStop` method. The `processPeerUpdates` goroutine deletes from the map as `OnStop` is reading from it. This change updates the `OnStop` method to wait for the peer updates channel to be done before closing the peers. It also copies the map contents to a new map so that it will not conflict with the view of the map that the goroutine created in `processPeerUpdate` sees.
2021-10-05 10:49:26 -04:00
M. J. Fromberger
be684091ae Revert "Consolidate related changelog entries. (#7056)" (#7061)
This reverts commits:
  c16cd72c0a
  6ef847fdfe

We decided on another release candidate to sort out SDK merge issues.
2021-10-05 06:38:23 -07:00
M. J. Fromberger
c16cd72c0a Consolidate related changelog entries. (#7056) 2021-10-04 14:05:25 -07:00
M. J. Fromberger
6ef847fdfe Consolidate release candidate changelogs for v0.35. (#7052) 2021-10-04 12:22:32 -07:00