Compare commits

..

39 Commits

Author SHA1 Message Date
William Banfield
750f709b42 wip 2021-09-28 16:42:43 -04:00
William Banfield
c9b775d2f0 wip 2021-09-28 16:10:46 -04:00
William Banfield
a3889ee2cb consensus: remove logic to unlock block on 2/3 prevote for nil (#6954) 2021-09-24 11:19:57 -04:00
William Banfield
87f4beb374 consensus: remove panics from test helper functions (#6969) 2021-09-22 08:56:42 -04:00
Sam Kleinman
b0423e2445 e2e: allow load generator to succed for short tests (#6952)
This should address last night's failure. We've taken the perspective
of "the load generator shouldn't cause tests to fail" in recent
days/weeks, and I think this is just a next step along that line. The
e2e tests shouldn't test performance. 

I included some comments indicating the ways that this isn't ideal (it
is perhaps not), and I think that if test networks could make
assertions about the required rate, that might be a cool future
improvement (and good, perhaps, for system benchmarking.)
2021-09-16 15:45:51 +00:00
dependabot[bot]
b0684bd300 build(deps): Bump github.com/vektra/mockery/v2 from 2.9.0 to 2.9.3 (#6951)
Bumps [github.com/vektra/mockery/v2](https://github.com/vektra/mockery) from 2.9.0 to 2.9.3.
- [Release notes](https://github.com/vektra/mockery/releases)
- [Changelog](https://github.com/vektra/mockery/blob/master/.goreleaser.yml)
- [Commits](https://github.com/vektra/mockery/compare/v2.9.0...v2.9.3)

---
updated-dependencies:
- dependency-name: github.com/vektra/mockery/v2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-09-16 08:41:46 -04:00
William Banfield
382947ce93 rfc: add performance taxonomy rfc (#6921)
This document attempts to capture and discuss some of the areas of Tendermint that seem to be cited as causing performance issue. I'm hoping to continue to gather feedback and input on this document to better understand what issues Tendermint performance may cause for our users. 

The overall goal of this document is to allow the maintainers and community to get a better sense of these issues and to be more capably able to discuss them and weight trade-offs about any proposed performance-focused changes. This document does not aim to propose any performance improvements. It does suggest useful places for benchmarks and places where additional metrics would be useful for diagnosing and further understanding Tendermint performance.

Please comment with areas where my reasoning seems off or with additional areas that Tendermint performance may be causing user pain.
2021-09-16 06:13:27 +00:00
Callum Waters
9a7ce08e3e statesync: shut down node when statesync fails (#6944) 2021-09-16 07:43:23 +02:00
Sam Kleinman
55f6d20977 e2e: skip broadcastTxCommit check (#6949)
I think the `Sync` check covers our primary use case, and perhaps we
can turn this back on in the future after some kind of event-system
rewrite, or RPC rewrite that will avoid the serverside timeout.
2021-09-15 21:24:35 +00:00
Sam Kleinman
b9c35c1263 docs: fix openapi yaml lint (#6948)
saw this in the super lint.
2021-09-15 19:29:25 +00:00
Sam Kleinman
f08f72e334 rfc: e2e improvements (#6941) 2021-09-15 15:26:39 -04:00
Callum Waters
e932b469ed e2e: tweak semantics of waitForHeight (#6943) 2021-09-15 20:49:24 +02:00
Callum Waters
5db2a39643 docs: add documentation of unsafe_flush_mempool to openapi (#6947) 2021-09-15 17:28:01 +02:00
Sam Kleinman
6909158933 e2e: reduce load pressure (#6939) 2021-09-14 10:44:30 -04:00
dependabot[bot]
de2cffe7a4 build(deps): Bump codecov/codecov-action from 2.0.3 to 2.1.0 (#6938) 2021-09-14 08:31:41 -04:00
Sam Kleinman
c257cda212 e2e: slow load processes with longer evidence timeouts (#6936)
These are mostly the timeouts that I think we're still hitting in CI. 

At this point, the tests (on master) pass on my local machine (which is quite beefy) so I think this is just the first in (perhaps?) a sequence of changes that attempt to change timeouts and load patterns so that the tests pass in CI more reliably.
2021-09-13 20:57:25 +00:00
M. J. Fromberger
5a49d1b997 RFC 002 Interprocess Communication in Tendermint (#6913)
Communication in Tendermint among consensus nodes, applications, and operator
tools all use different message formats and transport mechanisms.  In some
cases there are multiple options. Having all these options complicates both the
code and the developer experience, and hides bugs. To support a more robust,
trustworthy, and usable system, we should document which communication paths
are essential, which could be removed or reduced in scope, and what we can
improve for the most important use cases.

This document proposes a variety of possible improvements of varying size and
scope. Specific design proposals should get their own documentation.
2021-09-13 15:41:21 -04:00
M. J. Fromberger
e4feb56813 Update CHANGELOG.md for release v0.34.13. (#6935)
Fixes #6933. I forgot to update master after I did the v0.34.13 release.
2021-09-13 18:58:04 +00:00
Sam Kleinman
abbe8209b5 e2e: reduce load volume (#6932) 2021-09-13 13:45:01 -04:00
Sam Kleinman
723bf92ebb ci: skip coverage tasks for test infrastructure (#6934) 2021-09-13 13:23:16 -04:00
Sam Kleinman
ef79241f79 ci: skip coverage for non-go changes (#6927) 2021-09-13 09:06:56 -04:00
Sam Kleinman
3bf0c7a712 e2e: improve p2p mode selection (#6929)
The previous implemention of hybrid set testing, which was entirely my
own creation, was a bit peculiar, and I think this probably clears thins up.

The previous implementation had far fewer legacy nodes in hybrid
networks, *and* also for some reason that I can't quite explain,
caused a test case to fail.
2021-09-12 06:30:58 +00:00
Sam Kleinman
055f1b3279 ci: disable codecov patch status check (#6930) 2021-09-10 19:08:04 -04:00
Sam Kleinman
1998cf7e77 e2e: compile tests (#6926) 2021-09-10 13:34:26 -04:00
Sam Kleinman
c3bcf9b180 e2e: test multiple broadcast tx methods (#6925) 2021-09-10 12:03:41 -04:00
Callum Waters
f1b9613301 e2e: increase retain height to at least twice evidence age (#6924) 2021-09-10 16:18:01 +02:00
dependabot[bot]
5d279c93db build(deps): Bump github.com/rs/zerolog from 1.24.0 to 1.25.0 (#6923)
Bumps [github.com/rs/zerolog](https://github.com/rs/zerolog) from 1.24.0 to 1.25.0.
<details>
<summary>Commits</summary>
<ul>
<li><a href="65adfd88ec"><code>65adfd8</code></a> Make Fields method accept both map and slice (<a href="https://github-redirect.dependabot.com/rs/zerolog/issues/352">#352</a>)</li>
<li>See full diff in <a href="https://github.com/rs/zerolog/compare/v1.24.0...v1.25.0">compare view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=github.com/rs/zerolog&package-manager=go_modules&previous-version=1.24.0&new-version=1.25.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)


</details>
2021-09-10 13:38:55 +00:00
Sam Kleinman
af71f1cbcb e2e: load generation and logging changes (#6912) 2021-09-10 09:26:17 -04:00
Sam Kleinman
1a9bad9dd3 ci: tweak code coverage settings (#6920) 2021-09-09 15:58:52 -04:00
M. J. Fromberger
db690c3b68 rpc: fix hash encoding in JSON parameters (#6813)
The responses from node RPCs encode hash values as hexadecimal strings. This
behaviour is stipulated in our OpenAPI documentation. In some cases, however,
hashes received as JSON parameters were being decoded as byte buffers, as is
the convention for JSON.

This resulted in the confusing situation that a hash reported by one request
(e.g., broadcast_tx_commit) could not be passed as a parameter to another
(e.g., tx) via JSON, without translating the hex-encoded output hash into the
base64 encoding used by JSON for opaque bytes.

Fixes #6802.
2021-09-09 15:51:17 -04:00
Sam Kleinman
0c3601bcac e2e: introduce canonical ordering of manifests (#6918) 2021-09-09 14:31:17 -04:00
Sam Kleinman
816e9b0b49 ci: drop codecov bot (#6917) 2021-09-09 14:04:56 -04:00
Sam Kleinman
2a224fb2bd rfc: database storage engine (#6897) 2021-09-09 12:42:15 -04:00
William Banfield
2a74c9c498 update readme to more accurately reflect the tendermint public api (#6916) 2021-09-08 17:49:36 -04:00
William Banfield
dc0e04d243 rename configuration parameters to use the new blocksync nomenclature (#6896)
The 0.35 release cycle renamed the 'fastsync' functionality to 'blocksync'. This change brings the configuration parameters in line with that change. Namely, it updates the configuration file `[fastsync]` field to be `[blocksync]` and changes the command line flag and config file parameters `--fast-sync` and `fast-sync` to `--enable-block-sync` and `enable-block-sync` respectively.

Error messages were added to help users encountering these changes be able to quickly make the needed update to their files/scripts.

When using the old command line argument for fast-sync, the following is printed

```
./build/tendermint start --proxy-app=kvstore --consensus.create-empty-blocks=false --fast-sync=false
ERROR: invalid argument "false" for "--fast-sync" flag: --fast-sync has been deprecated, please use --enable-block-sync
```

When using one of the old config file parameters, the following is printed:
```
./build/tendermint start --proxy-app=kvstore --consensus.create-empty-blocks=false
ERROR: error in config file: a configuration parameter named 'fast-sync' was found in the configuration file. The 'fast-sync' parameter has been renamed to 'enable-block-sync', please update the 'fast-sync' field in your configuration file to 'enable-block-sync'
```
2021-09-08 13:58:12 +00:00
William Banfield
63aeb50665 upgrading: add information into the UPGRADING.md for users of the codebase wishing to upgrade (#6898)
* add information on upgrading to the new p2p library

* clarify p2p backwards compatibility

* reorder p2p queue list

* add demo for p2p selection

* fix spacing in upgrading
2021-09-08 09:41:12 -04:00
William Banfield
9b458a1c43 update changelog ahead of v0.35 release (#6893)
This change moves the changelog entries from CHANGELOG_PENDING.md to CHANGELOG.md ahead of the 0.35 release.
2021-09-07 22:38:45 +00:00
M. J. Fromberger
cfe64ed8b6 cleanup: fix order of linters in the golangci-lint config (#6910)
This is a cosmetic change that restores lexicographic order to the selected
linters in the CI config. No change to which linters we run, only putting them
back in order so it's easier to spot the one you care about.
2021-09-07 20:08:46 +00:00
M. J. Fromberger
db6e031a16 doc: fix a typo in the indexing section (#6909) 2021-09-07 18:44:23 +00:00
74 changed files with 3096 additions and 1172 deletions

14
.github/codecov.yml vendored
View File

@@ -5,19 +5,14 @@ coverage:
status:
project:
default:
threshold: 1%
patch: on
threshold: 20%
patch: off
changes: off
github_checks:
annotations: false
comment:
layout: "diff, files"
behavior: default
require_changes: no
require_base: no
require_head: yes
comment: false
ignore:
- "docs"
@@ -25,3 +20,6 @@ ignore:
- "scripts"
- "**/*.pb.go"
- "libs/pubsub/query/query.peg.go"
- "*.md"
- "*.rst"
- "*.yml"

View File

@@ -2,6 +2,9 @@ name: Test Coverage
on:
pull_request:
push:
paths:
- "**.go"
- "!test/"
branches:
- master
- release/**
@@ -50,6 +53,7 @@ jobs:
with:
PATTERNS: |
**/**.go
"!test/"
go.mod
go.sum
- name: install
@@ -72,6 +76,7 @@ jobs:
with:
PATTERNS: |
**/**.go
"!test/"
go.mod
go.sum
- uses: actions/download-artifact@v2
@@ -100,6 +105,7 @@ jobs:
with:
PATTERNS: |
**/**.go
"!test/"
go.mod
go.sum
- uses: actions/download-artifact@v2
@@ -121,7 +127,7 @@ jobs:
- run: |
cat ./*profile.out | grep -v "mode: atomic" >> coverage.txt
if: env.GIT_DIFF
- uses: codecov/codecov-action@v2.0.3
- uses: codecov/codecov-action@v2.1.0
with:
file: ./coverage.txt
if: env.GIT_DIFF

View File

@@ -30,7 +30,7 @@ jobs:
- name: Build
working-directory: test/e2e
# Run make jobs in parallel, since we can't run steps in parallel.
run: make -j2 docker generator runner
run: make -j2 docker generator runner tests
- name: Generate testnets
working-directory: test/e2e

View File

@@ -28,7 +28,7 @@ jobs:
- name: Build
working-directory: test/e2e
# Run two make jobs in parallel, since we can't run steps in parallel.
run: make -j2 docker runner
run: make -j2 docker runner tests
if: "env.GIT_DIFF != ''"
- name: Run CI testnet

View File

@@ -1,14 +1,17 @@
linters:
enable:
- asciicheck
- bodyclose
- deadcode
- depguard
- dogsled
- dupl
- errcheck
- exportloopref
# - funlen
# - gochecknoglobals
# - gochecknoinits
# - gocognit
- goconst
- gocritic
# - gocyclo
@@ -22,11 +25,11 @@ linters:
- ineffassign
# - interfacer
- lll
- misspell
# - maligned
- misspell
- nakedret
- nolintlint
- prealloc
- exportloopref
- staticcheck
- structcheck
- stylecheck
@@ -37,9 +40,6 @@ linters:
- varcheck
# - whitespace
# - wsl
# - gocognit
- nolintlint
- asciicheck
issues:
exclude-rules:

View File

@@ -1,6 +1,173 @@
# Changelog
Friendly reminder, we have a [bug bounty program](https://hackerone.com/tendermint).
Friendly reminder: We have a [bug bounty program](https://hackerone.com/tendermint).
## v0.35
Special thanks to external contributors on this release: @JayT106, @bipulprasad, @alessio, @Yawning, @silasdavis,
@cuonglm, @tanyabouman, @JoeKash, @githubsands, @jeebster, @crypto-facs, @liamsi, and @gotjoshua
### BREAKING CHANGES
- CLI/RPC/Config
- [pubsub/events] \#6634 The `ResultEvent.Events` field is now of type `[]abci.Event` preserving event order instead of `map[string][]string`. (@alexanderbez)
- [config] \#5598 The `test_fuzz` and `test_fuzz_config` P2P settings have been removed. (@erikgrinaker)
- [config] \#5728 `fastsync.version = "v1"` is no longer supported (@melekes)
- [cli] \#5772 `gen_node_key` prints JSON-encoded `NodeKey` rather than ID and does not save it to `node_key.json` (@melekes)
- [cli] \#5777 use hyphen-case instead of snake_case for all cli commands and config parameters (@cmwaters)
- [rpc] \#6019 standardise RPC errors and return the correct status code (@bipulprasad & @cmwaters)
- [rpc] \#6168 Change default sorting to desc for `/tx_search` results (@melekes)
- [cli] \#6282 User must specify the node mode when using `tendermint init` (@cmwaters)
- [state/indexer] \#6382 reconstruct indexer, move txindex into the indexer package (@JayT106)
- [cli] \#6372 Introduce `BootstrapPeers` as part of the new p2p stack. Peers to be connected on startup (@cmwaters)
- [config] \#6462 Move `PrivValidator` configuration out of `BaseConfig` into its own section. (@tychoish)
- [rpc] \#6610 Add MaxPeerBlockHeight into /status rpc call (@JayT106)
- [blocksync/rpc] \#6620 Add TotalSyncedTime & RemainingTime to SyncInfo in /status RPC (@JayT106)
- [rpc/grpc] \#6725 Mark gRPC in the RPC layer as deprecated.
- [blocksync/v2] \#6730 Fast Sync v2 is deprecated, please use v0
- [rpc] Add genesis_chunked method to support paginated and parallel fetching of large genesis documents.
- [rpc/jsonrpc/server] \#6785 `Listen` function updated to take an `int` argument, `maxOpenConnections`, instead of an entire config object. (@williambanfield)
- [rpc] \#6820 Update RPC methods to reflect changes in the p2p layer, disabling support for `UnsafeDialPeers` and `UnsafeDialPeers` when used with the new p2p layer, and changing the response format of the peer list in `NetInfo` for all users.
- [cli] \#6854 Remove deprecated snake case commands. (@tychoish)
- Apps
- [ABCI] \#6408 Change the `key` and `value` fields from `[]byte` to `string` in the `EventAttribute` type. (@alexanderbez)
- [ABCI] \#5447 Remove `SetOption` method from `ABCI.Client` interface
- [ABCI] \#5447 Reset `Oneof` indexes for `Request` and `Response`.
- [ABCI] \#5818 Use protoio for msg length delimitation. Migrates from int64 to uint64 length delimiters.
- [ABCI] \#3546 Add `mempool_error` field to `ResponseCheckTx`. This field will contain an error string if Tendermint encountered an error while adding a transaction to the mempool. (@williambanfield)
- [Version] \#6494 `TMCoreSemVer` has been renamed to `TMVersion`.
- It is not required any longer to set ldflags to set version strings
- [abci/counter] \#6684 Delete counter example app
- Go API
- [pubsub] \#6634 The `Query#Matches` method along with other pubsub methods, now accepts a `[]abci.Event` instead of `map[string][]string`. (@alexanderbez)
- [p2p] \#6618 \#6583 Move `p2p.NodeInfo`, `p2p.NodeID` and `p2p.NetAddress` into `types` to support use in external packages. (@tychoish)
- [node] \#6540 Reduce surface area of the `node` package by making most of the implementation details private. (@tychoish)
- [p2p] \#6547 Move the entire `p2p` package and all reactor implementations into `internal`. (@tychoish)
- [libs/log] \#6534 Remove the existing custom Tendermint logger backed by go-kit. The logging interface, `Logger`, remains. Tendermint still provides a default logger backed by the performant zerolog logger. (@alexanderbez)
- [libs/time] \#6495 Move types/time to libs/time to improve consistency. (@tychoish)
- [mempool] \#6529 The `Context` field has been removed from the `TxInfo` type. `CheckTx` now requires a `Context` argument. (@alexanderbez)
- [abci/client, proxy] \#5673 `Async` funcs return an error, `Sync` and `Async` funcs accept `context.Context` (@melekes)
- [p2p] Remove unused function `MakePoWTarget`. (@erikgrinaker)
- [libs/bits] \#5720 Validate `BitArray` in `FromProto`, which now returns an error (@melekes)
- [proto/p2p] Rename `DefaultNodeInfo` and `DefaultNodeInfoOther` to `NodeInfo` and `NodeInfoOther` (@erikgrinaker)
- [proto/p2p] Rename `NodeInfo.default_node_id` to `node_id` (@erikgrinaker)
- [libs/os] Kill() and {Must,}{Read,Write}File() functions have been removed. (@alessio)
- [store] \#5848 Remove block store state in favor of using the db iterators directly (@cmwaters)
- [state] \#5864 Use an iterator when pruning state (@cmwaters)
- [types] \#6023 Remove `tm2pb.Header`, `tm2pb.BlockID`, `tm2pb.PartSetHeader` and `tm2pb.NewValidatorUpdate`.
- Each of the above types has a `ToProto` and `FromProto` method or function which replaced this logic.
- [light] \#6054 Move `MaxRetryAttempt` option from client to provider.
- `NewWithOptions` now sets the max retry attempts and timeouts (@cmwaters)
- [all] \#6077 Change spelling from British English to American (@cmwaters)
- Rename "Subscription.Cancelled()" to "Subscription.Canceled()" in libs/pubsub
- Rename "behaviour" pkg to "behavior" and internalized it in blocksync v2
- [rpc/client/http] \#6176 Remove `endpoint` arg from `New`, `NewWithTimeout` and `NewWithClient` (@melekes)
- [rpc/client/http] \#6176 Unexpose `WSEvents` (@melekes)
- [rpc/jsonrpc/client/ws_client] \#6176 `NewWS` no longer accepts options (use `NewWSWithOptions` and `OnReconnect` funcs to configure the client) (@melekes)
- [internal/libs] \#6366 Move `autofile`, `clist`,`fail`,`flowrate`, `protoio`, `sync`, `tempfile`, `test` and `timer` lib packages to an internal folder
- [libs/rand] \#6364 Remove most of libs/rand in favour of standard lib's `math/rand` (@liamsi)
- [mempool] \#6466 The original mempool reactor has been versioned as `v0` and moved to a sub-package under the root `mempool` package.
Some core types have been kept in the `mempool` package such as `TxCache` and it's implementations, the `Mempool` interface itself
and `TxInfo`. (@alexanderbez)
- [crypto/sr25519] \#6526 Do not re-execute the Ed25519-style key derivation step when doing signing and verification. The derivation is now done once and only once. This breaks `sr25519.GenPrivKeyFromSecret` output compatibility. (@Yawning)
- [types] \#6627 Move `NodeKey` to types to make the type public.
- [config] \#6627 Extend `config` to contain methods `LoadNodeKeyID` and `LoadorGenNodeKeyID`
- [blocksync] \#6755 Rename `FastSync` and `Blockchain` package to `BlockSync` (@cmwaters)
- Data Storage
- [store/state/evidence/light] \#5771 Use an order-preserving varint key encoding (@cmwaters)
- [mempool] \#6396 Remove mempool's write ahead log (WAL), (previously unused by the tendermint code). (@tychoish)
- [state] \#6541 Move pruneBlocks from consensus/state to state/execution. (@JayT106)
- Tooling
- [tools] \#6498 Set OS home dir to instead of the hardcoded PATH. (@JayT106)
- [cli/indexer] \#6676 Reindex events command line tooling. (@JayT106)
### FEATURES
- [config] Add `--mode` flag and config variable. See [ADR-52](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-052-tendermint-mode.md) @dongsam
- [rpc] \#6329 Don't cap page size in unsafe mode (@gotjoshua, @cmwaters)
- [pex] \#6305 v2 pex reactor with backwards compatability. Introduces two new pex messages to
accomodate for the new p2p stack. Removes the notion of seeds and crawling. All peer
exchange reactors behave the same. (@cmwaters)
- [crypto] \#6376 Enable sr25519 as a validator key type
- [mempool] \#6466 Introduction of a prioritized mempool. (@alexanderbez)
- `Priority` and `Sender` have been introduced into the `ResponseCheckTx` type, where the `priority` will determine the prioritization of
the transaction when a proposer reaps transactions for a block proposal. The `sender` field acts as an index.
- Operators may toggle between the legacy mempool reactor, `v0`, and the new prioritized reactor, `v1`, by setting the
`mempool.version` configuration, where `v1` is the default configuration.
- Applications that do not specify a priority, i.e. zero, will have transactions reaped by the order in which they are received by the node.
- Transactions are gossiped in FIFO order as they are in `v0`.
- [config/indexer] \#6411 Introduce support for custom event indexing data sources, specifically PostgreSQL. (@JayT106)
- [blocksync/event] \#6619 Emit blocksync status event when switching consensus/blocksync (@JayT106)
- [statesync/event] \#6700 Emit statesync status start/end event (@JayT106)
- [inspect] \#6785 Add a new `inspect` command for introspecting the state and block store of a crashed tendermint node. (@williambanfield)
### IMPROVEMENTS
- [libs/log] Console log formatting changes as a result of \#6534 and \#6589. (@tychoish)
- [statesync] \#6566 Allow state sync fetchers and request timeout to be configurable. (@alexanderbez)
- [types] \#6478 Add `block_id` to `newblock` event (@jeebster)
- [crypto/ed25519] \#5632 Adopt zip215 `ed25519` verification. (@marbar3778)
- [crypto/ed25519] \#6526 Use [curve25519-voi](https://github.com/oasisprotocol/curve25519-voi) for `ed25519` signing and verification. (@Yawning)
- [crypto/sr25519] \#6526 Use [curve25519-voi](https://github.com/oasisprotocol/curve25519-voi) for `sr25519` signing and verification. (@Yawning)
- [privval] \#5603 Add `--key` to `init`, `gen_validator`, `testnet` & `unsafe_reset_priv_validator` for use in generating `secp256k1` keys.
- [privval] \#5725 Add gRPC support to private validator.
- [privval] \#5876 `tendermint show-validator` will query the remote signer if gRPC is being used (@marbar3778)
- [abci/client] \#5673 `Async` requests return an error if queue is full (@melekes)
- [mempool] \#5673 Cancel `CheckTx` requests if RPC client disconnects or times out (@melekes)
- [abci] \#5706 Added `AbciVersion` to `RequestInfo` allowing applications to check ABCI version when connecting to Tendermint. (@marbar3778)
- [blocksync/v1] \#5728 Remove blocksync v1 (@melekes)
- [blocksync/v0] \#5741 Relax termination conditions and increase sync timeout (@melekes)
- [cli] \#5772 `gen_node_key` output now contains node ID (`id` field) (@melekes)
- [blocksync/v2] \#5774 Send status request when new peer joins (@melekes)
- [store] \#5888 store.SaveBlock saves using batches instead of transactions for now to improve ACID properties. This is a quick fix for underlying issues around tm-db and ACID guarantees. (@githubsands)
- [consensus] \#5987 and \#5792 Remove the `time_iota_ms` consensus parameter. Merge `tmproto.ConsensusParams` and `abci.ConsensusParams`. (@marbar3778, @valardragon)
- [types] \#5994 Reduce the use of protobuf types in core logic. (@marbar3778)
- `ConsensusParams`, `BlockParams`, `ValidatorParams`, `EvidenceParams`, `VersionParams`, `sm.Version` and `version.Consensus` have become native types. They still utilize protobuf when being sent over the wire or written to disk.
- [rpc/client/http] \#6163 Do not drop events even if the `out` channel is full (@melekes)
- [node] \#6059 Validate and complete genesis doc before saving to state store (@silasdavis)
- [state] \#6067 Batch save state data (@githubsands & @cmwaters)
- [crypto] \#6120 Implement batch verification interface for ed25519 and sr25519. (@marbar3778)
- [types] \#6120 use batch verification for verifying commits signatures.
- If the key type supports the batch verification API it will try to batch verify. If the verification fails we will single verify each signature.
- [privval/file] \#6185 Return error on `LoadFilePV`, `LoadFilePVEmptyState`. Allows for better programmatic control of Tendermint.
- [privval] \#6240 Add `context.Context` to privval interface.
- [rpc] \#6265 set cache control in http-rpc response header (@JayT106)
- [statesync] \#6378 Retry requests for snapshots and add a minimum discovery time (5s) for new snapshots.
- [node/state] \#6370 graceful shutdown in the consensus reactor (@JayT106)
- [crypto/merkle] \#6443 Improve HashAlternatives performance (@cuonglm)
- [crypto/merkle] \#6513 Optimize HashAlternatives (@marbar3778)
- [p2p/pex] \#6509 Improve addrBook.hash performance (@cuonglm)
- [consensus/metrics] \#6549 Change block_size gauge to a histogram for better observability over time (@marbar3778)
- [statesync] \#6587 Increase chunk priority and re-request chunks that don't arrive (@cmwaters)
- [state/privval] \#6578 No GetPubKey retry beyond the proposal/voting window (@JayT106)
- [rpc] \#6615 Add TotalGasUsed to block_results response (@crypto-facs)
- [cmd/tendermint/commands] \#6623 replace `$HOME/.some/test/dir` with `t.TempDir` (@tanyabouman)
- [statesync] \6807 Implement P2P state provider as an alternative to RPC (@cmwaters)
### BUG FIXES
- [privval] \#5638 Increase read/write timeout to 5s and calculate ping interval based on it (@JoeKash)
- [evidence] \#6375 Fix bug with inconsistent LightClientAttackEvidence hashing (cmwaters)
- [rpc] \#6507 Ensure RPC client can handle URLs without ports (@JayT106)
- [statesync] \#6463 Adds Reverse Sync feature to fetch historical light blocks after state sync in order to verify any evidence (@cmwaters)
- [blocksync] \#6590 Update the metrics during blocksync (@JayT106)
## v0.34.13
*September 6, 2021*
This release backports improvements to state synchronization and ABCI
performance under concurrent load, and the PostgreSQL event indexer.
### IMPROVEMENTS
- [statesync] [\#6881](https://github.com/tendermint/tendermint/issues/6881) improvements to stateprovider logic (@cmwaters)
- [ABCI] [\#6873](https://github.com/tendermint/tendermint/issues/6873) change client to use multi-reader mutexes (@tychoish)
- [indexing] [\#6906](https://github.com/tendermint/tendermint/issues/6906) enable the PostgreSQL indexer sink (@creachadair)
## v0.34.12

View File

@@ -9,157 +9,18 @@ Friendly reminder: We have a [bug bounty program](https://hackerone.com/tendermi
### BREAKING CHANGES
- CLI/RPC/Config
- [pubsub/events] \#6634 The `ResultEvent.Events` field is now of type `[]abci.Event` preserving event order instead of `map[string][]string`. (@alexanderbez)
- [config] \#5598 The `test_fuzz` and `test_fuzz_config` P2P settings have been removed. (@erikgrinaker)
- [config] \#5728 `fast_sync = "v1"` is no longer supported (@melekes)
- [cli] \#5772 `gen_node_key` prints JSON-encoded `NodeKey` rather than ID and does not save it to `node_key.json` (@melekes)
- [cli] \#5777 use hyphen-case instead of snake_case for all cli commands and config parameters (@cmwaters)
- [rpc] \#6019 standardise RPC errors and return the correct status code (@bipulprasad & @cmwaters)
- [rpc] \#6168 Change default sorting to desc for `/tx_search` results (@melekes)
- [cli] \#6282 User must specify the node mode when using `tendermint init` (@cmwaters)
- [state/indexer] \#6382 reconstruct indexer, move txindex into the indexer package (@JayT106)
- [cli] \#6372 Introduce `BootstrapPeers` as part of the new p2p stack. Peers to be connected on startup (@cmwaters)
- [config] \#6462 Move `PrivValidator` configuration out of `BaseConfig` into its own section. (@tychoish)
- [rpc] \#6610 Add MaxPeerBlockHeight into /status rpc call (@JayT106)
- [fastsync/rpc] \#6620 Add TotalSyncedTime & RemainingTime to SyncInfo in /status RPC (@JayT106)
- [rpc/grpc] \#6725 Mark gRPC in the RPC layer as deprecated.
- [blockchain/v2] \#6730 Fast Sync v2 is deprecated, please use v0
- [rpc] Add genesis_chunked method to support paginated and parallel fetching of large genesis documents.
- [rpc/jsonrpc/server] \#6785 `Listen` function updated to take an `int` argument, `maxOpenConnections`, instead of an entire config object. (@williambanfield)
- [rpc] \#6820 Update RPC methods to reflect changes in the p2p layer, disabling support for `UnsafeDialPeers` and `UnsafeDialPeers` when used with the new p2p layer, and changing the response format of the peer list in `NetInfo` for all users.
- [cli] \#6854 Remove deprecated snake case commands. (@tychoish)
- Apps
- [ABCI] \#6408 Change the `key` and `value` fields from `[]byte` to `string` in the `EventAttribute` type. (@alexanderbez)
- [ABCI] \#5447 Remove `SetOption` method from `ABCI.Client` interface
- [ABCI] \#5447 Reset `Oneof` indexes for `Request` and `Response`.
- [ABCI] \#5818 Use protoio for msg length delimitation. Migrates from int64 to uint64 length delimiters.
- [ABCI] \#3546 Add `mempool_error` field to `ResponseCheckTx`. This field will contain an error string if Tendermint encountered an error while adding a transaction to the mempool. (@williambanfield)
- [Version] \#6494 `TMCoreSemVer` has been renamed to `TMVersion`.
- It is not required any longer to set ldflags to set version strings
- [abci/counter] \#6684 Delete counter example app
- P2P Protocol
- Go API
- [pubsub] \#6634 The `Query#Matches` method along with other pubsub methods, now accepts a `[]abci.Event` instead of `map[string][]string`. (@alexanderbez)
- [p2p] \#6618 Move `p2p.NodeInfo` into `types` to support use of the SDK. (@tychoish)
- [p2p] \#6583 Make `p2p.NodeID` and `p2p.NetAddress` exported types to support their use in the RPC layer. (@tychoish)
- [node] \#6540 Reduce surface area of the `node` package by making most of the implementation details private. (@tychoish)
- [p2p] \#6547 Move the entire `p2p` package and all reactor implementations into `internal`. (@tychoish)
- [libs/log] \#6534 Remove the existing custom Tendermint logger backed by go-kit. The logging interface, `Logger`, remains. Tendermint still provides a default logger backed by the performant zerolog logger. (@alexanderbez)
- [libs/time] \#6495 Move types/time to libs/time to improve consistency. (@tychoish)
- [mempool] \#6529 The `Context` field has been removed from the `TxInfo` type. `CheckTx` now requires a `Context` argument. (@alexanderbez)
- [abci/client, proxy] \#5673 `Async` funcs return an error, `Sync` and `Async` funcs accept `context.Context` (@melekes)
- [p2p] Remove unused function `MakePoWTarget`. (@erikgrinaker)
- [libs/bits] \#5720 Validate `BitArray` in `FromProto`, which now returns an error (@melekes)
- [proto/p2p] Rename `DefaultNodeInfo` and `DefaultNodeInfoOther` to `NodeInfo` and `NodeInfoOther` (@erikgrinaker)
- [proto/p2p] Rename `NodeInfo.default_node_id` to `node_id` (@erikgrinaker)
- [libs/os] Kill() and {Must,}{Read,Write}File() functions have been removed. (@alessio)
- [store] \#5848 Remove block store state in favor of using the db iterators directly (@cmwaters)
- [state] \#5864 Use an iterator when pruning state (@cmwaters)
- [types] \#6023 Remove `tm2pb.Header`, `tm2pb.BlockID`, `tm2pb.PartSetHeader` and `tm2pb.NewValidatorUpdate`.
- Each of the above types has a `ToProto` and `FromProto` method or function which replaced this logic.
- [light] \#6054 Move `MaxRetryAttempt` option from client to provider.
- `NewWithOptions` now sets the max retry attempts and timeouts (@cmwaters)
- [all] \#6077 Change spelling from British English to American (@cmwaters)
- Rename "Subscription.Cancelled()" to "Subscription.Canceled()" in libs/pubsub
- Rename "behaviour" pkg to "behavior" and internalized it in blockchain v2
- [rpc/client/http] \#6176 Remove `endpoint` arg from `New`, `NewWithTimeout` and `NewWithClient` (@melekes)
- [rpc/client/http] \#6176 Unexpose `WSEvents` (@melekes)
- [rpc/jsonrpc/client/ws_client] \#6176 `NewWS` no longer accepts options (use `NewWSWithOptions` and `OnReconnect` funcs to configure the client) (@melekes)
- [internal/libs] \#6366 Move `autofile`, `clist`,`fail`,`flowrate`, `protoio`, `sync`, `tempfile`, `test` and `timer` lib packages to an internal folder
- [libs/rand] \#6364 Remove most of libs/rand in favour of standard lib's `math/rand` (@liamsi)
- [mempool] \#6466 The original mempool reactor has been versioned as `v0` and moved to a sub-package under the root `mempool` package.
Some core types have been kept in the `mempool` package such as `TxCache` and it's implementations, the `Mempool` interface itself
and `TxInfo`. (@alexanderbez)
- [crypto/sr25519] \#6526 Do not re-execute the Ed25519-style key derivation step when doing signing and verification. The derivation is now done once and only once. This breaks `sr25519.GenPrivKeyFromSecret` output compatibility. (@Yawning)
- [types] \#6627 Move `NodeKey` to types to make the type public.
- [config] \#6627 Extend `config` to contain methods `LoadNodeKeyID` and `LoadorGenNodeKeyID`
- [blocksync] \#6755 Rename `FastSync` and `Blockchain` package to `BlockSync`
(@cmwaters)
- Blockchain Protocol
- Data Storage
- [store/state/evidence/light] \#5771 Use an order-preserving varint key encoding (@cmwaters)
- [mempool] \#6396 Remove mempool's write ahead log (WAL), (previously unused by the tendermint code). (@tychoish)
- [state] \#6541 Move pruneBlocks from consensus/state to state/execution. (@JayT106)
- Tooling
- [tools] \#6498 Set OS home dir to instead of the hardcoded PATH. (@JayT106)
- [cli/indexer] \#6676 Reindex events command line tooling. (@JayT106)
### FEATURES
- [config] Add `--mode` flag and config variable. See [ADR-52](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-052-tendermint-mode.md) @dongsam
- [rpc] \#6329 Don't cap page size in unsafe mode (@gotjoshua, @cmwaters)
- [pex] \#6305 v2 pex reactor with backwards compatability. Introduces two new pex messages to
accomodate for the new p2p stack. Removes the notion of seeds and crawling. All peer
exchange reactors behave the same. (@cmwaters)
- [crypto] \#6376 Enable sr25519 as a validator key
- [mempool] \#6466 Introduction of a prioritized mempool. (@alexanderbez)
- `Priority` and `Sender` have been introduced into the `ResponseCheckTx` type, where the `priority` will determine the prioritization of
the transaction when a proposer reaps transactions for a block proposal. The `sender` field acts as an index.
- Operators may toggle between the legacy mempool reactor, `v0`, and the new prioritized reactor, `v1`, by setting the
`mempool.version` configuration, where `v1` is the default configuration.
- Applications that do not specify a priority, i.e. zero, will have transactions reaped by the order in which they are received by the node.
- Transactions are gossiped in FIFO order as they are in `v0`.
- [config/indexer] \#6411 Introduce support for custom event indexing data sources, specifically PostgreSQL. (@JayT106)
- [fastsync/event] \#6619 Emit fastsync status event when switching consensus/fastsync (@JayT106)
- [statesync/event] \#6700 Emit statesync status start/end event (@JayT106)
- [inspect] \#6785 Add a new `inspect` command for introspecting the state and block store of a crashed tendermint node. (@williambanfield)
### IMPROVEMENTS
- [libs/log] Console log formatting changes as a result of \#6534 and \#6589. (@tychoish)
- [statesync] \#6566 Allow state sync fetchers and request timeout to be configurable. (@alexanderbez)
- [types] \#6478 Add `block_id` to `newblock` event (@jeebster)
- [crypto/ed25519] \#5632 Adopt zip215 `ed25519` verification. (@marbar3778)
- [crypto/ed25519] \#6526 Use [curve25519-voi](https://github.com/oasisprotocol/curve25519-voi) for `ed25519` signing and verification. (@Yawning)
- [crypto/sr25519] \#6526 Use [curve25519-voi](https://github.com/oasisprotocol/curve25519-voi) for `sr25519` signing and verification. (@Yawning)
- [privval] \#5603 Add `--key` to `init`, `gen_validator`, `testnet` & `unsafe_reset_priv_validator` for use in generating `secp256k1` keys.
- [privval] \#5725 Add gRPC support to private validator.
- [privval] \#5876 `tendermint show-validator` will query the remote signer if gRPC is being used (@marbar3778)
- [abci/client] \#5673 `Async` requests return an error if queue is full (@melekes)
- [mempool] \#5673 Cancel `CheckTx` requests if RPC client disconnects or times out (@melekes)
- [abci] \#5706 Added `AbciVersion` to `RequestInfo` allowing applications to check ABCI version when connecting to Tendermint. (@marbar3778)
- [blockchain/v1] \#5728 Remove in favor of v2 (@melekes)
- [blockchain/v0] \#5741 Relax termination conditions and increase sync timeout (@melekes)
- [cli] \#5772 `gen_node_key` output now contains node ID (`id` field) (@melekes)
- [blockchain/v2] \#5774 Send status request when new peer joins (@melekes)
- [consensus] \#5792 Deprecates the `time_iota_ms` consensus parameter, to reduce the bug surface. The parameter is no longer used. (@valardragon)
- [store] \#5888 store.SaveBlock saves using batches instead of transactions for now to improve ACID properties. This is a quick fix for underlying issues around tm-db and ACID guarantees. (@githubsands)
- [consensus] \#5987 Remove `time_iota_ms` from consensus params. Merge `tmproto.ConsensusParams` and `abci.ConsensusParams`. (@marbar3778)
- [types] \#5994 Reduce the use of protobuf types in core logic. (@marbar3778)
- `ConsensusParams`, `BlockParams`, `ValidatorParams`, `EvidenceParams`, `VersionParams`, `sm.Version` and `version.Consensus` have become native types. They still utilize protobuf when being sent over the wire or written to disk.
- [rpc/client/http] \#6163 Do not drop events even if the `out` channel is full (@melekes)
- [node] \#6059 Validate and complete genesis doc before saving to state store (@silasdavis)
- [state] \#6067 Batch save state data (@githubsands & @cmwaters)
- [crypto] \#6120 Implement batch verification interface for ed25519 and sr25519. (@marbar3778)
- [types] \#6120 use batch verification for verifying commits signatures.
- If the key type supports the batch verification API it will try to batch verify. If the verification fails we will single verify each signature.
- [privval/file] \#6185 Return error on `LoadFilePV`, `LoadFilePVEmptyState`. Allows for better programmatic control of Tendermint.
- [privval] \#6240 Add `context.Context` to privval interface.
- [rpc] \#6265 set cache control in http-rpc response header (@JayT106)
- [statesync] \#6378 Retry requests for snapshots and add a minimum discovery time (5s) for new snapshots.
- [node/state] \#6370 graceful shutdown in the consensus reactor (@JayT106)
- [crypto/merkle] \#6443 Improve HashAlternatives performance (@cuonglm)
- [crypto/merkle] \#6513 Optimize HashAlternatives (@marbar3778)
- [p2p/pex] \#6509 Improve addrBook.hash performance (@cuonglm)
- [consensus/metrics] \#6549 Change block_size gauge to a histogram for better observability over time (@marbar3778)
- [statesync] \#6587 Increase chunk priority and re-request chunks that don't arrive (@cmwaters)
- [state/privval] \#6578 No GetPubKey retry beyond the proposal/voting window (@JayT106)
- [rpc] \#6615 Add TotalGasUsed to block_results response (@crypto-facs)
- [cmd/tendermint/commands] \#6623 replace `$HOME/.some/test/dir` with `t.TempDir` (@tanyabouman)
- [statesync] \6807 Implement P2P state provider as an alternative to RPC (@cmwaters)
### BUG FIXES
- [privval] \#5638 Increase read/write timeout to 5s and calculate ping interval based on it (@JoeKash)
- [blockchain/v1] [\#5701](https://github.com/tendermint/tendermint/pull/5701) Handle peers without blocks (@melekes)
- [blockchain/v1] \#5711 Fix deadlock (@melekes)
- [evidence] \#6375 Fix bug with inconsistent LightClientAttackEvidence hashing (cmwaters)
- [rpc] \#6507 Ensure RPC client can handle URLs without ports (@JayT106)
- [statesync] \#6463 Adds Reverse Sync feature to fetch historical light blocks after state sync in order to verify any evidence (@cmwaters)
- [fastsync] \#6590 Update the metrics during fast-sync (@JayT106)
- [gitignore] \#6668 Fix gitignore of abci-cli (@tanyabouman)

View File

@@ -82,32 +82,12 @@ and familiarize yourself with our
Tendermint uses [Semantic Versioning](http://semver.org/) to determine when and how the version changes.
According to SemVer, anything in the public API can change at any time before version 1.0.0
To provide some stability to Tendermint users in these 0.X.X days, the MINOR version is used
to signal breaking changes across a subset of the total public API. This subset includes all
interfaces exposed to other processes (cli, rpc, p2p, etc.), but does not
include the Go APIs.
To provide some stability to users of 0.X.X versions of Tendermint, the MINOR version is used
to signal breaking changes across Tendermint's API. This API includes all
publicly exposed types, functions, and methods in non-internal Go packages as well as
the types and methods accessible via the Tendermint RPC interface.
That said, breaking changes in the following packages will be documented in the
CHANGELOG even if they don't lead to MINOR version bumps:
- crypto
- config
- libs
- bits
- bytes
- json
- log
- math
- net
- os
- protoio
- rand
- sync
- strings
- service
- node
- rpc/client
- types
Breaking changes to these public APIs will be documented in the CHANGELOG.
### Upgrades

View File

@@ -2,7 +2,7 @@
This guide provides instructions for upgrading to specific versions of Tendermint Core.
## Unreleased
## v0.35
### ABCI Changes
@@ -17,7 +17,16 @@ This guide provides instructions for upgrading to specific versions of Tendermin
### Config Changes
* `fast_sync = "v1"` and `fast_sync = "v2"` are no longer supported. Please use `v0` instead.
* The configuration file field `[fastsync]` has been renamed to `[blocksync]`.
* The top level configuration file field `fast-sync` has moved under the new `[blocksync]`
field as `blocksync.enable`.
* `blocksync.version = "v1"` and `blocksync.version = "v2"` (previously `fastsync`)
are no longer supported. Please use `v0` instead. During the v0.35 release cycle, `v0` was
determined to suit the existing needs and the cost of maintaining the `v1` and `v2` modules
was determined to be greater than necessary.
* All config parameters are now hyphen-case (also known as kebab-case) instead of snake_case. Before restarting the node make sure
you have updated all the variables in your `config.toml` file.
@@ -35,7 +44,7 @@ This guide provides instructions for upgrading to specific versions of Tendermin
* The fast sync process as well as the blockchain package and service has all
been renamed to block sync
### Key Format Changes
### Database Key Format Changes
The format of all tendermint on-disk database keys changes in
0.35. Upgrading nodes must either re-sync all data or run a migration
@@ -60,6 +69,8 @@ if needed.
* You must now specify the node mode (validator|full|seed) in `tendermint init [mode]`
* The `--fast-sync` command line option has been renamed to `--blocksync.enable`
* If you had previously used `tendermint gen_node_key` to generate a new node
key, keep in mind that it no longer saves the output to a file. You can use
`tendermint init validator` or pipe the output of `tendermint gen_node_key` to
@@ -74,8 +85,8 @@ if needed.
### API Changes
The p2p layer was reimplemented as part of the 0.35 release cycle, and
all reactors were refactored. As part of that work these
The p2p layer was reimplemented as part of the 0.35 release cycle and
all reactors were refactored to accomodate the change. As part of that work these
implementations moved into the `internal` package and are no longer
considered part of the public Go API of tendermint. These packages
are:
@@ -98,13 +109,11 @@ will need to change to accommodate these changes. Most notably:
longer exported and have been replaced with `node.New` and
`node.NewDefault` which provide more functional interfaces.
### RPC changes
#### gRPC Support
### gRPC Support
Mark gRPC in the RPC layer as deprecated and to be removed in 0.36.
#### Peer Management Interface
### Peer Management Interface
When running with the new P2P Layer, the methods `UnsafeDialSeeds` and
`UnsafeDialPeers` RPC methods will always return an error. They are
@@ -116,6 +125,58 @@ method changes in this release to accommodate the different way that
the new stack tracks data about peers. This change affects users of
both stacks.
### Using the updated p2p library
The P2P library was reimplemented in this release. The new implementation is
enabled by default in this version of Tendermint. The legacy implementation is still
included in this version of Tendermint as a backstop to work around unforeseen
production issues. The new and legacy version are interoperable. If necessary,
you can enable the legacy implementation in the server configuration file.
To make use of the legacy P2P implemementation add or update the following field of
your server's configuration file under the `[p2p]` section:
```toml
[p2p]
...
use-legacy = true
...
```
If you need to do this, please consider filing an issue in the Tendermint repository
to let us know why. We plan to remove the legacy P2P code in the next (v0.36) release.
#### New p2p queue types
The new p2p implementation enables selection of the queue type to be used for
passing messages between peers.
The following values may be used when selecting which queue type to use:
* `fifo`: (**default**) An unbuffered and lossless queue that passes messages through
in the order in which they were received.
* `priority`: A priority queue of messages.
* `wdrr`: A queue implementing the Weighted Deficit Round Robin algorithm. A
weighted deficit round robin queue is created per peer. Each queue contains a
separate 'flow' for each of the channels of communication that exist between any two
peers. Tendermint maintains a channel per message type between peers. Each WDRR
queue maintains a shared buffered with a fixed capacity through which messages on different
flows are passed.
For more information on WDRR scheduling, see: https://en.wikipedia.org/wiki/Deficit_round_robin
To select a queue type, add or update the following field under the `[p2p]`
section of your server's configuration file.
```toml
[p2p]
...
queue-type = wdrr
...
```
### Support for Custom Reactor and Mempool Implementations
The changes to p2p layer removed existing support for custom

View File

@@ -3,6 +3,8 @@ package commands
import (
"bytes"
"crypto/sha256"
"errors"
"flag"
"fmt"
"io"
"os"
@@ -33,7 +35,22 @@ func AddNodeFlags(cmd *cobra.Command) {
"socket address to listen on for connections from external priv-validator process")
// node flags
cmd.Flags().Bool("fast-sync", config.FastSyncMode, "fast blockchain syncing")
cmd.Flags().Bool("blocksync.enable", config.BlockSync.Enable, "enable fast blockchain syncing")
// TODO (https://github.com/tendermint/tendermint/issues/6908): remove this check after the v0.35 release cycle
// This check was added to give users an upgrade prompt to use the new flag for syncing.
//
// The pflag package does not have a native way to print a depcrecation warning
// and return an error. This logic was added to print a deprecation message to the user
// and then crash if the user attempts to use the old --fast-sync flag.
fs := flag.NewFlagSet("", flag.ExitOnError)
fs.Func("fast-sync", "deprecated",
func(string) error {
return errors.New("--fast-sync has been deprecated, please use --blocksync.enable")
})
cmd.Flags().AddGoFlagSet(fs)
cmd.Flags().MarkHidden("fast-sync") //nolint:errcheck
cmd.Flags().BytesHexVar(
&genesisHash,
"genesis-hash",
@@ -158,7 +175,7 @@ func checkGenesisHash(config *cfg.Config) error {
// Compare with the flag.
if !bytes.Equal(genesisHash, actualHash) {
return fmt.Errorf(
"--genesis_hash=%X does not match %s hash: %X",
"--genesis-hash=%X does not match %s hash: %X",
genesisHash, config.GenesisFile(), actualHash)
}

View File

@@ -76,7 +76,7 @@ type Config struct {
P2P *P2PConfig `mapstructure:"p2p"`
Mempool *MempoolConfig `mapstructure:"mempool"`
StateSync *StateSyncConfig `mapstructure:"statesync"`
BlockSync *BlockSyncConfig `mapstructure:"fastsync"`
BlockSync *BlockSyncConfig `mapstructure:"blocksync"`
Consensus *ConsensusConfig `mapstructure:"consensus"`
TxIndex *TxIndexConfig `mapstructure:"tx-index"`
Instrumentation *InstrumentationConfig `mapstructure:"instrumentation"`
@@ -152,7 +152,7 @@ func (cfg *Config) ValidateBasic() error {
return fmt.Errorf("error in [statesync] section: %w", err)
}
if err := cfg.BlockSync.ValidateBasic(); err != nil {
return fmt.Errorf("error in [fastsync] section: %w", err)
return fmt.Errorf("error in [blocksync] section: %w", err)
}
if err := cfg.Consensus.ValidateBasic(); err != nil {
return fmt.Errorf("error in [consensus] section: %w", err)
@@ -194,12 +194,6 @@ type BaseConfig struct { //nolint: maligned
// - No priv_validator_key.json, priv_validator_state.json
Mode string `mapstructure:"mode"`
// If this node is many blocks behind the tip of the chain, FastSync
// allows them to catchup quickly by downloading blocks in parallel
// and verifying their commits
// TODO: This should be moved to the blocksync config
FastSyncMode bool `mapstructure:"fast-sync"`
// Database backend: goleveldb | cleveldb | boltdb | rocksdb
// * goleveldb (github.com/syndtr/goleveldb - most popular implementation)
// - pure go
@@ -242,23 +236,24 @@ type BaseConfig struct { //nolint: maligned
// If true, query the ABCI app on connecting to a new peer
// so the app can decide if we should keep the connection or not
FilterPeers bool `mapstructure:"filter-peers"` // false
Other map[string]interface{} `mapstructure:",remain"`
}
// DefaultBaseConfig returns a default base configuration for a Tendermint node
func DefaultBaseConfig() BaseConfig {
return BaseConfig{
Genesis: defaultGenesisJSONPath,
NodeKey: defaultNodeKeyPath,
Mode: defaultMode,
Moniker: defaultMoniker,
ProxyApp: "tcp://127.0.0.1:26658",
ABCI: "socket",
LogLevel: DefaultLogLevel,
LogFormat: log.LogFormatPlain,
FastSyncMode: true,
FilterPeers: false,
DBBackend: "goleveldb",
DBPath: "data",
Genesis: defaultGenesisJSONPath,
NodeKey: defaultNodeKeyPath,
Mode: defaultMode,
Moniker: defaultMoniker,
ProxyApp: "tcp://127.0.0.1:26658",
ABCI: "socket",
LogLevel: DefaultLogLevel,
LogFormat: log.LogFormatPlain,
FilterPeers: false,
DBBackend: "goleveldb",
DBPath: "data",
}
}
@@ -268,7 +263,6 @@ func TestBaseConfig() BaseConfig {
cfg.chainID = "tendermint_test"
cfg.Mode = ModeValidator
cfg.ProxyApp = "kvstore"
cfg.FastSyncMode = false
cfg.DBBackend = "memdb"
return cfg
}
@@ -345,6 +339,28 @@ func (cfg BaseConfig) ValidateBasic() error {
return fmt.Errorf("unknown mode: %v", cfg.Mode)
}
// TODO (https://github.com/tendermint/tendermint/issues/6908) remove this check after the v0.35 release cycle.
// This check was added to give users an upgrade prompt to use the new
// configuration option in v0.35. In future release cycles they should no longer
// be using this configuration parameter so the check can be removed.
// The cfg.Other field can likely be removed at the same time if it is not referenced
// elsewhere as it was added to service this check.
if fs, ok := cfg.Other["fastsync"]; ok {
if _, ok := fs.(map[string]interface{}); ok {
return fmt.Errorf("a configuration section named 'fastsync' was found in the " +
"configuration file. The 'fastsync' section has been renamed to " +
"'blocksync', please update the 'fastsync' field in your configuration file to 'blocksync'")
}
}
if fs, ok := cfg.Other["fast-sync"]; ok {
if fs != "" {
return fmt.Errorf("a parameter named 'fast-sync' was found in the " +
"configuration file. The parameter to enable or disable quickly syncing with a blockchain" +
"has moved to the [blocksync] section of the configuration file as blocksync.enable. " +
"Please move the 'fast-sync' field in your configuration file to 'blocksync.enable'")
}
}
return nil
}
@@ -1005,13 +1021,18 @@ func (cfg *StateSyncConfig) ValidateBasic() error {
//-----------------------------------------------------------------------------
// BlockSyncConfig (formerly known as FastSync) defines the configuration for the Tendermint block sync service
// If this node is many blocks behind the tip of the chain, BlockSync
// allows them to catchup quickly by downloading blocks in parallel
// and verifying their commits.
type BlockSyncConfig struct {
Enable bool `mapstructure:"enable"`
Version string `mapstructure:"version"`
}
// DefaultBlockSyncConfig returns a default configuration for the block sync service
func DefaultBlockSyncConfig() *BlockSyncConfig {
return &BlockSyncConfig{
Enable: true,
Version: BlockSyncV0,
}
}

View File

@@ -97,11 +97,6 @@ moniker = "{{ .BaseConfig.Moniker }}"
# - No priv_validator_key.json, priv_validator_state.json
mode = "{{ .BaseConfig.Mode }}"
# If this node is many blocks behind the tip of the chain, FastSync
# allows them to catchup quickly by downloading blocks in parallel
# and verifying their commits
fast-sync = {{ .BaseConfig.FastSyncMode }}
# Database backend: goleveldb | cleveldb | boltdb | rocksdb | badgerdb
# * goleveldb (github.com/syndtr/goleveldb - most popular implementation)
# - pure go
@@ -465,10 +460,15 @@ fetchers = "{{ .StateSync.Fetchers }}"
#######################################################
### Block Sync Configuration Connections ###
#######################################################
[fastsync]
[blocksync]
# If this node is many blocks behind the tip of the chain, BlockSync
# allows them to catchup quickly by downloading blocks in parallel
# and verifying their commits
enable = {{ .BlockSync.Enable }}
# Block Sync version to use:
# 1) "v0" (default) - the legacy block sync implementation
# 1) "v0" (default) - the standard Block Sync implementation
# 2) "v2" - DEPRECATED, please use v0
version = "{{ .BlockSync.Version }}"

View File

@@ -36,9 +36,7 @@ func TestEnsureRoot(t *testing.T) {
data, err := ioutil.ReadFile(filepath.Join(tmpDir, defaultConfigFilePath))
require.Nil(err)
if !checkConfig(string(data)) {
t.Fatalf("config file missing some information")
}
checkConfig(t, string(data))
ensureFiles(t, tmpDir, "data")
}
@@ -57,9 +55,7 @@ func TestEnsureTestRoot(t *testing.T) {
data, err := ioutil.ReadFile(filepath.Join(rootDir, defaultConfigFilePath))
require.Nil(err)
if !checkConfig(string(data)) {
t.Fatalf("config file missing some information")
}
checkConfig(t, string(data))
// TODO: make sure the cfg returned and testconfig are the same!
baseConfig := DefaultBaseConfig()
@@ -67,16 +63,15 @@ func TestEnsureTestRoot(t *testing.T) {
ensureFiles(t, rootDir, defaultDataDir, baseConfig.Genesis, pvConfig.Key, pvConfig.State)
}
func checkConfig(configFile string) bool {
var valid bool
func checkConfig(t *testing.T, configFile string) {
t.Helper()
// list of words we expect in the config
var elems = []string{
"moniker",
"seeds",
"proxy-app",
"fast_sync",
"create_empty_blocks",
"blocksync",
"create-empty-blocks",
"peer",
"timeout",
"broadcast",
@@ -89,10 +84,7 @@ func checkConfig(configFile string) bool {
}
for _, e := range elems {
if !strings.Contains(configFile, e) {
valid = false
} else {
valid = true
t.Errorf("config file was expected to contain %s but did not", e)
}
}
return valid
}

View File

@@ -62,7 +62,7 @@ be turned off regardless of other values provided.
#### KV
The `kv` indexer type is an embedded key-value store supported by the main
underling Tendermint database. Using the `kv` indexer type allows you to query
underlying Tendermint database. Using the `kv` indexer type allows you to query
for block and transaction events directly against Tendermint's RPC. However, the
query syntax is limited and so this indexer type might be deprecated or removed
entirely in the future.

View File

@@ -36,10 +36,6 @@ proxy-app = "tcp://127.0.0.1:26658"
# A custom human readable name for this node
moniker = "anonymous"
# If this node is many blocks behind the tip of the chain, BlockSync
# allows them to catchup quickly by downloading blocks in parallel
# and verifying their commits
fast-sync = true
# Mode of Node: full | validator | seed (default: "validator")
# * validator node (default)
@@ -356,11 +352,16 @@ temp-dir = ""
#######################################################
### BlockSync Configuration Connections ###
#######################################################
[fastsync]
[blocksync]
# If this node is many blocks behind the tip of the chain, BlockSync
# allows them to catchup quickly by downloading blocks in parallel
# and verifying their commits
enable = true
# Block Sync version to use:
# 1) "v0" (default) - the legacy block sync implementation
# 2) "v2" - complete redesign of v0, optimized for testability & readability
# 1) "v0" (default) - the standard block sync implementation
# 2) "v2" - DEPRECATED, please use v0
version = "v0"
#######################################################

View File

@@ -38,6 +38,9 @@ sections.
## Table of Contents
- [RFC-000: P2P Roadmap](./rfc-000-p2p-roadmap.rst)
- [RFC-000: Storage Engines](./rfc-001-storage-engine.rst)
- [RFC-001: Storage Engines](./rfc-001-storage-engine.rst)
- [RFC-002: Interprocess Communication](./rfc-002-ipc-ecosystem.md)
- [RFC-003: Performance Taxonomy](./rfc-003-performance-questions.md)
- [RFC-004: E2E Test Framework Enhancements](./rfc-004-e2e-framework.md)
<!-- - [RFC-NNN: Title](./rfc-NNN-title.md) -->

View File

@@ -108,7 +108,7 @@ database layer. Users of the data layer shouldn't ever need to interact with
raw byte slices from the database, and should mostly have the experience of
interacting with Go-types.
Badger is more consistently developed and has a broader featureset than
Badger is more consistently developed and has a broader feature set than
Bolt. At the same time, Badger is likely more memory intensive and may have
more overhead in terms of open file handles given it's model. At first glance,
Badger is the obvious choice: it's actively developed and it has a lot of

View File

@@ -0,0 +1,420 @@
# RFC 002: Interprocess Communication (IPC) in Tendermint
## Changelog
- 08-Sep-2021: Initial draft (@creachadair).
## Abstract
Communication in Tendermint among consensus nodes, applications, and operator
tools all use different message formats and transport mechanisms. In some
cases there are multiple options. Having all these options complicates both the
code and the developer experience, and hides bugs. To support a more robust,
trustworthy, and usable system, we should document which communication paths
are essential, which could be removed or reduced in scope, and what we can
improve for the most important use cases.
This document proposes a variety of possible improvements of varying size and
scope. Specific design proposals should get their own documentation.
## Background
The Tendermint state replication engine has a complex IPC footprint.
1. Consensus nodes communicate with each other using a networked peer-to-peer
message-passing protocol.
2. Consensus nodes communicate with the application whose state is being
replicated via the [Application BlockChain Interface (ABCI)][abci].
3. Consensus nodes export a network-accessible [RPC service][rpc-service] to
support operations (bootstrapping, debugging) and synchronization of [light clients][light-client].
This interface is also used by the [`tendermint` CLI][tm-cli].
4. Consensus nodes export a gRPC service exposing a subset of the methods of
the RPC service described by (3). This was intended to simplify the
implementation of tools that already use gRPC to communicate with an
application (via the Cosmos SDK), and wanted to also talk to the consensus
node without implementing yet another RPC protocol.
The gRPC interface to the consensus node has been deprecated and is slated
for removal in the forthcoming Tendermint v0.36 release.
5. Consensus nodes may optionally communicate with a "remote signer" that holds
a validator key and can provide public keys and signatures to the consensus
node. One of the stated goals of this configuration is to allow the signer
to be run on a private network, separate from the consensus node, so that a
compromise of the consensus node from the public network would be less
likely to expose validator keys.
## Discussion: Transport Mechanisms
### Remote Signer Transport
A remote signer communicates with the consensus node in one of two ways:
1. "Raw": Using a TCP or Unix-domain socket which carries varint-prefixed
protocol buffer messages. In this mode, the consensus node is the server,
and the remote signer is the client.
This mode has been deprecated, and is intended to be removed.
2. gRPC: This mode uses the same protobuf messages as "Raw" node, but uses a
standard encrypted gRPC HTTP/2 stub as the transport. In this mode, the
remote signer is the server and the consensus node is the client.
### ABCI Transport
In ABCI, the _application_ is the server, and the Tendermint consensus engine
is the client. Most applications implement the server using the [Cosmos SDK][cosmos-sdk],
which handles low-level details of the ABCI interaction and provides a
higher-level interface to the rest of the application. The SDK is written in Go.
Beneath the SDK, the application communicates with Tendermint core in one of
two ways:
- In-process direct calls (for applications written in Go and compiled against
the Tendermint code). This is an optimization for the common case where an
application is written in Go, to save on the overhead of marshaling and
unmarshaling requests and responses within the same process:
[`abci/client/local_client.go`][local-client]
- A custom remote procedure protocol built on wire-format protobuf messages
using a socket (the "socket protocol"): [`abci/server/socket_server.go`][socket-server]
The SDK also provides a [gRPC service][sdk-grpc] accessible from outside the
application, allowing transactions to be broadcast to the network, look up
transactions, and simulate transaction costs.
### RPC Transport
The consensus node RPC service allows callers to query consensus parameters
(genesis data, transactions, commits), node status (network info, health
checks), application state (abci_query, abci_info), mempool state, and other
attributes of the node and its application. The service also provides methods
allowing transactions and evidence to be injected ("broadcast") into the
blockchain.
The RPC service is exposed in several ways:
- HTTP GET: Queries may be sent as URI parameters, with method names in the path.
- HTTP POST: Queries may be sent as JSON-RPC request messages in the body of an
HTTP POST request. The server uses a custom implementation of JSON-RPC that
is not fully compatible with the [JSON-RPC 2.0 spec][json-rpc], but handles
the common cases.
- Websocket: Queries may be sent as JSON-RPC request messages via a websocket.
This transport uses more or less the same JSON-RPC plumbing as the HTTP POST
handler.
The websocket endpoint also includes three methods that are _only_ exported
via websocket, which appear to support event subscription.
- gRPC: A subset of queries may be issued in protocol buffer format to the gRPC
interface described above under (4). As noted, this endpoint is deprecated
and will be removed in v0.36.
### Opportunities for Simplification
**Claim:** There are too many IPC mechanisms.
The preponderance of ABCI usage is via the Cosmos SDK, which means the
application and the consensus node are compiled together into a single binary,
and the consensus node calls the ABCI methods of the application directly as Go
functions.
We also need a true IPC transport to support ABCI applications _not_ written in
Go. There are also several known applications written in Rust, for example
(including [Anoma](https://github.com/anoma/anoma), Penumbra,
[Oasis](https://github.com/oasisprotocol/oasis-core), Twilight, and
[Nomic](https://github.com/nomic-io/nomic)). Ideally we will have at most one
such transport "built-in": More esoteric cases can be handled by a custom proxy.
Pragmatically, gRPC is probably the right choice here.
The primary consumers of the multi-headed "RPC service" today are the light
client and the `tendermint` command-line client. There is probably some local
use via curl, but I expect that is mostly ad hoc. Ethan reports that nodes are
often configured with the ports to the RPC service blocked, which is good for
security but complicates use by the light client.
### Context: Remote Signer Issues
Since the remote signer needs a secure communication channel to exchange keys
and signatures, and is expected to run truly remotely from the node (i.e., on a
separate physical server), there is not a whole lot we can do here. We should
finish the deprecation and removal of the "raw" socket protocol between the
consensus node and remote signers, but the use of gRPC is appropriate.
The main improvement we can make is to simplify the implementation quite a bit,
once we no longer need to support both "raw" and gRPC transports.
### Context: ABCI Issues
In the original design of ABCI, the presumption was that all access to the
application should be mediated by the consensus node. The idea is that outside
access could change application state and corrupt the consensus process, which
relies on the application to be deterministic. Of course, even without outside
access an application could behave nondeterministically, but allowing other
programs to send it requests was seen as courting trouble.
Conversely, users noted that most of the time, tools written for a particular
application don't want to talk to the consensus module directly. The
application "owns" the state machine the consensus engine is replicating, so
tools that care about application state should talk to the application.
Otherwise, they would have to bake in knowledge about Tendermint (e.g., its
interfaces and data structures) just because of the mediation.
For clients to talk directly to the application, however, there is another
concern: The consensus node is the ABCI _client_, so it is inconvenient for the
application to "push" work into the consensus module via ABCI itself. The
current implementation works around this by calling the consensus node's RPC
service, which exposes an `ABCIQuery` kitchen-sink method that allows the
application a way to poke ABCI messages in the other direction.
Without this RPC method, you could work around this (at least in principle) by
having the consensus module "poll" the application for work that needs done,
but that has unsatisfactory implications for performance and robustness, as
well as being harder to understand.
There has apparently been discussion about trying to make a more bidirectional
communication between the consensus node and the application, but this issue
seems to still be unresolved.
Another complication of ABCI is that it requires the application (server) to
maintain [four separate connections][abci-conn]: One for "consensus" operations
(BeginBlock, EndBlock, DeliverTx, Commit), one for "mempool" operations, one
for "query" operations, and one for "snapshot" (state synchronization) operations.
The rationale seems to have been that these groups of operations should be able
to proceed concurrently with each other. In practice, it results in a very complex
state management problem to coordinate state updates between the separate streams.
While application authors in Go are mostly insulated from that complexity by the
Cosmos SDK, the plumbing to maintain those separate streams is complicated, hard
to understand, and we suspect it contains concurrency bugs and/or lock contention
issues affecting performance that are subtle and difficult to pin down.
Even without changing the semantics of any ABCI operations, this code could be
made smaller and easier to debug by separating the management of concurrency
and locking from the IPC transport: If all requests and responses are routed
through one connection, the server can explicitly maintain priority queues for
requests and responses, and make less-conservative decisions about when locks
are (or aren't) required to synchronize state access. With independent queues,
the server must lock conservatively, and no optimistic scheduling is practical.
This would be a tedious implementation change, but should be achievable without
breaking any of the existing interfaces. More importantly, it could potentially
address a lot of difficult concurrency and performance problems we currently
see anecdotally but have difficultly isolating because of how intertwined these
separate message streams are at runtime.
TODO: Impact of ABCI++ for this topic?
### Context: RPC Issues
The RPC system serves several masters, and has a complex surface area. I
believe there are some improvements that can be exposed by separating some of
these concerns.
The Tendermint light client currently uses the RPC service to look up blocks
and transactions, and to forward ABCI queries to the application. The light
client proxy uses the RPC service via a websocket. The Cosmos IBC relayer also
uses the RPC service via websocket to watch for transaction events, and uses
the `ABCIQuery` method to fetch information and proofs for posted transactions.
Some work is already underway toward using P2P message passing rather than RPC
to synchronize light client state with the rest of the network. IBC relaying,
however, requires access to the event system, which is currently not accessible
except via the RPC interface. Event subscription _could_ be exposed via P2P,
but that is a larger project since it adds P2P communication load, and might
thus have an impact on the performance of consensus.
If event subscription can be moved into the P2P network, we could entirely
remove the websocket transport, even for clients that still need access to the
RPC service. Until then, we may still be able to reduce the scope of the
websocket endpoint to _only_ event subscription, by moving uses of the RPC
server as a proxy to ABCI over to the gRPC interface.
Having the RPC server still makes sense for local bootstrapping and operations,
but can be further simplified. Here are some specific proposals:
- Remove the HTTP GET interface entirely.
- Simplify JSON-RPC plumbing to remove unnecessary reflection and wrapping.
- Remove the gRPC interface (this is already planned for v0.36).
- Separate the websocket interface from the rest of the RPC service, and
restrict it to only event subscription.
Eventually we should try to emove the websocket interface entirely, but we
will need to revisit that (probably in a new RFC) once we've done some of the
easier things.
These changes would preserve the ability of operators to issue queries with
curl (but would require using JSON-RPC instead of URI parameters). That would
be a little less user-friendly, but for a use case that should not be that
prevalent.
These changes would also preserve compatibility with existing JSON-RPC based
code paths like the `tendermint` CLI and the light client (even ahead of
further work to remove that dependency).
**Design goal:** An operator should be able to disable non-local access to the
RPC server on any node in the network without impairing the ability of the
network to function for service of state replication, including light clients.
**Design principle:** All communication required to implement and monitor the
consensus network should use P2P, including the various synchronizations.
### Options for ABCI Transport
The majority of current usage is in Go, and the majority of that is mediated by
the Cosmos SDK, which uses the "direct call" interface. There is probably some
opportunity to clean up the implementation of that code, notably by inverting
which interface is at the "top" of the abstraction stack (currently it acts
like an RPC interface, and escape-hatches into the direct call). However, this
general approach works fine and doesn't need to be fundamentally changed.
For applications _not_ written in Go, the two remaining options are the
"socket" protocol (another variation on varint-prefixed protobuf messages over
an unstructured stream) and gRPC. It would be nice if we could get rid of one
of these to reduce (unneeded?) optionality.
Since both the socket protocol and gRPC depend on protocol buffers, the
"socket" protocol is the most obvious choice to remove. While gRPC is more
complex, the set of languages that _have_ protobuf support but _lack_ gRPC
support is small. Moreover, gRPC is already widely used in the rest of the
ecosystem (including the Cosmos SDK).
If some use case did arise later that can't work with gRPC, it would not be too
difficult for that application author to write a little proxy (in Go) that
bridges the convenient SDK APIs into a simpler protocol than gRPC.
**Design principle:** It is better for an uncommon special case to carry the
burdens of its specialness, than to bake an escape hatch into the infrastructure.
**Recommendation:** We should deprecate and remove the socket protocol.
### Options for RPC Transport
[ADR 057][adr-57] proposes using gRPC for the Tendermint RPC implementation.
This is still possible, but if we are able to simplify and decouple the
concerns as described above, I do not think it should be necessary.
While JSON-RPC is not the best possible RPC protocol for all situations, it has
some advantages over gRPC for our domain. Specifically:
- It is easy to call JSON-RPC manually from the command-line, which helps with
a common concern for the RPC service, local debugging and operations.
Relatedly: JSON is relatively easy for humans to read and write, and it can
be easily copied and pasted to share sample queries and debugging results in
chat, issue comments, and so on. Ideally, the RPC service will not be used
for activities where the costs of a text protocol are important compared to
its legibility and manual usability benefits.
- gRPC has an enormous dependency footprint for both clients and servers, and
many of the features it provides to support security and performance
(encryption, compression, streaming, etc.) are mostly irrelevant to local
use. Tendermint already needs to include a gRPC client for the remote signer,
but if we can avoid the need for a _client_ to depend on gRPC, that is a win
for usability.
- If we intend to migrate light clients off RPC to use P2P entirely, there is
no advantage to forcing a temporary migration to gRPC along the way; and once
the light client is not dependent on the RPC service, the efficiency of the
protocol is much less important.
- We can still get the benefits of generated data types using protocol buffers, even
without using gRPC:
- Protobuf defines a standard JSON encoding for all message types so
languages with protobuf support do not need to worry about type mapping
oddities.
- Using JSON means that even languages _without_ good protobuf support can
implement the protocol with a bit more work, and I expect this situation to
be rare.
Even if a language lacks a good standard JSON-RPC mechanism, the protocol is
lightweight and can be implemented by simple send/receive over TCP or
Unix-domain sockets with no need for code generation, encryption, etc. gRPC
uses a complex HTTP/2 based transport that is not easily replicated.
### Future Work
The background and proposals sketched above focus on the existing structure of
Tendermint and improvements we can make in the short term. It is worthwhile to
also consider options for longer-term broader changes to the IPC ecosystem.
The following outlines some ideas at a high level:
- **Consensus service:** Today, the application and the consensus node are
nominally connected only via ABCI. Tendermint was originally designed with
the assumption that all communication with the application should be mediated
by the consensus node. Based on further experience, however, the design goal
is now that the _application_ should be the mediator of application state.
As noted above, however, ABCI is a client/server protocol, with the
application as the server. For outside clients that turns out to have been a
good choice, but it complicates the relationship between the application and
the consensus node: Previously transactions were entered via the node, now
they are entered via the app.
We have worked around this by using the Tendermint RPC service to give the
application a "back channel" to the consensus node, so that it can push
transactions back into the consensus network. But the RPC service exposes a
lot of other functionality, too, including event subscription, block and
transaction queries, and a lot of node status information.
Even if we can't easily "fix" the orientation of the ABCI relationship, we
could improve isolation by splitting out the parts of the RPC service that
the application needs as a back-channel, and sharing those _only_ with the
application. By defining a "consensus service", we could give the application
a way to talk back limited to only the capabilities it needs. This approach
has the benefit that we could do it without breaking existing use, and if we
later did "fix" the ABCI directionality, we could drop the special case
without disrupting the rest of the RPC interface.
- **Event service:** Right now, the IBC relayer relies on the Tendermint RPC
service to provide a stream of block and transaction events, which it uses to
discover which transactions need relaying to other chains. While I think
that event subscription should eventually be handled via P2P, we could gain
some immediate benefit by splitting out event subscription from the rest of
the RPC service.
In this model, an event subscription service would be exposed on the public
network, but on a different endpoint. This would remove the need for the RPC
service to support the websocket protocol, and would allow operators to
isolate potentially sensitive status query results from the public network.
At the moment the relayers also use the RPC service to get block data for
synchronization, but work is already in progress to handle that concern via
the P2P layer. Once that's done, event subscription could be separated.
Separating parts of the existing RPC service is not without cost: It might
require additional connection endpoints, for example, though it is also not too
difficult for multiple otherwise-independent services to share a connection.
In return, though, it would become easier to reduce transport options and for
operators to independently control access to sensitive data. Considering the
viability and implications of these ideas is beyond the scope of this RFC, but
they are documented here since they follow from the background we have already
discussed.
## References
[abci]: https://github.com/tendermint/spec/tree/95cf253b6df623066ff7cd4074a94e7a3f147c7a/spec/abci
[rpc-service]: https://docs.tendermint.com/master/rpc/
[light-client]: https://docs.tendermint.com/master/tendermint-core/light-client.html
[tm-cli]: https://github.com/tendermint/tendermint/tree/master/cmd/tendermint
[cosmos-sdk]: https://github.com/cosmos/cosmos-sdk/
[local-client]: https://github.com/tendermint/tendermint/blob/master/abci/client/local_client.go
[socket-server]: https://github.com/tendermint/tendermint/blob/master/abci/server/socket_server.go
[sdk-grpc]: https://pkg.go.dev/github.com/cosmos/cosmos-sdk/types/tx#ServiceServer
[json-rpc]: https://www.jsonrpc.org/specification
[abci-conn]: https://github.com/tendermint/spec/blob/master/spec/abci/apps.md#state
[adr-57]: https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-057-RPC.md

View File

@@ -0,0 +1,283 @@
# RFC 003: Taxonomy of potential performance issues in Tendermint
## Changelog
- 2021-09-02: Created initial draft (@wbanfield)
- 2021-09-14: Add discussion of the event system (@wbanfield)
## Abstract
This document discusses the various sources of performance issues in Tendermint and
attempts to clarify what work may be required to understand and address them.
## Background
Performance, loosely defined as the ability of a software process to perform its work
quickly and efficiently under load and within reasonable resource limits, is a frequent
topic of discussion in the Tendermint project.
To effectively address any issues with Tendermint performance we need to
categorize the various issues, understand their potential sources, and gauge their
impact on users.
Categorizing the different known performance issues will allow us to discuss and fix them
more systematically. This document proposes a rough taxonomy of performance issues
and highlights areas where more research into potential performance problems is required.
Understanding Tendermint's performance limitations will also be critically important
as we make changes to many of its subsystems. Performance is a central concern for
upcoming decisions regarding the `p2p` protocol, RPC message encoding and structure,
database usage and selection, and consensus protocol updates.
## Discussion
This section attempts to delineate the different sections of Tendermint functionality
that are often cited as having performance issues. It raises questions and suggests
lines of inquiry that may be valuable for better understanding Tendermint's performance issues.
As a note: We should avoid quickly adding many microbenchmarks or package level benchmarks.
These are prone to being worse than useless as they can obscure what _should_ be
focused on: performance of the system from the perspective of a user. We should,
instead, tune performance with an eye towards user needs and actions users make. These users comprise
both operators of Tendermint chains and the people generating transactions for
Tendermint chains. Both of these sets of users are largely aligned in wanting an end-to-end
system that operates quickly and efficiently.
REQUEST: The list below may be incomplete, if there are additional sections that are often
cited as creating poor performance, please comment so that they may be included.
### P2P
#### Claim: Tendermint cannot scale to large numbers of nodes
A complaint has been reported that Tendermint networks cannot scale to large numbers of nodes.
The listed number of nodes a user reported as causing issue was in the thousands.
We don't currently have evidence about what the upper-limit of nodes that Tendermint's
P2P stack can scale to.
We need to more concretely understand the source of issues and determine what layer
is causing a problem. It's possible that the P2P layer, in the absence of any reactors
sending data, is perfectly capable of managing thousands of peer connections. For
a reasonable networking and application setup, thousands of connections should not present any
issue for the application.
We need more data to understand the problem directly. We want to drive the popularity
and adoption of Tendermint and this will mean allowing for chains with more validators.
We should follow up with users experiencing this issue. We may then want to add
a series of metrics to the P2P layer to better understand the inefficiencies it produces.
The following metrics can help us understand the sources of latency in the Tendermint P2P stack:
* Number of messages sent and received per second
* Time of a message spent on the P2P layer send and receive queues
The following metrics exist and should be leveraged in addition to those added:
* Number of peers node's connected to
* Number of bytes per channel sent and received from each peer
### Sync
#### Claim: Block Syncing is slow
Bootstrapping a new node in a network to the height of the rest of the network is believed to
take longer than users would like. Block sync requires fetching all of the blocks from
peers and placing them into the local disk for storage. A useful line of inquiry
is understanding how quickly a perfectly tuned system _could_ fetch all of the state
over a network so that we understand how much overhead Tendermint actually adds.
The operation is likely to be _incredibly_ dependent on the environment in which
the node is being run. The factors that will influence syncing include:
1. Number of peers that a syncing node may fetch from.
2. Speed of the disk that a validator is writing to.
3. Speed of the network connection between the different peers that node is
syncing from.
We should calculate how quickly this operation _could possibly_ complete for common chains and nodes.
To calculate how quickly this operation could possibly complete, we should assume that
a node is reading at line-rate of the NIC and writing at the full drive speed to its
local storage. Comparing this theoretical upper-limit to the actual sync times
observed by node operators will give us a good point of comparison for understanding
how much overhead Tendermint incurs.
We should additionally add metrics to the blocksync operation to more clearly pinpoint
slow operations. The following metrics should be added to the block syncing operation:
* Time to fetch and validate each block
* Time to execute a block
* Blocks sync'd per unit time
### Application
Applications performing complex state transitions have the potential to bottleneck
the Tendermint node.
#### Claim: ABCI block delivery could cause slowdown
ABCI delivers blocks in several methods: `BeginBlock`, `DeliverTx`, `EndBlock`, `Commit`.
Tendermint delivers transactions one-by-one via the `DeliverTx` call. Most of the
transaction delivery in Tendermint occurs asynchronously and therefore appears unlikely to
form a bottleneck in ABCI.
After delivering all transactions, Tendermint then calls the `Commit` ABCI method.
Tendermint [locks all access to the mempool][abci-commit-description] while `Commit`
proceeds. This means that an application that is slow to execute all of its
transactions or finalize state during the `Commit` method will prevent any new
transactions from being added to the mempool. Apps that are slow to commit will
prevent consensus from proceeded to the next consensus height since Tendermint
cannot validate block proposals or produce block proposals without the
AppHash obtained from the `Commit` method. We should add a metric for each
step in the ABCI protocol to track the amount of time that a node spends communicating
with the application at each step.
#### Claim: ABCI serialization overhead causes slowdown
The most common way to run a Tendermint application is using the Cosmos-SDK.
The Cosmos-SDK runs the ABCI application within the same process as Tendermint.
When an application is run in the same process as Tendermint, a serialization penalty
is not paid. This is because the local ABCI client does not serialize method calls
and instead passes the protobuf type through directly. This can be seen
in [local_client.go][abci-local-client-code].
Serialization and deserialization in the gRPC and socket protocol ABCI methods
may cause slowdown. While these may cause issue, they are not part of the primary
usecase of Tendermint and do not necessarily need to be addressed at this time.
### RPC
#### Claim: The Query API is slow.
The query API locks a mutex across the ABCI connections. This causes consensus to
slow during queries, as ABCI is no longer able to make progress. This is known
to be causing issue in the cosmos-sdk and is being addressed [in the sdk][sdk-query-fix]
but a more robust solution may be required. Adding metrics to each ABCI client connection
and message as described in the Application section of this document would allow us
to further introspect the issue here.
#### Claim: RPC Serialization may cause slowdown
The Tendermint RPC uses a modified version of JSON-RPC. This RPC powers the `broadcast_tx_*` methods,
which is a critical method for adding transactions to Tendermint at the moment. This method is
likely invoked quite frequently on popular networks. Being able to perform efficiently
on this common and critical operation is very important. The current JSON-RPC implementation
relies heavily on type introspection via reflection, which is known to be very slow in
Go. We should therefore produce benchmarks of this method to determine how much overhead
we are adding to what, is likely to be, a very common operation.
The other JSON-RPC methods are much less critical to the core functionality of Tendermint.
While there may other points of performance consideration within the RPC, methods that do not
receive high volumes of requests should not be prioritized for performance consideration.
NOTE: Previous discussion of the RPC framework was done in [ADR 57][adr-57] and
there is ongoing work to inspect and alter the JSON-RPC framework in [RFC 002][rfc-002].
Much of these RPC-related performance considerations can either wait until the work of RFC 002 work is done or be
considered concordantly with the in-flight changes to the JSON-RPC.
### Protocol
#### Claim: Gossiping messages is a slow process
Currently, for any validator to successfully vote in a consensus _step_, it must
receive votes from greater than 2/3 of the validators on the network. In many cases,
it's preferable to receive as many votes as possible from correct validators.
This produces a quadratic increase in messages that are communicated as more validators join the network.
(Each of the N validators must communicate with all other N-1 validators).
This large number of messages communicated per step has been identified to impact
performance of the protocol. Given that the number of messages communicated has been
identified as a bottleneck, it would be extremely valuable to gather data on how long
it takes for popular chains with many validators to gather all votes within a step.
Metrics that would improve visibility into this include:
* Amount of time for a node to gather votes in a step.
* Amount of time for a node to gather all block parts.
* Number of votes each node sends to gossip (i.e. not its own votes, but votes it is
transmitting for a peer).
* Total number of votes each node sends to receives (A node may receive duplicate votes
so understanding how frequently this occurs will be valuable in evaluating the performance
of the gossip system).
#### Claim: Hashing Txs causes slowdown in Tendermint
Using a faster hash algorithm for Tx hashes is currently a point of discussion
in Tendermint. Namely, it is being considered as part of the [modular hashing proposal][modular-hashing].
It is currently unknown if hashing transactions in the Mempool forms a significant bottleneck.
Although it does not appear to be documented as slow, there are a few open github
issues that indicate a possible user preference for a faster hashing algorithm,
including [issue 2187][issue-2187] and [issue 2186][issue-2186].
It is likely worth investigating what order of magnitude Tx hashing takes in comparison to other
aspects of adding a Tx to the mempool. It is not currently clear if the rate of adding Tx
to the mempool is a source of user pain. We should not endeavor to make large changes to
consensus critical components without first being certain that the change is highly
valuable and impactful.
### Digital Signatures
#### Claim: Verification of digital signatures may cause slowdown in Tendermint
Working with cryptographic signatures can be computationally expensive. The cosmos
hub uses [ed25519 signatures][hub-signature]. The library performing signature
verification in Tendermint on votes is [benchmarked][ed25519-bench] to be able to perform an `ed25519`
signature in 75μs on a decently fast CPU. A validator in the Cosmos Hub performs
3 sets of verifications on the signatures of the 140 validators in the Hub
in a consensus round, during block verification, when verifying the prevotes, and
when verifying the precommits. With no batching, this would be roughly `3ms` per
round. It is quite unlikely, therefore, that this accounts for any serious amount
of the ~7 seconds of block time per height in the Hub.
This may cause slowdown when syncing, since the process needs to constantly verify
signatures. It's possible that improved signature aggregation will lead to improved
light client or other syncing performance. In general, a metric should be added
to track block rate while blocksyncing.
#### Claim: Our use of digital signatures in the consensus protocol contributes to performance issue
Currently, Tendermint's digital signature verification requires that all validators
receive all vote messages. Each validator must receive the complete digital signature
along with the vote message that it corresponds to. This means that all N validators
must receive messages from at least 2/3 of the N validators in each consensus
round. Given the potential for oddly shaped network topologies and the expected
variable network roundtrip times of a few hundred milliseconds in a blockchain,
it is highly likely that this amount of gossiping is leading to a significant amount
of the slowdown in the Cosmos Hub and in Tendermint consensus.
### Tendermint Event System
#### Claim: The event system is a bottleneck in Tendermint
The Tendermint Event system is used to communicate and store information about
internal Tendermint execution. The system uses channels internally to send messages
to different subscribers. Sending an event [blocks on the internal channel][event-send].
The default configuration is to [use an unbuffered channel for event publishes][event-buffer-capacity].
Several consumers of the event system also use an unbuffered channel for reads.
An example of this is the [event indexer][event-indexer-unbuffered], which takes an
unbuffered subscription to the event system. The result is that these unbuffered readers
can cause writes to the event system to block or slow down depending on contention in the
event system. This has implications for the consensus system, which [publishes events][consensus-event-send].
To better understand the performance of the event system, we should add metrics to track the timing of
event sends. The following metrics would be a good start for tracking this performance:
* Time in event send, labeled by Event Type
* Time in event receive, labeled by subscriber
* Event throughput, measured in events per unit time.
### References
[modular-hashing]: https://github.com/tendermint/tendermint/pull/6773
[issue-2186]: https://github.com/tendermint/tendermint/issues/2186
[issue-2187]: https://github.com/tendermint/tendermint/issues/2187
[rfc-002]: https://github.com/tendermint/tendermint/pull/6913
[adr-57]: https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-057-RPC.md
[issue-1319]: https://github.com/tendermint/tendermint/issues/1319
[abci-commit-description]: https://github.com/tendermint/spec/blob/master/spec/abci/apps.md#commit
[abci-local-client-code]: https://github.com/tendermint/tendermint/blob/511bd3eb7f037855a793a27ff4c53c12f085b570/abci/client/local_client.go#L84
[hub-signature]: https://github.com/cosmos/gaia/blob/0ecb6ed8a244d835807f1ced49217d54a9ca2070/docs/resources/genesis.md#consensus-parameters
[ed25519-bench]: https://github.com/oasisprotocol/curve25519-voi/blob/d2e7fc59fe38c18ca990c84c4186cba2cc45b1f9/PERFORMANCE.md
[event-send]: https://github.com/tendermint/tendermint/blob/5bd3b286a2b715737f6d6c33051b69061d38f8ef/libs/pubsub/pubsub.go#L338
[event-buffer-capacity]: https://github.com/tendermint/tendermint/blob/5bd3b286a2b715737f6d6c33051b69061d38f8ef/types/event_bus.go#L14
[event-indexer-unbuffered]: https://github.com/tendermint/tendermint/blob/5bd3b286a2b715737f6d6c33051b69061d38f8ef/state/indexer/indexer_service.go#L39
[consensus-event-send]: https://github.com/tendermint/tendermint/blob/5bd3b286a2b715737f6d6c33051b69061d38f8ef/internal/consensus/state.go#L1573
[sdk-query-fix]: https://github.com/cosmos/cosmos-sdk/pull/10045

View File

@@ -0,0 +1,213 @@
========================================
RFC 004: E2E Test Framework Enhancements
========================================
Changelog
---------
- 2021-09-14: started initial draft (@tychoish)
Abstract
--------
This document discusses a series of improvements to the e2e test framework
that we can consider during the next few releases to help boost confidence in
Tendermint releases, and improve developer efficiency.
Background
----------
During the 0.35 release cycle, the E2E tests were a source of great
value, helping to identify a number of bugs before release. At the same time,
the tests were not consistently passing during this time, thereby reducing
their value, and forcing the core development team to allocate time and energy
to maintaining and chasing down issues with the e2e tests and the test
harness. The experience of this release cycle calls to mind a series of
improvements to the test framework, and this document attempts to capture
these improvements, along with motivations, and potential for impact.
Projects
--------
Flexible Workload Generation
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Presently the e2e suite contains a single workload generation pattern, which
exists simply to ensure that the test networks have some work during their
runs. However, the shape and volume of the work is very consistent and is very
gentle to help ensure test reliability.
We don't need a complex workload generation framework, but being able to have
a few different workload shapes available for test networks, both generated and
hand-crafted, would be useful.
Workload patterns/configurations might include:
- transaction targeting patterns (include light nodes, round robin, target
individual nodes)
- variable transaction size over time.
- transaction broadcast option (synchronously, checked, fire-and-forget,
mixed).
- number of transactions to submit.
- non-transaction workloads: (evidence submission, query, event subscription.)
Configurable Generator
~~~~~~~~~~~~~~~~~~~~~~
The nightly e2e suite is defined by the `testnet generator
<https://github.com/tendermint/tendermint/blob/master/test/e2e/generator/generate.go#L13-L65>`_,
and it's difficult to add dimensions or change the focus of the test suite in
any way without modifying the implementation of the generator. If the
generator were more configurable, potentially via a file rather than in
the Go implementation, we could modify the focus of the test suite on the
fly.
Features that we might want to configure:
- number of test networks to generate of various topologies, to improve
coverage of different configurations.
- test application configurations (to modify the latency of ABCI calls, etc.)
- size of test networks.
- workload shape and behavior.
- initial sync and catch-up configurations.
The workload generator currently provides runtime options for limiting the
generator to specific types of P2P stacks, and for generating multiple groups
of test cases to support parallelism. The goal is to extend this pattern and
avoid hardcoding the matrix of test cases in the generator code. Once the
testnet configuration generation behavior is configurable at runtime,
developers may be able to use the e2e framework to validate changes before
landing changes that break e2e tests a day later.
In addition to the autogenerated suite, it might make sense to maintain a
small collection of hand-crafted cases that exercise configurations of
concern, to run as part of the nightly (or less frequent) loop.
Implementation Plan Structure
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
As a development team, we should determine the features should impact the e2e
testing early in the development cycle, and if we intend to modify the e2e
tests to exercise a feature, we should identify this early and begin the
integration process as early as possible.
To facilitate this, we should adopt a practice whereby we exercise specific
features that are currently under development more rigorously in the e2e
suite, and then as development stabilizes we can reduce the number or weight
of these features in the suite.
As of 0.35 there are essentially two end to end tests: the suite of 64
generated test networks, and the hand crafted `ci.toml` test case. The
generated test cases help provide systemtic coverage, while the `ci` run
provides coverage for a large number of features.
Reduce Cycle Time
~~~~~~~~~~~~~~~~~
One of the barriers to leveraging the e2e framework, and one of the challenges
in debugging failures, is the cycle time of running a single test iteration is
quite high: 5 minutes to build the docker image, plus the time to run the test
or tests.
There are a number of improvements and enhancements that can reduce the cycle
time in practice:
- reduce the amount of time required to build the docker image used in these
tests. Without the dependency on CGo, the tendermint binaries could be
(cross) compiled outside of the docker container and then injected into
them, which would take better advantage of docker's native caching,
although, without the dependency on CGo there would be no hard requirement
for the e2e tests to use docker.
- support test parallelism. Because of the way the testnets are orchestrated
a single system can really only run one network at a time. For executions
(local or remote) with more resources, there's no reason to run a few
networks in parallel to reduce the feedback time.
- prune testnet configurations that are unlikely to provide good signal, to
shorten the time to feedback.
- apply some kind of tiered approach to test execution, to improve the
legibility of the test result. For example order tests by the dependency of
their features, or run test networks without perturbations before running
that configuration with perturbations, to be able to isolate the impact of
specific features.
- orchestrate the test harness directly from go test rather than via a special
harness and shell scripts so e2e tests may more naively fit into developers
existing workflows.
Many of these improvements, particularly, reducing the build time will also
reduce the time to get feedback during automated builds.
Deeper Insights
~~~~~~~~~~~~~~~
When a test network fails, it's incredibly difficult to understand _why_ the
network failed, as the current system provides very little insight into the
system outside of the process logs. When a test network stalls or fails
developers should be able to quickly and easily get a sense of the state of
the network and all nodes.
Improvements in persuit of this goal, include functionality that would help
node operators in production environments by improving the quality and utility
of the logging messages and other reported metrics, but also provide some
tools to collect and aggregate this data for developers in the context of test
networks.
- Interleave messages from all nodes in the network to be able to correlate
events during the test run.
- Collect structured metrics of the system operation (CPU/MEM/IO) during the
test run, as well as from each tendermint/application process.
- Build (simple) tools to be able to render and summarize the data collected
during the test run to answer basic questions about test outcome.
Flexible Assertions
~~~~~~~~~~~~~~~~~~~
Currently, all assertions run for every test network, which makes the
assertions pretty bland, and the framework primarily useful as a smoke-test
framework, but it might be useful to be able to write and run different
tests for different configurations. This could allow us to test outside of the
happy-path.
In general our existing assertions occupy a fraction of the total test time,
so the relative cost of adding a few extra test assertions would be of limited
cost, and could help build confidence.
Additional Kinds of Testing
~~~~~~~~~~~~~~~~~~~~~~~~~~~
The existing e2e suite, exercises networks of nodes that have homogeneous
tendermint version, stable configuration, that are expected to make
progress. There are many other possible test configurations that may be
interesting to engage with. These could include dimensions, such as:
- Multi-version testing to exercise our compatibility guarantees for networks
that might have different tendermint versions.
- As a flavor or mult-version testing, include upgrade testing, to build
confidence in migration code and procedures.
- Additional test applications, particularly practical-type applciations
including some that use gaiad and/or the cosmos-sdk. Test-only applications
that simulate other kinds of applications (e.g. variable application
operation latency.)
- Tests of "non-viable" configurations that ensure that forbidden combinations
lead to halts.
References
----------
- `ADR 66: End-to-End Testing <../architecture/adr-66-e2e-testing.md>`_

View File

@@ -17,9 +17,9 @@ consensus gossip protocol.
## Using Block Sync
To support faster syncing, Tendermint offers a `fast-sync` mode, which
To support faster syncing, Tendermint offers a `blocksync` mode, which
is enabled by default, and can be toggled in the `config.toml` or via
`--fast_sync=false`.
`--blocksync.enable=false`.
In this mode, the Tendermint daemon will sync hundreds of times faster
than if it used the real-time consensus process. Once caught up, the
@@ -29,18 +29,23 @@ has at least one peer and it's height is at least as high as the max
reported peer height. See [the IsCaughtUp
method](https://github.com/tendermint/tendermint/blob/b467515719e686e4678e6da4e102f32a491b85a0/blockchain/pool.go#L128).
Note: There are two versions of Block Sync. We recommend using v0 as v2 is still in beta.
Note: There are multiple versions of Block Sync. Please use v0 as the other versions are no longer supported.
If you would like to use a different version you can do so by changing the version in the `config.toml`:
```toml
#######################################################
### Block Sync Configuration Connections ###
#######################################################
[fastsync]
[blocksync]
# If this node is many blocks behind the tip of the chain, BlockSync
# allows them to catchup quickly by downloading blocks in parallel
# and verifying their commits
enable = true
# Block Sync version to use:
# 1) "v0" (default) - the legacy Block Sync implementation
# 2) "v2" - complete redesign of v0, optimized for testability & readability
# 1) "v0" (default) - the standard Block Sync implementation
# 2) "v2" - DEPRECATED, please use v0
version = "v0"
```
@@ -55,4 +60,4 @@ the network best height, it will switches to the state sync mechanism and then e
another event for exposing the fast-sync `complete` status and the state `height`.
The user can query the events by subscribing `EventQueryBlockSyncStatus`
Please check [types](https://pkg.go.dev/github.com/tendermint/tendermint/types?utm_source=godoc#pkg-constants) for the details.
Please check [types](https://pkg.go.dev/github.com/tendermint/tendermint/types?utm_source=godoc#pkg-constants) for the details.

View File

@@ -185,51 +185,65 @@ the argument name and use `_` as a placeholder.
### Formatting
The following nuances when sending/formatting transactions should be
taken into account:
When sending transactions to the RPC interface, the following formatting rules
must be followed:
With `GET`:
Using `GET` (with parameters in the URL):
To send a UTF8 string byte array, quote the value of the tx parameter:
To send a UTF8 string as transaction data, enclose the value of the `tx`
parameter in double quotes:
```sh
curl 'http://localhost:26657/broadcast_tx_commit?tx="hello"'
```
which sends a 5 byte transaction: "h e l l o" \[68 65 6c 6c 6f\].
which sends a 5-byte transaction: "h e l l o" \[68 65 6c 6c 6f\].
Note the URL must be wrapped with single quotes, else bash will ignore
the double quotes. To avoid the single quotes, escape the double quotes:
Note that the URL in this example is enclosed in single quotes to prevent the
shell from interpreting the double quotes. Alternatively, you may escape the
double quotes with backslashes:
```sh
curl http://localhost:26657/broadcast_tx_commit?tx=\"hello\"
```
Using a special character:
The double-quoted format works with for multibyte characters, as long as they
are valid UTF8, for example:
```sh
curl 'http://localhost:26657/broadcast_tx_commit?tx="€5"'
```
sends a 4 byte transaction: "€5" (UTF8) \[e2 82 ac 35\].
sends a 4-byte transaction: "€5" (UTF8) \[e2 82 ac 35\].
To send as raw hex, omit quotes AND prefix the hex string with `0x`:
Arbitrary (non-UTF8) transaction data may also be encoded as a string of
hexadecimal digits (2 digits per byte). To do this, omit the quotation marks
and prefix the hex string with `0x`:
```sh
curl http://localhost:26657/broadcast_tx_commit?tx=0x01020304
curl http://localhost:26657/broadcast_tx_commit?tx=0x68656C6C6F
```
which sends a 4 byte transaction: \[01 02 03 04\].
which sends the 5-byte transaction: \[68 65 6c 6c 6f\].
With `POST` (using `json`), the raw hex must be `base64` encoded:
Using `POST` (with parameters in JSON), the transaction data are sent as a JSON
string in base64 encoding:
```sh
curl --data-binary '{"jsonrpc":"2.0","id":"anything","method":"broadcast_tx_commit","params": {"tx": "AQIDBA=="}}' -H 'content-type:text/plain;' http://localhost:26657
curl http://localhost:26657 -H 'Content-Type: application/json' --data-binary '{
"jsonrpc": "2.0",
"id": "anything",
"method": "broadcast_tx_commit",
"params": {
"tx": "aGVsbG8="
}
}'
```
which sends the same 4 byte transaction: \[01 02 03 04\].
which sends the same 5-byte transaction: \[68 65 6c 6c 6f\].
Note that raw hex cannot be used in `POST` transactions.
Note that the hexadecimal encoding of transaction data is _not_ supported in
JSON (`POST`) requests.
## Reset

4
go.mod
View File

@@ -27,14 +27,14 @@ require (
github.com/prometheus/client_golang v1.11.0
github.com/rcrowley/go-metrics v0.0.0-20200313005456-10cdbea86bc0
github.com/rs/cors v1.8.0
github.com/rs/zerolog v1.24.0
github.com/rs/zerolog v1.25.0
github.com/sasha-s/go-deadlock v0.2.1-0.20190427202633-1595213edefa
github.com/snikch/goodman v0.0.0-20171125024755-10e37e294daa
github.com/spf13/cobra v1.2.1
github.com/spf13/viper v1.8.1
github.com/stretchr/testify v1.7.0
github.com/tendermint/tm-db v0.6.4
github.com/vektra/mockery/v2 v2.9.0
github.com/vektra/mockery/v2 v2.9.3
golang.org/x/crypto v0.0.0-20210513164829-c07d793c2f9a
golang.org/x/net v0.0.0-20210428140749-89ef3d95e781
golang.org/x/sync v0.0.0-20210220032951-036812b2e83c

8
go.sum
View File

@@ -767,8 +767,8 @@ github.com/rs/cors v1.8.0/go.mod h1:EBwu+T5AvHOcXwvZIkQFjUN6s8Czyqw12GL/Y0tUyRM=
github.com/rs/xid v1.2.1/go.mod h1:+uKXf+4Djp6Md1KODXJxgGQPKngRmWyn10oCKFzNHOQ=
github.com/rs/xid v1.3.0/go.mod h1:trrq9SKmegXys3aeAKXMUTdJsYXVwGY3RLcfgqegfbg=
github.com/rs/zerolog v1.18.0/go.mod h1:9nvC1axdVrAHcu/s9taAVfBuIdTZLVQmKQyvrUjF5+I=
github.com/rs/zerolog v1.24.0 h1:76ivFxmVSRs1u2wUwJVg5VZDYQgeH1JpoS6ndgr9Wy8=
github.com/rs/zerolog v1.24.0/go.mod h1:7KHcEGe0QZPOm2IE4Kpb5rTh6n1h2hIgS5OOnu1rUaI=
github.com/rs/zerolog v1.25.0 h1:Rj7XygbUHKUlDPcVdoLyR91fJBsduXj5fRxyqIQj/II=
github.com/rs/zerolog v1.25.0/go.mod h1:7KHcEGe0QZPOm2IE4Kpb5rTh6n1h2hIgS5OOnu1rUaI=
github.com/russross/blackfriday v1.5.2/go.mod h1:JO/DiYxRf+HjHt06OyowR9PTA263kcR/rfWxYHBV53g=
github.com/russross/blackfriday/v2 v2.0.1/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM=
github.com/ryancurrah/gomodguard v1.2.3 h1:ww2fsjqocGCAFamzvv/b8IsRduuHHeK2MHTcTxZTQX8=
@@ -895,8 +895,8 @@ github.com/valyala/bytebufferpool v1.0.0/go.mod h1:6bBcMArwyJ5K/AmCkWv1jt77kVWyC
github.com/valyala/fasthttp v1.16.0/go.mod h1:YOKImeEosDdBPnxc0gy7INqi3m1zK6A+xl6TwOBhHCA=
github.com/valyala/quicktemplate v1.6.3/go.mod h1:fwPzK2fHuYEODzJ9pkw0ipCPNHZ2tD5KW4lOuSdPKzY=
github.com/valyala/tcplisten v0.0.0-20161114210144-ceec8f93295a/go.mod h1:v3UYOV9WzVtRmSR+PDvWpU/qWl4Wa5LApYYX4ZtKbio=
github.com/vektra/mockery/v2 v2.9.0 h1:+3FhCL3EviR779mTzXwUuhPNnqFUA7sDnt9OFkXaFd4=
github.com/vektra/mockery/v2 v2.9.0/go.mod h1:2gU4Cf/f8YyC8oEaSXfCnZBMxMjMl/Ko205rlP0fO90=
github.com/vektra/mockery/v2 v2.9.3 h1:ma6hcGQw4q/lhFUTJ+E9V8/5tsIcht9i2Q4d1qo26SQ=
github.com/vektra/mockery/v2 v2.9.3/go.mod h1:2gU4Cf/f8YyC8oEaSXfCnZBMxMjMl/Ko205rlP0fO90=
github.com/viki-org/dnscache v0.0.0-20130720023526-c70c1f23c5d8/go.mod h1:dniwbG03GafCjFohMDmz6Zc6oCuiqgH6tGNyXTkHzXE=
github.com/xiang90/probing v0.0.0-20190116061207-43a291ad63a2/go.mod h1:UETIi67q53MR2AWcXfiuqkDkRtnGDLqkBTpCHuJHxtU=
github.com/xo/terminfo v0.0.0-20210125001918-ca9a967f8778/go.mod h1:2MuV+tbUrU1zIOPMxZ5EncGwgmMJsa+9ucAQZXxsObs=

View File

@@ -52,7 +52,7 @@ func TestByzantinePrevoteEquivocation(t *testing.T) {
thisConfig := ResetConfig(fmt.Sprintf("%s_%d", testName, i))
defer os.RemoveAll(thisConfig.RootDir)
ensureDir(path.Dir(thisConfig.Consensus.WalFile()), 0700) // dir for wal
ensureDir(t, path.Dir(thisConfig.Consensus.WalFile()), 0700) // dir for wal
app := appFunc()
vals := types.TM2PB.ValidatorUpdates(state.Validators)
app.InitChain(abci.RequestInitChain{Validators: vals})

View File

@@ -69,9 +69,10 @@ func configSetup(t *testing.T) *cfg.Config {
return config
}
func ensureDir(dir string, mode os.FileMode) {
func ensureDir(t *testing.T, dir string, mode os.FileMode) {
t.Helper()
if err := tmos.EnsureDir(dir, mode); err != nil {
panic(err)
t.Fatalf("error opening directory: %s", err)
}
}
@@ -221,18 +222,20 @@ func startTestRound(cs *State, height int64, round int32) {
// Create proposal block from cs1 but sign it with vs.
func decideProposal(
t *testing.T,
cs1 *State,
vs *validatorStub,
height int64,
round int32,
) (proposal *types.Proposal, block *types.Block) {
t.Helper()
cs1.mtx.Lock()
block, blockParts := cs1.createProposalBlock()
validRound := cs1.ValidRound
chainID := cs1.state.ChainID
cs1.mtx.Unlock()
if block == nil {
panic("Failed to createProposalBlock. Did you forget to add commit for previous block?")
t.Fatal("Failed to createProposalBlock. Did you forget to add commit for previous block?")
}
// Make proposal
@@ -240,7 +243,7 @@ func decideProposal(
proposal = types.NewProposal(height, round, polRound, propBlockID)
p := proposal.ToProto()
if err := vs.SignProposal(context.Background(), chainID, p); err != nil {
panic(err)
t.Fatalf("error signing proposal: %s", err)
}
proposal.Signature = p.Signature
@@ -267,36 +270,38 @@ func signAddVotes(
}
func validatePrevote(t *testing.T, cs *State, round int32, privVal *validatorStub, blockHash []byte) {
t.Helper()
prevotes := cs.Votes.Prevotes(round)
pubKey, err := privVal.GetPubKey(context.Background())
require.NoError(t, err)
address := pubKey.Address()
var vote *types.Vote
if vote = prevotes.GetByAddress(address); vote == nil {
panic("Failed to find prevote from validator")
t.Fatalf("Failed to find prevote from validator")
}
if blockHash == nil {
if vote.BlockID.Hash != nil {
panic(fmt.Sprintf("Expected prevote to be for nil, got %X", vote.BlockID.Hash))
t.Fatalf("Expected prevote to be for nil, got %X", vote.BlockID.Hash)
}
} else {
if !bytes.Equal(vote.BlockID.Hash, blockHash) {
panic(fmt.Sprintf("Expected prevote to be for %X, got %X", blockHash, vote.BlockID.Hash))
t.Fatalf("Expected prevote to be for %X, got %X", blockHash, vote.BlockID.Hash)
}
}
}
func validateLastPrecommit(t *testing.T, cs *State, privVal *validatorStub, blockHash []byte) {
t.Helper()
votes := cs.LastCommit
pv, err := privVal.GetPubKey(context.Background())
require.NoError(t, err)
address := pv.Address()
var vote *types.Vote
if vote = votes.GetByAddress(address); vote == nil {
panic("Failed to find precommit from validator")
t.Fatalf("Failed to find precommit from validator")
}
if !bytes.Equal(vote.BlockID.Hash, blockHash) {
panic(fmt.Sprintf("Expected precommit to be for %X, got %X", blockHash, vote.BlockID.Hash))
t.Fatalf("Expected precommit to be for %X, got %X", blockHash, vote.BlockID.Hash)
}
}
@@ -309,41 +314,42 @@ func validatePrecommit(
votedBlockHash,
lockedBlockHash []byte,
) {
t.Helper()
precommits := cs.Votes.Precommits(thisRound)
pv, err := privVal.GetPubKey(context.Background())
require.NoError(t, err)
address := pv.Address()
var vote *types.Vote
if vote = precommits.GetByAddress(address); vote == nil {
panic("Failed to find precommit from validator")
t.Fatalf("Failed to find precommit from validator")
}
if votedBlockHash == nil {
if vote.BlockID.Hash != nil {
panic("Expected precommit to be for nil")
t.Fatalf("Expected precommit to be for nil")
}
} else {
if !bytes.Equal(vote.BlockID.Hash, votedBlockHash) {
panic("Expected precommit to be for proposal block")
t.Fatalf("Expected precommit to be for proposal block")
}
}
if lockedBlockHash == nil {
if cs.LockedRound != lockRound || cs.LockedBlock != nil {
panic(fmt.Sprintf(
t.Fatalf(
"Expected to be locked on nil at round %d. Got locked at round %d with block %v",
lockRound,
cs.LockedRound,
cs.LockedBlock))
cs.LockedBlock)
}
} else {
if cs.LockedRound != lockRound || !bytes.Equal(cs.LockedBlock.Hash(), lockedBlockHash) {
panic(fmt.Sprintf(
t.Fatalf(
"Expected block to be locked on round %d, got %d. Got locked block %X, expected %X",
lockRound,
cs.LockedRound,
cs.LockedBlock.Hash(),
lockedBlockHash))
lockedBlockHash)
}
}
}
@@ -357,6 +363,7 @@ func validatePrevoteAndPrecommit(
votedBlockHash,
lockedBlockHash []byte,
) {
t.Helper()
// verify the prevote
validatePrevote(t, cs, thisRound, privVal, votedBlockHash)
// verify precommit
@@ -444,13 +451,14 @@ func newStateWithConfigAndBlockStore(
return cs
}
func loadPrivValidator(config *cfg.Config) *privval.FilePV {
func loadPrivValidator(t *testing.T, config *cfg.Config) *privval.FilePV {
t.Helper()
privValidatorKeyFile := config.PrivValidator.KeyFile()
ensureDir(filepath.Dir(privValidatorKeyFile), 0700)
ensureDir(t, filepath.Dir(privValidatorKeyFile), 0700)
privValidatorStateFile := config.PrivValidator.StateFile()
privValidator, err := privval.LoadOrGenFilePV(privValidatorKeyFile, privValidatorStateFile)
if err != nil {
panic(err)
t.Fatalf("error generating validator file: %s", err)
}
privValidator.Reset()
return privValidator
@@ -475,220 +483,238 @@ func randState(config *cfg.Config, nValidators int) (*State, []*validatorStub) {
//-------------------------------------------------------------------------------
func ensureNoNewEvent(ch <-chan tmpubsub.Message, timeout time.Duration,
func ensureNoNewEvent(t *testing.T, ch <-chan tmpubsub.Message, timeout time.Duration,
errorMessage string) {
t.Helper()
select {
case <-time.After(timeout):
break
case <-ch:
panic(errorMessage)
t.Fatalf("unexpected event: %s", errorMessage)
}
}
func ensureNoNewEventOnChannel(ch <-chan tmpubsub.Message) {
func ensureNoNewEventOnChannel(t *testing.T, ch <-chan tmpubsub.Message) {
t.Helper()
ensureNoNewEvent(
t,
ch,
ensureTimeout,
"We should be stuck waiting, not receiving new event on the channel")
}
func ensureNoNewRoundStep(stepCh <-chan tmpubsub.Message) {
func ensureNoNewRoundStep(t *testing.T, stepCh <-chan tmpubsub.Message) {
t.Helper()
ensureNoNewEvent(
t,
stepCh,
ensureTimeout,
"We should be stuck waiting, not receiving NewRoundStep event")
}
func ensureNoNewUnlock(unlockCh <-chan tmpubsub.Message) {
ensureNoNewEvent(
unlockCh,
ensureTimeout,
"We should be stuck waiting, not receiving Unlock event")
}
func ensureNoNewTimeout(stepCh <-chan tmpubsub.Message, timeout int64) {
func ensureNoNewTimeout(t *testing.T, stepCh <-chan tmpubsub.Message, timeout int64) {
t.Helper()
timeoutDuration := time.Duration(timeout*10) * time.Nanosecond
ensureNoNewEvent(
t,
stepCh,
timeoutDuration,
"We should be stuck waiting, not receiving NewTimeout event")
}
func ensureNewEvent(ch <-chan tmpubsub.Message, height int64, round int32, timeout time.Duration, errorMessage string) {
func ensureNewEvent(t *testing.T, ch <-chan tmpubsub.Message, height int64, round int32, timeout time.Duration, errorMessage string) { // nolint: lll
t.Helper()
select {
case <-time.After(timeout):
panic(errorMessage)
t.Fatalf("timed out waiting for new event: %s", errorMessage)
case msg := <-ch:
roundStateEvent, ok := msg.Data().(types.EventDataRoundState)
if !ok {
panic(fmt.Sprintf("expected a EventDataRoundState, got %T. Wrong subscription channel?",
msg.Data()))
t.Fatalf("expected a EventDataRoundState, got %T. Wrong subscription channel?", msg.Data())
}
if roundStateEvent.Height != height {
panic(fmt.Sprintf("expected height %v, got %v", height, roundStateEvent.Height))
t.Fatalf("expected height %v, got %v", height, roundStateEvent.Height)
}
if roundStateEvent.Round != round {
panic(fmt.Sprintf("expected round %v, got %v", round, roundStateEvent.Round))
t.Fatalf("expected round %v, got %v", round, roundStateEvent.Round)
}
// TODO: We could check also for a step at this point!
}
}
func ensureNewRound(roundCh <-chan tmpubsub.Message, height int64, round int32) {
func ensureNewRound(t *testing.T, roundCh <-chan tmpubsub.Message, height int64, round int32) {
t.Helper()
select {
case <-time.After(ensureTimeout):
panic("Timeout expired while waiting for NewRound event")
t.Fatal("Timeout expired while waiting for NewRound event")
case msg := <-roundCh:
newRoundEvent, ok := msg.Data().(types.EventDataNewRound)
if !ok {
panic(fmt.Sprintf("expected a EventDataNewRound, got %T. Wrong subscription channel?",
msg.Data()))
t.Fatalf("expected a EventDataNewRound, got %T. Wrong subscription channel?", msg.Data())
}
if newRoundEvent.Height != height {
panic(fmt.Sprintf("expected height %v, got %v", height, newRoundEvent.Height))
t.Fatalf("expected height %v, got %v", height, newRoundEvent.Height)
}
if newRoundEvent.Round != round {
panic(fmt.Sprintf("expected round %v, got %v", round, newRoundEvent.Round))
t.Fatalf("expected round %v, got %v", round, newRoundEvent.Round)
}
}
}
func ensureNewTimeout(timeoutCh <-chan tmpubsub.Message, height int64, round int32, timeout int64) {
func ensureNewTimeout(t *testing.T, timeoutCh <-chan tmpubsub.Message, height int64, round int32, timeout int64) {
t.Helper()
timeoutDuration := time.Duration(timeout*10) * time.Nanosecond
ensureNewEvent(timeoutCh, height, round, timeoutDuration,
ensureNewEvent(t, timeoutCh, height, round, timeoutDuration,
"Timeout expired while waiting for NewTimeout event")
}
func ensureNewProposal(proposalCh <-chan tmpubsub.Message, height int64, round int32) {
func ensureNewProposal(t *testing.T, proposalCh <-chan tmpubsub.Message, height int64, round int32) {
t.Helper()
select {
case <-time.After(ensureTimeout):
panic("Timeout expired while waiting for NewProposal event")
t.Fatalf("Timeout expired while waiting for NewProposal event")
case msg := <-proposalCh:
proposalEvent, ok := msg.Data().(types.EventDataCompleteProposal)
if !ok {
panic(fmt.Sprintf("expected a EventDataCompleteProposal, got %T. Wrong subscription channel?",
msg.Data()))
t.Fatalf("expected a EventDataCompleteProposal, got %T. Wrong subscription channel?",
msg.Data())
}
if proposalEvent.Height != height {
panic(fmt.Sprintf("expected height %v, got %v", height, proposalEvent.Height))
t.Fatalf("expected height %v, got %v", height, proposalEvent.Height)
}
if proposalEvent.Round != round {
panic(fmt.Sprintf("expected round %v, got %v", round, proposalEvent.Round))
t.Fatalf("expected round %v, got %v", round, proposalEvent.Round)
}
}
}
func ensureNewValidBlock(validBlockCh <-chan tmpubsub.Message, height int64, round int32) {
ensureNewEvent(validBlockCh, height, round, ensureTimeout,
func ensureNewValidBlock(t *testing.T, validBlockCh <-chan tmpubsub.Message, height int64, round int32) {
t.Helper()
ensureNewEvent(t, validBlockCh, height, round, ensureTimeout,
"Timeout expired while waiting for NewValidBlock event")
}
func ensureNewBlock(blockCh <-chan tmpubsub.Message, height int64) {
func ensureNewBlock(t *testing.T, blockCh <-chan tmpubsub.Message, height int64) {
t.Helper()
select {
case <-time.After(ensureTimeout):
panic("Timeout expired while waiting for NewBlock event")
t.Fatalf("Timeout expired while waiting for NewBlock event")
case msg := <-blockCh:
blockEvent, ok := msg.Data().(types.EventDataNewBlock)
if !ok {
panic(fmt.Sprintf("expected a EventDataNewBlock, got %T. Wrong subscription channel?",
msg.Data()))
t.Fatalf("expected a EventDataNewBlock, got %T. Wrong subscription channel?",
msg.Data())
}
if blockEvent.Block.Height != height {
panic(fmt.Sprintf("expected height %v, got %v", height, blockEvent.Block.Height))
t.Fatalf("expected height %v, got %v", height, blockEvent.Block.Height)
}
}
}
func ensureNewBlockHeader(blockCh <-chan tmpubsub.Message, height int64, blockHash tmbytes.HexBytes) {
func ensureNewBlockHeader(t *testing.T, blockCh <-chan tmpubsub.Message, height int64, blockHash tmbytes.HexBytes) {
t.Helper()
select {
case <-time.After(ensureTimeout):
panic("Timeout expired while waiting for NewBlockHeader event")
t.Fatalf("Timeout expired while waiting for NewBlockHeader event")
case msg := <-blockCh:
blockHeaderEvent, ok := msg.Data().(types.EventDataNewBlockHeader)
if !ok {
panic(fmt.Sprintf("expected a EventDataNewBlockHeader, got %T. Wrong subscription channel?",
msg.Data()))
t.Fatalf("expected a EventDataNewBlockHeader, got %T. Wrong subscription channel?",
msg.Data())
}
if blockHeaderEvent.Header.Height != height {
panic(fmt.Sprintf("expected height %v, got %v", height, blockHeaderEvent.Header.Height))
t.Fatalf("expected height %v, got %v", height, blockHeaderEvent.Header.Height)
}
if !bytes.Equal(blockHeaderEvent.Header.Hash(), blockHash) {
panic(fmt.Sprintf("expected header %X, got %X", blockHash, blockHeaderEvent.Header.Hash()))
t.Fatalf("expected header %X, got %X", blockHash, blockHeaderEvent.Header.Hash())
}
}
}
func ensureNewUnlock(unlockCh <-chan tmpubsub.Message, height int64, round int32) {
ensureNewEvent(unlockCh, height, round, ensureTimeout,
"Timeout expired while waiting for NewUnlock event")
func ensureLock(t *testing.T, lockCh <-chan tmpubsub.Message, height int64, round int32) {
t.Helper()
ensureNewEvent(t, lockCh, height, round, ensureTimeout,
"Timeout expired while waiting for LockValue event")
}
func ensureProposal(proposalCh <-chan tmpubsub.Message, height int64, round int32, propID types.BlockID) {
func ensureRelock(t *testing.T, relockCh <-chan tmpubsub.Message, height int64, round int32) {
t.Helper()
ensureNewEvent(t, relockCh, height, round, ensureTimeout,
"Timeout expired while waiting for RelockValue event")
}
func ensureProposal(t *testing.T, proposalCh <-chan tmpubsub.Message, height int64, round int32, propID types.BlockID) {
t.Helper()
select {
case <-time.After(ensureTimeout):
panic("Timeout expired while waiting for NewProposal event")
t.Fatalf("Timeout expired while waiting for NewProposal event")
case msg := <-proposalCh:
proposalEvent, ok := msg.Data().(types.EventDataCompleteProposal)
if !ok {
panic(fmt.Sprintf("expected a EventDataCompleteProposal, got %T. Wrong subscription channel?",
msg.Data()))
t.Fatalf("expected a EventDataCompleteProposal, got %T. Wrong subscription channel?",
msg.Data())
}
if proposalEvent.Height != height {
panic(fmt.Sprintf("expected height %v, got %v", height, proposalEvent.Height))
t.Fatalf("expected height %v, got %v", height, proposalEvent.Height)
}
if proposalEvent.Round != round {
panic(fmt.Sprintf("expected round %v, got %v", round, proposalEvent.Round))
t.Fatalf("expected round %v, got %v", round, proposalEvent.Round)
}
if !proposalEvent.BlockID.Equals(propID) {
panic(fmt.Sprintf("Proposed block does not match expected block (%v != %v)", proposalEvent.BlockID, propID))
t.Fatalf("Proposed block does not match expected block (%v != %v)", proposalEvent.BlockID, propID)
}
}
}
func ensurePrecommit(voteCh <-chan tmpubsub.Message, height int64, round int32) {
ensureVote(voteCh, height, round, tmproto.PrecommitType)
func ensurePrecommit(t *testing.T, voteCh <-chan tmpubsub.Message, height int64, round int32) {
t.Helper()
ensureVote(t, voteCh, height, round, tmproto.PrecommitType)
}
func ensurePrevote(voteCh <-chan tmpubsub.Message, height int64, round int32) {
ensureVote(voteCh, height, round, tmproto.PrevoteType)
func ensurePrevote(t *testing.T, voteCh <-chan tmpubsub.Message, height int64, round int32) {
t.Helper()
ensureVote(t, voteCh, height, round, tmproto.PrevoteType)
}
func ensureVote(voteCh <-chan tmpubsub.Message, height int64, round int32,
func ensureVote(t *testing.T, voteCh <-chan tmpubsub.Message, height int64, round int32,
voteType tmproto.SignedMsgType) {
t.Helper()
select {
case <-time.After(ensureTimeout):
panic("Timeout expired while waiting for NewVote event")
t.Fatalf("Timeout expired while waiting for NewVote event")
case msg := <-voteCh:
voteEvent, ok := msg.Data().(types.EventDataVote)
if !ok {
panic(fmt.Sprintf("expected a EventDataVote, got %T. Wrong subscription channel?",
msg.Data()))
t.Fatalf("expected a EventDataVote, got %T. Wrong subscription channel?",
msg.Data())
}
vote := voteEvent.Vote
if vote.Height != height {
panic(fmt.Sprintf("expected height %v, got %v", height, vote.Height))
t.Fatalf("expected height %v, got %v", height, vote.Height)
}
if vote.Round != round {
panic(fmt.Sprintf("expected round %v, got %v", round, vote.Round))
t.Fatalf("expected round %v, got %v", round, vote.Round)
}
if vote.Type != voteType {
panic(fmt.Sprintf("expected type %v, got %v", voteType, vote.Type))
t.Fatalf("expected type %v, got %v", voteType, vote.Type)
}
}
}
func ensurePrecommitTimeout(ch <-chan tmpubsub.Message) {
func ensurePrecommitTimeout(t *testing.T, ch <-chan tmpubsub.Message) {
t.Helper()
select {
case <-time.After(ensureTimeout):
panic("Timeout expired while waiting for the Precommit to Timeout")
t.Fatalf("Timeout expired while waiting for the Precommit to Timeout")
case <-ch:
}
}
func ensureNewEventOnChannel(ch <-chan tmpubsub.Message) {
func ensureNewEventOnChannel(t *testing.T, ch <-chan tmpubsub.Message) {
t.Helper()
select {
case <-time.After(ensureTimeout):
panic("Timeout expired while waiting for new activity on the channel")
t.Fatalf("Timeout expired while waiting for new activity on the channel")
case <-ch:
}
}
@@ -711,6 +737,7 @@ func randConsensusState(
appFunc func() abci.Application,
configOpts ...func(*cfg.Config),
) ([]*State, cleanupFunc) {
t.Helper()
genDoc, privVals := factory.RandGenesisDoc(config, nValidators, false, 30)
css := make([]*State, nValidators)
@@ -731,7 +758,7 @@ func randConsensusState(
opt(thisConfig)
}
ensureDir(filepath.Dir(thisConfig.Consensus.WalFile()), 0700) // dir for wal
ensureDir(t, filepath.Dir(thisConfig.Consensus.WalFile()), 0700) // dir for wal
app := appFunc()
@@ -759,6 +786,7 @@ func randConsensusState(
// nPeers = nValidators + nNotValidator
func randConsensusNetWithPeers(
t *testing.T,
config *cfg.Config,
nValidators,
nPeers int,
@@ -768,6 +796,7 @@ func randConsensusNetWithPeers(
) ([]*State, *types.GenesisDoc, *cfg.Config, cleanupFunc) {
genDoc, privVals := factory.RandGenesisDoc(config, nValidators, false, testMinPower)
css := make([]*State, nPeers)
t.Helper()
logger := consensusLogger()
var peer0Config *cfg.Config
@@ -776,7 +805,7 @@ func randConsensusNetWithPeers(
state, _ := sm.MakeGenesisState(genDoc)
thisConfig := ResetConfig(fmt.Sprintf("%s_%d", testName, i))
configRootDirs = append(configRootDirs, thisConfig.RootDir)
ensureDir(filepath.Dir(thisConfig.Consensus.WalFile()), 0700) // dir for wal
ensureDir(t, filepath.Dir(thisConfig.Consensus.WalFile()), 0700) // dir for wal
if i == 0 {
peer0Config = thisConfig
}
@@ -786,16 +815,16 @@ func randConsensusNetWithPeers(
} else {
tempKeyFile, err := ioutil.TempFile("", "priv_validator_key_")
if err != nil {
panic(err)
t.Fatalf("error creating temp file for validator key: %s", err)
}
tempStateFile, err := ioutil.TempFile("", "priv_validator_state_")
if err != nil {
panic(err)
t.Fatalf("error loading validator state: %s", err)
}
privVal, err = privval.GenFilePV(tempKeyFile.Name(), tempStateFile.Name(), "")
if err != nil {
panic(err)
t.Fatalf("error generating validator key: %s", err)
}
}

View File

@@ -40,12 +40,12 @@ func TestMempoolNoProgressUntilTxsAvailable(t *testing.T) {
newBlockCh := subscribe(cs.eventBus, types.EventQueryNewBlock)
startTestRound(cs, height, round)
ensureNewEventOnChannel(newBlockCh) // first block gets committed
ensureNoNewEventOnChannel(newBlockCh)
ensureNewEventOnChannel(t, newBlockCh) // first block gets committed
ensureNoNewEventOnChannel(t, newBlockCh)
deliverTxsRange(cs, 0, 1)
ensureNewEventOnChannel(newBlockCh) // commit txs
ensureNewEventOnChannel(newBlockCh) // commit updated app hash
ensureNoNewEventOnChannel(newBlockCh)
ensureNewEventOnChannel(t, newBlockCh) // commit txs
ensureNewEventOnChannel(t, newBlockCh) // commit updated app hash
ensureNoNewEventOnChannel(t, newBlockCh)
}
func TestMempoolProgressAfterCreateEmptyBlocksInterval(t *testing.T) {
@@ -63,9 +63,9 @@ func TestMempoolProgressAfterCreateEmptyBlocksInterval(t *testing.T) {
newBlockCh := subscribe(cs.eventBus, types.EventQueryNewBlock)
startTestRound(cs, cs.Height, cs.Round)
ensureNewEventOnChannel(newBlockCh) // first block gets committed
ensureNoNewEventOnChannel(newBlockCh) // then we dont make a block ...
ensureNewEventOnChannel(newBlockCh) // until the CreateEmptyBlocksInterval has passed
ensureNewEventOnChannel(t, newBlockCh) // first block gets committed
ensureNoNewEventOnChannel(t, newBlockCh) // then we dont make a block ...
ensureNewEventOnChannel(t, newBlockCh) // until the CreateEmptyBlocksInterval has passed
}
func TestMempoolProgressInHigherRound(t *testing.T) {
@@ -93,19 +93,19 @@ func TestMempoolProgressInHigherRound(t *testing.T) {
}
startTestRound(cs, height, round)
ensureNewRound(newRoundCh, height, round) // first round at first height
ensureNewEventOnChannel(newBlockCh) // first block gets committed
ensureNewRound(t, newRoundCh, height, round) // first round at first height
ensureNewEventOnChannel(t, newBlockCh) // first block gets committed
height++ // moving to the next height
round = 0
ensureNewRound(newRoundCh, height, round) // first round at next height
deliverTxsRange(cs, 0, 1) // we deliver txs, but dont set a proposal so we get the next round
ensureNewTimeout(timeoutCh, height, round, cs.config.TimeoutPropose.Nanoseconds())
ensureNewRound(t, newRoundCh, height, round) // first round at next height
deliverTxsRange(cs, 0, 1) // we deliver txs, but dont set a proposal so we get the next round
ensureNewTimeout(t, timeoutCh, height, round, cs.config.TimeoutPropose.Nanoseconds())
round++ // moving to the next round
ensureNewRound(newRoundCh, height, round) // wait for the next round
ensureNewEventOnChannel(newBlockCh) // now we can commit the block
round++ // moving to the next round
ensureNewRound(t, newRoundCh, height, round) // wait for the next round
ensureNewEventOnChannel(t, newBlockCh) // now we can commit the block
}
func deliverTxsRange(cs *State, start, end int) {

View File

@@ -336,7 +336,7 @@ func TestReactorWithEvidence(t *testing.T) {
defer os.RemoveAll(thisConfig.RootDir)
ensureDir(path.Dir(thisConfig.Consensus.WalFile()), 0700) // dir for wal
ensureDir(t, path.Dir(thisConfig.Consensus.WalFile()), 0700) // dir for wal
app := appFunc()
vals := types.TM2PB.ValidatorUpdates(state.Validators)
app.InitChain(abci.RequestInitChain{Validators: vals})
@@ -627,6 +627,7 @@ func TestReactorValidatorSetChanges(t *testing.T) {
nPeers := 7
nVals := 4
states, _, _, cleanup := randConsensusNetWithPeers(
t,
config,
nVals,
nPeers,

View File

@@ -58,7 +58,7 @@ func startNewStateAndWaitForBlock(t *testing.T, consensusReplayConfig *cfg.Confi
logger := log.TestingLogger()
state, err := sm.MakeGenesisStateFromFile(consensusReplayConfig.GenesisFile())
require.NoError(t, err)
privValidator := loadPrivValidator(consensusReplayConfig)
privValidator := loadPrivValidator(t, consensusReplayConfig)
blockStore := store.NewBlockStore(dbm.NewMemDB())
cs := newStateWithConfigAndBlockStore(
consensusReplayConfig,
@@ -154,7 +154,7 @@ LOOP:
blockStore := store.NewBlockStore(blockDB)
state, err := sm.MakeGenesisStateFromFile(consensusReplayConfig.GenesisFile())
require.NoError(t, err)
privValidator := loadPrivValidator(consensusReplayConfig)
privValidator := loadPrivValidator(t, consensusReplayConfig)
cs := newStateWithConfigAndBlockStore(
consensusReplayConfig,
state,
@@ -321,6 +321,7 @@ func setupSimulator(t *testing.T) *simulatorTestSuite {
nVals := 4
css, genDoc, config, cleanup := randConsensusNetWithPeers(
t,
config,
nVals,
nPeers,
@@ -345,15 +346,15 @@ func setupSimulator(t *testing.T) *simulatorTestSuite {
// start the machine
startTestRound(css[0], height, round)
incrementHeight(vss...)
ensureNewRound(newRoundCh, height, 0)
ensureNewProposal(proposalCh, height, round)
ensureNewRound(t, newRoundCh, height, 0)
ensureNewProposal(t, proposalCh, height, round)
rs := css[0].GetRoundState()
signAddVotes(sim.Config, css[0], tmproto.PrecommitType,
rs.ProposalBlock.Hash(), rs.ProposalBlockParts.Header(),
vss[1:nVals]...)
ensureNewRound(newRoundCh, height+1, 0)
ensureNewRound(t, newRoundCh, height+1, 0)
// HEIGHT 2
height++
@@ -380,12 +381,12 @@ func setupSimulator(t *testing.T) *simulatorTestSuite {
if err := css[0].SetProposalAndBlock(proposal, propBlock, propBlockParts, "some peer"); err != nil {
t.Fatal(err)
}
ensureNewProposal(proposalCh, height, round)
ensureNewProposal(t, proposalCh, height, round)
rs = css[0].GetRoundState()
signAddVotes(sim.Config, css[0], tmproto.PrecommitType,
rs.ProposalBlock.Hash(), rs.ProposalBlockParts.Header(),
vss[1:nVals]...)
ensureNewRound(newRoundCh, height+1, 0)
ensureNewRound(t, newRoundCh, height+1, 0)
// HEIGHT 3
height++
@@ -412,12 +413,12 @@ func setupSimulator(t *testing.T) *simulatorTestSuite {
if err := css[0].SetProposalAndBlock(proposal, propBlock, propBlockParts, "some peer"); err != nil {
t.Fatal(err)
}
ensureNewProposal(proposalCh, height, round)
ensureNewProposal(t, proposalCh, height, round)
rs = css[0].GetRoundState()
signAddVotes(sim.Config, css[0], tmproto.PrecommitType,
rs.ProposalBlock.Hash(), rs.ProposalBlockParts.Header(),
vss[1:nVals]...)
ensureNewRound(newRoundCh, height+1, 0)
ensureNewRound(t, newRoundCh, height+1, 0)
// HEIGHT 4
height++
@@ -471,7 +472,7 @@ func setupSimulator(t *testing.T) *simulatorTestSuite {
if err := css[0].SetProposalAndBlock(proposal, propBlock, propBlockParts, "some peer"); err != nil {
t.Fatal(err)
}
ensureNewProposal(proposalCh, height, round)
ensureNewProposal(t, proposalCh, height, round)
removeValidatorTx2 := kvstore.MakeValSetChangeTx(newVal2ABCI, 0)
err = assertMempool(css[0].txNotifier).CheckTx(context.Background(), removeValidatorTx2, nil, mempl.TxInfo{})
@@ -487,7 +488,7 @@ func setupSimulator(t *testing.T) *simulatorTestSuite {
rs.ProposalBlockParts.Header(), newVss[i])
}
ensureNewRound(newRoundCh, height+1, 0)
ensureNewRound(t, newRoundCh, height+1, 0)
// HEIGHT 5
height++
@@ -497,7 +498,7 @@ func setupSimulator(t *testing.T) *simulatorTestSuite {
newVss[newVssIdx].VotingPower = 25
sort.Sort(ValidatorStubsByPower(newVss))
selfIndex = valIndexFn(0)
ensureNewProposal(proposalCh, height, round)
ensureNewProposal(t, proposalCh, height, round)
rs = css[0].GetRoundState()
for i := 0; i < nVals+1; i++ {
if i == selfIndex {
@@ -507,7 +508,7 @@ func setupSimulator(t *testing.T) *simulatorTestSuite {
tmproto.PrecommitType, rs.ProposalBlock.Hash(),
rs.ProposalBlockParts.Header(), newVss[i])
}
ensureNewRound(newRoundCh, height+1, 0)
ensureNewRound(t, newRoundCh, height+1, 0)
// HEIGHT 6
height++
@@ -534,7 +535,7 @@ func setupSimulator(t *testing.T) *simulatorTestSuite {
if err := css[0].SetProposalAndBlock(proposal, propBlock, propBlockParts, "some peer"); err != nil {
t.Fatal(err)
}
ensureNewProposal(proposalCh, height, round)
ensureNewProposal(t, proposalCh, height, round)
rs = css[0].GetRoundState()
for i := 0; i < nVals+3; i++ {
if i == selfIndex {
@@ -544,7 +545,7 @@ func setupSimulator(t *testing.T) *simulatorTestSuite {
tmproto.PrecommitType, rs.ProposalBlock.Hash(),
rs.ProposalBlockParts.Header(), newVss[i])
}
ensureNewRound(newRoundCh, height+1, 0)
ensureNewRound(t, newRoundCh, height+1, 0)
sim.Chain = make([]*types.Block, 0)
sim.Commits = make([]*types.Commit, 0)

View File

@@ -137,7 +137,7 @@ type State struct {
done chan struct{}
// synchronous pubsub between consensus state and reactor.
// state only emits EventNewRoundStep and EventVote
// state only emits EventNewRoundStep, EventValidBlock, and EventVote
evsw tmevents.EventSwitch
// for reporting metrics
@@ -1361,7 +1361,6 @@ func (cs *State) enterPrevoteWait(height int64, round int32) {
// Enter: `timeoutPrecommit` after any +2/3 precommits.
// Enter: +2/3 precomits for block or nil.
// Lock & precommit the ProposalBlock if we have enough prevotes for it (a POL in this round)
// else, unlock an existing lock and precommit nil if +2/3 of prevotes were nil,
// else, precommit nil otherwise.
func (cs *State) enterPrecommit(height int64, round int32) {
logger := cs.Logger.With("height", height, "round", round)
@@ -1408,21 +1407,9 @@ func (cs *State) enterPrecommit(height int64, round int32) {
panic(fmt.Sprintf("this POLRound should be %v but got %v", round, polRound))
}
// +2/3 prevoted nil. Unlock and precommit nil.
if len(blockID.Hash) == 0 {
if cs.LockedBlock == nil {
logger.Debug("precommit step; +2/3 prevoted for nil")
} else {
logger.Debug("precommit step; +2/3 prevoted for nil; unlocking")
cs.LockedRound = -1
cs.LockedBlock = nil
cs.LockedBlockParts = nil
if err := cs.eventBus.PublishEventUnlock(cs.RoundStateEvent()); err != nil {
logger.Error("failed publishing event unlock", "err", err)
}
}
// +2/3 prevoted nil. Precommit nil.
if blockID.IsNil() {
logger.Debug("precommit step; +2/3 prevoted for nil")
cs.signAddVote(tmproto.PrecommitType, nil, types.PartSetHeader{})
return
}
@@ -1442,7 +1429,9 @@ func (cs *State) enterPrecommit(height int64, round int32) {
return
}
// If +2/3 prevoted for proposal block, stage and precommit it
// If greater than 2/3 of the voting power on the network prevoted for
// the proposed block, update our locked block to this block and issue a
// precommit vote for it.
if cs.ProposalBlock.HashesTo(blockID.Hash) {
logger.Debug("precommit step; +2/3 prevoted proposal block; locking", "hash", blockID.Hash)
@@ -1464,23 +1453,14 @@ func (cs *State) enterPrecommit(height int64, round int32) {
}
// There was a polka in this round for a block we don't have.
// Fetch that block, unlock, and precommit nil.
// The +2/3 prevotes for this round is the POL for our unlock.
// Fetch that block, and precommit nil.
logger.Debug("precommit step; +2/3 prevotes for a block we do not have; voting nil", "block_id", blockID)
cs.LockedRound = -1
cs.LockedBlock = nil
cs.LockedBlockParts = nil
if !cs.ProposalBlockParts.HasHeader(blockID.PartSetHeader) {
cs.ProposalBlock = nil
cs.ProposalBlockParts = types.NewPartSetFromHeader(blockID.PartSetHeader)
}
if err := cs.eventBus.PublishEventUnlock(cs.RoundStateEvent()); err != nil {
logger.Error("failed publishing event unlock", "err", err)
}
cs.signAddVote(tmproto.PrecommitType, nil, types.PartSetHeader{})
}
@@ -1588,7 +1568,7 @@ func (cs *State) tryFinalizeCommit(height int64) {
}
blockID, ok := cs.Votes.Precommits(cs.CommitRound).TwoThirdsMajority()
if !ok || len(blockID.Hash) == 0 {
if !ok || blockID.IsNil() {
logger.Error("failed attempt to finalize commit; there was no +2/3 majority or +2/3 was for nil")
return
}
@@ -1921,7 +1901,7 @@ func (cs *State) addProposalBlockPart(msg *BlockPartMessage, peerID types.NodeID
// Update Valid* if we can.
prevotes := cs.Votes.Prevotes(cs.Round)
blockID, hasTwoThirds := prevotes.TwoThirdsMajority()
if hasTwoThirds && !blockID.IsZero() && (cs.ValidRound < cs.Round) {
if hasTwoThirds && !blockID.IsNil() && (cs.ValidRound < cs.Round) {
if cs.ProposalBlock.HashesTo(blockID.Hash) {
cs.Logger.Debug(
"updating valid block to new proposal block",
@@ -2070,33 +2050,13 @@ func (cs *State) addVote(vote *types.Vote, peerID types.NodeID) (added bool, err
prevotes := cs.Votes.Prevotes(vote.Round)
cs.Logger.Debug("added vote to prevote", "vote", vote, "prevotes", prevotes.StringShort())
// If +2/3 prevotes for a block or nil for *any* round:
if blockID, ok := prevotes.TwoThirdsMajority(); ok {
// There was a polka!
// If we're locked but this is a recent polka, unlock.
// If it matches our ProposalBlock, update the ValidBlock
// Unlock if `cs.LockedRound < vote.Round <= cs.Round`
// NOTE: If vote.Round > cs.Round, we'll deal with it when we get to vote.Round
if (cs.LockedBlock != nil) &&
(cs.LockedRound < vote.Round) &&
(vote.Round <= cs.Round) &&
!cs.LockedBlock.HashesTo(blockID.Hash) {
cs.Logger.Debug("unlocking because of POL", "locked_round", cs.LockedRound, "pol_round", vote.Round)
cs.LockedRound = -1
cs.LockedBlock = nil
cs.LockedBlockParts = nil
if err := cs.eventBus.PublishEventUnlock(cs.RoundStateEvent()); err != nil {
return added, err
}
}
// Check to see if >2/3 of the voting power on the network voted for any non-nil block.
if blockID, ok := prevotes.TwoThirdsMajority(); ok && !blockID.IsNil() {
// Greater than 2/3 of the voting power on the network voted for some
// non-nil block
// Update Valid* if we can.
// NOTE: our proposal block may be nil or not what received a polka..
if len(blockID.Hash) != 0 && (cs.ValidRound < vote.Round) && (vote.Round == cs.Round) {
if cs.ValidRound < vote.Round && vote.Round == cs.Round {
if cs.ProposalBlock.HashesTo(blockID.Hash) {
cs.Logger.Debug("updating valid block because of POL", "valid_round", cs.ValidRound, "pol_round", vote.Round)
cs.ValidRound = vote.Round
@@ -2132,7 +2092,7 @@ func (cs *State) addVote(vote *types.Vote, peerID types.NodeID) (added bool, err
case cs.Round == vote.Round && cstypes.RoundStepPrevote <= cs.Step: // current round
blockID, ok := prevotes.TwoThirdsMajority()
if ok && (cs.isProposalComplete() || len(blockID.Hash) == 0) {
if ok && (cs.isProposalComplete() || blockID.IsNil()) {
cs.enterPrecommit(height, vote.Round)
} else if prevotes.HasTwoThirdsAny() {
cs.enterPrevoteWait(height, vote.Round)
@@ -2160,7 +2120,7 @@ func (cs *State) addVote(vote *types.Vote, peerID types.NodeID) (added bool, err
cs.enterNewRound(height, vote.Round)
cs.enterPrecommit(height, vote.Round)
if len(blockID.Hash) != 0 {
if !blockID.IsNil() {
cs.enterCommit(height, vote.Round)
if cs.config.SkipTimeoutCommit && precommits.HasAll() {
cs.enterNewRound(cs.Height, 0)

File diff suppressed because it is too large Load Diff

View File

@@ -16,7 +16,7 @@ type wrappedEnvelope struct {
size uint
}
// assert the WDDR scheduler implements the queue interface at compile-time
// assert the WDRR scheduler implements the queue interface at compile-time
var _ queue = (*wdrrScheduler)(nil)
// wdrrQueue implements a Weighted Deficit Round Robin (WDRR) scheduling

View File

@@ -55,7 +55,7 @@ func MakeHeader(h *types.Header) (*types.Header, error) {
if h.Height == 0 {
h.Height = 1
}
if h.LastBlockID.IsZero() {
if h.LastBlockID.IsNil() {
h.LastBlockID = MakeBlockID()
}
if h.ChainID == "" {

View File

@@ -379,6 +379,7 @@ func (c *Client) Update(ctx context.Context, now time.Time) (*types.LightBlock,
return nil, err
}
// If there is a new light block then verify it
if latestBlock.Height > lastTrustedHeight {
err = c.verifyLightBlock(ctx, latestBlock, now)
if err != nil {
@@ -388,7 +389,8 @@ func (c *Client) Update(ctx context.Context, now time.Time) (*types.LightBlock,
return latestBlock, nil
}
return nil, nil
// else return the latestTrustedBlock
return c.latestTrustedBlock, nil
}
// VerifyLightBlockAtHeight fetches the light block at the given height

View File

@@ -644,7 +644,7 @@ func TestClientReplacesPrimaryWithWitnessIfPrimaryIsUnavailable(t *testing.T) {
chainID,
trustOptions,
mockDeadNode,
[]provider.Provider{mockFullNode, mockFullNode},
[]provider.Provider{mockDeadNode, mockFullNode},
dbs.New(dbm.NewMemDB()),
light.Logger(log.TestingLogger()),
)
@@ -663,6 +663,32 @@ func TestClientReplacesPrimaryWithWitnessIfPrimaryIsUnavailable(t *testing.T) {
mockFullNode.AssertExpectations(t)
}
func TestClientReplacesPrimaryWithWitnessIfPrimaryDoesntHaveBlock(t *testing.T) {
mockFullNode := &provider_mocks.Provider{}
mockFullNode.On("LightBlock", mock.Anything, mock.Anything).Return(l1, nil)
mockDeadNode := &provider_mocks.Provider{}
mockDeadNode.On("LightBlock", mock.Anything, mock.Anything).Return(nil, provider.ErrLightBlockNotFound)
c, err := light.NewClient(
ctx,
chainID,
trustOptions,
mockDeadNode,
[]provider.Provider{mockDeadNode, mockFullNode},
dbs.New(dbm.NewMemDB()),
light.Logger(log.TestingLogger()),
)
require.NoError(t, err)
_, err = c.Update(ctx, bTime.Add(2*time.Hour))
require.NoError(t, err)
// we should still have the dead node as a witness because it
// hasn't repeatedly been unresponsive yet
assert.Equal(t, 2, len(c.Witnesses()))
mockDeadNode.AssertExpectations(t)
mockFullNode.AssertExpectations(t)
}
func TestClient_BackwardsVerification(t *testing.T) {
{
headers, vals, _ := genLightBlocksWithKeys(chainID, 9, 3, 0, bTime)

View File

@@ -341,7 +341,7 @@ func (c *Client) Block(ctx context.Context, height *int64) (*ctypes.ResultBlock,
}
// BlockByHash calls rpcclient#BlockByHash and then verifies the result.
func (c *Client) BlockByHash(ctx context.Context, hash []byte) (*ctypes.ResultBlock, error) {
func (c *Client) BlockByHash(ctx context.Context, hash tmbytes.HexBytes) (*ctypes.ResultBlock, error) {
res, err := c.next.BlockByHash(ctx, hash)
if err != nil {
return nil, err
@@ -454,7 +454,7 @@ func (c *Client) Commit(ctx context.Context, height *int64) (*ctypes.ResultCommi
// Tx calls rpcclient#Tx method and then verifies the proof if such was
// requested.
func (c *Client) Tx(ctx context.Context, hash []byte, prove bool) (*ctypes.ResultTx, error) {
func (c *Client) Tx(ctx context.Context, hash tmbytes.HexBytes, prove bool) (*ctypes.ResultTx, error) {
res, err := c.next.Tx(ctx, hash, prove)
if err != nil || !prove {
return res, err

View File

@@ -220,7 +220,7 @@ func makeNode(config *cfg.Config,
// Determine whether we should do block sync. This must happen after the handshake, since the
// app may modify the validator set, specifying ourself as the only validator.
blockSync := config.FastSyncMode && !onlyValidatorIsUs(state, pubKey)
blockSync := config.BlockSync.Enable && !onlyValidatorIsUs(state, pubKey)
logNodeStartupInfo(state, pubKey, logger, consensusLogger, config.Mode)
@@ -702,7 +702,11 @@ func (n *nodeImpl) OnStart() error {
n.Logger.Info("starting state sync")
state, err := n.stateSyncReactor.Sync(context.TODO())
if err != nil {
n.Logger.Error("state sync failed", "err", err)
n.Logger.Error("state sync failed; shutting down this node", "err", err)
// stop the node
if err := n.Stop(); err != nil {
n.Logger.Error("failed to shut down node", "err", err)
}
return
}
@@ -716,7 +720,7 @@ func (n *nodeImpl) OnStart() error {
// TODO: Some form of orchestrator is needed here between the state
// advancing reactors to be able to control which one of the three
// is running
if n.config.FastSyncMode {
if n.config.BlockSync.Enable {
// FIXME Very ugly to have these metrics bleed through here.
n.consensusReactor.SetBlockSyncingMetrics(1)
if err := bcR.SwitchToBlockSync(state); err != nil {

View File

@@ -34,6 +34,7 @@ type ConsensusParams struct {
Evidence *EvidenceParams `protobuf:"bytes,2,opt,name=evidence,proto3" json:"evidence,omitempty"`
Validator *ValidatorParams `protobuf:"bytes,3,opt,name=validator,proto3" json:"validator,omitempty"`
Version *VersionParams `protobuf:"bytes,4,opt,name=version,proto3" json:"version,omitempty"`
Timestamp *TimestampParams `protobuf:"bytes,5,opt,name=timestamp,proto3" json:"timestamp,omitempty"`
}
func (m *ConsensusParams) Reset() { *m = ConsensusParams{} }
@@ -97,6 +98,13 @@ func (m *ConsensusParams) GetVersion() *VersionParams {
return nil
}
func (m *ConsensusParams) GetTimestamp() *TimestampParams {
if m != nil {
return m.Timestamp
}
return nil
}
// BlockParams contains limits on the block size.
type BlockParams struct {
// Max block size, in bytes.
@@ -318,6 +326,66 @@ func (m *VersionParams) GetAppVersion() uint64 {
return 0
}
type TimestampParams struct {
Accuracy time.Duration `protobuf:"bytes,1,opt,name=accuracy,proto3,stdduration" json:"accuracy"`
Precision time.Duration `protobuf:"bytes,2,opt,name=precision,proto3,stdduration" json:"precision"`
MessageDelay time.Duration `protobuf:"bytes,3,opt,name=message_delay,json=messageDelay,proto3,stdduration" json:"message_delay"`
}
func (m *TimestampParams) Reset() { *m = TimestampParams{} }
func (m *TimestampParams) String() string { return proto.CompactTextString(m) }
func (*TimestampParams) ProtoMessage() {}
func (*TimestampParams) Descriptor() ([]byte, []int) {
return fileDescriptor_e12598271a686f57, []int{5}
}
func (m *TimestampParams) XXX_Unmarshal(b []byte) error {
return m.Unmarshal(b)
}
func (m *TimestampParams) XXX_Marshal(b []byte, deterministic bool) ([]byte, error) {
if deterministic {
return xxx_messageInfo_TimestampParams.Marshal(b, m, deterministic)
} else {
b = b[:cap(b)]
n, err := m.MarshalToSizedBuffer(b)
if err != nil {
return nil, err
}
return b[:n], nil
}
}
func (m *TimestampParams) XXX_Merge(src proto.Message) {
xxx_messageInfo_TimestampParams.Merge(m, src)
}
func (m *TimestampParams) XXX_Size() int {
return m.Size()
}
func (m *TimestampParams) XXX_DiscardUnknown() {
xxx_messageInfo_TimestampParams.DiscardUnknown(m)
}
var xxx_messageInfo_TimestampParams proto.InternalMessageInfo
func (m *TimestampParams) GetAccuracy() time.Duration {
if m != nil {
return m.Accuracy
}
return 0
}
func (m *TimestampParams) GetPrecision() time.Duration {
if m != nil {
return m.Precision
}
return 0
}
func (m *TimestampParams) GetMessageDelay() time.Duration {
if m != nil {
return m.MessageDelay
}
return 0
}
// HashedParams is a subset of ConsensusParams.
//
// It is hashed into the Header.ConsensusHash.
@@ -330,7 +398,7 @@ func (m *HashedParams) Reset() { *m = HashedParams{} }
func (m *HashedParams) String() string { return proto.CompactTextString(m) }
func (*HashedParams) ProtoMessage() {}
func (*HashedParams) Descriptor() ([]byte, []int) {
return fileDescriptor_e12598271a686f57, []int{5}
return fileDescriptor_e12598271a686f57, []int{6}
}
func (m *HashedParams) XXX_Unmarshal(b []byte) error {
return m.Unmarshal(b)
@@ -379,45 +447,51 @@ func init() {
proto.RegisterType((*EvidenceParams)(nil), "tendermint.types.EvidenceParams")
proto.RegisterType((*ValidatorParams)(nil), "tendermint.types.ValidatorParams")
proto.RegisterType((*VersionParams)(nil), "tendermint.types.VersionParams")
proto.RegisterType((*TimestampParams)(nil), "tendermint.types.TimestampParams")
proto.RegisterType((*HashedParams)(nil), "tendermint.types.HashedParams")
}
func init() { proto.RegisterFile("tendermint/types/params.proto", fileDescriptor_e12598271a686f57) }
var fileDescriptor_e12598271a686f57 = []byte{
// 498 bytes of a gzipped FileDescriptorProto
0x1f, 0x8b, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x02, 0xff, 0x6c, 0x93, 0xc1, 0x6a, 0xd4, 0x40,
0x1c, 0xc6, 0x77, 0x9a, 0xda, 0xee, 0xfe, 0xe3, 0x76, 0xcb, 0x20, 0x18, 0x2b, 0xcd, 0xae, 0x39,
0x48, 0x41, 0x48, 0xc4, 0x22, 0x22, 0x08, 0xe2, 0x56, 0xa9, 0x20, 0x15, 0x09, 0xea, 0xa1, 0x97,
0x30, 0xd9, 0x8c, 0x69, 0xe8, 0x4e, 0x66, 0xc8, 0x24, 0xcb, 0xee, 0xcd, 0x47, 0xf0, 0xe8, 0x23,
0xe8, 0x9b, 0xf4, 0xd8, 0xa3, 0x27, 0x95, 0xdd, 0x17, 0x91, 0x4c, 0x32, 0xa6, 0x9b, 0xf6, 0x36,
0x33, 0xdf, 0xef, 0x9b, 0xe1, 0xfb, 0x86, 0x3f, 0xec, 0xe7, 0x34, 0x8d, 0x68, 0xc6, 0x92, 0x34,
0xf7, 0xf2, 0x85, 0xa0, 0xd2, 0x13, 0x24, 0x23, 0x4c, 0xba, 0x22, 0xe3, 0x39, 0xc7, 0xbb, 0x8d,
0xec, 0x2a, 0x79, 0xef, 0x4e, 0xcc, 0x63, 0xae, 0x44, 0xaf, 0x5c, 0x55, 0xdc, 0x9e, 0x1d, 0x73,
0x1e, 0x4f, 0xa9, 0xa7, 0x76, 0x61, 0xf1, 0xc5, 0x8b, 0x8a, 0x8c, 0xe4, 0x09, 0x4f, 0x2b, 0xdd,
0xf9, 0xba, 0x01, 0x83, 0x23, 0x9e, 0x4a, 0x9a, 0xca, 0x42, 0x7e, 0x50, 0x2f, 0xe0, 0x43, 0xb8,
0x15, 0x4e, 0xf9, 0xe4, 0xdc, 0x42, 0x23, 0x74, 0x60, 0x3e, 0xd9, 0x77, 0xdb, 0x6f, 0xb9, 0xe3,
0x52, 0xae, 0x68, 0xbf, 0x62, 0xf1, 0x0b, 0xe8, 0xd2, 0x59, 0x12, 0xd1, 0x74, 0x42, 0xad, 0x0d,
0xe5, 0x1b, 0x5d, 0xf7, 0xbd, 0xa9, 0x89, 0xda, 0xfa, 0xdf, 0x81, 0x5f, 0x42, 0x6f, 0x46, 0xa6,
0x49, 0x44, 0x72, 0x9e, 0x59, 0x86, 0xb2, 0x3f, 0xb8, 0x6e, 0xff, 0xac, 0x91, 0xda, 0xdf, 0x78,
0xf0, 0x73, 0xd8, 0x9e, 0xd1, 0x4c, 0x26, 0x3c, 0xb5, 0x36, 0x95, 0x7d, 0x78, 0x83, 0xbd, 0x02,
0x6a, 0xb3, 0xe6, 0x9d, 0x23, 0x30, 0xaf, 0xe4, 0xc1, 0xf7, 0xa1, 0xc7, 0xc8, 0x3c, 0x08, 0x17,
0x39, 0x95, 0xaa, 0x01, 0xc3, 0xef, 0x32, 0x32, 0x1f, 0x97, 0x7b, 0x7c, 0x17, 0xb6, 0x4b, 0x31,
0x26, 0x52, 0x85, 0x34, 0xfc, 0x2d, 0x46, 0xe6, 0xc7, 0x44, 0x3a, 0x3f, 0x11, 0xec, 0xac, 0xa7,
0xc3, 0x8f, 0x00, 0x97, 0x2c, 0x89, 0x69, 0x90, 0x16, 0x2c, 0x50, 0x35, 0xe9, 0x1b, 0x07, 0x8c,
0xcc, 0x5f, 0xc5, 0xf4, 0x7d, 0xc1, 0xd4, 0xd3, 0x12, 0x9f, 0xc0, 0xae, 0x86, 0xf5, 0x0f, 0xd5,
0x35, 0xde, 0x73, 0xab, 0x2f, 0x74, 0xf5, 0x17, 0xba, 0xaf, 0x6b, 0x60, 0xdc, 0xbd, 0xf8, 0x3d,
0xec, 0x7c, 0xff, 0x33, 0x44, 0xfe, 0x4e, 0x75, 0x9f, 0x56, 0xd6, 0x43, 0x18, 0xeb, 0x21, 0x9c,
0xa7, 0x30, 0x68, 0x35, 0x89, 0x1d, 0xe8, 0x8b, 0x22, 0x0c, 0xce, 0xe9, 0x22, 0x50, 0x5d, 0x59,
0x68, 0x64, 0x1c, 0xf4, 0x7c, 0x53, 0x14, 0xe1, 0x3b, 0xba, 0xf8, 0x58, 0x1e, 0x39, 0x8f, 0xa1,
0xbf, 0xd6, 0x20, 0x1e, 0x82, 0x49, 0x84, 0x08, 0x74, 0xef, 0x65, 0xb2, 0x4d, 0x1f, 0x88, 0x10,
0x35, 0xe6, 0x9c, 0xc2, 0xed, 0xb7, 0x44, 0x9e, 0xd1, 0xa8, 0x36, 0x3c, 0x84, 0x81, 0x6a, 0x21,
0x68, 0x17, 0xdc, 0x57, 0xc7, 0x27, 0xba, 0x65, 0x07, 0xfa, 0x0d, 0xd7, 0x74, 0x6d, 0x6a, 0xea,
0x98, 0xc8, 0xf1, 0xa7, 0x1f, 0x4b, 0x1b, 0x5d, 0x2c, 0x6d, 0x74, 0xb9, 0xb4, 0xd1, 0xdf, 0xa5,
0x8d, 0xbe, 0xad, 0xec, 0xce, 0xe5, 0xca, 0xee, 0xfc, 0x5a, 0xd9, 0x9d, 0xd3, 0x67, 0x71, 0x92,
0x9f, 0x15, 0xa1, 0x3b, 0xe1, 0xcc, 0xbb, 0x3a, 0x48, 0xcd, 0xb2, 0x9a, 0x94, 0xf6, 0x90, 0x85,
0x5b, 0xea, 0xfc, 0xf0, 0x5f, 0x00, 0x00, 0x00, 0xff, 0xff, 0x18, 0x54, 0x4f, 0xe1, 0x7f, 0x03,
0x00, 0x00,
// 577 bytes of a gzipped FileDescriptorProto
0x1f, 0x8b, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x02, 0xff, 0x94, 0x94, 0x4f, 0x6b, 0xd4, 0x40,
0x18, 0xc6, 0x37, 0xdd, 0xfe, 0xd9, 0x7d, 0xb7, 0xdb, 0x2d, 0x83, 0x60, 0xac, 0x34, 0x5b, 0x73,
0x90, 0x82, 0x90, 0x88, 0x45, 0x44, 0x10, 0x4a, 0xb7, 0x95, 0x16, 0xa4, 0x22, 0xa1, 0x7a, 0xe8,
0x25, 0x4c, 0xb2, 0x63, 0x1a, 0xba, 0x93, 0x19, 0x32, 0x49, 0xd9, 0x7c, 0x0b, 0x8f, 0x7e, 0x04,
0xfd, 0x18, 0xde, 0x7a, 0xec, 0xd1, 0x93, 0x95, 0xed, 0x17, 0x91, 0x4c, 0x66, 0x36, 0xdd, 0xad,
0x42, 0xbd, 0x25, 0xf3, 0x3e, 0xbf, 0x79, 0x79, 0x9f, 0xf7, 0x61, 0x60, 0x33, 0x23, 0xc9, 0x90,
0xa4, 0x34, 0x4e, 0x32, 0x37, 0x2b, 0x38, 0x11, 0x2e, 0xc7, 0x29, 0xa6, 0xc2, 0xe1, 0x29, 0xcb,
0x18, 0x5a, 0xaf, 0xcb, 0x8e, 0x2c, 0x6f, 0x3c, 0x88, 0x58, 0xc4, 0x64, 0xd1, 0x2d, 0xbf, 0x2a,
0xdd, 0x86, 0x15, 0x31, 0x16, 0x8d, 0x88, 0x2b, 0xff, 0x82, 0xfc, 0xb3, 0x3b, 0xcc, 0x53, 0x9c,
0xc5, 0x2c, 0xa9, 0xea, 0xf6, 0x8f, 0x05, 0xe8, 0xed, 0xb3, 0x44, 0x90, 0x44, 0xe4, 0xe2, 0x83,
0xec, 0x80, 0x76, 0x60, 0x29, 0x18, 0xb1, 0xf0, 0xdc, 0x34, 0xb6, 0x8c, 0xed, 0xce, 0x8b, 0x4d,
0x67, 0xbe, 0x97, 0x33, 0x28, 0xcb, 0x95, 0xda, 0xab, 0xb4, 0xe8, 0x0d, 0xb4, 0xc8, 0x45, 0x3c,
0x24, 0x49, 0x48, 0xcc, 0x05, 0xc9, 0x6d, 0xdd, 0xe5, 0xde, 0x2a, 0x85, 0x42, 0xa7, 0x04, 0xda,
0x85, 0xf6, 0x05, 0x1e, 0xc5, 0x43, 0x9c, 0xb1, 0xd4, 0x6c, 0x4a, 0xfc, 0xc9, 0x5d, 0xfc, 0x93,
0x96, 0x28, 0xbe, 0x66, 0xd0, 0x6b, 0x58, 0xb9, 0x20, 0xa9, 0x88, 0x59, 0x62, 0x2e, 0x4a, 0xbc,
0xff, 0x17, 0xbc, 0x12, 0x28, 0x58, 0xeb, 0xcb, 0xde, 0x59, 0x4c, 0x89, 0xc8, 0x30, 0xe5, 0xe6,
0xd2, 0xbf, 0x7a, 0x9f, 0x68, 0x89, 0xee, 0x3d, 0x65, 0xec, 0x7d, 0xe8, 0xdc, 0x32, 0x04, 0x3d,
0x86, 0x36, 0xc5, 0x63, 0x3f, 0x28, 0x32, 0x22, 0xa4, 0x85, 0x4d, 0xaf, 0x45, 0xf1, 0x78, 0x50,
0xfe, 0xa3, 0x87, 0xb0, 0x52, 0x16, 0x23, 0x2c, 0xa4, 0x4b, 0x4d, 0x6f, 0x99, 0xe2, 0xf1, 0x21,
0x16, 0xf6, 0x77, 0x03, 0xd6, 0x66, 0xed, 0x41, 0xcf, 0x00, 0x95, 0x5a, 0x1c, 0x11, 0x3f, 0xc9,
0xa9, 0x2f, 0x7d, 0xd6, 0x37, 0xf6, 0x28, 0x1e, 0xef, 0x45, 0xe4, 0x7d, 0x4e, 0x65, 0x6b, 0x81,
0x8e, 0x61, 0x5d, 0x8b, 0xf5, 0x8a, 0xd5, 0x1e, 0x1e, 0x39, 0x55, 0x06, 0x1c, 0x9d, 0x01, 0xe7,
0x40, 0x09, 0x06, 0xad, 0xcb, 0x5f, 0xfd, 0xc6, 0xd7, 0xeb, 0xbe, 0xe1, 0xad, 0x55, 0xf7, 0xe9,
0xca, 0xec, 0x10, 0xcd, 0xd9, 0x21, 0xec, 0x97, 0xd0, 0x9b, 0x5b, 0x05, 0xb2, 0xa1, 0xcb, 0xf3,
0xc0, 0x3f, 0x27, 0x85, 0x2f, 0xfd, 0x32, 0x8d, 0xad, 0xe6, 0x76, 0xdb, 0xeb, 0xf0, 0x3c, 0x78,
0x47, 0x8a, 0x93, 0xf2, 0xc8, 0x7e, 0x0e, 0xdd, 0x99, 0x15, 0xa0, 0x3e, 0x74, 0x30, 0xe7, 0xbe,
0x5e, 0x5c, 0x39, 0xd9, 0xa2, 0x07, 0x98, 0x73, 0x25, 0xb3, 0xaf, 0x0d, 0xe8, 0xcd, 0x19, 0x8f,
0x76, 0xa1, 0x85, 0xc3, 0x30, 0x4f, 0x71, 0x58, 0xa8, 0x80, 0xde, 0x6b, 0xc0, 0x29, 0x84, 0xf6,
0xa0, 0xcd, 0x53, 0x12, 0xc6, 0xe2, 0x3f, 0x2d, 0xaa, 0x29, 0x74, 0x04, 0x5d, 0x4a, 0x84, 0x90,
0x66, 0x93, 0x11, 0x2e, 0x54, 0x64, 0xef, 0x75, 0xcd, 0xaa, 0x22, 0x0f, 0x4a, 0xd0, 0x3e, 0x85,
0xd5, 0x23, 0x2c, 0xce, 0xc8, 0x50, 0x4d, 0xf7, 0x14, 0x7a, 0x72, 0xcf, 0xfe, 0x7c, 0x84, 0xba,
0xf2, 0xf8, 0x58, 0xe7, 0xc8, 0x86, 0x6e, 0xad, 0xab, 0xd3, 0xd4, 0xd1, 0xaa, 0x43, 0x2c, 0x06,
0x1f, 0xbf, 0x4d, 0x2c, 0xe3, 0x72, 0x62, 0x19, 0x57, 0x13, 0xcb, 0xf8, 0x3d, 0xb1, 0x8c, 0x2f,
0x37, 0x56, 0xe3, 0xea, 0xc6, 0x6a, 0xfc, 0xbc, 0xb1, 0x1a, 0xa7, 0xaf, 0xa2, 0x38, 0x3b, 0xcb,
0x03, 0x27, 0x64, 0xd4, 0xbd, 0xfd, 0xd6, 0xd4, 0x9f, 0xd5, 0x63, 0x32, 0xff, 0x0e, 0x05, 0xcb,
0xf2, 0x7c, 0xe7, 0x4f, 0x00, 0x00, 0x00, 0xff, 0xff, 0x6e, 0x64, 0xc1, 0x5d, 0xa2, 0x04, 0x00,
0x00,
}
func (this *ConsensusParams) Equal(that interface{}) bool {
@@ -451,6 +525,9 @@ func (this *ConsensusParams) Equal(that interface{}) bool {
if !this.Version.Equal(that1.Version) {
return false
}
if !this.Timestamp.Equal(that1.Timestamp) {
return false
}
return true
}
func (this *BlockParams) Equal(that interface{}) bool {
@@ -563,6 +640,36 @@ func (this *VersionParams) Equal(that interface{}) bool {
}
return true
}
func (this *TimestampParams) Equal(that interface{}) bool {
if that == nil {
return this == nil
}
that1, ok := that.(*TimestampParams)
if !ok {
that2, ok := that.(TimestampParams)
if ok {
that1 = &that2
} else {
return false
}
}
if that1 == nil {
return this == nil
} else if this == nil {
return false
}
if this.Accuracy != that1.Accuracy {
return false
}
if this.Precision != that1.Precision {
return false
}
if this.MessageDelay != that1.MessageDelay {
return false
}
return true
}
func (this *HashedParams) Equal(that interface{}) bool {
if that == nil {
return this == nil
@@ -610,6 +717,18 @@ func (m *ConsensusParams) MarshalToSizedBuffer(dAtA []byte) (int, error) {
_ = i
var l int
_ = l
if m.Timestamp != nil {
{
size, err := m.Timestamp.MarshalToSizedBuffer(dAtA[:i])
if err != nil {
return 0, err
}
i -= size
i = encodeVarintParams(dAtA, i, uint64(size))
}
i--
dAtA[i] = 0x2a
}
if m.Version != nil {
{
size, err := m.Version.MarshalToSizedBuffer(dAtA[:i])
@@ -719,12 +838,12 @@ func (m *EvidenceParams) MarshalToSizedBuffer(dAtA []byte) (int, error) {
i--
dAtA[i] = 0x18
}
n5, err5 := github_com_gogo_protobuf_types.StdDurationMarshalTo(m.MaxAgeDuration, dAtA[i-github_com_gogo_protobuf_types.SizeOfStdDuration(m.MaxAgeDuration):])
if err5 != nil {
return 0, err5
n6, err6 := github_com_gogo_protobuf_types.StdDurationMarshalTo(m.MaxAgeDuration, dAtA[i-github_com_gogo_protobuf_types.SizeOfStdDuration(m.MaxAgeDuration):])
if err6 != nil {
return 0, err6
}
i -= n5
i = encodeVarintParams(dAtA, i, uint64(n5))
i -= n6
i = encodeVarintParams(dAtA, i, uint64(n6))
i--
dAtA[i] = 0x12
if m.MaxAgeNumBlocks != 0 {
@@ -795,6 +914,53 @@ func (m *VersionParams) MarshalToSizedBuffer(dAtA []byte) (int, error) {
return len(dAtA) - i, nil
}
func (m *TimestampParams) Marshal() (dAtA []byte, err error) {
size := m.Size()
dAtA = make([]byte, size)
n, err := m.MarshalToSizedBuffer(dAtA[:size])
if err != nil {
return nil, err
}
return dAtA[:n], nil
}
func (m *TimestampParams) MarshalTo(dAtA []byte) (int, error) {
size := m.Size()
return m.MarshalToSizedBuffer(dAtA[:size])
}
func (m *TimestampParams) MarshalToSizedBuffer(dAtA []byte) (int, error) {
i := len(dAtA)
_ = i
var l int
_ = l
n7, err7 := github_com_gogo_protobuf_types.StdDurationMarshalTo(m.MessageDelay, dAtA[i-github_com_gogo_protobuf_types.SizeOfStdDuration(m.MessageDelay):])
if err7 != nil {
return 0, err7
}
i -= n7
i = encodeVarintParams(dAtA, i, uint64(n7))
i--
dAtA[i] = 0x1a
n8, err8 := github_com_gogo_protobuf_types.StdDurationMarshalTo(m.Precision, dAtA[i-github_com_gogo_protobuf_types.SizeOfStdDuration(m.Precision):])
if err8 != nil {
return 0, err8
}
i -= n8
i = encodeVarintParams(dAtA, i, uint64(n8))
i--
dAtA[i] = 0x12
n9, err9 := github_com_gogo_protobuf_types.StdDurationMarshalTo(m.Accuracy, dAtA[i-github_com_gogo_protobuf_types.SizeOfStdDuration(m.Accuracy):])
if err9 != nil {
return 0, err9
}
i -= n9
i = encodeVarintParams(dAtA, i, uint64(n9))
i--
dAtA[i] = 0xa
return len(dAtA) - i, nil
}
func (m *HashedParams) Marshal() (dAtA []byte, err error) {
size := m.Size()
dAtA = make([]byte, size)
@@ -861,6 +1027,10 @@ func (m *ConsensusParams) Size() (n int) {
l = m.Version.Size()
n += 1 + l + sovParams(uint64(l))
}
if m.Timestamp != nil {
l = m.Timestamp.Size()
n += 1 + l + sovParams(uint64(l))
}
return n
}
@@ -923,6 +1093,21 @@ func (m *VersionParams) Size() (n int) {
return n
}
func (m *TimestampParams) Size() (n int) {
if m == nil {
return 0
}
var l int
_ = l
l = github_com_gogo_protobuf_types.SizeOfStdDuration(m.Accuracy)
n += 1 + l + sovParams(uint64(l))
l = github_com_gogo_protobuf_types.SizeOfStdDuration(m.Precision)
n += 1 + l + sovParams(uint64(l))
l = github_com_gogo_protobuf_types.SizeOfStdDuration(m.MessageDelay)
n += 1 + l + sovParams(uint64(l))
return n
}
func (m *HashedParams) Size() (n int) {
if m == nil {
return 0
@@ -1117,6 +1302,42 @@ func (m *ConsensusParams) Unmarshal(dAtA []byte) error {
return err
}
iNdEx = postIndex
case 5:
if wireType != 2 {
return fmt.Errorf("proto: wrong wireType = %d for field Timestamp", wireType)
}
var msglen int
for shift := uint(0); ; shift += 7 {
if shift >= 64 {
return ErrIntOverflowParams
}
if iNdEx >= l {
return io.ErrUnexpectedEOF
}
b := dAtA[iNdEx]
iNdEx++
msglen |= int(b&0x7F) << shift
if b < 0x80 {
break
}
}
if msglen < 0 {
return ErrInvalidLengthParams
}
postIndex := iNdEx + msglen
if postIndex < 0 {
return ErrInvalidLengthParams
}
if postIndex > l {
return io.ErrUnexpectedEOF
}
if m.Timestamp == nil {
m.Timestamp = &TimestampParams{}
}
if err := m.Timestamp.Unmarshal(dAtA[iNdEx:postIndex]); err != nil {
return err
}
iNdEx = postIndex
default:
iNdEx = preIndex
skippy, err := skipParams(dAtA[iNdEx:])
@@ -1498,6 +1719,155 @@ func (m *VersionParams) Unmarshal(dAtA []byte) error {
}
return nil
}
func (m *TimestampParams) Unmarshal(dAtA []byte) error {
l := len(dAtA)
iNdEx := 0
for iNdEx < l {
preIndex := iNdEx
var wire uint64
for shift := uint(0); ; shift += 7 {
if shift >= 64 {
return ErrIntOverflowParams
}
if iNdEx >= l {
return io.ErrUnexpectedEOF
}
b := dAtA[iNdEx]
iNdEx++
wire |= uint64(b&0x7F) << shift
if b < 0x80 {
break
}
}
fieldNum := int32(wire >> 3)
wireType := int(wire & 0x7)
if wireType == 4 {
return fmt.Errorf("proto: TimestampParams: wiretype end group for non-group")
}
if fieldNum <= 0 {
return fmt.Errorf("proto: TimestampParams: illegal tag %d (wire type %d)", fieldNum, wire)
}
switch fieldNum {
case 1:
if wireType != 2 {
return fmt.Errorf("proto: wrong wireType = %d for field Accuracy", wireType)
}
var msglen int
for shift := uint(0); ; shift += 7 {
if shift >= 64 {
return ErrIntOverflowParams
}
if iNdEx >= l {
return io.ErrUnexpectedEOF
}
b := dAtA[iNdEx]
iNdEx++
msglen |= int(b&0x7F) << shift
if b < 0x80 {
break
}
}
if msglen < 0 {
return ErrInvalidLengthParams
}
postIndex := iNdEx + msglen
if postIndex < 0 {
return ErrInvalidLengthParams
}
if postIndex > l {
return io.ErrUnexpectedEOF
}
if err := github_com_gogo_protobuf_types.StdDurationUnmarshal(&m.Accuracy, dAtA[iNdEx:postIndex]); err != nil {
return err
}
iNdEx = postIndex
case 2:
if wireType != 2 {
return fmt.Errorf("proto: wrong wireType = %d for field Precision", wireType)
}
var msglen int
for shift := uint(0); ; shift += 7 {
if shift >= 64 {
return ErrIntOverflowParams
}
if iNdEx >= l {
return io.ErrUnexpectedEOF
}
b := dAtA[iNdEx]
iNdEx++
msglen |= int(b&0x7F) << shift
if b < 0x80 {
break
}
}
if msglen < 0 {
return ErrInvalidLengthParams
}
postIndex := iNdEx + msglen
if postIndex < 0 {
return ErrInvalidLengthParams
}
if postIndex > l {
return io.ErrUnexpectedEOF
}
if err := github_com_gogo_protobuf_types.StdDurationUnmarshal(&m.Precision, dAtA[iNdEx:postIndex]); err != nil {
return err
}
iNdEx = postIndex
case 3:
if wireType != 2 {
return fmt.Errorf("proto: wrong wireType = %d for field MessageDelay", wireType)
}
var msglen int
for shift := uint(0); ; shift += 7 {
if shift >= 64 {
return ErrIntOverflowParams
}
if iNdEx >= l {
return io.ErrUnexpectedEOF
}
b := dAtA[iNdEx]
iNdEx++
msglen |= int(b&0x7F) << shift
if b < 0x80 {
break
}
}
if msglen < 0 {
return ErrInvalidLengthParams
}
postIndex := iNdEx + msglen
if postIndex < 0 {
return ErrInvalidLengthParams
}
if postIndex > l {
return io.ErrUnexpectedEOF
}
if err := github_com_gogo_protobuf_types.StdDurationUnmarshal(&m.MessageDelay, dAtA[iNdEx:postIndex]); err != nil {
return err
}
iNdEx = postIndex
default:
iNdEx = preIndex
skippy, err := skipParams(dAtA[iNdEx:])
if err != nil {
return err
}
if (skippy < 0) || (iNdEx+skippy) < 0 {
return ErrInvalidLengthParams
}
if (iNdEx + skippy) > l {
return io.ErrUnexpectedEOF
}
iNdEx += skippy
}
}
if iNdEx > l {
return io.ErrUnexpectedEOF
}
return nil
}
func (m *HashedParams) Unmarshal(dAtA []byte) error {
l := len(dAtA)
iNdEx := 0

View File

@@ -15,6 +15,7 @@ message ConsensusParams {
EvidenceParams evidence = 2;
ValidatorParams validator = 3;
VersionParams version = 4;
TimestampParams timestamp = 5;
}
// BlockParams contains limits on the block size.
@@ -60,6 +61,15 @@ message VersionParams {
uint64 app_version = 1;
}
message TimestampParams {
google.protobuf.Duration accuracy = 1
[(gogoproto.nullable) = false, (gogoproto.stdduration) = true];
google.protobuf.Duration precision = 2
[(gogoproto.nullable) = false, (gogoproto.stdduration) = true];
google.protobuf.Duration message_delay = 3
[(gogoproto.nullable) = false, (gogoproto.stdduration) = true];
}
// HashedParams is a subset of ConsensusParams.
//
// It is hashed into the Header.ConsensusHash.

View File

@@ -419,7 +419,7 @@ func (c *baseRPCClient) Block(ctx context.Context, height *int64) (*ctypes.Resul
return result, nil
}
func (c *baseRPCClient) BlockByHash(ctx context.Context, hash []byte) (*ctypes.ResultBlock, error) {
func (c *baseRPCClient) BlockByHash(ctx context.Context, hash bytes.HexBytes) (*ctypes.ResultBlock, error) {
result := new(ctypes.ResultBlock)
params := map[string]interface{}{
"hash": hash,
@@ -460,7 +460,7 @@ func (c *baseRPCClient) Commit(ctx context.Context, height *int64) (*ctypes.Resu
return result, nil
}
func (c *baseRPCClient) Tx(ctx context.Context, hash []byte, prove bool) (*ctypes.ResultTx, error) {
func (c *baseRPCClient) Tx(ctx context.Context, hash bytes.HexBytes, prove bool) (*ctypes.ResultTx, error) {
result := new(ctypes.ResultTx)
params := map[string]interface{}{
"hash": hash,

View File

@@ -67,11 +67,11 @@ type ABCIClient interface {
// and prove anything about the chain.
type SignClient interface {
Block(ctx context.Context, height *int64) (*ctypes.ResultBlock, error)
BlockByHash(ctx context.Context, hash []byte) (*ctypes.ResultBlock, error)
BlockByHash(ctx context.Context, hash bytes.HexBytes) (*ctypes.ResultBlock, error)
BlockResults(ctx context.Context, height *int64) (*ctypes.ResultBlockResults, error)
Commit(ctx context.Context, height *int64) (*ctypes.ResultCommit, error)
Validators(ctx context.Context, height *int64, page, perPage *int) (*ctypes.ResultValidators, error)
Tx(ctx context.Context, hash []byte, prove bool) (*ctypes.ResultTx, error)
Tx(ctx context.Context, hash bytes.HexBytes, prove bool) (*ctypes.ResultTx, error)
// TxSearch defines a method to search for a paginated set of transactions by
// DeliverTx event search criteria.

View File

@@ -166,7 +166,7 @@ func (c *Local) Block(ctx context.Context, height *int64) (*ctypes.ResultBlock,
return c.env.Block(c.ctx, height)
}
func (c *Local) BlockByHash(ctx context.Context, hash []byte) (*ctypes.ResultBlock, error) {
func (c *Local) BlockByHash(ctx context.Context, hash bytes.HexBytes) (*ctypes.ResultBlock, error) {
return c.env.BlockByHash(c.ctx, hash)
}
@@ -182,7 +182,7 @@ func (c *Local) Validators(ctx context.Context, height *int64, page, perPage *in
return c.env.Validators(c.ctx, height, page, perPage)
}
func (c *Local) Tx(ctx context.Context, hash []byte, prove bool) (*ctypes.ResultTx, error) {
func (c *Local) Tx(ctx context.Context, hash bytes.HexBytes, prove bool) (*ctypes.ResultTx, error) {
return c.env.Tx(c.ctx, hash, prove)
}

View File

@@ -166,7 +166,7 @@ func (c Client) Block(ctx context.Context, height *int64) (*ctypes.ResultBlock,
return c.env.Block(&rpctypes.Context{}, height)
}
func (c Client) BlockByHash(ctx context.Context, hash []byte) (*ctypes.ResultBlock, error) {
func (c Client) BlockByHash(ctx context.Context, hash bytes.HexBytes) (*ctypes.ResultBlock, error) {
return c.env.BlockByHash(&rpctypes.Context{}, hash)
}

View File

@@ -115,11 +115,11 @@ func (_m *Client) Block(ctx context.Context, height *int64) (*coretypes.ResultBl
}
// BlockByHash provides a mock function with given fields: ctx, hash
func (_m *Client) BlockByHash(ctx context.Context, hash []byte) (*coretypes.ResultBlock, error) {
func (_m *Client) BlockByHash(ctx context.Context, hash bytes.HexBytes) (*coretypes.ResultBlock, error) {
ret := _m.Called(ctx, hash)
var r0 *coretypes.ResultBlock
if rf, ok := ret.Get(0).(func(context.Context, []byte) *coretypes.ResultBlock); ok {
if rf, ok := ret.Get(0).(func(context.Context, bytes.HexBytes) *coretypes.ResultBlock); ok {
r0 = rf(ctx, hash)
} else {
if ret.Get(0) != nil {
@@ -128,7 +128,7 @@ func (_m *Client) BlockByHash(ctx context.Context, hash []byte) (*coretypes.Resu
}
var r1 error
if rf, ok := ret.Get(1).(func(context.Context, []byte) error); ok {
if rf, ok := ret.Get(1).(func(context.Context, bytes.HexBytes) error); ok {
r1 = rf(ctx, hash)
} else {
r1 = ret.Error(1)
@@ -706,11 +706,11 @@ func (_m *Client) Subscribe(ctx context.Context, subscriber string, query string
}
// Tx provides a mock function with given fields: ctx, hash, prove
func (_m *Client) Tx(ctx context.Context, hash []byte, prove bool) (*coretypes.ResultTx, error) {
func (_m *Client) Tx(ctx context.Context, hash bytes.HexBytes, prove bool) (*coretypes.ResultTx, error) {
ret := _m.Called(ctx, hash, prove)
var r0 *coretypes.ResultTx
if rf, ok := ret.Get(0).(func(context.Context, []byte, bool) *coretypes.ResultTx); ok {
if rf, ok := ret.Get(0).(func(context.Context, bytes.HexBytes, bool) *coretypes.ResultTx); ok {
r0 = rf(ctx, hash, prove)
} else {
if ret.Get(0) != nil {
@@ -719,7 +719,7 @@ func (_m *Client) Tx(ctx context.Context, hash []byte, prove bool) (*coretypes.R
}
var r1 error
if rf, ok := ret.Get(1).(func(context.Context, []byte, bool) error); ok {
if rf, ok := ret.Get(1).(func(context.Context, bytes.HexBytes, bool) error); ok {
r1 = rf(ctx, hash, prove)
} else {
r1 = ret.Error(1)

View File

@@ -4,6 +4,7 @@ import (
"fmt"
"sort"
"github.com/tendermint/tendermint/libs/bytes"
tmmath "github.com/tendermint/tendermint/libs/math"
tmquery "github.com/tendermint/tendermint/libs/pubsub/query"
ctypes "github.com/tendermint/tendermint/rpc/core/types"
@@ -107,7 +108,11 @@ func (env *Environment) Block(ctx *rpctypes.Context, heightPtr *int64) (*ctypes.
// BlockByHash gets block by hash.
// More: https://docs.tendermint.com/master/rpc/#/Info/block_by_hash
func (env *Environment) BlockByHash(ctx *rpctypes.Context, hash []byte) (*ctypes.ResultBlock, error) {
func (env *Environment) BlockByHash(ctx *rpctypes.Context, hash bytes.HexBytes) (*ctypes.ResultBlock, error) {
// N.B. The hash parameter is HexBytes so that the reflective parameter
// decoding logic in the HTTP service will correctly translate from JSON.
// See https://github.com/tendermint/tendermint/issues/6802 for context.
block := env.BlockStore.LoadBlockByHash(hash)
if block == nil {
return &ctypes.ResultBlock{BlockID: types.BlockID{}, Block: nil}, nil

View File

@@ -5,6 +5,7 @@ import (
"fmt"
"sort"
"github.com/tendermint/tendermint/libs/bytes"
tmmath "github.com/tendermint/tendermint/libs/math"
tmquery "github.com/tendermint/tendermint/libs/pubsub/query"
ctypes "github.com/tendermint/tendermint/rpc/core/types"
@@ -17,9 +18,13 @@ import (
// transaction is in the mempool, invalidated, or was not sent in the first
// place.
// More: https://docs.tendermint.com/master/rpc/#/Info/tx
func (env *Environment) Tx(ctx *rpctypes.Context, hash []byte, prove bool) (*ctypes.ResultTx, error) {
func (env *Environment) Tx(ctx *rpctypes.Context, hash bytes.HexBytes, prove bool) (*ctypes.ResultTx, error) {
// if index is disabled, return error
// N.B. The hash parameter is HexBytes so that the reflective parameter
// decoding logic in the HTTP service will correctly translate from JSON.
// See https://github.com/tendermint/tendermint/issues/6802 for context.
if !indexer.KVSinkEnabled(env.EventSinks) {
return nil, errors.New("transaction querying is disabled due to no kvEventSink")
}

View File

@@ -601,6 +601,32 @@ paths:
application/json:
schema:
$ref: "#/components/schemas/ErrorResponse"
/unsafe_flush_mempool:
get:
summary: Flush mempool of all unconfirmed transactions
operationId: unsafe_flush_mempool
tags:
- Unsafe
description: |
Flush flushes out the mempool. It acquires a read-lock, fetches all the
transactions currently in the transaction store and removes each transaction
from the store and all indexes and finally resets the cache.
Note, flushing the mempool may leave the mempool in an inconsistent state.
responses:
"200":
description: empty answer
content:
application/json:
schema:
$ref: "#/components/schemas/EmptyResponse"
"500":
description: empty error
content:
application/json:
schema:
$ref: "#/components/schemas/ErrorResponse"
/blockchain:
get:
summary: "Get block headers (max: 20) for minHeight <= height <= maxHeight."

View File

@@ -1,4 +1,4 @@
all: docker generator runner
all: docker generator runner tests
docker:
docker build --tag tendermint/e2e-node -f docker/Dockerfile ../..
@@ -15,4 +15,7 @@ generator:
runner:
go build -o build/runner ./runner
.PHONY: all app docker generator runner
tests:
go test -o build/tests ./tests
.PHONY: all app docker generator runner tests

View File

@@ -51,7 +51,7 @@ var (
nodeStateSyncs = uniformChoice{e2e.StateSyncDisabled, e2e.StateSyncP2P, e2e.StateSyncRPC}
nodePersistIntervals = uniformChoice{0, 1, 5}
nodeSnapshotIntervals = uniformChoice{0, 3}
nodeRetainBlocks = uniformChoice{0, int(e2e.EvidenceAgeHeight), int(e2e.EvidenceAgeHeight) + 5}
nodeRetainBlocks = uniformChoice{0, 2 * int(e2e.EvidenceAgeHeight), 4 * int(e2e.EvidenceAgeHeight)}
nodePerturbations = probSetChoice{
"disconnect": 0.1,
"pause": 0.1,
@@ -87,11 +87,19 @@ func Generate(r *rand.Rand, opts Options) ([]e2e.Manifest, error) {
}
manifests = append(manifests, manifest)
}
if opts.Sorted {
// When the sorted flag is set (generally, as long as
// groups aren't set),
e2e.SortManifests(manifests)
}
return manifests, nil
}
type Options struct {
P2P P2PMode
P2P P2PMode
Sorted bool
}
type P2PMode string
@@ -119,18 +127,11 @@ func generateTestnet(r *rand.Rand, opt map[string]interface{}) (e2e.Manifest, er
TxSize: int64(txSize.Choose(r).(int)),
}
var p2pNodeFactor int
switch opt["p2p"].(P2PMode) {
case NewP2PMode:
manifest.UseLegacyP2P = true
case LegacyP2PMode:
manifest.UseLegacyP2P = false
case HybridP2PMode:
manifest.UseLegacyP2P = true
p2pNodeFactor = 2
p2pMode := opt["p2p"].(P2PMode)
switch p2pMode {
case NewP2PMode, LegacyP2PMode, HybridP2PMode:
default:
return manifest, fmt.Errorf("unknown p2p mode %s", opt["p2p"])
return manifest, fmt.Errorf("unknown p2p mode %s", p2pMode)
}
var numSeeds, numValidators, numFulls, numLightClients int
@@ -153,10 +154,11 @@ func generateTestnet(r *rand.Rand, opt map[string]interface{}) (e2e.Manifest, er
for i := 1; i <= numSeeds; i++ {
node := generateNode(r, e2e.ModeSeed, 0, manifest.InitialHeight, false)
if p2pNodeFactor == 0 {
node.UseLegacyP2P = manifest.UseLegacyP2P
} else if p2pNodeFactor%i == 0 {
node.UseLegacyP2P = !manifest.UseLegacyP2P
switch p2pMode {
case LegacyP2PMode:
node.UseLegacyP2P = true
case HybridP2PMode:
node.UseLegacyP2P = r.Intn(2) == 1
}
manifest.Nodes[fmt.Sprintf("seed%02d", i)] = node
@@ -177,10 +179,11 @@ func generateTestnet(r *rand.Rand, opt map[string]interface{}) (e2e.Manifest, er
node := generateNode(
r, e2e.ModeValidator, startAt, manifest.InitialHeight, i <= 2)
if p2pNodeFactor == 0 {
node.UseLegacyP2P = manifest.UseLegacyP2P
} else if p2pNodeFactor%i == 0 {
node.UseLegacyP2P = !manifest.UseLegacyP2P
switch p2pMode {
case LegacyP2PMode:
node.UseLegacyP2P = true
case HybridP2PMode:
node.UseLegacyP2P = r.Intn(2) == 1
}
manifest.Nodes[name] = node
@@ -213,11 +216,13 @@ func generateTestnet(r *rand.Rand, opt map[string]interface{}) (e2e.Manifest, er
}
node := generateNode(r, e2e.ModeFull, startAt, manifest.InitialHeight, false)
if p2pNodeFactor == 0 {
node.UseLegacyP2P = manifest.UseLegacyP2P
} else if p2pNodeFactor%i == 0 {
node.UseLegacyP2P = !manifest.UseLegacyP2P
switch p2pMode {
case LegacyP2PMode:
node.UseLegacyP2P = true
case HybridP2PMode:
node.UseLegacyP2P = r.Intn(2) == 1
}
manifest.Nodes[fmt.Sprintf("full%02d", i)] = node
}

View File

@@ -57,6 +57,10 @@ func NewCLI() *CLI {
return fmt.Errorf("p2p mode must be either new, legacy, hybrid or mixed got %s", p2pMode)
}
if groups == 0 {
opts.Sorted = true
}
return cli.generate(dir, groups, opts)
},
}

View File

@@ -43,6 +43,7 @@ persist_interval = 0
perturb = ["restart"]
privval_protocol = "tcp"
seeds = ["seed01"]
block_sync = "v0"
[node.validator03]
database = "badgerdb"
@@ -51,7 +52,8 @@ abci_protocol = "grpc"
persist_interval = 3
perturb = ["kill"]
privval_protocol = "grpc"
retain_blocks = 7
block_sync = "v0"
retain_blocks = 10
[node.validator04]
abci_protocol = "builtin"
@@ -59,12 +61,13 @@ snapshot_interval = 5
database = "rocksdb"
persistent_peers = ["validator01"]
perturb = ["pause"]
block_sync = "v0"
[node.validator05]
database = "cleveldb"
block_sync = "v0"
database = "cleveldb"
block_sync = "v0"
state_sync = "p2p"
seeds = ["seed01"]
seeds = ["seed01"]
start_at = 1005 # Becomes part of the validator set at 1010
abci_protocol = "grpc"
perturb = ["pause", "disconnect", "restart"]
@@ -73,11 +76,10 @@ privval_protocol = "tcp"
[node.full01]
mode = "full"
start_at = 1010
# FIXME: should be v2, disabled due to flake
block_sync = "v0"
persistent_peers = ["validator01", "validator02", "validator03", "validator04"]
perturb = ["restart"]
retain_blocks = 7
retain_blocks = 10
state_sync = "rpc"
[node.light01]

View File

@@ -3,6 +3,7 @@ package e2e
import (
"fmt"
"os"
"sort"
"github.com/BurntSushi/toml"
)
@@ -59,9 +60,6 @@ type Manifest struct {
// by individual nodes.
LogLevel string `toml:"log_level"`
// UseLegacyP2P uses the legacy p2p layer for all nodes in a test.
UseLegacyP2P bool `toml:"use_legacy_p2p"`
// QueueType describes the type of queue that the system uses internally
QueueType string `toml:"queue_type"`
@@ -170,3 +168,43 @@ func LoadManifest(file string) (Manifest, error) {
}
return manifest, nil
}
// SortManifests orders (in-place) a list of manifests such that the
// manifests will be ordered (vaguely) from least complex to most
// complex.
func SortManifests(manifests []Manifest) {
sort.SliceStable(manifests, func(i, j int) bool {
left, right := manifests[i], manifests[j]
if len(left.Nodes) < len(right.Nodes) {
return true
}
if left.InitialHeight < right.InitialHeight {
return true
}
if left.TxSize < right.TxSize {
return true
}
if left.Evidence < right.Evidence {
return true
}
var (
leftPerturb int
rightPerturb int
)
for _, n := range left.Nodes {
leftPerturb += len(n.Perturb)
}
for _, n := range right.Nodes {
rightPerturb += len(n.Perturb)
}
return leftPerturb < rightPerturb
})
}

View File

@@ -182,7 +182,7 @@ func LoadTestnet(file string) (*Testnet, error) {
Perturbations: []Perturbation{},
LogLevel: manifest.LogLevel,
QueueType: manifest.QueueType,
UseLegacyP2P: manifest.UseLegacyP2P && nodeManifest.UseLegacyP2P,
UseLegacyP2P: nodeManifest.UseLegacyP2P,
}
if node.StartAt == testnet.InitialHeight {

View File

@@ -19,7 +19,7 @@ FAILED=()
for MANIFEST in "$@"; do
START=$SECONDS
echo "==> Running testnet $MANIFEST..."
echo "==> Running testnet: $MANIFEST"
if ! ./build/runner -f "$MANIFEST"; then
echo "==> Testnet $MANIFEST failed, dumping manifest..."

View File

@@ -21,8 +21,8 @@ import (
//
// Metrics are based of the `benchmarkLength`, the amount of consecutive blocks
// sampled from in the testnet
func Benchmark(testnet *e2e.Testnet, benchmarkLength int64) error {
block, _, err := waitForHeight(testnet, 0)
func Benchmark(ctx context.Context, testnet *e2e.Testnet, benchmarkLength int64) error {
block, err := getLatestBlock(ctx, testnet)
if err != nil {
return err
}
@@ -32,13 +32,15 @@ func Benchmark(testnet *e2e.Testnet, benchmarkLength int64) error {
// wait for the length of the benchmark period in blocks to pass. We allow 5 seconds for each block
// which should be sufficient.
waitingTime := time.Duration(benchmarkLength*5) * time.Second
endHeight, err := waitForAllNodes(testnet, block.Height+benchmarkLength, waitingTime)
ctx, cancel := context.WithTimeout(ctx, waitingTime)
defer cancel()
block, _, err = waitForHeight(ctx, testnet, block.Height+benchmarkLength)
if err != nil {
return err
}
dur := time.Since(startAt)
logger.Info("Ending benchmark period", "height", endHeight)
logger.Info("Ending benchmark period", "height", block.Height)
// fetch a sample of blocks
blocks, err := fetchBlockChainSample(testnet, benchmarkLength)

View File

@@ -28,7 +28,7 @@ const lightClientEvidenceRatio = 4
// evidence and broadcasts it to a random node through the rpc endpoint `/broadcast_evidence`.
// Evidence is random and can be a mixture of LightClientAttackEvidence and
// DuplicateVoteEvidence.
func InjectEvidence(testnet *e2e.Testnet, amount int) error {
func InjectEvidence(ctx context.Context, testnet *e2e.Testnet, amount int) error {
// select a random node
var targetNode *e2e.Node
@@ -79,9 +79,12 @@ func InjectEvidence(testnet *e2e.Testnet, amount int) error {
return err
}
wctx, cancel := context.WithTimeout(ctx, time.Minute)
defer cancel()
// wait for the node to reach the height above the forged height so that
// it is able to validate the evidence
_, err = waitForNode(targetNode, waitHeight, 30*time.Second)
_, err = waitForNode(wctx, targetNode, waitHeight)
if err != nil {
return err
}
@@ -107,9 +110,12 @@ func InjectEvidence(testnet *e2e.Testnet, amount int) error {
}
}
wctx, cancel = context.WithTimeout(ctx, 30*time.Second)
defer cancel()
// wait for the node to reach the height above the forged height so that
// it is able to validate the evidence
_, err = waitForNode(targetNode, blockRes.Block.Height+2, 10*time.Second)
_, err = waitForNode(wctx, targetNode, blockRes.Block.Height+2)
if err != nil {
return err
}

View File

@@ -3,10 +3,9 @@ package main
import (
"container/ring"
"context"
"crypto/rand"
"errors"
"fmt"
"math"
"math/rand"
"time"
rpchttp "github.com/tendermint/tendermint/rpc/client/http"
@@ -15,9 +14,8 @@ import (
)
// Load generates transactions against the network until the given context is
// canceled. A multiplier of greater than one can be supplied if load needs to
// be generated beyond a minimum amount.
func Load(ctx context.Context, testnet *e2e.Testnet, multiplier int) error {
// canceled.
func Load(ctx context.Context, testnet *e2e.Testnet) error {
// Since transactions are executed across all nodes in the network, we need
// to reduce transaction load for larger networks to avoid using too much
// CPU. This gives high-throughput small networks and low-throughput large ones.
@@ -27,11 +25,9 @@ func Load(ctx context.Context, testnet *e2e.Testnet, multiplier int) error {
if concurrency == 0 {
concurrency = 1
}
initialTimeout := 1 * time.Minute
stallTimeout := 30 * time.Second
chTx := make(chan types.Tx)
chSuccess := make(chan types.Tx)
chSuccess := make(chan int) // success counts per iteration
ctx, cancel := context.WithCancel(ctx)
defer cancel()
@@ -39,61 +35,115 @@ func Load(ctx context.Context, testnet *e2e.Testnet, multiplier int) error {
logger.Info(fmt.Sprintf("Starting transaction load (%v workers)...", concurrency))
started := time.Now()
go loadGenerate(ctx, chTx, multiplier, testnet.TxSize)
go loadGenerate(ctx, chTx, testnet.TxSize)
for w := 0; w < concurrency; w++ {
go loadProcess(ctx, testnet, chTx, chSuccess)
}
// Monitor successful transactions, and abort on stalls.
// Montior transaction to ensure load propagates to the network
//
// This loop doesn't check or time out for stalls, since a stall here just
// aborts the load generator sooner and could obscure backpressure
// from the test harness, and there are other checks for
// stalls in the framework. Ideally we should monitor latency as a guide
// for when to give up, but we don't have a good way to track that yet.
success := 0
timeout := initialTimeout
for {
select {
case <-chSuccess:
success++
timeout = stallTimeout
case <-time.After(timeout):
return fmt.Errorf("unable to submit transactions for %v", timeout)
case numSeen := <-chSuccess:
success += numSeen
case <-ctx.Done():
if success == 0 {
// if we couldn't submit any transactions,
// that's probably a problem and the test
// should error; however, for very short tests
// we shouldn't abort.
//
// The 2s cut off, is a rough guess based on
// the expected value of
// loadGenerateWaitTime. If the implementation
// of that function changes, then this might
// also need to change without more
// refactoring.
if success == 0 && time.Since(started) > 2*time.Second {
return errors.New("failed to submit any transactions")
}
logger.Info(fmt.Sprintf("Ending transaction load after %v txs (%.1f tx/s)...",
success, float64(success)/time.Since(started).Seconds()))
// TODO perhaps allow test networks to
// declare required transaction rates, which
// might allow us to avoid the special case
// around 0 txs above.
rate := float64(success) / time.Since(started).Seconds()
logger.Info("ending transaction load",
"dur_secs", time.Since(started).Seconds(),
"txns", success,
"rate", rate,
"slow", rate < 1)
return nil
}
}
}
// loadGenerate generates jobs until the context is canceled
func loadGenerate(ctx context.Context, chTx chan<- types.Tx, multiplier int, size int64) {
for i := 0; i < math.MaxInt64; i++ {
// loadGenerate generates jobs until the context is canceled.
//
// The chTx has multiple consumers, thus the rate limiting of the load
// generation is primarily the result of backpressure from the
// broadcast transaction, though there is still some timer-based
// limiting.
func loadGenerate(ctx context.Context, chTx chan<- types.Tx, size int64) {
timer := time.NewTimer(0)
defer timer.Stop()
defer close(chTx)
for {
select {
case <-ctx.Done():
return
case <-timer.C:
}
// We keep generating the same 100 keys over and over, with different values.
// This gives a reasonable load without putting too much data in the app.
id := i % 100
id := rand.Int63() % 100 // nolint: gosec
bz := make([]byte, size)
_, err := rand.Read(bz)
_, err := rand.Read(bz) // nolint: gosec
if err != nil {
panic(fmt.Sprintf("Failed to read random bytes: %v", err))
}
tx := types.Tx(fmt.Sprintf("load-%X=%x", id, bz))
select {
case chTx <- tx:
sqrtSize := int(math.Sqrt(float64(size)))
time.Sleep(10 * time.Millisecond * time.Duration(sqrtSize/multiplier))
case <-ctx.Done():
close(chTx)
return
case chTx <- tx:
// sleep for a bit before sending the
// next transaction.
timer.Reset(loadGenerateWaitTime(size))
}
}
}
func loadGenerateWaitTime(size int64) time.Duration {
const (
min = int64(100 * time.Millisecond)
max = int64(time.Second)
)
var (
baseJitter = rand.Int63n(max-min+1) + min // nolint: gosec
sizeFactor = size * int64(time.Millisecond)
sizeJitter = rand.Int63n(sizeFactor-min+1) + min // nolint: gosec
)
return time.Duration(baseJitter + sizeJitter)
}
// loadProcess processes transactions
func loadProcess(ctx context.Context, testnet *e2e.Testnet, chTx <-chan types.Tx, chSuccess chan<- types.Tx) {
func loadProcess(ctx context.Context, testnet *e2e.Testnet, chTx <-chan types.Tx, chSuccess chan<- int) {
// Each worker gets its own client to each usable node, which
// allows for some concurrency while still bounding it.
clients := make([]*rpchttp.HTTP, 0, len(testnet.Nodes))
@@ -127,8 +177,7 @@ func loadProcess(ctx context.Context, testnet *e2e.Testnet, chTx <-chan types.Tx
clientRing = clientRing.Next()
}
var err error
successes := 0
for {
select {
case <-ctx.Done():
@@ -137,19 +186,24 @@ func loadProcess(ctx context.Context, testnet *e2e.Testnet, chTx <-chan types.Tx
clientRing = clientRing.Next()
client := clientRing.Value.(*rpchttp.HTTP)
if _, err := client.Health(ctx); err != nil {
if status, err := client.Status(ctx); err != nil {
continue
} else if status.SyncInfo.CatchingUp {
continue
}
if _, err = client.BroadcastTxSync(ctx, tx); err != nil {
if _, err := client.BroadcastTxSync(ctx, tx); err != nil {
continue
}
successes++
select {
case chSuccess <- tx:
case chSuccess <- successes:
successes = 0 // reset counter for the next iteration
continue
case <-ctx.Done():
return
default:
}
}

View File

@@ -57,44 +57,47 @@ func NewCLI() *CLI {
}
chLoadResult := make(chan error)
ctx, loadCancel := context.WithCancel(context.Background())
ctx, cancel := context.WithCancel(cmd.Context())
defer cancel()
lctx, loadCancel := context.WithCancel(ctx)
defer loadCancel()
go func() {
err := Load(ctx, cli.testnet, 1)
chLoadResult <- err
chLoadResult <- Load(lctx, cli.testnet)
}()
if err := Start(cli.testnet); err != nil {
if err := Start(ctx, cli.testnet); err != nil {
return err
}
if err := Wait(cli.testnet, 5); err != nil { // allow some txs to go through
if err := Wait(ctx, cli.testnet, 5); err != nil { // allow some txs to go through
return err
}
if cli.testnet.HasPerturbations() {
if err := Perturb(cli.testnet); err != nil {
if err := Perturb(ctx, cli.testnet); err != nil {
return err
}
if err := Wait(cli.testnet, 5); err != nil { // allow some txs to go through
if err := Wait(ctx, cli.testnet, 5); err != nil { // allow some txs to go through
return err
}
}
if cli.testnet.Evidence > 0 {
if err := InjectEvidence(cli.testnet, cli.testnet.Evidence); err != nil {
if err := InjectEvidence(ctx, cli.testnet, cli.testnet.Evidence); err != nil {
return err
}
if err := Wait(cli.testnet, 5); err != nil { // ensure chain progress
if err := Wait(ctx, cli.testnet, 5); err != nil { // ensure chain progress
return err
}
}
loadCancel()
if err := <-chLoadResult; err != nil {
return fmt.Errorf("transaction load failed: %w", err)
}
if err := Wait(cli.testnet, 5); err != nil { // wait for network to settle before tests
if err := Wait(ctx, cli.testnet, 5); err != nil { // wait for network to settle before tests
return err
}
if err := Test(cli.testnet); err != nil {
@@ -139,7 +142,7 @@ func NewCLI() *CLI {
if err != nil {
return err
}
return Start(cli.testnet)
return Start(cmd.Context(), cli.testnet)
},
})
@@ -147,7 +150,7 @@ func NewCLI() *CLI {
Use: "perturb",
Short: "Perturbs the Docker testnet, e.g. by restarting or disconnecting nodes",
RunE: func(cmd *cobra.Command, args []string) error {
return Perturb(cli.testnet)
return Perturb(cmd.Context(), cli.testnet)
},
})
@@ -155,7 +158,7 @@ func NewCLI() *CLI {
Use: "wait",
Short: "Waits for a few blocks to be produced and all nodes to catch up",
RunE: func(cmd *cobra.Command, args []string) error {
return Wait(cli.testnet, 5)
return Wait(cmd.Context(), cli.testnet, 5)
},
})
@@ -187,20 +190,10 @@ func NewCLI() *CLI {
})
cli.root.AddCommand(&cobra.Command{
Use: "load [multiplier]",
Args: cobra.MaximumNArgs(1),
Use: "load",
Short: "Generates transaction load until the command is canceled",
RunE: func(cmd *cobra.Command, args []string) (err error) {
m := 1
if len(args) == 1 {
m, err = strconv.Atoi(args[0])
if err != nil {
return err
}
}
return Load(context.Background(), cli.testnet, m)
return Load(context.Background(), cli.testnet)
},
})
@@ -218,7 +211,7 @@ func NewCLI() *CLI {
}
}
return InjectEvidence(cli.testnet, amount)
return InjectEvidence(cmd.Context(), cli.testnet, amount)
},
})
@@ -281,23 +274,26 @@ Does not run any perbutations.
}
chLoadResult := make(chan error)
ctx, loadCancel := context.WithCancel(context.Background())
ctx, cancel := context.WithCancel(cmd.Context())
defer cancel()
lctx, loadCancel := context.WithCancel(ctx)
defer loadCancel()
go func() {
err := Load(ctx, cli.testnet, 1)
err := Load(lctx, cli.testnet)
chLoadResult <- err
}()
if err := Start(cli.testnet); err != nil {
if err := Start(ctx, cli.testnet); err != nil {
return err
}
if err := Wait(cli.testnet, 5); err != nil { // allow some txs to go through
if err := Wait(ctx, cli.testnet, 5); err != nil { // allow some txs to go through
return err
}
// we benchmark performance over the next 100 blocks
if err := Benchmark(cli.testnet, 100); err != nil {
if err := Benchmark(ctx, cli.testnet, 100); err != nil {
return err
}

View File

@@ -1,6 +1,7 @@
package main
import (
"context"
"fmt"
"time"
@@ -9,14 +10,24 @@ import (
)
// Perturbs a running testnet.
func Perturb(testnet *e2e.Testnet) error {
func Perturb(ctx context.Context, testnet *e2e.Testnet) error {
timer := time.NewTimer(0) // first tick fires immediately; reset below
defer timer.Stop()
for _, node := range testnet.Nodes {
for _, perturbation := range node.Perturbations {
_, err := PerturbNode(node, perturbation)
if err != nil {
return err
select {
case <-ctx.Done():
return ctx.Err()
case <-timer.C:
_, err := PerturbNode(ctx, node, perturbation)
if err != nil {
return err
}
// give network some time to recover between each
timer.Reset(20 * time.Second)
}
time.Sleep(20 * time.Second) // give network some time to recover between each
}
}
return nil
@@ -24,7 +35,7 @@ func Perturb(testnet *e2e.Testnet) error {
// PerturbNode perturbs a node with a given perturbation, returning its status
// after recovering.
func PerturbNode(node *e2e.Node, perturbation e2e.Perturbation) (*rpctypes.ResultStatus, error) {
func PerturbNode(ctx context.Context, node *e2e.Node, perturbation e2e.Perturbation) (*rpctypes.ResultStatus, error) {
testnet := node.Testnet
switch perturbation {
case e2e.PerturbationDisconnect:
@@ -77,7 +88,9 @@ func PerturbNode(node *e2e.Node, perturbation e2e.Perturbation) (*rpctypes.Resul
return nil, nil
}
status, err := waitForNode(node, 0, 3*time.Minute)
ctx, cancel := context.WithTimeout(ctx, 5*time.Minute)
defer cancel()
status, err := waitForNode(ctx, node, 0)
if err != nil {
return nil, err
}

View File

@@ -13,23 +13,24 @@ import (
)
// waitForHeight waits for the network to reach a certain height (or above),
// returning the highest height seen. Errors if the network is not making
// returning the block at the height seen. Errors if the network is not making
// progress at all.
func waitForHeight(testnet *e2e.Testnet, height int64) (*types.Block, *types.BlockID, error) {
// If height == 0, the initial height of the test network is used as the target.
func waitForHeight(ctx context.Context, testnet *e2e.Testnet, height int64) (*types.Block, *types.BlockID, error) {
var (
err error
maxResult *rpctypes.ResultBlock
clients = map[string]*rpchttp.HTTP{}
lastHeight int64
lastIncrease = time.Now()
nodesAtHeight = map[string]struct{}{}
numRunningNodes int
)
for _, node := range testnet.Nodes {
if node.Mode == e2e.ModeSeed {
continue
}
if height == 0 {
height = testnet.InitialHeight
}
if node.Mode == e2e.ModeLight {
for _, node := range testnet.Nodes {
if node.Stateless() {
continue
}
@@ -38,86 +39,97 @@ func waitForHeight(testnet *e2e.Testnet, height int64) (*types.Block, *types.Blo
}
}
timer := time.NewTimer(0)
defer timer.Stop()
for {
for _, node := range testnet.Nodes {
// skip nodes that have reached the target height
if _, ok := nodesAtHeight[node.Name]; ok {
continue
}
select {
case <-ctx.Done():
return nil, nil, ctx.Err()
case <-timer.C:
for _, node := range testnet.Nodes {
// skip nodes that have reached the target height
if _, ok := nodesAtHeight[node.Name]; ok {
continue
}
if node.Mode == e2e.ModeSeed {
continue
}
// skip nodes that don't have state or haven't started yet
if node.Stateless() {
continue
}
if !node.HasStarted {
continue
}
if node.Mode == e2e.ModeLight {
continue
}
// cache the clients
client, ok := clients[node.Name]
if !ok {
client, err = node.Client()
if err != nil {
continue
}
clients[node.Name] = client
}
if !node.HasStarted {
continue
}
// cache the clients
client, ok := clients[node.Name]
if !ok {
client, err = node.Client()
wctx, cancel := context.WithTimeout(ctx, 10*time.Second)
defer cancel()
result, err := client.Status(wctx)
if err != nil {
continue
}
clients[node.Name] = client
}
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()
result, err := client.Block(ctx, nil)
if err != nil {
continue
}
if result.Block != nil && (maxResult == nil || result.Block.Height > maxResult.Block.Height) {
maxResult = result
lastIncrease = time.Now()
}
if maxResult != nil && maxResult.Block.Height >= height {
// the node has achieved the target height!
// add this node to the set of target
// height nodes
nodesAtHeight[node.Name] = struct{}{}
// if not all of the nodes that we
// have clients for have reached the
// target height, keep trying.
if numRunningNodes > len(nodesAtHeight) {
continue
if result.SyncInfo.LatestBlockHeight > lastHeight {
lastHeight = result.SyncInfo.LatestBlockHeight
lastIncrease = time.Now()
}
// return once all nodes have reached
// the target height.
return maxResult.Block, &maxResult.BlockID, nil
}
}
if result.SyncInfo.LatestBlockHeight >= height {
// the node has achieved the target height!
if len(clients) == 0 {
return nil, nil, errors.New("unable to connect to any network nodes")
}
if time.Since(lastIncrease) >= time.Minute {
if maxResult == nil {
return nil, nil, errors.New("chain stalled at unknown height")
// add this node to the set of target
// height nodes
nodesAtHeight[node.Name] = struct{}{}
// if not all of the nodes that we
// have clients for have reached the
// target height, keep trying.
if numRunningNodes > len(nodesAtHeight) {
continue
}
// All nodes are at or above the target height. Now fetch the block for that target height
// and return it. We loop again through all clients because some may have pruning set but
// at least two of them should be archive nodes.
for _, c := range clients {
result, err := c.Block(ctx, &height)
if err != nil || result == nil || result.Block == nil {
continue
}
return result.Block, &result.BlockID, err
}
}
}
return nil, nil, fmt.Errorf("chain stalled at height %v [%d of %d nodes]",
maxResult.Block.Height,
len(nodesAtHeight),
numRunningNodes)
if len(clients) == 0 {
return nil, nil, errors.New("unable to connect to any network nodes")
}
if time.Since(lastIncrease) >= time.Minute {
if lastHeight == 0 {
return nil, nil, errors.New("chain stalled at unknown height (most likely upon starting)")
}
return nil, nil, fmt.Errorf("chain stalled at height %v [%d of %d nodes %+v]",
lastHeight,
len(nodesAtHeight),
numRunningNodes,
nodesAtHeight)
}
timer.Reset(1 * time.Second)
}
time.Sleep(1 * time.Second)
}
}
// waitForNode waits for a node to become available and catch up to the given block height.
func waitForNode(node *e2e.Node, height int64, timeout time.Duration) (*rpctypes.ResultStatus, error) {
func waitForNode(ctx context.Context, node *e2e.Node, height int64) (*rpctypes.ResultStatus, error) {
if node.Mode == e2e.ModeSeed {
return nil, nil
}
@@ -126,42 +138,91 @@ func waitForNode(node *e2e.Node, height int64, timeout time.Duration) (*rpctypes
return nil, err
}
ctx, cancel := context.WithTimeout(context.Background(), timeout)
defer cancel()
timer := time.NewTimer(0)
defer timer.Stop()
var (
lastFailed bool
counter int
)
for {
status, err := client.Status(ctx)
switch {
case errors.Is(err, context.DeadlineExceeded):
return nil, fmt.Errorf("timed out waiting for %v to reach height %v", node.Name, height)
case errors.Is(err, context.Canceled):
return nil, err
case err == nil && status.SyncInfo.LatestBlockHeight >= height:
return status, nil
counter++
if lastFailed {
lastFailed = false
// if there was a problem with the request in
// the previous recreate the client to ensure
// reconnection
client, err = node.Client()
if err != nil {
return nil, err
}
}
time.Sleep(300 * time.Millisecond)
select {
case <-ctx.Done():
return nil, ctx.Err()
case <-timer.C:
status, err := client.Status(ctx)
switch {
case errors.Is(err, context.DeadlineExceeded):
return nil, fmt.Errorf("timed out waiting for %v to reach height %v", node.Name, height)
case errors.Is(err, context.Canceled):
return nil, err
case err == nil && status.SyncInfo.LatestBlockHeight >= height:
return status, nil
case counter%100 == 0:
switch {
case err != nil:
lastFailed = true
logger.Error("node not yet ready",
"iter", counter,
"node", node.Name,
"err", err,
"target", height,
)
case status != nil:
logger.Error("node not yet ready",
"iter", counter,
"node", node.Name,
"height", status.SyncInfo.LatestBlockHeight,
"target", height,
)
}
}
timer.Reset(250 * time.Millisecond)
}
}
}
// waitForAllNodes waits for all nodes to become available and catch up to the given block height.
func waitForAllNodes(testnet *e2e.Testnet, height int64, timeout time.Duration) (int64, error) {
var lastHeight int64
// getLatestBlock returns the last block that all active nodes in the network have
// agreed upon i.e. the earlist of each nodes latest block
func getLatestBlock(ctx context.Context, testnet *e2e.Testnet) (*types.Block, error) {
var earliestBlock *types.Block
for _, node := range testnet.Nodes {
if node.Mode == e2e.ModeSeed {
// skip nodes that don't have state or haven't started yet
if node.Stateless() {
continue
}
if !node.HasStarted {
continue
}
status, err := waitForNode(node, height, timeout)
client, err := node.Client()
if err != nil {
return 0, err
return nil, err
}
if status.SyncInfo.LatestBlockHeight > lastHeight {
lastHeight = status.SyncInfo.LatestBlockHeight
wctx, cancel := context.WithTimeout(ctx, 10*time.Second)
defer cancel()
result, err := client.Block(wctx, nil)
if err != nil {
return nil, err
}
if result.Block != nil && (earliestBlock == nil || earliestBlock.Height > result.Block.Height) {
earliestBlock = result.Block
}
}
return lastHeight, nil
return earliestBlock, nil
}

View File

@@ -297,7 +297,7 @@ func MakeConfig(node *e2e.Node) (*config.Config, error) {
}
if node.BlockSync == "" {
cfg.FastSyncMode = false
cfg.BlockSync.Enable = false
} else {
cfg.BlockSync.Version = node.BlockSync
}

View File

@@ -1,6 +1,7 @@
package main
import (
"context"
"fmt"
"sort"
"time"
@@ -8,7 +9,7 @@ import (
e2e "github.com/tendermint/tendermint/test/e2e/pkg"
)
func Start(testnet *e2e.Testnet) error {
func Start(ctx context.Context, testnet *e2e.Testnet) error {
if len(testnet.Nodes) == 0 {
return fmt.Errorf("no nodes in testnet")
}
@@ -45,7 +46,14 @@ func Start(testnet *e2e.Testnet) error {
if err := execCompose(testnet.Dir, "up", "-d", node.Name); err != nil {
return err
}
if _, err := waitForNode(node, 0, time.Minute); err != nil {
if err := func() error {
ctx, cancel := context.WithTimeout(ctx, time.Minute)
defer cancel()
_, err := waitForNode(ctx, node, 0)
return err
}(); err != nil {
return err
}
node.HasStarted = true
@@ -60,7 +68,7 @@ func Start(testnet *e2e.Testnet) error {
"nodes", len(testnet.Nodes)-len(nodeQueue),
"pending", len(nodeQueue))
block, blockID, err := waitForHeight(testnet, networkHeight)
block, blockID, err := waitForHeight(ctx, testnet, networkHeight)
if err != nil {
return err
}
@@ -74,9 +82,16 @@ func Start(testnet *e2e.Testnet) error {
// that this node will start at before we
// start the node.
logger.Info("Waiting for network to advance to height",
"node", node.Name,
"last_height", networkHeight,
"waiting_for", node.StartAt,
"size", len(testnet.Nodes)-len(nodeQueue),
"pending", len(nodeQueue))
networkHeight = node.StartAt
block, blockID, err = waitForHeight(testnet, networkHeight)
block, blockID, err = waitForHeight(ctx, testnet, networkHeight)
if err != nil {
return err
}
@@ -93,10 +108,15 @@ func Start(testnet *e2e.Testnet) error {
if err := execCompose(testnet.Dir, "up", "-d", node.Name); err != nil {
return err
}
status, err := waitForNode(node, node.StartAt, 8*time.Minute)
wctx, wcancel := context.WithTimeout(ctx, 8*time.Minute)
status, err := waitForNode(wctx, node, node.StartAt)
if err != nil {
wcancel()
return err
}
wcancel()
node.HasStarted = true
logger.Info(fmt.Sprintf("Node %v up on http://127.0.0.1:%v at height %v",
node.Name, node.ProxyPort, status.SyncInfo.LatestBlockHeight))

View File

@@ -15,5 +15,5 @@ func Test(testnet *e2e.Testnet) error {
return err
}
return execVerbose("go", "test", "-count", "1", "./tests/...")
return execVerbose("./build/tests", "-test.count", "1")
}

View File

@@ -1,31 +1,27 @@
package main
import (
"context"
"fmt"
"time"
e2e "github.com/tendermint/tendermint/test/e2e/pkg"
)
// Wait waits for a number of blocks to be produced, and for all nodes to catch
// up with it.
func Wait(testnet *e2e.Testnet, blocks int64) error {
block, _, err := waitForHeight(testnet, 0)
func Wait(ctx context.Context, testnet *e2e.Testnet, blocks int64) error {
block, err := getLatestBlock(ctx, testnet)
if err != nil {
return err
}
return WaitUntil(testnet, block.Height+blocks)
return WaitUntil(ctx, testnet, block.Height+blocks)
}
// WaitUntil waits until a given height has been reached.
func WaitUntil(testnet *e2e.Testnet, height int64) error {
func WaitUntil(ctx context.Context, testnet *e2e.Testnet, height int64) error {
logger.Info(fmt.Sprintf("Waiting for all nodes to reach height %v...", height))
_, err := waitForAllNodes(testnet, height, waitingTime(len(testnet.Nodes)))
_, _, err := waitForHeight(ctx, testnet, height)
return err
}
// waitingTime estimates how long it should take for a node to reach the height.
// More nodes in a network implies we may expect a slower network and may have to wait longer.
func waitingTime(nodes int) time.Duration {
return time.Minute + (time.Duration(nodes) * (30 * time.Second))
}

View File

@@ -2,6 +2,7 @@ package e2e_test
import (
"bytes"
"context"
"fmt"
"math/rand"
"testing"
@@ -10,6 +11,7 @@ import (
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
"github.com/tendermint/tendermint/rpc/client/http"
e2e "github.com/tendermint/tendermint/test/e2e/pkg"
"github.com/tendermint/tendermint/types"
)
@@ -44,7 +46,7 @@ func TestApp_Hash(t *testing.T) {
block, err := client.Block(ctx, nil)
require.NoError(t, err)
require.EqualValues(t, info.Response.LastBlockAppHash, block.Block.AppHash,
require.EqualValues(t, info.Response.LastBlockAppHash, block.Block.AppHash.Bytes(),
"app hash does not match last block's app hash")
status, err := client.Status(ctx)
@@ -56,42 +58,101 @@ func TestApp_Hash(t *testing.T) {
// Tests that we can set a value and retrieve it.
func TestApp_Tx(t *testing.T) {
testNode(t, func(t *testing.T, node e2e.Node) {
client, err := node.Client()
require.NoError(t, err)
type broadcastFunc func(context.Context, types.Tx) error
// Generate a random value, to prevent duplicate tx errors when
// manually running the test multiple times for a testnet.
r := rand.New(rand.NewSource(time.Now().UnixNano()))
bz := make([]byte, 32)
_, err = r.Read(bz)
require.NoError(t, err)
testCases := []struct {
Name string
WaitTime time.Duration
BroadcastTx func(client *http.HTTP) broadcastFunc
ShouldSkip bool
}{
{
Name: "Sync",
WaitTime: time.Minute,
BroadcastTx: func(client *http.HTTP) broadcastFunc {
return func(ctx context.Context, tx types.Tx) error {
_, err := client.BroadcastTxSync(ctx, tx)
return err
}
},
},
{
Name: "Commit",
WaitTime: 15 * time.Second,
// TODO: turn this check back on if it can
// return reliably. Currently these calls have
// a hard timeout of 10s (server side
// configured). The Sync check is probably
// safe.
ShouldSkip: true,
BroadcastTx: func(client *http.HTTP) broadcastFunc {
return func(ctx context.Context, tx types.Tx) error {
_, err := client.BroadcastTxCommit(ctx, tx)
return err
}
},
},
{
Name: "Async",
WaitTime: 90 * time.Second,
// TODO: turn this check back on if there's a
// way to avoid failures in the case that the
// transaction doesn't make it into the
// mempool. (retries?)
ShouldSkip: true,
BroadcastTx: func(client *http.HTTP) broadcastFunc {
return func(ctx context.Context, tx types.Tx) error {
_, err := client.BroadcastTxAsync(ctx, tx)
return err
}
},
},
}
key := fmt.Sprintf("testapp-tx-%v", node.Name)
value := fmt.Sprintf("%x", bz)
tx := types.Tx(fmt.Sprintf("%v=%v", key, value))
_, err = client.BroadcastTxSync(ctx, tx)
require.NoError(t, err)
hash := tx.Hash()
waitTime := 20 * time.Second
require.Eventuallyf(t, func() bool {
txResp, err := client.Tx(ctx, hash, false)
return err == nil && bytes.Equal(txResp.Tx, tx)
}, waitTime, time.Second,
"submitted tx %X wasn't committed after %v", hash, waitTime,
)
// NOTE: we don't test abci query of the light client
if node.Mode == e2e.ModeLight {
return
for idx, test := range testCases {
if test.ShouldSkip {
continue
}
t.Run(test.Name, func(t *testing.T) {
// testNode calls t.Parallel as well, so we should
// have a copy of the
test := testCases[idx]
testNode(t, func(t *testing.T, node e2e.Node) {
client, err := node.Client()
require.NoError(t, err)
// Generate a random value, to prevent duplicate tx errors when
// manually running the test multiple times for a testnet.
bz := make([]byte, 32)
_, err = rand.Read(bz)
require.NoError(t, err)
key := fmt.Sprintf("testapp-tx-%v", node.Name)
value := fmt.Sprintf("%x", bz)
tx := types.Tx(fmt.Sprintf("%v=%v", key, value))
require.NoError(t, test.BroadcastTx(client)(ctx, tx))
hash := tx.Hash()
require.Eventuallyf(t, func() bool {
txResp, err := client.Tx(ctx, hash, false)
return err == nil && bytes.Equal(txResp.Tx, tx)
},
test.WaitTime, // timeout
time.Second, // interval
"submitted tx %X wasn't committed after %v",
hash, test.WaitTime,
)
abciResp, err := client.ABCIQuery(ctx, "", []byte(key))
require.NoError(t, err)
assert.Equal(t, key, string(abciResp.Response.Key))
assert.Equal(t, value, string(abciResp.Response.Value))
})
})
}
abciResp, err := client.ABCIQuery(ctx, "", []byte(key))
require.NoError(t, err)
assert.Equal(t, key, string(abciResp.Response.Key))
assert.Equal(t, value, string(abciResp.Response.Value))
})
}

View File

@@ -3,7 +3,6 @@ package e2e_test
import (
"context"
"os"
"path/filepath"
"sync"
"testing"
@@ -72,9 +71,6 @@ func loadTestnet(t *testing.T) e2e.Testnet {
if manifest == "" {
t.Skip("E2E_MANIFEST not set, not an end-to-end test run")
}
if !filepath.IsAbs(manifest) {
manifest = filepath.Join("..", manifest)
}
testnetCacheMtx.Lock()
defer testnetCacheMtx.Unlock()

View File

@@ -883,7 +883,7 @@ func (commit *Commit) ValidateBasic() error {
}
if commit.Height >= 1 {
if commit.BlockID.IsZero() {
if commit.BlockID.IsNil() {
return errors.New("commit cannot be for nil block")
}
@@ -1204,8 +1204,8 @@ func (blockID BlockID) ValidateBasic() error {
return nil
}
// IsZero returns true if this is the BlockID of a nil block.
func (blockID BlockID) IsZero() bool {
// IsNil returns true if this is the BlockID of a nil block.
func (blockID BlockID) IsNil() bool {
return len(blockID.Hash) == 0 &&
blockID.PartSetHeader.IsZero()
}

View File

@@ -21,7 +21,7 @@ func CanonicalizeBlockID(bid tmproto.BlockID) *tmproto.CanonicalBlockID {
panic(err)
}
var cbid *tmproto.CanonicalBlockID
if rbid == nil || rbid.IsZero() {
if rbid == nil || rbid.IsNil() {
cbid = nil
} else {
cbid = &tmproto.CanonicalBlockID{

View File

@@ -221,10 +221,6 @@ func (b *EventBus) PublishEventPolka(data EventDataRoundState) error {
return b.Publish(EventPolkaValue, data)
}
func (b *EventBus) PublishEventUnlock(data EventDataRoundState) error {
return b.Publish(EventUnlockValue, data)
}
func (b *EventBus) PublishEventRelock(data EventDataRoundState) error {
return b.Publish(EventRelockValue, data)
}
@@ -301,10 +297,6 @@ func (NopEventBus) PublishEventPolka(data EventDataRoundState) error {
return nil
}
func (NopEventBus) PublishEventUnlock(data EventDataRoundState) error {
return nil
}
func (NopEventBus) PublishEventRelock(data EventDataRoundState) error {
return nil
}

View File

@@ -362,8 +362,6 @@ func TestEventBusPublish(t *testing.T) {
require.NoError(t, err)
err = eventBus.PublishEventPolka(EventDataRoundState{})
require.NoError(t, err)
err = eventBus.PublishEventUnlock(EventDataRoundState{})
require.NoError(t, err)
err = eventBus.PublishEventRelock(EventDataRoundState{})
require.NoError(t, err)
err = eventBus.PublishEventLock(EventDataRoundState{})
@@ -475,7 +473,6 @@ var events = []string{
EventTimeoutProposeValue,
EventCompleteProposalValue,
EventPolkaValue,
EventUnlockValue,
EventLockValue,
EventRelockValue,
EventTimeoutWaitValue,
@@ -497,7 +494,6 @@ var queries = []tmpubsub.Query{
EventQueryTimeoutPropose,
EventQueryCompleteProposal,
EventQueryPolka,
EventQueryUnlock,
EventQueryLock,
EventQueryRelock,
EventQueryTimeoutWait,

View File

@@ -38,7 +38,6 @@ const (
EventStateSyncStatusValue = "StateSyncStatus"
EventTimeoutProposeValue = "TimeoutPropose"
EventTimeoutWaitValue = "TimeoutWait"
EventUnlockValue = "Unlock"
EventValidBlockValue = "ValidBlock"
EventVoteValue = "Vote"
)
@@ -223,7 +222,6 @@ var (
EventQueryTimeoutPropose = QueryForEvent(EventTimeoutProposeValue)
EventQueryTimeoutWait = QueryForEvent(EventTimeoutWaitValue)
EventQueryTx = QueryForEvent(EventTxValue)
EventQueryUnlock = QueryForEvent(EventUnlockValue)
EventQueryValidatorSetUpdates = QueryForEvent(EventValidatorSetUpdatesValue)
EventQueryValidBlock = QueryForEvent(EventValidBlockValue)
EventQueryVote = QueryForEvent(EventVoteValue)

View File

@@ -41,6 +41,7 @@ type ConsensusParams struct {
Evidence EvidenceParams `json:"evidence"`
Validator ValidatorParams `json:"validator"`
Version VersionParams `json:"version"`
Timestamp TimestampParams `json:"timestamp"`
}
// HashedParams is a subset of ConsensusParams.
@@ -65,6 +66,14 @@ type EvidenceParams struct {
MaxBytes int64 `json:"max_bytes"`
}
// TimestampParams define the acceptable amount of clock skew among different
// validators on a network.
type TimestampParams struct {
Accuracy time.Duration `json:"accuracy"`
Precision time.Duration `json:"precision"`
MessageDelay time.Duration `json:"message_delay"`
}
// ValidatorParams restrict the public key types validators can use.
// NOTE: uses ABCI pubkey naming, not Amino names.
type ValidatorParams struct {
@@ -235,6 +244,11 @@ func (params ConsensusParams) UpdateConsensusParams(params2 *tmproto.ConsensusPa
if params2.Version != nil {
res.Version.AppVersion = params2.Version.AppVersion
}
if params2.Timestamp != nil {
res.Timestamp.Accuracy = params2.Timestamp.Accuracy
res.Timestamp.Precision = params2.Timestamp.Precision
res.Timestamp.MessageDelay = params2.Timestamp.MessageDelay
}
return res
}
@@ -255,6 +269,11 @@ func (params *ConsensusParams) ToProto() tmproto.ConsensusParams {
Version: &tmproto.VersionParams{
AppVersion: params.Version.AppVersion,
},
Timestamp: &tmproto.TimestampParams{
Accuracy: params.Timestamp.Accuracy,
Precision: params.Timestamp.Precision,
MessageDelay: params.Timestamp.MessageDelay,
},
}
}
@@ -275,5 +294,10 @@ func ConsensusParamsFromProto(pbParams tmproto.ConsensusParams) ConsensusParams
Version: VersionParams{
AppVersion: pbParams.Version.AppVersion,
},
Timestamp: TimestampParams{
Accuracy: pbParams.Timestamp.Accuracy,
Precision: pbParams.Timestamp.Precision,
MessageDelay: pbParams.Timestamp.MessageDelay,
},
}
}

View File

@@ -68,7 +68,7 @@ func (vote *Vote) CommitSig() CommitSig {
switch {
case vote.BlockID.IsComplete():
blockIDFlag = BlockIDFlagCommit
case vote.BlockID.IsZero():
case vote.BlockID.IsNil():
blockIDFlag = BlockIDFlagNil
default:
panic(fmt.Sprintf("Invalid vote %v - expected BlockID to be either empty or complete", vote))
@@ -177,7 +177,7 @@ func (vote *Vote) ValidateBasic() error {
// BlockID.ValidateBasic would not err if we for instance have an empty hash but a
// non-empty PartsSetHeader:
if !vote.BlockID.IsZero() && !vote.BlockID.IsComplete() {
if !vote.BlockID.IsNil() && !vote.BlockID.IsComplete() {
return fmt.Errorf("blockID must be either empty or complete, got: %v", vote.BlockID)
}

View File

@@ -27,7 +27,7 @@ func TestVoteSet_AddVote_Good(t *testing.T) {
assert.Nil(t, voteSet.GetByAddress(val0Addr))
assert.False(t, voteSet.BitArray().GetIndex(0))
blockID, ok := voteSet.TwoThirdsMajority()
assert.False(t, ok || !blockID.IsZero(), "there should be no 2/3 majority")
assert.False(t, ok || !blockID.IsNil(), "there should be no 2/3 majority")
vote := &Vote{
ValidatorAddress: val0Addr,
@@ -44,7 +44,7 @@ func TestVoteSet_AddVote_Good(t *testing.T) {
assert.NotNil(t, voteSet.GetByAddress(val0Addr))
assert.True(t, voteSet.BitArray().GetIndex(0))
blockID, ok = voteSet.TwoThirdsMajority()
assert.False(t, ok || !blockID.IsZero(), "there should be no 2/3 majority")
assert.False(t, ok || !blockID.IsNil(), "there should be no 2/3 majority")
}
func TestVoteSet_AddVote_Bad(t *testing.T) {
@@ -145,7 +145,7 @@ func TestVoteSet_2_3Majority(t *testing.T) {
require.NoError(t, err)
}
blockID, ok := voteSet.TwoThirdsMajority()
assert.False(t, ok || !blockID.IsZero(), "there should be no 2/3 majority")
assert.False(t, ok || !blockID.IsNil(), "there should be no 2/3 majority")
// 7th validator voted for some blockhash
{
@@ -156,7 +156,7 @@ func TestVoteSet_2_3Majority(t *testing.T) {
_, err = signAddVote(privValidators[6], withBlockHash(vote, tmrand.Bytes(32)), voteSet)
require.NoError(t, err)
blockID, ok = voteSet.TwoThirdsMajority()
assert.False(t, ok || !blockID.IsZero(), "there should be no 2/3 majority")
assert.False(t, ok || !blockID.IsNil(), "there should be no 2/3 majority")
}
// 8th validator voted for nil.
@@ -168,7 +168,7 @@ func TestVoteSet_2_3Majority(t *testing.T) {
_, err = signAddVote(privValidators[7], vote, voteSet)
require.NoError(t, err)
blockID, ok = voteSet.TwoThirdsMajority()
assert.True(t, ok || blockID.IsZero(), "there should be 2/3 majority for nil")
assert.True(t, ok || blockID.IsNil(), "there should be 2/3 majority for nil")
}
}
@@ -200,7 +200,7 @@ func TestVoteSet_2_3MajorityRedux(t *testing.T) {
require.NoError(t, err)
}
blockID, ok := voteSet.TwoThirdsMajority()
assert.False(t, ok || !blockID.IsZero(),
assert.False(t, ok || !blockID.IsNil(),
"there should be no 2/3 majority")
// 67th validator voted for nil
@@ -212,7 +212,7 @@ func TestVoteSet_2_3MajorityRedux(t *testing.T) {
_, err = signAddVote(privValidators[66], withBlockHash(vote, nil), voteSet)
require.NoError(t, err)
blockID, ok = voteSet.TwoThirdsMajority()
assert.False(t, ok || !blockID.IsZero(),
assert.False(t, ok || !blockID.IsNil(),
"there should be no 2/3 majority: last vote added was nil")
}
@@ -226,7 +226,7 @@ func TestVoteSet_2_3MajorityRedux(t *testing.T) {
_, err = signAddVote(privValidators[67], withBlockPartSetHeader(vote, blockPartsHeader), voteSet)
require.NoError(t, err)
blockID, ok = voteSet.TwoThirdsMajority()
assert.False(t, ok || !blockID.IsZero(),
assert.False(t, ok || !blockID.IsNil(),
"there should be no 2/3 majority: last vote added had different PartSetHeader Hash")
}
@@ -240,7 +240,7 @@ func TestVoteSet_2_3MajorityRedux(t *testing.T) {
_, err = signAddVote(privValidators[68], withBlockPartSetHeader(vote, blockPartsHeader), voteSet)
require.NoError(t, err)
blockID, ok = voteSet.TwoThirdsMajority()
assert.False(t, ok || !blockID.IsZero(),
assert.False(t, ok || !blockID.IsNil(),
"there should be no 2/3 majority: last vote added had different PartSetHeader Total")
}
@@ -253,7 +253,7 @@ func TestVoteSet_2_3MajorityRedux(t *testing.T) {
_, err = signAddVote(privValidators[69], withBlockHash(vote, tmrand.Bytes(32)), voteSet)
require.NoError(t, err)
blockID, ok = voteSet.TwoThirdsMajority()
assert.False(t, ok || !blockID.IsZero(),
assert.False(t, ok || !blockID.IsNil(),
"there should be no 2/3 majority: last vote added had different BlockHash")
}