mirror of
https://github.com/tendermint/tendermint.git
synced 2026-01-14 00:32:52 +00:00
Compare commits
15 Commits
rfc-e2e-te
...
wb/proposa
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
9e19803ded | ||
|
|
671c22b5d1 | ||
|
|
0d2a5de203 | ||
|
|
fe2ed68718 | ||
|
|
a3889ee2cb | ||
|
|
87f4beb374 | ||
|
|
b0423e2445 | ||
|
|
b0684bd300 | ||
|
|
382947ce93 | ||
|
|
9a7ce08e3e | ||
|
|
55f6d20977 | ||
|
|
b9c35c1263 | ||
|
|
f08f72e334 | ||
|
|
e932b469ed | ||
|
|
5db2a39643 |
@@ -40,5 +40,7 @@ sections.
|
||||
- [RFC-000: P2P Roadmap](./rfc-000-p2p-roadmap.rst)
|
||||
- [RFC-001: Storage Engines](./rfc-001-storage-engine.rst)
|
||||
- [RFC-002: Interprocess Communication](./rfc-002-ipc-ecosystem.md)
|
||||
- [RFC-003: Performance Taxonomy](./rfc-003-performance-questions.md)
|
||||
- [RFC-004: E2E Test Framework Enhancements](./rfc-004-e2e-framework.md)
|
||||
|
||||
<!-- - [RFC-NNN: Title](./rfc-NNN-title.md) -->
|
||||
|
||||
283
docs/rfc/rfc-003-performance-questions.md
Normal file
283
docs/rfc/rfc-003-performance-questions.md
Normal file
@@ -0,0 +1,283 @@
|
||||
# RFC 003: Taxonomy of potential performance issues in Tendermint
|
||||
|
||||
## Changelog
|
||||
|
||||
- 2021-09-02: Created initial draft (@wbanfield)
|
||||
- 2021-09-14: Add discussion of the event system (@wbanfield)
|
||||
|
||||
## Abstract
|
||||
|
||||
This document discusses the various sources of performance issues in Tendermint and
|
||||
attempts to clarify what work may be required to understand and address them.
|
||||
|
||||
## Background
|
||||
|
||||
Performance, loosely defined as the ability of a software process to perform its work
|
||||
quickly and efficiently under load and within reasonable resource limits, is a frequent
|
||||
topic of discussion in the Tendermint project.
|
||||
To effectively address any issues with Tendermint performance we need to
|
||||
categorize the various issues, understand their potential sources, and gauge their
|
||||
impact on users.
|
||||
|
||||
Categorizing the different known performance issues will allow us to discuss and fix them
|
||||
more systematically. This document proposes a rough taxonomy of performance issues
|
||||
and highlights areas where more research into potential performance problems is required.
|
||||
|
||||
Understanding Tendermint's performance limitations will also be critically important
|
||||
as we make changes to many of its subsystems. Performance is a central concern for
|
||||
upcoming decisions regarding the `p2p` protocol, RPC message encoding and structure,
|
||||
database usage and selection, and consensus protocol updates.
|
||||
|
||||
|
||||
## Discussion
|
||||
|
||||
This section attempts to delineate the different sections of Tendermint functionality
|
||||
that are often cited as having performance issues. It raises questions and suggests
|
||||
lines of inquiry that may be valuable for better understanding Tendermint's performance issues.
|
||||
|
||||
As a note: We should avoid quickly adding many microbenchmarks or package level benchmarks.
|
||||
These are prone to being worse than useless as they can obscure what _should_ be
|
||||
focused on: performance of the system from the perspective of a user. We should,
|
||||
instead, tune performance with an eye towards user needs and actions users make. These users comprise
|
||||
both operators of Tendermint chains and the people generating transactions for
|
||||
Tendermint chains. Both of these sets of users are largely aligned in wanting an end-to-end
|
||||
system that operates quickly and efficiently.
|
||||
|
||||
REQUEST: The list below may be incomplete, if there are additional sections that are often
|
||||
cited as creating poor performance, please comment so that they may be included.
|
||||
|
||||
### P2P
|
||||
|
||||
#### Claim: Tendermint cannot scale to large numbers of nodes
|
||||
|
||||
A complaint has been reported that Tendermint networks cannot scale to large numbers of nodes.
|
||||
The listed number of nodes a user reported as causing issue was in the thousands.
|
||||
We don't currently have evidence about what the upper-limit of nodes that Tendermint's
|
||||
P2P stack can scale to.
|
||||
|
||||
We need to more concretely understand the source of issues and determine what layer
|
||||
is causing a problem. It's possible that the P2P layer, in the absence of any reactors
|
||||
sending data, is perfectly capable of managing thousands of peer connections. For
|
||||
a reasonable networking and application setup, thousands of connections should not present any
|
||||
issue for the application.
|
||||
|
||||
We need more data to understand the problem directly. We want to drive the popularity
|
||||
and adoption of Tendermint and this will mean allowing for chains with more validators.
|
||||
We should follow up with users experiencing this issue. We may then want to add
|
||||
a series of metrics to the P2P layer to better understand the inefficiencies it produces.
|
||||
|
||||
The following metrics can help us understand the sources of latency in the Tendermint P2P stack:
|
||||
|
||||
* Number of messages sent and received per second
|
||||
* Time of a message spent on the P2P layer send and receive queues
|
||||
|
||||
The following metrics exist and should be leveraged in addition to those added:
|
||||
|
||||
* Number of peers node's connected to
|
||||
* Number of bytes per channel sent and received from each peer
|
||||
|
||||
### Sync
|
||||
|
||||
#### Claim: Block Syncing is slow
|
||||
|
||||
Bootstrapping a new node in a network to the height of the rest of the network is believed to
|
||||
take longer than users would like. Block sync requires fetching all of the blocks from
|
||||
peers and placing them into the local disk for storage. A useful line of inquiry
|
||||
is understanding how quickly a perfectly tuned system _could_ fetch all of the state
|
||||
over a network so that we understand how much overhead Tendermint actually adds.
|
||||
|
||||
The operation is likely to be _incredibly_ dependent on the environment in which
|
||||
the node is being run. The factors that will influence syncing include:
|
||||
1. Number of peers that a syncing node may fetch from.
|
||||
2. Speed of the disk that a validator is writing to.
|
||||
3. Speed of the network connection between the different peers that node is
|
||||
syncing from.
|
||||
|
||||
We should calculate how quickly this operation _could possibly_ complete for common chains and nodes.
|
||||
To calculate how quickly this operation could possibly complete, we should assume that
|
||||
a node is reading at line-rate of the NIC and writing at the full drive speed to its
|
||||
local storage. Comparing this theoretical upper-limit to the actual sync times
|
||||
observed by node operators will give us a good point of comparison for understanding
|
||||
how much overhead Tendermint incurs.
|
||||
|
||||
We should additionally add metrics to the blocksync operation to more clearly pinpoint
|
||||
slow operations. The following metrics should be added to the block syncing operation:
|
||||
|
||||
* Time to fetch and validate each block
|
||||
* Time to execute a block
|
||||
* Blocks sync'd per unit time
|
||||
|
||||
### Application
|
||||
|
||||
Applications performing complex state transitions have the potential to bottleneck
|
||||
the Tendermint node.
|
||||
|
||||
#### Claim: ABCI block delivery could cause slowdown
|
||||
|
||||
ABCI delivers blocks in several methods: `BeginBlock`, `DeliverTx`, `EndBlock`, `Commit`.
|
||||
|
||||
Tendermint delivers transactions one-by-one via the `DeliverTx` call. Most of the
|
||||
transaction delivery in Tendermint occurs asynchronously and therefore appears unlikely to
|
||||
form a bottleneck in ABCI.
|
||||
|
||||
After delivering all transactions, Tendermint then calls the `Commit` ABCI method.
|
||||
Tendermint [locks all access to the mempool][abci-commit-description] while `Commit`
|
||||
proceeds. This means that an application that is slow to execute all of its
|
||||
transactions or finalize state during the `Commit` method will prevent any new
|
||||
transactions from being added to the mempool. Apps that are slow to commit will
|
||||
prevent consensus from proceeded to the next consensus height since Tendermint
|
||||
cannot validate block proposals or produce block proposals without the
|
||||
AppHash obtained from the `Commit` method. We should add a metric for each
|
||||
step in the ABCI protocol to track the amount of time that a node spends communicating
|
||||
with the application at each step.
|
||||
|
||||
#### Claim: ABCI serialization overhead causes slowdown
|
||||
|
||||
The most common way to run a Tendermint application is using the Cosmos-SDK.
|
||||
The Cosmos-SDK runs the ABCI application within the same process as Tendermint.
|
||||
When an application is run in the same process as Tendermint, a serialization penalty
|
||||
is not paid. This is because the local ABCI client does not serialize method calls
|
||||
and instead passes the protobuf type through directly. This can be seen
|
||||
in [local_client.go][abci-local-client-code].
|
||||
|
||||
Serialization and deserialization in the gRPC and socket protocol ABCI methods
|
||||
may cause slowdown. While these may cause issue, they are not part of the primary
|
||||
usecase of Tendermint and do not necessarily need to be addressed at this time.
|
||||
|
||||
### RPC
|
||||
|
||||
#### Claim: The Query API is slow.
|
||||
|
||||
The query API locks a mutex across the ABCI connections. This causes consensus to
|
||||
slow during queries, as ABCI is no longer able to make progress. This is known
|
||||
to be causing issue in the cosmos-sdk and is being addressed [in the sdk][sdk-query-fix]
|
||||
but a more robust solution may be required. Adding metrics to each ABCI client connection
|
||||
and message as described in the Application section of this document would allow us
|
||||
to further introspect the issue here.
|
||||
|
||||
#### Claim: RPC Serialization may cause slowdown
|
||||
|
||||
The Tendermint RPC uses a modified version of JSON-RPC. This RPC powers the `broadcast_tx_*` methods,
|
||||
which is a critical method for adding transactions to Tendermint at the moment. This method is
|
||||
likely invoked quite frequently on popular networks. Being able to perform efficiently
|
||||
on this common and critical operation is very important. The current JSON-RPC implementation
|
||||
relies heavily on type introspection via reflection, which is known to be very slow in
|
||||
Go. We should therefore produce benchmarks of this method to determine how much overhead
|
||||
we are adding to what, is likely to be, a very common operation.
|
||||
|
||||
The other JSON-RPC methods are much less critical to the core functionality of Tendermint.
|
||||
While there may other points of performance consideration within the RPC, methods that do not
|
||||
receive high volumes of requests should not be prioritized for performance consideration.
|
||||
|
||||
NOTE: Previous discussion of the RPC framework was done in [ADR 57][adr-57] and
|
||||
there is ongoing work to inspect and alter the JSON-RPC framework in [RFC 002][rfc-002].
|
||||
Much of these RPC-related performance considerations can either wait until the work of RFC 002 work is done or be
|
||||
considered concordantly with the in-flight changes to the JSON-RPC.
|
||||
|
||||
### Protocol
|
||||
|
||||
#### Claim: Gossiping messages is a slow process
|
||||
|
||||
Currently, for any validator to successfully vote in a consensus _step_, it must
|
||||
receive votes from greater than 2/3 of the validators on the network. In many cases,
|
||||
it's preferable to receive as many votes as possible from correct validators.
|
||||
|
||||
This produces a quadratic increase in messages that are communicated as more validators join the network.
|
||||
(Each of the N validators must communicate with all other N-1 validators).
|
||||
|
||||
This large number of messages communicated per step has been identified to impact
|
||||
performance of the protocol. Given that the number of messages communicated has been
|
||||
identified as a bottleneck, it would be extremely valuable to gather data on how long
|
||||
it takes for popular chains with many validators to gather all votes within a step.
|
||||
|
||||
Metrics that would improve visibility into this include:
|
||||
|
||||
* Amount of time for a node to gather votes in a step.
|
||||
* Amount of time for a node to gather all block parts.
|
||||
* Number of votes each node sends to gossip (i.e. not its own votes, but votes it is
|
||||
transmitting for a peer).
|
||||
* Total number of votes each node sends to receives (A node may receive duplicate votes
|
||||
so understanding how frequently this occurs will be valuable in evaluating the performance
|
||||
of the gossip system).
|
||||
|
||||
#### Claim: Hashing Txs causes slowdown in Tendermint
|
||||
|
||||
Using a faster hash algorithm for Tx hashes is currently a point of discussion
|
||||
in Tendermint. Namely, it is being considered as part of the [modular hashing proposal][modular-hashing].
|
||||
It is currently unknown if hashing transactions in the Mempool forms a significant bottleneck.
|
||||
Although it does not appear to be documented as slow, there are a few open github
|
||||
issues that indicate a possible user preference for a faster hashing algorithm,
|
||||
including [issue 2187][issue-2187] and [issue 2186][issue-2186].
|
||||
|
||||
It is likely worth investigating what order of magnitude Tx hashing takes in comparison to other
|
||||
aspects of adding a Tx to the mempool. It is not currently clear if the rate of adding Tx
|
||||
to the mempool is a source of user pain. We should not endeavor to make large changes to
|
||||
consensus critical components without first being certain that the change is highly
|
||||
valuable and impactful.
|
||||
|
||||
### Digital Signatures
|
||||
|
||||
#### Claim: Verification of digital signatures may cause slowdown in Tendermint
|
||||
|
||||
Working with cryptographic signatures can be computationally expensive. The cosmos
|
||||
hub uses [ed25519 signatures][hub-signature]. The library performing signature
|
||||
verification in Tendermint on votes is [benchmarked][ed25519-bench] to be able to perform an `ed25519`
|
||||
signature in 75μs on a decently fast CPU. A validator in the Cosmos Hub performs
|
||||
3 sets of verifications on the signatures of the 140 validators in the Hub
|
||||
in a consensus round, during block verification, when verifying the prevotes, and
|
||||
when verifying the precommits. With no batching, this would be roughly `3ms` per
|
||||
round. It is quite unlikely, therefore, that this accounts for any serious amount
|
||||
of the ~7 seconds of block time per height in the Hub.
|
||||
|
||||
This may cause slowdown when syncing, since the process needs to constantly verify
|
||||
signatures. It's possible that improved signature aggregation will lead to improved
|
||||
light client or other syncing performance. In general, a metric should be added
|
||||
to track block rate while blocksyncing.
|
||||
|
||||
#### Claim: Our use of digital signatures in the consensus protocol contributes to performance issue
|
||||
|
||||
Currently, Tendermint's digital signature verification requires that all validators
|
||||
receive all vote messages. Each validator must receive the complete digital signature
|
||||
along with the vote message that it corresponds to. This means that all N validators
|
||||
must receive messages from at least 2/3 of the N validators in each consensus
|
||||
round. Given the potential for oddly shaped network topologies and the expected
|
||||
variable network roundtrip times of a few hundred milliseconds in a blockchain,
|
||||
it is highly likely that this amount of gossiping is leading to a significant amount
|
||||
of the slowdown in the Cosmos Hub and in Tendermint consensus.
|
||||
|
||||
### Tendermint Event System
|
||||
|
||||
#### Claim: The event system is a bottleneck in Tendermint
|
||||
|
||||
The Tendermint Event system is used to communicate and store information about
|
||||
internal Tendermint execution. The system uses channels internally to send messages
|
||||
to different subscribers. Sending an event [blocks on the internal channel][event-send].
|
||||
The default configuration is to [use an unbuffered channel for event publishes][event-buffer-capacity].
|
||||
Several consumers of the event system also use an unbuffered channel for reads.
|
||||
An example of this is the [event indexer][event-indexer-unbuffered], which takes an
|
||||
unbuffered subscription to the event system. The result is that these unbuffered readers
|
||||
can cause writes to the event system to block or slow down depending on contention in the
|
||||
event system. This has implications for the consensus system, which [publishes events][consensus-event-send].
|
||||
To better understand the performance of the event system, we should add metrics to track the timing of
|
||||
event sends. The following metrics would be a good start for tracking this performance:
|
||||
|
||||
* Time in event send, labeled by Event Type
|
||||
* Time in event receive, labeled by subscriber
|
||||
* Event throughput, measured in events per unit time.
|
||||
|
||||
### References
|
||||
[modular-hashing]: https://github.com/tendermint/tendermint/pull/6773
|
||||
[issue-2186]: https://github.com/tendermint/tendermint/issues/2186
|
||||
[issue-2187]: https://github.com/tendermint/tendermint/issues/2187
|
||||
[rfc-002]: https://github.com/tendermint/tendermint/pull/6913
|
||||
[adr-57]: https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-057-RPC.md
|
||||
[issue-1319]: https://github.com/tendermint/tendermint/issues/1319
|
||||
[abci-commit-description]: https://github.com/tendermint/spec/blob/master/spec/abci/apps.md#commit
|
||||
[abci-local-client-code]: https://github.com/tendermint/tendermint/blob/511bd3eb7f037855a793a27ff4c53c12f085b570/abci/client/local_client.go#L84
|
||||
[hub-signature]: https://github.com/cosmos/gaia/blob/0ecb6ed8a244d835807f1ced49217d54a9ca2070/docs/resources/genesis.md#consensus-parameters
|
||||
[ed25519-bench]: https://github.com/oasisprotocol/curve25519-voi/blob/d2e7fc59fe38c18ca990c84c4186cba2cc45b1f9/PERFORMANCE.md
|
||||
[event-send]: https://github.com/tendermint/tendermint/blob/5bd3b286a2b715737f6d6c33051b69061d38f8ef/libs/pubsub/pubsub.go#L338
|
||||
[event-buffer-capacity]: https://github.com/tendermint/tendermint/blob/5bd3b286a2b715737f6d6c33051b69061d38f8ef/types/event_bus.go#L14
|
||||
[event-indexer-unbuffered]: https://github.com/tendermint/tendermint/blob/5bd3b286a2b715737f6d6c33051b69061d38f8ef/state/indexer/indexer_service.go#L39
|
||||
[consensus-event-send]: https://github.com/tendermint/tendermint/blob/5bd3b286a2b715737f6d6c33051b69061d38f8ef/internal/consensus/state.go#L1573
|
||||
[sdk-query-fix]: https://github.com/cosmos/cosmos-sdk/pull/10045
|
||||
213
docs/rfc/rfc-004-e2e-framework.rst
Normal file
213
docs/rfc/rfc-004-e2e-framework.rst
Normal file
@@ -0,0 +1,213 @@
|
||||
========================================
|
||||
RFC 004: E2E Test Framework Enhancements
|
||||
========================================
|
||||
|
||||
Changelog
|
||||
---------
|
||||
|
||||
- 2021-09-14: started initial draft (@tychoish)
|
||||
|
||||
Abstract
|
||||
--------
|
||||
|
||||
This document discusses a series of improvements to the e2e test framework
|
||||
that we can consider during the next few releases to help boost confidence in
|
||||
Tendermint releases, and improve developer efficiency.
|
||||
|
||||
Background
|
||||
----------
|
||||
|
||||
During the 0.35 release cycle, the E2E tests were a source of great
|
||||
value, helping to identify a number of bugs before release. At the same time,
|
||||
the tests were not consistently passing during this time, thereby reducing
|
||||
their value, and forcing the core development team to allocate time and energy
|
||||
to maintaining and chasing down issues with the e2e tests and the test
|
||||
harness. The experience of this release cycle calls to mind a series of
|
||||
improvements to the test framework, and this document attempts to capture
|
||||
these improvements, along with motivations, and potential for impact.
|
||||
|
||||
Projects
|
||||
--------
|
||||
|
||||
Flexible Workload Generation
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Presently the e2e suite contains a single workload generation pattern, which
|
||||
exists simply to ensure that the test networks have some work during their
|
||||
runs. However, the shape and volume of the work is very consistent and is very
|
||||
gentle to help ensure test reliability.
|
||||
|
||||
We don't need a complex workload generation framework, but being able to have
|
||||
a few different workload shapes available for test networks, both generated and
|
||||
hand-crafted, would be useful.
|
||||
|
||||
Workload patterns/configurations might include:
|
||||
|
||||
- transaction targeting patterns (include light nodes, round robin, target
|
||||
individual nodes)
|
||||
|
||||
- variable transaction size over time.
|
||||
|
||||
- transaction broadcast option (synchronously, checked, fire-and-forget,
|
||||
mixed).
|
||||
|
||||
- number of transactions to submit.
|
||||
|
||||
- non-transaction workloads: (evidence submission, query, event subscription.)
|
||||
|
||||
Configurable Generator
|
||||
~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The nightly e2e suite is defined by the `testnet generator
|
||||
<https://github.com/tendermint/tendermint/blob/master/test/e2e/generator/generate.go#L13-L65>`_,
|
||||
and it's difficult to add dimensions or change the focus of the test suite in
|
||||
any way without modifying the implementation of the generator. If the
|
||||
generator were more configurable, potentially via a file rather than in
|
||||
the Go implementation, we could modify the focus of the test suite on the
|
||||
fly.
|
||||
|
||||
Features that we might want to configure:
|
||||
|
||||
- number of test networks to generate of various topologies, to improve
|
||||
coverage of different configurations.
|
||||
|
||||
- test application configurations (to modify the latency of ABCI calls, etc.)
|
||||
|
||||
- size of test networks.
|
||||
|
||||
- workload shape and behavior.
|
||||
|
||||
- initial sync and catch-up configurations.
|
||||
|
||||
The workload generator currently provides runtime options for limiting the
|
||||
generator to specific types of P2P stacks, and for generating multiple groups
|
||||
of test cases to support parallelism. The goal is to extend this pattern and
|
||||
avoid hardcoding the matrix of test cases in the generator code. Once the
|
||||
testnet configuration generation behavior is configurable at runtime,
|
||||
developers may be able to use the e2e framework to validate changes before
|
||||
landing changes that break e2e tests a day later.
|
||||
|
||||
In addition to the autogenerated suite, it might make sense to maintain a
|
||||
small collection of hand-crafted cases that exercise configurations of
|
||||
concern, to run as part of the nightly (or less frequent) loop.
|
||||
|
||||
Implementation Plan Structure
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
As a development team, we should determine the features should impact the e2e
|
||||
testing early in the development cycle, and if we intend to modify the e2e
|
||||
tests to exercise a feature, we should identify this early and begin the
|
||||
integration process as early as possible.
|
||||
|
||||
To facilitate this, we should adopt a practice whereby we exercise specific
|
||||
features that are currently under development more rigorously in the e2e
|
||||
suite, and then as development stabilizes we can reduce the number or weight
|
||||
of these features in the suite.
|
||||
|
||||
As of 0.35 there are essentially two end to end tests: the suite of 64
|
||||
generated test networks, and the hand crafted `ci.toml` test case. The
|
||||
generated test cases help provide systemtic coverage, while the `ci` run
|
||||
provides coverage for a large number of features.
|
||||
|
||||
Reduce Cycle Time
|
||||
~~~~~~~~~~~~~~~~~
|
||||
|
||||
One of the barriers to leveraging the e2e framework, and one of the challenges
|
||||
in debugging failures, is the cycle time of running a single test iteration is
|
||||
quite high: 5 minutes to build the docker image, plus the time to run the test
|
||||
or tests.
|
||||
|
||||
There are a number of improvements and enhancements that can reduce the cycle
|
||||
time in practice:
|
||||
|
||||
- reduce the amount of time required to build the docker image used in these
|
||||
tests. Without the dependency on CGo, the tendermint binaries could be
|
||||
(cross) compiled outside of the docker container and then injected into
|
||||
them, which would take better advantage of docker's native caching,
|
||||
although, without the dependency on CGo there would be no hard requirement
|
||||
for the e2e tests to use docker.
|
||||
|
||||
- support test parallelism. Because of the way the testnets are orchestrated
|
||||
a single system can really only run one network at a time. For executions
|
||||
(local or remote) with more resources, there's no reason to run a few
|
||||
networks in parallel to reduce the feedback time.
|
||||
|
||||
- prune testnet configurations that are unlikely to provide good signal, to
|
||||
shorten the time to feedback.
|
||||
|
||||
- apply some kind of tiered approach to test execution, to improve the
|
||||
legibility of the test result. For example order tests by the dependency of
|
||||
their features, or run test networks without perturbations before running
|
||||
that configuration with perturbations, to be able to isolate the impact of
|
||||
specific features.
|
||||
|
||||
- orchestrate the test harness directly from go test rather than via a special
|
||||
harness and shell scripts so e2e tests may more naively fit into developers
|
||||
existing workflows.
|
||||
|
||||
Many of these improvements, particularly, reducing the build time will also
|
||||
reduce the time to get feedback during automated builds.
|
||||
|
||||
Deeper Insights
|
||||
~~~~~~~~~~~~~~~
|
||||
|
||||
When a test network fails, it's incredibly difficult to understand _why_ the
|
||||
network failed, as the current system provides very little insight into the
|
||||
system outside of the process logs. When a test network stalls or fails
|
||||
developers should be able to quickly and easily get a sense of the state of
|
||||
the network and all nodes.
|
||||
|
||||
Improvements in persuit of this goal, include functionality that would help
|
||||
node operators in production environments by improving the quality and utility
|
||||
of the logging messages and other reported metrics, but also provide some
|
||||
tools to collect and aggregate this data for developers in the context of test
|
||||
networks.
|
||||
|
||||
- Interleave messages from all nodes in the network to be able to correlate
|
||||
events during the test run.
|
||||
|
||||
- Collect structured metrics of the system operation (CPU/MEM/IO) during the
|
||||
test run, as well as from each tendermint/application process.
|
||||
|
||||
- Build (simple) tools to be able to render and summarize the data collected
|
||||
during the test run to answer basic questions about test outcome.
|
||||
|
||||
Flexible Assertions
|
||||
~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Currently, all assertions run for every test network, which makes the
|
||||
assertions pretty bland, and the framework primarily useful as a smoke-test
|
||||
framework, but it might be useful to be able to write and run different
|
||||
tests for different configurations. This could allow us to test outside of the
|
||||
happy-path.
|
||||
|
||||
In general our existing assertions occupy a fraction of the total test time,
|
||||
so the relative cost of adding a few extra test assertions would be of limited
|
||||
cost, and could help build confidence.
|
||||
|
||||
Additional Kinds of Testing
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The existing e2e suite, exercises networks of nodes that have homogeneous
|
||||
tendermint version, stable configuration, that are expected to make
|
||||
progress. There are many other possible test configurations that may be
|
||||
interesting to engage with. These could include dimensions, such as:
|
||||
|
||||
- Multi-version testing to exercise our compatibility guarantees for networks
|
||||
that might have different tendermint versions.
|
||||
|
||||
- As a flavor or mult-version testing, include upgrade testing, to build
|
||||
confidence in migration code and procedures.
|
||||
|
||||
- Additional test applications, particularly practical-type applciations
|
||||
including some that use gaiad and/or the cosmos-sdk. Test-only applications
|
||||
that simulate other kinds of applications (e.g. variable application
|
||||
operation latency.)
|
||||
|
||||
- Tests of "non-viable" configurations that ensure that forbidden combinations
|
||||
lead to halts.
|
||||
|
||||
References
|
||||
----------
|
||||
|
||||
- `ADR 66: End-to-End Testing <../architecture/adr-66-e2e-testing.md>`_
|
||||
2
go.mod
2
go.mod
@@ -34,7 +34,7 @@ require (
|
||||
github.com/spf13/viper v1.8.1
|
||||
github.com/stretchr/testify v1.7.0
|
||||
github.com/tendermint/tm-db v0.6.4
|
||||
github.com/vektra/mockery/v2 v2.9.0
|
||||
github.com/vektra/mockery/v2 v2.9.3
|
||||
golang.org/x/crypto v0.0.0-20210513164829-c07d793c2f9a
|
||||
golang.org/x/net v0.0.0-20210428140749-89ef3d95e781
|
||||
golang.org/x/sync v0.0.0-20210220032951-036812b2e83c
|
||||
|
||||
4
go.sum
4
go.sum
@@ -895,8 +895,8 @@ github.com/valyala/bytebufferpool v1.0.0/go.mod h1:6bBcMArwyJ5K/AmCkWv1jt77kVWyC
|
||||
github.com/valyala/fasthttp v1.16.0/go.mod h1:YOKImeEosDdBPnxc0gy7INqi3m1zK6A+xl6TwOBhHCA=
|
||||
github.com/valyala/quicktemplate v1.6.3/go.mod h1:fwPzK2fHuYEODzJ9pkw0ipCPNHZ2tD5KW4lOuSdPKzY=
|
||||
github.com/valyala/tcplisten v0.0.0-20161114210144-ceec8f93295a/go.mod h1:v3UYOV9WzVtRmSR+PDvWpU/qWl4Wa5LApYYX4ZtKbio=
|
||||
github.com/vektra/mockery/v2 v2.9.0 h1:+3FhCL3EviR779mTzXwUuhPNnqFUA7sDnt9OFkXaFd4=
|
||||
github.com/vektra/mockery/v2 v2.9.0/go.mod h1:2gU4Cf/f8YyC8oEaSXfCnZBMxMjMl/Ko205rlP0fO90=
|
||||
github.com/vektra/mockery/v2 v2.9.3 h1:ma6hcGQw4q/lhFUTJ+E9V8/5tsIcht9i2Q4d1qo26SQ=
|
||||
github.com/vektra/mockery/v2 v2.9.3/go.mod h1:2gU4Cf/f8YyC8oEaSXfCnZBMxMjMl/Ko205rlP0fO90=
|
||||
github.com/viki-org/dnscache v0.0.0-20130720023526-c70c1f23c5d8/go.mod h1:dniwbG03GafCjFohMDmz6Zc6oCuiqgH6tGNyXTkHzXE=
|
||||
github.com/xiang90/probing v0.0.0-20190116061207-43a291ad63a2/go.mod h1:UETIi67q53MR2AWcXfiuqkDkRtnGDLqkBTpCHuJHxtU=
|
||||
github.com/xo/terminfo v0.0.0-20210125001918-ca9a967f8778/go.mod h1:2MuV+tbUrU1zIOPMxZ5EncGwgmMJsa+9ucAQZXxsObs=
|
||||
|
||||
@@ -52,7 +52,7 @@ func TestByzantinePrevoteEquivocation(t *testing.T) {
|
||||
thisConfig := ResetConfig(fmt.Sprintf("%s_%d", testName, i))
|
||||
defer os.RemoveAll(thisConfig.RootDir)
|
||||
|
||||
ensureDir(path.Dir(thisConfig.Consensus.WalFile()), 0700) // dir for wal
|
||||
ensureDir(t, path.Dir(thisConfig.Consensus.WalFile()), 0700) // dir for wal
|
||||
app := appFunc()
|
||||
vals := types.TM2PB.ValidatorUpdates(state.Validators)
|
||||
app.InitChain(abci.RequestInitChain{Validators: vals})
|
||||
|
||||
@@ -69,9 +69,10 @@ func configSetup(t *testing.T) *cfg.Config {
|
||||
return config
|
||||
}
|
||||
|
||||
func ensureDir(dir string, mode os.FileMode) {
|
||||
func ensureDir(t *testing.T, dir string, mode os.FileMode) {
|
||||
t.Helper()
|
||||
if err := tmos.EnsureDir(dir, mode); err != nil {
|
||||
panic(err)
|
||||
t.Fatalf("error opening directory: %s", err)
|
||||
}
|
||||
}
|
||||
|
||||
@@ -221,18 +222,20 @@ func startTestRound(cs *State, height int64, round int32) {
|
||||
|
||||
// Create proposal block from cs1 but sign it with vs.
|
||||
func decideProposal(
|
||||
t *testing.T,
|
||||
cs1 *State,
|
||||
vs *validatorStub,
|
||||
height int64,
|
||||
round int32,
|
||||
) (proposal *types.Proposal, block *types.Block) {
|
||||
t.Helper()
|
||||
cs1.mtx.Lock()
|
||||
block, blockParts := cs1.createProposalBlock()
|
||||
validRound := cs1.ValidRound
|
||||
chainID := cs1.state.ChainID
|
||||
cs1.mtx.Unlock()
|
||||
if block == nil {
|
||||
panic("Failed to createProposalBlock. Did you forget to add commit for previous block?")
|
||||
t.Fatal("Failed to createProposalBlock. Did you forget to add commit for previous block?")
|
||||
}
|
||||
|
||||
// Make proposal
|
||||
@@ -240,7 +243,7 @@ func decideProposal(
|
||||
proposal = types.NewProposal(height, round, polRound, propBlockID)
|
||||
p := proposal.ToProto()
|
||||
if err := vs.SignProposal(context.Background(), chainID, p); err != nil {
|
||||
panic(err)
|
||||
t.Fatalf("error signing proposal: %s", err)
|
||||
}
|
||||
|
||||
proposal.Signature = p.Signature
|
||||
@@ -267,36 +270,38 @@ func signAddVotes(
|
||||
}
|
||||
|
||||
func validatePrevote(t *testing.T, cs *State, round int32, privVal *validatorStub, blockHash []byte) {
|
||||
t.Helper()
|
||||
prevotes := cs.Votes.Prevotes(round)
|
||||
pubKey, err := privVal.GetPubKey(context.Background())
|
||||
require.NoError(t, err)
|
||||
address := pubKey.Address()
|
||||
var vote *types.Vote
|
||||
if vote = prevotes.GetByAddress(address); vote == nil {
|
||||
panic("Failed to find prevote from validator")
|
||||
t.Fatalf("Failed to find prevote from validator")
|
||||
}
|
||||
if blockHash == nil {
|
||||
if vote.BlockID.Hash != nil {
|
||||
panic(fmt.Sprintf("Expected prevote to be for nil, got %X", vote.BlockID.Hash))
|
||||
t.Fatalf("Expected prevote to be for nil, got %X", vote.BlockID.Hash)
|
||||
}
|
||||
} else {
|
||||
if !bytes.Equal(vote.BlockID.Hash, blockHash) {
|
||||
panic(fmt.Sprintf("Expected prevote to be for %X, got %X", blockHash, vote.BlockID.Hash))
|
||||
t.Fatalf("Expected prevote to be for %X, got %X", blockHash, vote.BlockID.Hash)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func validateLastPrecommit(t *testing.T, cs *State, privVal *validatorStub, blockHash []byte) {
|
||||
t.Helper()
|
||||
votes := cs.LastCommit
|
||||
pv, err := privVal.GetPubKey(context.Background())
|
||||
require.NoError(t, err)
|
||||
address := pv.Address()
|
||||
var vote *types.Vote
|
||||
if vote = votes.GetByAddress(address); vote == nil {
|
||||
panic("Failed to find precommit from validator")
|
||||
t.Fatalf("Failed to find precommit from validator")
|
||||
}
|
||||
if !bytes.Equal(vote.BlockID.Hash, blockHash) {
|
||||
panic(fmt.Sprintf("Expected precommit to be for %X, got %X", blockHash, vote.BlockID.Hash))
|
||||
t.Fatalf("Expected precommit to be for %X, got %X", blockHash, vote.BlockID.Hash)
|
||||
}
|
||||
}
|
||||
|
||||
@@ -309,41 +314,42 @@ func validatePrecommit(
|
||||
votedBlockHash,
|
||||
lockedBlockHash []byte,
|
||||
) {
|
||||
t.Helper()
|
||||
precommits := cs.Votes.Precommits(thisRound)
|
||||
pv, err := privVal.GetPubKey(context.Background())
|
||||
require.NoError(t, err)
|
||||
address := pv.Address()
|
||||
var vote *types.Vote
|
||||
if vote = precommits.GetByAddress(address); vote == nil {
|
||||
panic("Failed to find precommit from validator")
|
||||
t.Fatalf("Failed to find precommit from validator")
|
||||
}
|
||||
|
||||
if votedBlockHash == nil {
|
||||
if vote.BlockID.Hash != nil {
|
||||
panic("Expected precommit to be for nil")
|
||||
t.Fatalf("Expected precommit to be for nil")
|
||||
}
|
||||
} else {
|
||||
if !bytes.Equal(vote.BlockID.Hash, votedBlockHash) {
|
||||
panic("Expected precommit to be for proposal block")
|
||||
t.Fatalf("Expected precommit to be for proposal block")
|
||||
}
|
||||
}
|
||||
|
||||
if lockedBlockHash == nil {
|
||||
if cs.LockedRound != lockRound || cs.LockedBlock != nil {
|
||||
panic(fmt.Sprintf(
|
||||
t.Fatalf(
|
||||
"Expected to be locked on nil at round %d. Got locked at round %d with block %v",
|
||||
lockRound,
|
||||
cs.LockedRound,
|
||||
cs.LockedBlock))
|
||||
cs.LockedBlock)
|
||||
}
|
||||
} else {
|
||||
if cs.LockedRound != lockRound || !bytes.Equal(cs.LockedBlock.Hash(), lockedBlockHash) {
|
||||
panic(fmt.Sprintf(
|
||||
t.Fatalf(
|
||||
"Expected block to be locked on round %d, got %d. Got locked block %X, expected %X",
|
||||
lockRound,
|
||||
cs.LockedRound,
|
||||
cs.LockedBlock.Hash(),
|
||||
lockedBlockHash))
|
||||
lockedBlockHash)
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -357,6 +363,7 @@ func validatePrevoteAndPrecommit(
|
||||
votedBlockHash,
|
||||
lockedBlockHash []byte,
|
||||
) {
|
||||
t.Helper()
|
||||
// verify the prevote
|
||||
validatePrevote(t, cs, thisRound, privVal, votedBlockHash)
|
||||
// verify precommit
|
||||
@@ -444,13 +451,14 @@ func newStateWithConfigAndBlockStore(
|
||||
return cs
|
||||
}
|
||||
|
||||
func loadPrivValidator(config *cfg.Config) *privval.FilePV {
|
||||
func loadPrivValidator(t *testing.T, config *cfg.Config) *privval.FilePV {
|
||||
t.Helper()
|
||||
privValidatorKeyFile := config.PrivValidator.KeyFile()
|
||||
ensureDir(filepath.Dir(privValidatorKeyFile), 0700)
|
||||
ensureDir(t, filepath.Dir(privValidatorKeyFile), 0700)
|
||||
privValidatorStateFile := config.PrivValidator.StateFile()
|
||||
privValidator, err := privval.LoadOrGenFilePV(privValidatorKeyFile, privValidatorStateFile)
|
||||
if err != nil {
|
||||
panic(err)
|
||||
t.Fatalf("error generating validator file: %s", err)
|
||||
}
|
||||
privValidator.Reset()
|
||||
return privValidator
|
||||
@@ -475,220 +483,238 @@ func randState(config *cfg.Config, nValidators int) (*State, []*validatorStub) {
|
||||
|
||||
//-------------------------------------------------------------------------------
|
||||
|
||||
func ensureNoNewEvent(ch <-chan tmpubsub.Message, timeout time.Duration,
|
||||
func ensureNoNewEvent(t *testing.T, ch <-chan tmpubsub.Message, timeout time.Duration,
|
||||
errorMessage string) {
|
||||
t.Helper()
|
||||
select {
|
||||
case <-time.After(timeout):
|
||||
break
|
||||
case <-ch:
|
||||
panic(errorMessage)
|
||||
t.Fatalf("unexpected event: %s", errorMessage)
|
||||
}
|
||||
}
|
||||
|
||||
func ensureNoNewEventOnChannel(ch <-chan tmpubsub.Message) {
|
||||
func ensureNoNewEventOnChannel(t *testing.T, ch <-chan tmpubsub.Message) {
|
||||
t.Helper()
|
||||
ensureNoNewEvent(
|
||||
t,
|
||||
ch,
|
||||
ensureTimeout,
|
||||
"We should be stuck waiting, not receiving new event on the channel")
|
||||
}
|
||||
|
||||
func ensureNoNewRoundStep(stepCh <-chan tmpubsub.Message) {
|
||||
func ensureNoNewRoundStep(t *testing.T, stepCh <-chan tmpubsub.Message) {
|
||||
t.Helper()
|
||||
ensureNoNewEvent(
|
||||
t,
|
||||
stepCh,
|
||||
ensureTimeout,
|
||||
"We should be stuck waiting, not receiving NewRoundStep event")
|
||||
}
|
||||
|
||||
func ensureNoNewUnlock(unlockCh <-chan tmpubsub.Message) {
|
||||
ensureNoNewEvent(
|
||||
unlockCh,
|
||||
ensureTimeout,
|
||||
"We should be stuck waiting, not receiving Unlock event")
|
||||
}
|
||||
|
||||
func ensureNoNewTimeout(stepCh <-chan tmpubsub.Message, timeout int64) {
|
||||
func ensureNoNewTimeout(t *testing.T, stepCh <-chan tmpubsub.Message, timeout int64) {
|
||||
t.Helper()
|
||||
timeoutDuration := time.Duration(timeout*10) * time.Nanosecond
|
||||
ensureNoNewEvent(
|
||||
t,
|
||||
stepCh,
|
||||
timeoutDuration,
|
||||
"We should be stuck waiting, not receiving NewTimeout event")
|
||||
}
|
||||
|
||||
func ensureNewEvent(ch <-chan tmpubsub.Message, height int64, round int32, timeout time.Duration, errorMessage string) {
|
||||
func ensureNewEvent(t *testing.T, ch <-chan tmpubsub.Message, height int64, round int32, timeout time.Duration, errorMessage string) { // nolint: lll
|
||||
t.Helper()
|
||||
select {
|
||||
case <-time.After(timeout):
|
||||
panic(errorMessage)
|
||||
t.Fatalf("timed out waiting for new event: %s", errorMessage)
|
||||
case msg := <-ch:
|
||||
roundStateEvent, ok := msg.Data().(types.EventDataRoundState)
|
||||
if !ok {
|
||||
panic(fmt.Sprintf("expected a EventDataRoundState, got %T. Wrong subscription channel?",
|
||||
msg.Data()))
|
||||
t.Fatalf("expected a EventDataRoundState, got %T. Wrong subscription channel?", msg.Data())
|
||||
}
|
||||
if roundStateEvent.Height != height {
|
||||
panic(fmt.Sprintf("expected height %v, got %v", height, roundStateEvent.Height))
|
||||
t.Fatalf("expected height %v, got %v", height, roundStateEvent.Height)
|
||||
}
|
||||
if roundStateEvent.Round != round {
|
||||
panic(fmt.Sprintf("expected round %v, got %v", round, roundStateEvent.Round))
|
||||
t.Fatalf("expected round %v, got %v", round, roundStateEvent.Round)
|
||||
}
|
||||
// TODO: We could check also for a step at this point!
|
||||
}
|
||||
}
|
||||
|
||||
func ensureNewRound(roundCh <-chan tmpubsub.Message, height int64, round int32) {
|
||||
func ensureNewRound(t *testing.T, roundCh <-chan tmpubsub.Message, height int64, round int32) {
|
||||
t.Helper()
|
||||
select {
|
||||
case <-time.After(ensureTimeout):
|
||||
panic("Timeout expired while waiting for NewRound event")
|
||||
t.Fatal("Timeout expired while waiting for NewRound event")
|
||||
case msg := <-roundCh:
|
||||
newRoundEvent, ok := msg.Data().(types.EventDataNewRound)
|
||||
if !ok {
|
||||
panic(fmt.Sprintf("expected a EventDataNewRound, got %T. Wrong subscription channel?",
|
||||
msg.Data()))
|
||||
t.Fatalf("expected a EventDataNewRound, got %T. Wrong subscription channel?", msg.Data())
|
||||
}
|
||||
if newRoundEvent.Height != height {
|
||||
panic(fmt.Sprintf("expected height %v, got %v", height, newRoundEvent.Height))
|
||||
t.Fatalf("expected height %v, got %v", height, newRoundEvent.Height)
|
||||
}
|
||||
if newRoundEvent.Round != round {
|
||||
panic(fmt.Sprintf("expected round %v, got %v", round, newRoundEvent.Round))
|
||||
t.Fatalf("expected round %v, got %v", round, newRoundEvent.Round)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func ensureNewTimeout(timeoutCh <-chan tmpubsub.Message, height int64, round int32, timeout int64) {
|
||||
func ensureNewTimeout(t *testing.T, timeoutCh <-chan tmpubsub.Message, height int64, round int32, timeout int64) {
|
||||
t.Helper()
|
||||
timeoutDuration := time.Duration(timeout*10) * time.Nanosecond
|
||||
ensureNewEvent(timeoutCh, height, round, timeoutDuration,
|
||||
ensureNewEvent(t, timeoutCh, height, round, timeoutDuration,
|
||||
"Timeout expired while waiting for NewTimeout event")
|
||||
}
|
||||
|
||||
func ensureNewProposal(proposalCh <-chan tmpubsub.Message, height int64, round int32) {
|
||||
func ensureNewProposal(t *testing.T, proposalCh <-chan tmpubsub.Message, height int64, round int32) {
|
||||
t.Helper()
|
||||
select {
|
||||
case <-time.After(ensureTimeout):
|
||||
panic("Timeout expired while waiting for NewProposal event")
|
||||
t.Fatalf("Timeout expired while waiting for NewProposal event")
|
||||
case msg := <-proposalCh:
|
||||
proposalEvent, ok := msg.Data().(types.EventDataCompleteProposal)
|
||||
if !ok {
|
||||
panic(fmt.Sprintf("expected a EventDataCompleteProposal, got %T. Wrong subscription channel?",
|
||||
msg.Data()))
|
||||
t.Fatalf("expected a EventDataCompleteProposal, got %T. Wrong subscription channel?",
|
||||
msg.Data())
|
||||
}
|
||||
if proposalEvent.Height != height {
|
||||
panic(fmt.Sprintf("expected height %v, got %v", height, proposalEvent.Height))
|
||||
t.Fatalf("expected height %v, got %v", height, proposalEvent.Height)
|
||||
}
|
||||
if proposalEvent.Round != round {
|
||||
panic(fmt.Sprintf("expected round %v, got %v", round, proposalEvent.Round))
|
||||
t.Fatalf("expected round %v, got %v", round, proposalEvent.Round)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func ensureNewValidBlock(validBlockCh <-chan tmpubsub.Message, height int64, round int32) {
|
||||
ensureNewEvent(validBlockCh, height, round, ensureTimeout,
|
||||
func ensureNewValidBlock(t *testing.T, validBlockCh <-chan tmpubsub.Message, height int64, round int32) {
|
||||
t.Helper()
|
||||
ensureNewEvent(t, validBlockCh, height, round, ensureTimeout,
|
||||
"Timeout expired while waiting for NewValidBlock event")
|
||||
}
|
||||
|
||||
func ensureNewBlock(blockCh <-chan tmpubsub.Message, height int64) {
|
||||
func ensureNewBlock(t *testing.T, blockCh <-chan tmpubsub.Message, height int64) {
|
||||
t.Helper()
|
||||
select {
|
||||
case <-time.After(ensureTimeout):
|
||||
panic("Timeout expired while waiting for NewBlock event")
|
||||
t.Fatalf("Timeout expired while waiting for NewBlock event")
|
||||
case msg := <-blockCh:
|
||||
blockEvent, ok := msg.Data().(types.EventDataNewBlock)
|
||||
if !ok {
|
||||
panic(fmt.Sprintf("expected a EventDataNewBlock, got %T. Wrong subscription channel?",
|
||||
msg.Data()))
|
||||
t.Fatalf("expected a EventDataNewBlock, got %T. Wrong subscription channel?",
|
||||
msg.Data())
|
||||
}
|
||||
if blockEvent.Block.Height != height {
|
||||
panic(fmt.Sprintf("expected height %v, got %v", height, blockEvent.Block.Height))
|
||||
t.Fatalf("expected height %v, got %v", height, blockEvent.Block.Height)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func ensureNewBlockHeader(blockCh <-chan tmpubsub.Message, height int64, blockHash tmbytes.HexBytes) {
|
||||
func ensureNewBlockHeader(t *testing.T, blockCh <-chan tmpubsub.Message, height int64, blockHash tmbytes.HexBytes) {
|
||||
t.Helper()
|
||||
select {
|
||||
case <-time.After(ensureTimeout):
|
||||
panic("Timeout expired while waiting for NewBlockHeader event")
|
||||
t.Fatalf("Timeout expired while waiting for NewBlockHeader event")
|
||||
case msg := <-blockCh:
|
||||
blockHeaderEvent, ok := msg.Data().(types.EventDataNewBlockHeader)
|
||||
if !ok {
|
||||
panic(fmt.Sprintf("expected a EventDataNewBlockHeader, got %T. Wrong subscription channel?",
|
||||
msg.Data()))
|
||||
t.Fatalf("expected a EventDataNewBlockHeader, got %T. Wrong subscription channel?",
|
||||
msg.Data())
|
||||
}
|
||||
if blockHeaderEvent.Header.Height != height {
|
||||
panic(fmt.Sprintf("expected height %v, got %v", height, blockHeaderEvent.Header.Height))
|
||||
t.Fatalf("expected height %v, got %v", height, blockHeaderEvent.Header.Height)
|
||||
}
|
||||
if !bytes.Equal(blockHeaderEvent.Header.Hash(), blockHash) {
|
||||
panic(fmt.Sprintf("expected header %X, got %X", blockHash, blockHeaderEvent.Header.Hash()))
|
||||
t.Fatalf("expected header %X, got %X", blockHash, blockHeaderEvent.Header.Hash())
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func ensureNewUnlock(unlockCh <-chan tmpubsub.Message, height int64, round int32) {
|
||||
ensureNewEvent(unlockCh, height, round, ensureTimeout,
|
||||
"Timeout expired while waiting for NewUnlock event")
|
||||
func ensureLock(t *testing.T, lockCh <-chan tmpubsub.Message, height int64, round int32) {
|
||||
t.Helper()
|
||||
ensureNewEvent(t, lockCh, height, round, ensureTimeout,
|
||||
"Timeout expired while waiting for LockValue event")
|
||||
}
|
||||
|
||||
func ensureProposal(proposalCh <-chan tmpubsub.Message, height int64, round int32, propID types.BlockID) {
|
||||
func ensureRelock(t *testing.T, relockCh <-chan tmpubsub.Message, height int64, round int32) {
|
||||
t.Helper()
|
||||
ensureNewEvent(t, relockCh, height, round, ensureTimeout,
|
||||
"Timeout expired while waiting for RelockValue event")
|
||||
}
|
||||
|
||||
func ensureProposal(t *testing.T, proposalCh <-chan tmpubsub.Message, height int64, round int32, propID types.BlockID) {
|
||||
t.Helper()
|
||||
select {
|
||||
case <-time.After(ensureTimeout):
|
||||
panic("Timeout expired while waiting for NewProposal event")
|
||||
t.Fatalf("Timeout expired while waiting for NewProposal event")
|
||||
case msg := <-proposalCh:
|
||||
proposalEvent, ok := msg.Data().(types.EventDataCompleteProposal)
|
||||
if !ok {
|
||||
panic(fmt.Sprintf("expected a EventDataCompleteProposal, got %T. Wrong subscription channel?",
|
||||
msg.Data()))
|
||||
t.Fatalf("expected a EventDataCompleteProposal, got %T. Wrong subscription channel?",
|
||||
msg.Data())
|
||||
}
|
||||
if proposalEvent.Height != height {
|
||||
panic(fmt.Sprintf("expected height %v, got %v", height, proposalEvent.Height))
|
||||
t.Fatalf("expected height %v, got %v", height, proposalEvent.Height)
|
||||
}
|
||||
if proposalEvent.Round != round {
|
||||
panic(fmt.Sprintf("expected round %v, got %v", round, proposalEvent.Round))
|
||||
t.Fatalf("expected round %v, got %v", round, proposalEvent.Round)
|
||||
}
|
||||
if !proposalEvent.BlockID.Equals(propID) {
|
||||
panic(fmt.Sprintf("Proposed block does not match expected block (%v != %v)", proposalEvent.BlockID, propID))
|
||||
t.Fatalf("Proposed block does not match expected block (%v != %v)", proposalEvent.BlockID, propID)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func ensurePrecommit(voteCh <-chan tmpubsub.Message, height int64, round int32) {
|
||||
ensureVote(voteCh, height, round, tmproto.PrecommitType)
|
||||
func ensurePrecommit(t *testing.T, voteCh <-chan tmpubsub.Message, height int64, round int32) {
|
||||
t.Helper()
|
||||
ensureVote(t, voteCh, height, round, tmproto.PrecommitType)
|
||||
}
|
||||
|
||||
func ensurePrevote(voteCh <-chan tmpubsub.Message, height int64, round int32) {
|
||||
ensureVote(voteCh, height, round, tmproto.PrevoteType)
|
||||
func ensurePrevote(t *testing.T, voteCh <-chan tmpubsub.Message, height int64, round int32) {
|
||||
t.Helper()
|
||||
ensureVote(t, voteCh, height, round, tmproto.PrevoteType)
|
||||
}
|
||||
|
||||
func ensureVote(voteCh <-chan tmpubsub.Message, height int64, round int32,
|
||||
func ensureVote(t *testing.T, voteCh <-chan tmpubsub.Message, height int64, round int32,
|
||||
voteType tmproto.SignedMsgType) {
|
||||
t.Helper()
|
||||
select {
|
||||
case <-time.After(ensureTimeout):
|
||||
panic("Timeout expired while waiting for NewVote event")
|
||||
t.Fatalf("Timeout expired while waiting for NewVote event")
|
||||
case msg := <-voteCh:
|
||||
voteEvent, ok := msg.Data().(types.EventDataVote)
|
||||
if !ok {
|
||||
panic(fmt.Sprintf("expected a EventDataVote, got %T. Wrong subscription channel?",
|
||||
msg.Data()))
|
||||
t.Fatalf("expected a EventDataVote, got %T. Wrong subscription channel?",
|
||||
msg.Data())
|
||||
}
|
||||
vote := voteEvent.Vote
|
||||
if vote.Height != height {
|
||||
panic(fmt.Sprintf("expected height %v, got %v", height, vote.Height))
|
||||
t.Fatalf("expected height %v, got %v", height, vote.Height)
|
||||
}
|
||||
if vote.Round != round {
|
||||
panic(fmt.Sprintf("expected round %v, got %v", round, vote.Round))
|
||||
t.Fatalf("expected round %v, got %v", round, vote.Round)
|
||||
}
|
||||
if vote.Type != voteType {
|
||||
panic(fmt.Sprintf("expected type %v, got %v", voteType, vote.Type))
|
||||
t.Fatalf("expected type %v, got %v", voteType, vote.Type)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func ensurePrecommitTimeout(ch <-chan tmpubsub.Message) {
|
||||
func ensurePrecommitTimeout(t *testing.T, ch <-chan tmpubsub.Message) {
|
||||
t.Helper()
|
||||
select {
|
||||
case <-time.After(ensureTimeout):
|
||||
panic("Timeout expired while waiting for the Precommit to Timeout")
|
||||
t.Fatalf("Timeout expired while waiting for the Precommit to Timeout")
|
||||
case <-ch:
|
||||
}
|
||||
}
|
||||
|
||||
func ensureNewEventOnChannel(ch <-chan tmpubsub.Message) {
|
||||
func ensureNewEventOnChannel(t *testing.T, ch <-chan tmpubsub.Message) {
|
||||
t.Helper()
|
||||
select {
|
||||
case <-time.After(ensureTimeout):
|
||||
panic("Timeout expired while waiting for new activity on the channel")
|
||||
t.Fatalf("Timeout expired while waiting for new activity on the channel")
|
||||
case <-ch:
|
||||
}
|
||||
}
|
||||
@@ -711,6 +737,7 @@ func randConsensusState(
|
||||
appFunc func() abci.Application,
|
||||
configOpts ...func(*cfg.Config),
|
||||
) ([]*State, cleanupFunc) {
|
||||
t.Helper()
|
||||
|
||||
genDoc, privVals := factory.RandGenesisDoc(config, nValidators, false, 30)
|
||||
css := make([]*State, nValidators)
|
||||
@@ -731,7 +758,7 @@ func randConsensusState(
|
||||
opt(thisConfig)
|
||||
}
|
||||
|
||||
ensureDir(filepath.Dir(thisConfig.Consensus.WalFile()), 0700) // dir for wal
|
||||
ensureDir(t, filepath.Dir(thisConfig.Consensus.WalFile()), 0700) // dir for wal
|
||||
|
||||
app := appFunc()
|
||||
|
||||
@@ -759,6 +786,7 @@ func randConsensusState(
|
||||
|
||||
// nPeers = nValidators + nNotValidator
|
||||
func randConsensusNetWithPeers(
|
||||
t *testing.T,
|
||||
config *cfg.Config,
|
||||
nValidators,
|
||||
nPeers int,
|
||||
@@ -768,6 +796,7 @@ func randConsensusNetWithPeers(
|
||||
) ([]*State, *types.GenesisDoc, *cfg.Config, cleanupFunc) {
|
||||
genDoc, privVals := factory.RandGenesisDoc(config, nValidators, false, testMinPower)
|
||||
css := make([]*State, nPeers)
|
||||
t.Helper()
|
||||
logger := consensusLogger()
|
||||
|
||||
var peer0Config *cfg.Config
|
||||
@@ -776,7 +805,7 @@ func randConsensusNetWithPeers(
|
||||
state, _ := sm.MakeGenesisState(genDoc)
|
||||
thisConfig := ResetConfig(fmt.Sprintf("%s_%d", testName, i))
|
||||
configRootDirs = append(configRootDirs, thisConfig.RootDir)
|
||||
ensureDir(filepath.Dir(thisConfig.Consensus.WalFile()), 0700) // dir for wal
|
||||
ensureDir(t, filepath.Dir(thisConfig.Consensus.WalFile()), 0700) // dir for wal
|
||||
if i == 0 {
|
||||
peer0Config = thisConfig
|
||||
}
|
||||
@@ -786,16 +815,16 @@ func randConsensusNetWithPeers(
|
||||
} else {
|
||||
tempKeyFile, err := ioutil.TempFile("", "priv_validator_key_")
|
||||
if err != nil {
|
||||
panic(err)
|
||||
t.Fatalf("error creating temp file for validator key: %s", err)
|
||||
}
|
||||
tempStateFile, err := ioutil.TempFile("", "priv_validator_state_")
|
||||
if err != nil {
|
||||
panic(err)
|
||||
t.Fatalf("error loading validator state: %s", err)
|
||||
}
|
||||
|
||||
privVal, err = privval.GenFilePV(tempKeyFile.Name(), tempStateFile.Name(), "")
|
||||
if err != nil {
|
||||
panic(err)
|
||||
t.Fatalf("error generating validator key: %s", err)
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -40,12 +40,12 @@ func TestMempoolNoProgressUntilTxsAvailable(t *testing.T) {
|
||||
newBlockCh := subscribe(cs.eventBus, types.EventQueryNewBlock)
|
||||
startTestRound(cs, height, round)
|
||||
|
||||
ensureNewEventOnChannel(newBlockCh) // first block gets committed
|
||||
ensureNoNewEventOnChannel(newBlockCh)
|
||||
ensureNewEventOnChannel(t, newBlockCh) // first block gets committed
|
||||
ensureNoNewEventOnChannel(t, newBlockCh)
|
||||
deliverTxsRange(cs, 0, 1)
|
||||
ensureNewEventOnChannel(newBlockCh) // commit txs
|
||||
ensureNewEventOnChannel(newBlockCh) // commit updated app hash
|
||||
ensureNoNewEventOnChannel(newBlockCh)
|
||||
ensureNewEventOnChannel(t, newBlockCh) // commit txs
|
||||
ensureNewEventOnChannel(t, newBlockCh) // commit updated app hash
|
||||
ensureNoNewEventOnChannel(t, newBlockCh)
|
||||
}
|
||||
|
||||
func TestMempoolProgressAfterCreateEmptyBlocksInterval(t *testing.T) {
|
||||
@@ -63,9 +63,9 @@ func TestMempoolProgressAfterCreateEmptyBlocksInterval(t *testing.T) {
|
||||
newBlockCh := subscribe(cs.eventBus, types.EventQueryNewBlock)
|
||||
startTestRound(cs, cs.Height, cs.Round)
|
||||
|
||||
ensureNewEventOnChannel(newBlockCh) // first block gets committed
|
||||
ensureNoNewEventOnChannel(newBlockCh) // then we dont make a block ...
|
||||
ensureNewEventOnChannel(newBlockCh) // until the CreateEmptyBlocksInterval has passed
|
||||
ensureNewEventOnChannel(t, newBlockCh) // first block gets committed
|
||||
ensureNoNewEventOnChannel(t, newBlockCh) // then we dont make a block ...
|
||||
ensureNewEventOnChannel(t, newBlockCh) // until the CreateEmptyBlocksInterval has passed
|
||||
}
|
||||
|
||||
func TestMempoolProgressInHigherRound(t *testing.T) {
|
||||
@@ -93,19 +93,19 @@ func TestMempoolProgressInHigherRound(t *testing.T) {
|
||||
}
|
||||
startTestRound(cs, height, round)
|
||||
|
||||
ensureNewRound(newRoundCh, height, round) // first round at first height
|
||||
ensureNewEventOnChannel(newBlockCh) // first block gets committed
|
||||
ensureNewRound(t, newRoundCh, height, round) // first round at first height
|
||||
ensureNewEventOnChannel(t, newBlockCh) // first block gets committed
|
||||
|
||||
height++ // moving to the next height
|
||||
round = 0
|
||||
|
||||
ensureNewRound(newRoundCh, height, round) // first round at next height
|
||||
deliverTxsRange(cs, 0, 1) // we deliver txs, but dont set a proposal so we get the next round
|
||||
ensureNewTimeout(timeoutCh, height, round, cs.config.TimeoutPropose.Nanoseconds())
|
||||
ensureNewRound(t, newRoundCh, height, round) // first round at next height
|
||||
deliverTxsRange(cs, 0, 1) // we deliver txs, but dont set a proposal so we get the next round
|
||||
ensureNewTimeout(t, timeoutCh, height, round, cs.config.TimeoutPropose.Nanoseconds())
|
||||
|
||||
round++ // moving to the next round
|
||||
ensureNewRound(newRoundCh, height, round) // wait for the next round
|
||||
ensureNewEventOnChannel(newBlockCh) // now we can commit the block
|
||||
round++ // moving to the next round
|
||||
ensureNewRound(t, newRoundCh, height, round) // wait for the next round
|
||||
ensureNewEventOnChannel(t, newBlockCh) // now we can commit the block
|
||||
}
|
||||
|
||||
func deliverTxsRange(cs *State, start, end int) {
|
||||
|
||||
@@ -336,7 +336,7 @@ func TestReactorWithEvidence(t *testing.T) {
|
||||
|
||||
defer os.RemoveAll(thisConfig.RootDir)
|
||||
|
||||
ensureDir(path.Dir(thisConfig.Consensus.WalFile()), 0700) // dir for wal
|
||||
ensureDir(t, path.Dir(thisConfig.Consensus.WalFile()), 0700) // dir for wal
|
||||
app := appFunc()
|
||||
vals := types.TM2PB.ValidatorUpdates(state.Validators)
|
||||
app.InitChain(abci.RequestInitChain{Validators: vals})
|
||||
@@ -627,6 +627,7 @@ func TestReactorValidatorSetChanges(t *testing.T) {
|
||||
nPeers := 7
|
||||
nVals := 4
|
||||
states, _, _, cleanup := randConsensusNetWithPeers(
|
||||
t,
|
||||
config,
|
||||
nVals,
|
||||
nPeers,
|
||||
|
||||
@@ -58,7 +58,7 @@ func startNewStateAndWaitForBlock(t *testing.T, consensusReplayConfig *cfg.Confi
|
||||
logger := log.TestingLogger()
|
||||
state, err := sm.MakeGenesisStateFromFile(consensusReplayConfig.GenesisFile())
|
||||
require.NoError(t, err)
|
||||
privValidator := loadPrivValidator(consensusReplayConfig)
|
||||
privValidator := loadPrivValidator(t, consensusReplayConfig)
|
||||
blockStore := store.NewBlockStore(dbm.NewMemDB())
|
||||
cs := newStateWithConfigAndBlockStore(
|
||||
consensusReplayConfig,
|
||||
@@ -154,7 +154,7 @@ LOOP:
|
||||
blockStore := store.NewBlockStore(blockDB)
|
||||
state, err := sm.MakeGenesisStateFromFile(consensusReplayConfig.GenesisFile())
|
||||
require.NoError(t, err)
|
||||
privValidator := loadPrivValidator(consensusReplayConfig)
|
||||
privValidator := loadPrivValidator(t, consensusReplayConfig)
|
||||
cs := newStateWithConfigAndBlockStore(
|
||||
consensusReplayConfig,
|
||||
state,
|
||||
@@ -321,6 +321,7 @@ func setupSimulator(t *testing.T) *simulatorTestSuite {
|
||||
nVals := 4
|
||||
|
||||
css, genDoc, config, cleanup := randConsensusNetWithPeers(
|
||||
t,
|
||||
config,
|
||||
nVals,
|
||||
nPeers,
|
||||
@@ -345,15 +346,15 @@ func setupSimulator(t *testing.T) *simulatorTestSuite {
|
||||
// start the machine
|
||||
startTestRound(css[0], height, round)
|
||||
incrementHeight(vss...)
|
||||
ensureNewRound(newRoundCh, height, 0)
|
||||
ensureNewProposal(proposalCh, height, round)
|
||||
ensureNewRound(t, newRoundCh, height, 0)
|
||||
ensureNewProposal(t, proposalCh, height, round)
|
||||
rs := css[0].GetRoundState()
|
||||
|
||||
signAddVotes(sim.Config, css[0], tmproto.PrecommitType,
|
||||
rs.ProposalBlock.Hash(), rs.ProposalBlockParts.Header(),
|
||||
vss[1:nVals]...)
|
||||
|
||||
ensureNewRound(newRoundCh, height+1, 0)
|
||||
ensureNewRound(t, newRoundCh, height+1, 0)
|
||||
|
||||
// HEIGHT 2
|
||||
height++
|
||||
@@ -380,12 +381,12 @@ func setupSimulator(t *testing.T) *simulatorTestSuite {
|
||||
if err := css[0].SetProposalAndBlock(proposal, propBlock, propBlockParts, "some peer"); err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
ensureNewProposal(proposalCh, height, round)
|
||||
ensureNewProposal(t, proposalCh, height, round)
|
||||
rs = css[0].GetRoundState()
|
||||
signAddVotes(sim.Config, css[0], tmproto.PrecommitType,
|
||||
rs.ProposalBlock.Hash(), rs.ProposalBlockParts.Header(),
|
||||
vss[1:nVals]...)
|
||||
ensureNewRound(newRoundCh, height+1, 0)
|
||||
ensureNewRound(t, newRoundCh, height+1, 0)
|
||||
|
||||
// HEIGHT 3
|
||||
height++
|
||||
@@ -412,12 +413,12 @@ func setupSimulator(t *testing.T) *simulatorTestSuite {
|
||||
if err := css[0].SetProposalAndBlock(proposal, propBlock, propBlockParts, "some peer"); err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
ensureNewProposal(proposalCh, height, round)
|
||||
ensureNewProposal(t, proposalCh, height, round)
|
||||
rs = css[0].GetRoundState()
|
||||
signAddVotes(sim.Config, css[0], tmproto.PrecommitType,
|
||||
rs.ProposalBlock.Hash(), rs.ProposalBlockParts.Header(),
|
||||
vss[1:nVals]...)
|
||||
ensureNewRound(newRoundCh, height+1, 0)
|
||||
ensureNewRound(t, newRoundCh, height+1, 0)
|
||||
|
||||
// HEIGHT 4
|
||||
height++
|
||||
@@ -471,7 +472,7 @@ func setupSimulator(t *testing.T) *simulatorTestSuite {
|
||||
if err := css[0].SetProposalAndBlock(proposal, propBlock, propBlockParts, "some peer"); err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
ensureNewProposal(proposalCh, height, round)
|
||||
ensureNewProposal(t, proposalCh, height, round)
|
||||
|
||||
removeValidatorTx2 := kvstore.MakeValSetChangeTx(newVal2ABCI, 0)
|
||||
err = assertMempool(css[0].txNotifier).CheckTx(context.Background(), removeValidatorTx2, nil, mempl.TxInfo{})
|
||||
@@ -487,7 +488,7 @@ func setupSimulator(t *testing.T) *simulatorTestSuite {
|
||||
rs.ProposalBlockParts.Header(), newVss[i])
|
||||
}
|
||||
|
||||
ensureNewRound(newRoundCh, height+1, 0)
|
||||
ensureNewRound(t, newRoundCh, height+1, 0)
|
||||
|
||||
// HEIGHT 5
|
||||
height++
|
||||
@@ -497,7 +498,7 @@ func setupSimulator(t *testing.T) *simulatorTestSuite {
|
||||
newVss[newVssIdx].VotingPower = 25
|
||||
sort.Sort(ValidatorStubsByPower(newVss))
|
||||
selfIndex = valIndexFn(0)
|
||||
ensureNewProposal(proposalCh, height, round)
|
||||
ensureNewProposal(t, proposalCh, height, round)
|
||||
rs = css[0].GetRoundState()
|
||||
for i := 0; i < nVals+1; i++ {
|
||||
if i == selfIndex {
|
||||
@@ -507,7 +508,7 @@ func setupSimulator(t *testing.T) *simulatorTestSuite {
|
||||
tmproto.PrecommitType, rs.ProposalBlock.Hash(),
|
||||
rs.ProposalBlockParts.Header(), newVss[i])
|
||||
}
|
||||
ensureNewRound(newRoundCh, height+1, 0)
|
||||
ensureNewRound(t, newRoundCh, height+1, 0)
|
||||
|
||||
// HEIGHT 6
|
||||
height++
|
||||
@@ -534,7 +535,7 @@ func setupSimulator(t *testing.T) *simulatorTestSuite {
|
||||
if err := css[0].SetProposalAndBlock(proposal, propBlock, propBlockParts, "some peer"); err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
ensureNewProposal(proposalCh, height, round)
|
||||
ensureNewProposal(t, proposalCh, height, round)
|
||||
rs = css[0].GetRoundState()
|
||||
for i := 0; i < nVals+3; i++ {
|
||||
if i == selfIndex {
|
||||
@@ -544,7 +545,7 @@ func setupSimulator(t *testing.T) *simulatorTestSuite {
|
||||
tmproto.PrecommitType, rs.ProposalBlock.Hash(),
|
||||
rs.ProposalBlockParts.Header(), newVss[i])
|
||||
}
|
||||
ensureNewRound(newRoundCh, height+1, 0)
|
||||
ensureNewRound(t, newRoundCh, height+1, 0)
|
||||
|
||||
sim.Chain = make([]*types.Block, 0)
|
||||
sim.Commits = make([]*types.Commit, 0)
|
||||
|
||||
@@ -137,7 +137,7 @@ type State struct {
|
||||
done chan struct{}
|
||||
|
||||
// synchronous pubsub between consensus state and reactor.
|
||||
// state only emits EventNewRoundStep and EventVote
|
||||
// state only emits EventNewRoundStep, EventValidBlock, and EventVote
|
||||
evsw tmevents.EventSwitch
|
||||
|
||||
// for reporting metrics
|
||||
@@ -1265,8 +1265,11 @@ func (cs *State) createProposalBlock() (block *types.Block, blockParts *types.Pa
|
||||
|
||||
// Enter: `timeoutPropose` after entering Propose.
|
||||
// Enter: proposal block and POL is ready.
|
||||
// Prevote for LockedBlock if we're locked, or ProposalBlock if valid.
|
||||
// Otherwise vote nil.
|
||||
// If we received a valid proposal within this round and we are not locked on a block,
|
||||
// we will prevote for block.
|
||||
// Otherwise, if we receive a valid proposal that matches the block we are
|
||||
// locked on or matches a block that received a POL in a round later than our
|
||||
// locked round, prevote for the proposal, otherwise vote nil.
|
||||
func (cs *State) enterPrevote(height int64, round int32) {
|
||||
logger := cs.Logger.With("height", height, "round", round)
|
||||
|
||||
@@ -1296,14 +1299,7 @@ func (cs *State) enterPrevote(height int64, round int32) {
|
||||
func (cs *State) defaultDoPrevote(height int64, round int32) {
|
||||
logger := cs.Logger.With("height", height, "round", round)
|
||||
|
||||
// If a block is locked, prevote that.
|
||||
if cs.LockedBlock != nil {
|
||||
logger.Debug("prevote step; already locked on a block; prevoting locked block")
|
||||
cs.signAddVote(tmproto.PrevoteType, cs.LockedBlock.Hash(), cs.LockedBlockParts.Header())
|
||||
return
|
||||
}
|
||||
|
||||
// If ProposalBlock is nil, prevote nil.
|
||||
// We did not receive a proposal within this round. (and thus executing this from a timeout)
|
||||
if cs.ProposalBlock == nil {
|
||||
logger.Debug("prevote step: ProposalBlock is nil")
|
||||
cs.signAddVote(tmproto.PrevoteType, nil, types.PartSetHeader{})
|
||||
@@ -1319,11 +1315,67 @@ func (cs *State) defaultDoPrevote(height int64, round int32) {
|
||||
return
|
||||
}
|
||||
|
||||
// Prevote cs.ProposalBlock
|
||||
// NOTE: the proposal signature is validated when it is received,
|
||||
// and the proposal block parts are validated as they are received (against the merkle hash in the proposal)
|
||||
logger.Debug("prevote step: ProposalBlock is valid")
|
||||
cs.signAddVote(tmproto.PrevoteType, cs.ProposalBlock.Hash(), cs.ProposalBlockParts.Header())
|
||||
/*
|
||||
22: upon <PROPOSAL, h_p, round_p, v, −1> from proposer(h_p, round_p) while step_p = propose do
|
||||
23: if valid(v) && (lockedRound_p = −1 || lockedValue_p = v) then
|
||||
24: broadcast <PREVOTE, h_p, round_p, id(v)>
|
||||
|
||||
Here, cs.Proposal.POLRound corresponds to the -1 in the above algorithm rule.
|
||||
This means that the proposer is producing a new proposal that has not previously
|
||||
seen a 2/3 majority by the network.
|
||||
|
||||
If we have already locked on a different value that is different from the proposed value,
|
||||
we prevote nil since we are locked on a different value. Otherwise, if we're not locked on a block
|
||||
or the proposal matches our locked block, we prevote the proposal.
|
||||
*/
|
||||
if cs.Proposal.POLRound == -1 {
|
||||
if cs.LockedRound == -1 {
|
||||
logger.Debug("prevote step: ProposalBlock is valid and there is no locked block; prevoting the proposal")
|
||||
cs.signAddVote(tmproto.PrevoteType, cs.ProposalBlock.Hash(), cs.ProposalBlockParts.Header())
|
||||
return
|
||||
}
|
||||
if cs.ProposalBlock.HashesTo(cs.LockedBlock.Hash()) {
|
||||
logger.Debug("prevote step: ProposalBlock is valid and matches our locked block; prevoting the proposal")
|
||||
cs.signAddVote(tmproto.PrevoteType, cs.ProposalBlock.Hash(), cs.ProposalBlockParts.Header())
|
||||
return
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
28: upon <PROPOSAL, h_p, round_p, v, v_r> from proposer(h_p, round_p) AND 2f + 1 <PREVOTE, h_p, v_r, id(v)> while
|
||||
step_p = propose && (v_r ≥ 0 && v_r < round_p) do
|
||||
29: if valid(v) && (lockedRound_p ≤ v_r || lockedValue_p = v) then
|
||||
30: broadcast <PREVOTE, h_p, round_p, id(v)>
|
||||
|
||||
This rule is a bit confusing but breaks down as follows:
|
||||
|
||||
If we see a proposal in the current round for value 'v' that lists its valid round as 'v_r'
|
||||
AND this validator saw a 2/3 majority of the voting power prevote 'v' in round 'v_r', then we will
|
||||
issue a prevote for 'v' in this round if 'v' is valid and either matches our locked value OR
|
||||
'v_r' is a round greater than or equal to our current locked round.
|
||||
|
||||
'v_r' can be a round greater than to our current locked round if a 2/3 majority of
|
||||
the network prevoted a value in round 'v_r' but we did not lock on it, possibly because we
|
||||
missed the proposal in round 'v_r'.
|
||||
*/
|
||||
blockID, ok := cs.Votes.Prevotes(cs.Proposal.POLRound).TwoThirdsMajority()
|
||||
if ok && cs.ProposalBlock.HashesTo(blockID.Hash) && cs.Proposal.POLRound >= 0 && cs.Proposal.POLRound < cs.Round {
|
||||
if cs.LockedRound <= cs.Proposal.POLRound {
|
||||
logger.Debug("prevote step: ProposalBlock is valid and received a 2/3" +
|
||||
"majority in a round later than the locked round; prevoting the proposal")
|
||||
cs.signAddVote(tmproto.PrevoteType, cs.ProposalBlock.Hash(), cs.ProposalBlockParts.Header())
|
||||
return
|
||||
}
|
||||
if cs.ProposalBlock.HashesTo(cs.LockedBlock.Hash()) {
|
||||
logger.Debug("prevote step: ProposalBlock is valid and matches our locked block; prevoting the proposal")
|
||||
cs.signAddVote(tmproto.PrevoteType, cs.ProposalBlock.Hash(), cs.ProposalBlockParts.Header())
|
||||
return
|
||||
}
|
||||
}
|
||||
|
||||
logger.Debug("prevote step: ProposalBlock is valid but was not our locked block or" +
|
||||
"did not receive a more recent majority; prevoting nil")
|
||||
cs.signAddVote(tmproto.PrevoteType, nil, types.PartSetHeader{})
|
||||
}
|
||||
|
||||
// Enter: any +2/3 prevotes at next round.
|
||||
@@ -1361,7 +1413,6 @@ func (cs *State) enterPrevoteWait(height int64, round int32) {
|
||||
// Enter: `timeoutPrecommit` after any +2/3 precommits.
|
||||
// Enter: +2/3 precomits for block or nil.
|
||||
// Lock & precommit the ProposalBlock if we have enough prevotes for it (a POL in this round)
|
||||
// else, unlock an existing lock and precommit nil if +2/3 of prevotes were nil,
|
||||
// else, precommit nil otherwise.
|
||||
func (cs *State) enterPrecommit(height int64, round int32) {
|
||||
logger := cs.Logger.With("height", height, "round", round)
|
||||
@@ -1408,21 +1459,9 @@ func (cs *State) enterPrecommit(height int64, round int32) {
|
||||
panic(fmt.Sprintf("this POLRound should be %v but got %v", round, polRound))
|
||||
}
|
||||
|
||||
// +2/3 prevoted nil. Unlock and precommit nil.
|
||||
if len(blockID.Hash) == 0 {
|
||||
if cs.LockedBlock == nil {
|
||||
logger.Debug("precommit step; +2/3 prevoted for nil")
|
||||
} else {
|
||||
logger.Debug("precommit step; +2/3 prevoted for nil; unlocking")
|
||||
cs.LockedRound = -1
|
||||
cs.LockedBlock = nil
|
||||
cs.LockedBlockParts = nil
|
||||
|
||||
if err := cs.eventBus.PublishEventUnlock(cs.RoundStateEvent()); err != nil {
|
||||
logger.Error("failed publishing event unlock", "err", err)
|
||||
}
|
||||
}
|
||||
|
||||
// +2/3 prevoted nil. Precommit nil.
|
||||
if blockID.IsNil() {
|
||||
logger.Debug("precommit step; +2/3 prevoted for nil")
|
||||
cs.signAddVote(tmproto.PrecommitType, nil, types.PartSetHeader{})
|
||||
return
|
||||
}
|
||||
@@ -1442,7 +1481,9 @@ func (cs *State) enterPrecommit(height int64, round int32) {
|
||||
return
|
||||
}
|
||||
|
||||
// If +2/3 prevoted for proposal block, stage and precommit it
|
||||
// If greater than 2/3 of the voting power on the network prevoted for
|
||||
// the proposed block, update our locked block to this block and issue a
|
||||
// precommit vote for it.
|
||||
if cs.ProposalBlock.HashesTo(blockID.Hash) {
|
||||
logger.Debug("precommit step; +2/3 prevoted proposal block; locking", "hash", blockID.Hash)
|
||||
|
||||
@@ -1464,23 +1505,14 @@ func (cs *State) enterPrecommit(height int64, round int32) {
|
||||
}
|
||||
|
||||
// There was a polka in this round for a block we don't have.
|
||||
// Fetch that block, unlock, and precommit nil.
|
||||
// The +2/3 prevotes for this round is the POL for our unlock.
|
||||
// Fetch that block, and precommit nil.
|
||||
logger.Debug("precommit step; +2/3 prevotes for a block we do not have; voting nil", "block_id", blockID)
|
||||
|
||||
cs.LockedRound = -1
|
||||
cs.LockedBlock = nil
|
||||
cs.LockedBlockParts = nil
|
||||
|
||||
if !cs.ProposalBlockParts.HasHeader(blockID.PartSetHeader) {
|
||||
cs.ProposalBlock = nil
|
||||
cs.ProposalBlockParts = types.NewPartSetFromHeader(blockID.PartSetHeader)
|
||||
}
|
||||
|
||||
if err := cs.eventBus.PublishEventUnlock(cs.RoundStateEvent()); err != nil {
|
||||
logger.Error("failed publishing event unlock", "err", err)
|
||||
}
|
||||
|
||||
cs.signAddVote(tmproto.PrecommitType, nil, types.PartSetHeader{})
|
||||
}
|
||||
|
||||
@@ -1588,7 +1620,7 @@ func (cs *State) tryFinalizeCommit(height int64) {
|
||||
}
|
||||
|
||||
blockID, ok := cs.Votes.Precommits(cs.CommitRound).TwoThirdsMajority()
|
||||
if !ok || len(blockID.Hash) == 0 {
|
||||
if !ok || blockID.IsNil() {
|
||||
logger.Error("failed attempt to finalize commit; there was no +2/3 majority or +2/3 was for nil")
|
||||
return
|
||||
}
|
||||
@@ -1921,7 +1953,7 @@ func (cs *State) addProposalBlockPart(msg *BlockPartMessage, peerID types.NodeID
|
||||
// Update Valid* if we can.
|
||||
prevotes := cs.Votes.Prevotes(cs.Round)
|
||||
blockID, hasTwoThirds := prevotes.TwoThirdsMajority()
|
||||
if hasTwoThirds && !blockID.IsZero() && (cs.ValidRound < cs.Round) {
|
||||
if hasTwoThirds && !blockID.IsNil() && (cs.ValidRound < cs.Round) {
|
||||
if cs.ProposalBlock.HashesTo(blockID.Hash) {
|
||||
cs.Logger.Debug(
|
||||
"updating valid block to new proposal block",
|
||||
@@ -2070,33 +2102,13 @@ func (cs *State) addVote(vote *types.Vote, peerID types.NodeID) (added bool, err
|
||||
prevotes := cs.Votes.Prevotes(vote.Round)
|
||||
cs.Logger.Debug("added vote to prevote", "vote", vote, "prevotes", prevotes.StringShort())
|
||||
|
||||
// If +2/3 prevotes for a block or nil for *any* round:
|
||||
if blockID, ok := prevotes.TwoThirdsMajority(); ok {
|
||||
// There was a polka!
|
||||
// If we're locked but this is a recent polka, unlock.
|
||||
// If it matches our ProposalBlock, update the ValidBlock
|
||||
|
||||
// Unlock if `cs.LockedRound < vote.Round <= cs.Round`
|
||||
// NOTE: If vote.Round > cs.Round, we'll deal with it when we get to vote.Round
|
||||
if (cs.LockedBlock != nil) &&
|
||||
(cs.LockedRound < vote.Round) &&
|
||||
(vote.Round <= cs.Round) &&
|
||||
!cs.LockedBlock.HashesTo(blockID.Hash) {
|
||||
|
||||
cs.Logger.Debug("unlocking because of POL", "locked_round", cs.LockedRound, "pol_round", vote.Round)
|
||||
|
||||
cs.LockedRound = -1
|
||||
cs.LockedBlock = nil
|
||||
cs.LockedBlockParts = nil
|
||||
|
||||
if err := cs.eventBus.PublishEventUnlock(cs.RoundStateEvent()); err != nil {
|
||||
return added, err
|
||||
}
|
||||
}
|
||||
// Check to see if >2/3 of the voting power on the network voted for any non-nil block.
|
||||
if blockID, ok := prevotes.TwoThirdsMajority(); ok && !blockID.IsNil() {
|
||||
// Greater than 2/3 of the voting power on the network voted for some
|
||||
// non-nil block
|
||||
|
||||
// Update Valid* if we can.
|
||||
// NOTE: our proposal block may be nil or not what received a polka..
|
||||
if len(blockID.Hash) != 0 && (cs.ValidRound < vote.Round) && (vote.Round == cs.Round) {
|
||||
if cs.ValidRound < vote.Round && vote.Round == cs.Round {
|
||||
if cs.ProposalBlock.HashesTo(blockID.Hash) {
|
||||
cs.Logger.Debug("updating valid block because of POL", "valid_round", cs.ValidRound, "pol_round", vote.Round)
|
||||
cs.ValidRound = vote.Round
|
||||
@@ -2132,7 +2144,7 @@ func (cs *State) addVote(vote *types.Vote, peerID types.NodeID) (added bool, err
|
||||
|
||||
case cs.Round == vote.Round && cstypes.RoundStepPrevote <= cs.Step: // current round
|
||||
blockID, ok := prevotes.TwoThirdsMajority()
|
||||
if ok && (cs.isProposalComplete() || len(blockID.Hash) == 0) {
|
||||
if ok && (cs.isProposalComplete() || blockID.IsNil()) {
|
||||
cs.enterPrecommit(height, vote.Round)
|
||||
} else if prevotes.HasTwoThirdsAny() {
|
||||
cs.enterPrevoteWait(height, vote.Round)
|
||||
@@ -2160,7 +2172,7 @@ func (cs *State) addVote(vote *types.Vote, peerID types.NodeID) (added bool, err
|
||||
cs.enterNewRound(height, vote.Round)
|
||||
cs.enterPrecommit(height, vote.Round)
|
||||
|
||||
if len(blockID.Hash) != 0 {
|
||||
if !blockID.IsNil() {
|
||||
cs.enterCommit(height, vote.Round)
|
||||
if cs.config.SkipTimeoutCommit && precommits.HasAll() {
|
||||
cs.enterNewRound(cs.Height, 0)
|
||||
@@ -2400,3 +2412,13 @@ func repairWalFile(src, dst string) error {
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
// proposalTimeout Header.Time + 2*ACCURACY + MSGDELAY
|
||||
func proposalWaitingTime(lt tmtime.Source, h types.Header, tp types.TimestampParams) time.Duration {
|
||||
t := lt.Now()
|
||||
wt := h.Time.Add(2 * tp.Accuracy).Add(tp.MsgDelay)
|
||||
if t.After(wt) {
|
||||
return 0
|
||||
}
|
||||
return wt.Sub(t)
|
||||
}
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -55,7 +55,7 @@ func MakeHeader(h *types.Header) (*types.Header, error) {
|
||||
if h.Height == 0 {
|
||||
h.Height = 1
|
||||
}
|
||||
if h.LastBlockID.IsZero() {
|
||||
if h.LastBlockID.IsNil() {
|
||||
h.LastBlockID = MakeBlockID()
|
||||
}
|
||||
if h.ChainID == "" {
|
||||
|
||||
28
libs/time/mocks/source.go
Normal file
28
libs/time/mocks/source.go
Normal file
@@ -0,0 +1,28 @@
|
||||
// Code generated by mockery. DO NOT EDIT.
|
||||
|
||||
package mocks
|
||||
|
||||
import (
|
||||
time "time"
|
||||
|
||||
mock "github.com/stretchr/testify/mock"
|
||||
)
|
||||
|
||||
// Source is an autogenerated mock type for the Source type
|
||||
type Source struct {
|
||||
mock.Mock
|
||||
}
|
||||
|
||||
// Now provides a mock function with given fields:
|
||||
func (_m *Source) Now() time.Time {
|
||||
ret := _m.Called()
|
||||
|
||||
var r0 time.Time
|
||||
if rf, ok := ret.Get(0).(func() time.Time); ok {
|
||||
r0 = rf()
|
||||
} else {
|
||||
r0 = ret.Get(0).(time.Time)
|
||||
}
|
||||
|
||||
return r0
|
||||
}
|
||||
@@ -15,3 +15,17 @@ func Now() time.Time {
|
||||
func Canonical(t time.Time) time.Time {
|
||||
return t.Round(0).UTC()
|
||||
}
|
||||
|
||||
//go:generate ../../scripts/mockery_generate.sh Source
|
||||
|
||||
// Source is an interface that defines a way to fetch the current time.
|
||||
type Source interface {
|
||||
Now() time.Time
|
||||
}
|
||||
|
||||
// DefaultSource implements the Source interface using the system clock provided by the standard library.
|
||||
type DefaultSource struct{}
|
||||
|
||||
func (DefaultSource) Now() time.Time {
|
||||
return Now()
|
||||
}
|
||||
|
||||
@@ -379,6 +379,7 @@ func (c *Client) Update(ctx context.Context, now time.Time) (*types.LightBlock,
|
||||
return nil, err
|
||||
}
|
||||
|
||||
// If there is a new light block then verify it
|
||||
if latestBlock.Height > lastTrustedHeight {
|
||||
err = c.verifyLightBlock(ctx, latestBlock, now)
|
||||
if err != nil {
|
||||
@@ -388,7 +389,8 @@ func (c *Client) Update(ctx context.Context, now time.Time) (*types.LightBlock,
|
||||
return latestBlock, nil
|
||||
}
|
||||
|
||||
return nil, nil
|
||||
// else return the latestTrustedBlock
|
||||
return c.latestTrustedBlock, nil
|
||||
}
|
||||
|
||||
// VerifyLightBlockAtHeight fetches the light block at the given height
|
||||
|
||||
@@ -644,7 +644,7 @@ func TestClientReplacesPrimaryWithWitnessIfPrimaryIsUnavailable(t *testing.T) {
|
||||
chainID,
|
||||
trustOptions,
|
||||
mockDeadNode,
|
||||
[]provider.Provider{mockFullNode, mockFullNode},
|
||||
[]provider.Provider{mockDeadNode, mockFullNode},
|
||||
dbs.New(dbm.NewMemDB()),
|
||||
light.Logger(log.TestingLogger()),
|
||||
)
|
||||
@@ -663,6 +663,32 @@ func TestClientReplacesPrimaryWithWitnessIfPrimaryIsUnavailable(t *testing.T) {
|
||||
mockFullNode.AssertExpectations(t)
|
||||
}
|
||||
|
||||
func TestClientReplacesPrimaryWithWitnessIfPrimaryDoesntHaveBlock(t *testing.T) {
|
||||
mockFullNode := &provider_mocks.Provider{}
|
||||
mockFullNode.On("LightBlock", mock.Anything, mock.Anything).Return(l1, nil)
|
||||
|
||||
mockDeadNode := &provider_mocks.Provider{}
|
||||
mockDeadNode.On("LightBlock", mock.Anything, mock.Anything).Return(nil, provider.ErrLightBlockNotFound)
|
||||
c, err := light.NewClient(
|
||||
ctx,
|
||||
chainID,
|
||||
trustOptions,
|
||||
mockDeadNode,
|
||||
[]provider.Provider{mockDeadNode, mockFullNode},
|
||||
dbs.New(dbm.NewMemDB()),
|
||||
light.Logger(log.TestingLogger()),
|
||||
)
|
||||
require.NoError(t, err)
|
||||
_, err = c.Update(ctx, bTime.Add(2*time.Hour))
|
||||
require.NoError(t, err)
|
||||
|
||||
// we should still have the dead node as a witness because it
|
||||
// hasn't repeatedly been unresponsive yet
|
||||
assert.Equal(t, 2, len(c.Witnesses()))
|
||||
mockDeadNode.AssertExpectations(t)
|
||||
mockFullNode.AssertExpectations(t)
|
||||
}
|
||||
|
||||
func TestClient_BackwardsVerification(t *testing.T) {
|
||||
{
|
||||
headers, vals, _ := genLightBlocksWithKeys(chainID, 9, 3, 0, bTime)
|
||||
|
||||
@@ -702,7 +702,11 @@ func (n *nodeImpl) OnStart() error {
|
||||
n.Logger.Info("starting state sync")
|
||||
state, err := n.stateSyncReactor.Sync(context.TODO())
|
||||
if err != nil {
|
||||
n.Logger.Error("state sync failed", "err", err)
|
||||
n.Logger.Error("state sync failed; shutting down this node", "err", err)
|
||||
// stop the node
|
||||
if err := n.Stop(); err != nil {
|
||||
n.Logger.Error("failed to shut down node", "err", err)
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
|
||||
@@ -601,6 +601,32 @@ paths:
|
||||
application/json:
|
||||
schema:
|
||||
$ref: "#/components/schemas/ErrorResponse"
|
||||
/unsafe_flush_mempool:
|
||||
get:
|
||||
summary: Flush mempool of all unconfirmed transactions
|
||||
operationId: unsafe_flush_mempool
|
||||
tags:
|
||||
- Unsafe
|
||||
description: |
|
||||
Flush flushes out the mempool. It acquires a read-lock, fetches all the
|
||||
transactions currently in the transaction store and removes each transaction
|
||||
from the store and all indexes and finally resets the cache.
|
||||
|
||||
Note, flushing the mempool may leave the mempool in an inconsistent state.
|
||||
responses:
|
||||
"200":
|
||||
description: empty answer
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
$ref: "#/components/schemas/EmptyResponse"
|
||||
"500":
|
||||
description: empty error
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
$ref: "#/components/schemas/ErrorResponse"
|
||||
|
||||
/blockchain:
|
||||
get:
|
||||
summary: "Get block headers (max: 20) for minHeight <= height <= maxHeight."
|
||||
|
||||
@@ -22,7 +22,7 @@ import (
|
||||
// Metrics are based of the `benchmarkLength`, the amount of consecutive blocks
|
||||
// sampled from in the testnet
|
||||
func Benchmark(ctx context.Context, testnet *e2e.Testnet, benchmarkLength int64) error {
|
||||
block, _, err := waitForHeight(ctx, testnet, 0)
|
||||
block, err := getLatestBlock(ctx, testnet)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
@@ -54,9 +54,25 @@ func Load(ctx context.Context, testnet *e2e.Testnet) error {
|
||||
case numSeen := <-chSuccess:
|
||||
success += numSeen
|
||||
case <-ctx.Done():
|
||||
if success == 0 {
|
||||
// if we couldn't submit any transactions,
|
||||
// that's probably a problem and the test
|
||||
// should error; however, for very short tests
|
||||
// we shouldn't abort.
|
||||
//
|
||||
// The 2s cut off, is a rough guess based on
|
||||
// the expected value of
|
||||
// loadGenerateWaitTime. If the implementation
|
||||
// of that function changes, then this might
|
||||
// also need to change without more
|
||||
// refactoring.
|
||||
if success == 0 && time.Since(started) > 2*time.Second {
|
||||
return errors.New("failed to submit any transactions")
|
||||
}
|
||||
|
||||
// TODO perhaps allow test networks to
|
||||
// declare required transaction rates, which
|
||||
// might allow us to avoid the special case
|
||||
// around 0 txs above.
|
||||
rate := float64(success) / time.Since(started).Seconds()
|
||||
|
||||
logger.Info("ending transaction load",
|
||||
|
||||
@@ -13,17 +13,22 @@ import (
|
||||
)
|
||||
|
||||
// waitForHeight waits for the network to reach a certain height (or above),
|
||||
// returning the highest height seen. Errors if the network is not making
|
||||
// returning the block at the height seen. Errors if the network is not making
|
||||
// progress at all.
|
||||
// If height == 0, the initial height of the test network is used as the target.
|
||||
func waitForHeight(ctx context.Context, testnet *e2e.Testnet, height int64) (*types.Block, *types.BlockID, error) {
|
||||
var (
|
||||
err error
|
||||
maxResult *rpctypes.ResultBlock
|
||||
clients = map[string]*rpchttp.HTTP{}
|
||||
lastHeight int64
|
||||
lastIncrease = time.Now()
|
||||
nodesAtHeight = map[string]struct{}{}
|
||||
numRunningNodes int
|
||||
)
|
||||
if height == 0 {
|
||||
height = testnet.InitialHeight
|
||||
}
|
||||
|
||||
for _, node := range testnet.Nodes {
|
||||
if node.Stateless() {
|
||||
continue
|
||||
@@ -47,10 +52,10 @@ func waitForHeight(ctx context.Context, testnet *e2e.Testnet, height int64) (*ty
|
||||
continue
|
||||
}
|
||||
|
||||
// skip nodes that don't have state or haven't started yet
|
||||
if node.Stateless() {
|
||||
continue
|
||||
}
|
||||
|
||||
if !node.HasStarted {
|
||||
continue
|
||||
}
|
||||
@@ -67,16 +72,16 @@ func waitForHeight(ctx context.Context, testnet *e2e.Testnet, height int64) (*ty
|
||||
|
||||
wctx, cancel := context.WithTimeout(ctx, 10*time.Second)
|
||||
defer cancel()
|
||||
result, err := client.Block(wctx, nil)
|
||||
result, err := client.Status(wctx)
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
if result.Block != nil && (maxResult == nil || result.Block.Height > maxResult.Block.Height) {
|
||||
maxResult = result
|
||||
if result.SyncInfo.LatestBlockHeight > lastHeight {
|
||||
lastHeight = result.SyncInfo.LatestBlockHeight
|
||||
lastIncrease = time.Now()
|
||||
}
|
||||
|
||||
if maxResult != nil && maxResult.Block.Height >= height {
|
||||
if result.SyncInfo.LatestBlockHeight >= height {
|
||||
// the node has achieved the target height!
|
||||
|
||||
// add this node to the set of target
|
||||
@@ -90,9 +95,16 @@ func waitForHeight(ctx context.Context, testnet *e2e.Testnet, height int64) (*ty
|
||||
continue
|
||||
}
|
||||
|
||||
// return once all nodes have reached
|
||||
// the target height.
|
||||
return maxResult.Block, &maxResult.BlockID, nil
|
||||
// All nodes are at or above the target height. Now fetch the block for that target height
|
||||
// and return it. We loop again through all clients because some may have pruning set but
|
||||
// at least two of them should be archive nodes.
|
||||
for _, c := range clients {
|
||||
result, err := c.Block(ctx, &height)
|
||||
if err != nil || result == nil || result.Block == nil {
|
||||
continue
|
||||
}
|
||||
return result.Block, &result.BlockID, err
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -100,12 +112,12 @@ func waitForHeight(ctx context.Context, testnet *e2e.Testnet, height int64) (*ty
|
||||
return nil, nil, errors.New("unable to connect to any network nodes")
|
||||
}
|
||||
if time.Since(lastIncrease) >= time.Minute {
|
||||
if maxResult == nil {
|
||||
return nil, nil, errors.New("chain stalled at unknown height")
|
||||
if lastHeight == 0 {
|
||||
return nil, nil, errors.New("chain stalled at unknown height (most likely upon starting)")
|
||||
}
|
||||
|
||||
return nil, nil, fmt.Errorf("chain stalled at height %v [%d of %d nodes %+v]",
|
||||
maxResult.Block.Height,
|
||||
lastHeight,
|
||||
len(nodesAtHeight),
|
||||
numRunningNodes,
|
||||
nodesAtHeight)
|
||||
@@ -182,3 +194,35 @@ func waitForNode(ctx context.Context, node *e2e.Node, height int64) (*rpctypes.R
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// getLatestBlock returns the last block that all active nodes in the network have
|
||||
// agreed upon i.e. the earlist of each nodes latest block
|
||||
func getLatestBlock(ctx context.Context, testnet *e2e.Testnet) (*types.Block, error) {
|
||||
var earliestBlock *types.Block
|
||||
for _, node := range testnet.Nodes {
|
||||
// skip nodes that don't have state or haven't started yet
|
||||
if node.Stateless() {
|
||||
continue
|
||||
}
|
||||
if !node.HasStarted {
|
||||
continue
|
||||
}
|
||||
|
||||
client, err := node.Client()
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
wctx, cancel := context.WithTimeout(ctx, 10*time.Second)
|
||||
defer cancel()
|
||||
result, err := client.Block(wctx, nil)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
if result.Block != nil && (earliestBlock == nil || earliestBlock.Height > result.Block.Height) {
|
||||
earliestBlock = result.Block
|
||||
}
|
||||
}
|
||||
return earliestBlock, nil
|
||||
}
|
||||
|
||||
@@ -10,7 +10,7 @@ import (
|
||||
// Wait waits for a number of blocks to be produced, and for all nodes to catch
|
||||
// up with it.
|
||||
func Wait(ctx context.Context, testnet *e2e.Testnet, blocks int64) error {
|
||||
block, _, err := waitForHeight(ctx, testnet, 0)
|
||||
block, err := getLatestBlock(ctx, testnet)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
@@ -68,7 +68,7 @@ func TestApp_Tx(t *testing.T) {
|
||||
}{
|
||||
{
|
||||
Name: "Sync",
|
||||
WaitTime: 30 * time.Second,
|
||||
WaitTime: time.Minute,
|
||||
BroadcastTx: func(client *http.HTTP) broadcastFunc {
|
||||
return func(ctx context.Context, tx types.Tx) error {
|
||||
_, err := client.BroadcastTxSync(ctx, tx)
|
||||
@@ -78,7 +78,13 @@ func TestApp_Tx(t *testing.T) {
|
||||
},
|
||||
{
|
||||
Name: "Commit",
|
||||
WaitTime: time.Minute,
|
||||
WaitTime: 15 * time.Second,
|
||||
// TODO: turn this check back on if it can
|
||||
// return reliably. Currently these calls have
|
||||
// a hard timeout of 10s (server side
|
||||
// configured). The Sync check is probably
|
||||
// safe.
|
||||
ShouldSkip: true,
|
||||
BroadcastTx: func(client *http.HTTP) broadcastFunc {
|
||||
return func(ctx context.Context, tx types.Tx) error {
|
||||
_, err := client.BroadcastTxCommit(ctx, tx)
|
||||
@@ -87,8 +93,12 @@ func TestApp_Tx(t *testing.T) {
|
||||
},
|
||||
},
|
||||
{
|
||||
Name: "Async",
|
||||
WaitTime: time.Minute,
|
||||
Name: "Async",
|
||||
WaitTime: 90 * time.Second,
|
||||
// TODO: turn this check back on if there's a
|
||||
// way to avoid failures in the case that the
|
||||
// transaction doesn't make it into the
|
||||
// mempool. (retries?)
|
||||
ShouldSkip: true,
|
||||
BroadcastTx: func(client *http.HTTP) broadcastFunc {
|
||||
return func(ctx context.Context, tx types.Tx) error {
|
||||
|
||||
@@ -883,7 +883,7 @@ func (commit *Commit) ValidateBasic() error {
|
||||
}
|
||||
|
||||
if commit.Height >= 1 {
|
||||
if commit.BlockID.IsZero() {
|
||||
if commit.BlockID.IsNil() {
|
||||
return errors.New("commit cannot be for nil block")
|
||||
}
|
||||
|
||||
@@ -1204,8 +1204,8 @@ func (blockID BlockID) ValidateBasic() error {
|
||||
return nil
|
||||
}
|
||||
|
||||
// IsZero returns true if this is the BlockID of a nil block.
|
||||
func (blockID BlockID) IsZero() bool {
|
||||
// IsNil returns true if this is the BlockID of a nil block.
|
||||
func (blockID BlockID) IsNil() bool {
|
||||
return len(blockID.Hash) == 0 &&
|
||||
blockID.PartSetHeader.IsZero()
|
||||
}
|
||||
|
||||
@@ -21,7 +21,7 @@ func CanonicalizeBlockID(bid tmproto.BlockID) *tmproto.CanonicalBlockID {
|
||||
panic(err)
|
||||
}
|
||||
var cbid *tmproto.CanonicalBlockID
|
||||
if rbid == nil || rbid.IsZero() {
|
||||
if rbid == nil || rbid.IsNil() {
|
||||
cbid = nil
|
||||
} else {
|
||||
cbid = &tmproto.CanonicalBlockID{
|
||||
|
||||
@@ -221,10 +221,6 @@ func (b *EventBus) PublishEventPolka(data EventDataRoundState) error {
|
||||
return b.Publish(EventPolkaValue, data)
|
||||
}
|
||||
|
||||
func (b *EventBus) PublishEventUnlock(data EventDataRoundState) error {
|
||||
return b.Publish(EventUnlockValue, data)
|
||||
}
|
||||
|
||||
func (b *EventBus) PublishEventRelock(data EventDataRoundState) error {
|
||||
return b.Publish(EventRelockValue, data)
|
||||
}
|
||||
@@ -301,10 +297,6 @@ func (NopEventBus) PublishEventPolka(data EventDataRoundState) error {
|
||||
return nil
|
||||
}
|
||||
|
||||
func (NopEventBus) PublishEventUnlock(data EventDataRoundState) error {
|
||||
return nil
|
||||
}
|
||||
|
||||
func (NopEventBus) PublishEventRelock(data EventDataRoundState) error {
|
||||
return nil
|
||||
}
|
||||
|
||||
@@ -362,8 +362,6 @@ func TestEventBusPublish(t *testing.T) {
|
||||
require.NoError(t, err)
|
||||
err = eventBus.PublishEventPolka(EventDataRoundState{})
|
||||
require.NoError(t, err)
|
||||
err = eventBus.PublishEventUnlock(EventDataRoundState{})
|
||||
require.NoError(t, err)
|
||||
err = eventBus.PublishEventRelock(EventDataRoundState{})
|
||||
require.NoError(t, err)
|
||||
err = eventBus.PublishEventLock(EventDataRoundState{})
|
||||
@@ -475,7 +473,6 @@ var events = []string{
|
||||
EventTimeoutProposeValue,
|
||||
EventCompleteProposalValue,
|
||||
EventPolkaValue,
|
||||
EventUnlockValue,
|
||||
EventLockValue,
|
||||
EventRelockValue,
|
||||
EventTimeoutWaitValue,
|
||||
@@ -497,7 +494,6 @@ var queries = []tmpubsub.Query{
|
||||
EventQueryTimeoutPropose,
|
||||
EventQueryCompleteProposal,
|
||||
EventQueryPolka,
|
||||
EventQueryUnlock,
|
||||
EventQueryLock,
|
||||
EventQueryRelock,
|
||||
EventQueryTimeoutWait,
|
||||
|
||||
@@ -38,7 +38,6 @@ const (
|
||||
EventStateSyncStatusValue = "StateSyncStatus"
|
||||
EventTimeoutProposeValue = "TimeoutPropose"
|
||||
EventTimeoutWaitValue = "TimeoutWait"
|
||||
EventUnlockValue = "Unlock"
|
||||
EventValidBlockValue = "ValidBlock"
|
||||
EventVoteValue = "Vote"
|
||||
)
|
||||
@@ -223,7 +222,6 @@ var (
|
||||
EventQueryTimeoutPropose = QueryForEvent(EventTimeoutProposeValue)
|
||||
EventQueryTimeoutWait = QueryForEvent(EventTimeoutWaitValue)
|
||||
EventQueryTx = QueryForEvent(EventTxValue)
|
||||
EventQueryUnlock = QueryForEvent(EventUnlockValue)
|
||||
EventQueryValidatorSetUpdates = QueryForEvent(EventValidatorSetUpdatesValue)
|
||||
EventQueryValidBlock = QueryForEvent(EventValidBlockValue)
|
||||
EventQueryVote = QueryForEvent(EventVoteValue)
|
||||
|
||||
@@ -41,6 +41,7 @@ type ConsensusParams struct {
|
||||
Evidence EvidenceParams `json:"evidence"`
|
||||
Validator ValidatorParams `json:"validator"`
|
||||
Version VersionParams `json:"version"`
|
||||
Timestamp TimestampParams `json:"timestamp"`
|
||||
}
|
||||
|
||||
// HashedParams is a subset of ConsensusParams.
|
||||
@@ -75,6 +76,14 @@ type VersionParams struct {
|
||||
AppVersion uint64 `json:"app_version"`
|
||||
}
|
||||
|
||||
// TimestampParams influence the validity of block timestamps.
|
||||
// TODO (@wbanfield): add link to proposer-based timestamp spec when completed.
|
||||
type TimestampParams struct {
|
||||
Precision time.Duration `json:"precision"`
|
||||
Accuracy time.Duration `json:"accuracy"`
|
||||
MsgDelay time.Duration `json:"msg_delay"`
|
||||
}
|
||||
|
||||
// DefaultConsensusParams returns a default ConsensusParams.
|
||||
func DefaultConsensusParams() *ConsensusParams {
|
||||
return &ConsensusParams{
|
||||
@@ -116,6 +125,16 @@ func DefaultVersionParams() VersionParams {
|
||||
}
|
||||
}
|
||||
|
||||
func DefaultTimestampParams() TimestampParams {
|
||||
// TODO(@wbanfield): Determine experimental values for these defaults
|
||||
// https://github.com/tendermint/tendermint/issues/7202
|
||||
return TimestampParams{
|
||||
Precision: 2 * time.Second,
|
||||
Accuracy: 500 * time.Millisecond,
|
||||
MsgDelay: 3 * time.Second,
|
||||
}
|
||||
}
|
||||
|
||||
func (val *ValidatorParams) IsValidPubkeyType(pubkeyType string) bool {
|
||||
for i := 0; i < len(val.PubKeyTypes); i++ {
|
||||
if val.PubKeyTypes[i] == pubkeyType {
|
||||
|
||||
@@ -79,6 +79,25 @@ func (p *Proposal) ValidateBasic() error {
|
||||
return nil
|
||||
}
|
||||
|
||||
// IsTimely validates that the block timestamp is 'timely' according to the proposer-based timestamp algorithm.
|
||||
// To evaluate if a block is timely, its timestamp is compared to the local time of the validator along with the
|
||||
// configured Precision and MsgDelay parameters.
|
||||
// Specifically, a proposed block timestamp is considered timely if it is satisfies the following inequalities:
|
||||
//
|
||||
// proposedBlockTime > validatorLocaltime - Precision && proposedBlockTime < validatorLocalTime + Precision + MsgDelay.
|
||||
//
|
||||
// For more information on the meaning of 'timely', see the proposer-based timestamp specification:
|
||||
// https://github.com/tendermint/spec/tree/master/spec/consensus/proposer-based-timestamp
|
||||
func (p *Proposal) IsTimely(clock tmtime.Source, tp TimestampParams) bool {
|
||||
lt := clock.Now()
|
||||
lhs := lt.Add(-tp.Precision)
|
||||
rhs := lt.Add(tp.Precision).Add(tp.MsgDelay)
|
||||
if lhs.Before(p.Timestamp) && p.Timestamp.Before(rhs) {
|
||||
return true
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
// String returns a string representation of the Proposal.
|
||||
//
|
||||
// 1. height
|
||||
|
||||
@@ -13,6 +13,7 @@ import (
|
||||
"github.com/tendermint/tendermint/crypto/tmhash"
|
||||
"github.com/tendermint/tendermint/internal/libs/protoio"
|
||||
tmrand "github.com/tendermint/tendermint/libs/rand"
|
||||
tmtimemocks "github.com/tendermint/tendermint/libs/time/mocks"
|
||||
tmproto "github.com/tendermint/tendermint/proto/tendermint/types"
|
||||
)
|
||||
|
||||
@@ -191,3 +192,65 @@ func TestProposalProtoBuf(t *testing.T) {
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func TestIsTimely(t *testing.T) {
|
||||
genesisTime, err := time.Parse(time.RFC3339, "2019-03-13T23:00:00Z")
|
||||
require.NoError(t, err)
|
||||
testCases := []struct {
|
||||
name string
|
||||
proposalTime time.Time
|
||||
localTime time.Time
|
||||
precision time.Duration
|
||||
msgDelay time.Duration
|
||||
expectTimely bool
|
||||
}{
|
||||
{
|
||||
// Checking that the following inequality evaluates to true:
|
||||
// 1 - 2 < 0 < 1 + 2 + 1
|
||||
name: "basic timely",
|
||||
proposalTime: genesisTime,
|
||||
localTime: genesisTime.Add(1 * time.Nanosecond),
|
||||
precision: time.Nanosecond * 2,
|
||||
msgDelay: time.Nanosecond,
|
||||
expectTimely: true,
|
||||
},
|
||||
{
|
||||
// Checking that the following inequality evaluates to false:
|
||||
// 3 - 2 < 0 < 3 + 2 + 1
|
||||
name: "local time too large",
|
||||
proposalTime: genesisTime,
|
||||
localTime: genesisTime.Add(3 * time.Nanosecond),
|
||||
precision: time.Nanosecond * 2,
|
||||
msgDelay: time.Nanosecond,
|
||||
expectTimely: false,
|
||||
},
|
||||
{
|
||||
// Checking that the following inequality evaluates to false:
|
||||
// 0 - 2 < 2 < 2 + 1
|
||||
name: "proposal time too large",
|
||||
proposalTime: genesisTime.Add(4 * time.Nanosecond),
|
||||
localTime: genesisTime,
|
||||
precision: time.Nanosecond * 2,
|
||||
msgDelay: time.Nanosecond,
|
||||
expectTimely: false,
|
||||
},
|
||||
}
|
||||
for _, testCase := range testCases {
|
||||
t.Run(testCase.name, func(t *testing.T) {
|
||||
p := Proposal{
|
||||
Timestamp: testCase.proposalTime,
|
||||
}
|
||||
|
||||
tp := TimestampParams{
|
||||
Precision: testCase.precision,
|
||||
MsgDelay: testCase.msgDelay,
|
||||
}
|
||||
|
||||
mockSource := new(tmtimemocks.Source)
|
||||
mockSource.On("Now").Return(testCase.localTime)
|
||||
|
||||
ti := p.IsTimely(mockSource, tp)
|
||||
assert.Equal(t, testCase.expectTimely, ti)
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
@@ -68,7 +68,7 @@ func (vote *Vote) CommitSig() CommitSig {
|
||||
switch {
|
||||
case vote.BlockID.IsComplete():
|
||||
blockIDFlag = BlockIDFlagCommit
|
||||
case vote.BlockID.IsZero():
|
||||
case vote.BlockID.IsNil():
|
||||
blockIDFlag = BlockIDFlagNil
|
||||
default:
|
||||
panic(fmt.Sprintf("Invalid vote %v - expected BlockID to be either empty or complete", vote))
|
||||
@@ -177,7 +177,7 @@ func (vote *Vote) ValidateBasic() error {
|
||||
|
||||
// BlockID.ValidateBasic would not err if we for instance have an empty hash but a
|
||||
// non-empty PartsSetHeader:
|
||||
if !vote.BlockID.IsZero() && !vote.BlockID.IsComplete() {
|
||||
if !vote.BlockID.IsNil() && !vote.BlockID.IsComplete() {
|
||||
return fmt.Errorf("blockID must be either empty or complete, got: %v", vote.BlockID)
|
||||
}
|
||||
|
||||
|
||||
@@ -27,7 +27,7 @@ func TestVoteSet_AddVote_Good(t *testing.T) {
|
||||
assert.Nil(t, voteSet.GetByAddress(val0Addr))
|
||||
assert.False(t, voteSet.BitArray().GetIndex(0))
|
||||
blockID, ok := voteSet.TwoThirdsMajority()
|
||||
assert.False(t, ok || !blockID.IsZero(), "there should be no 2/3 majority")
|
||||
assert.False(t, ok || !blockID.IsNil(), "there should be no 2/3 majority")
|
||||
|
||||
vote := &Vote{
|
||||
ValidatorAddress: val0Addr,
|
||||
@@ -44,7 +44,7 @@ func TestVoteSet_AddVote_Good(t *testing.T) {
|
||||
assert.NotNil(t, voteSet.GetByAddress(val0Addr))
|
||||
assert.True(t, voteSet.BitArray().GetIndex(0))
|
||||
blockID, ok = voteSet.TwoThirdsMajority()
|
||||
assert.False(t, ok || !blockID.IsZero(), "there should be no 2/3 majority")
|
||||
assert.False(t, ok || !blockID.IsNil(), "there should be no 2/3 majority")
|
||||
}
|
||||
|
||||
func TestVoteSet_AddVote_Bad(t *testing.T) {
|
||||
@@ -145,7 +145,7 @@ func TestVoteSet_2_3Majority(t *testing.T) {
|
||||
require.NoError(t, err)
|
||||
}
|
||||
blockID, ok := voteSet.TwoThirdsMajority()
|
||||
assert.False(t, ok || !blockID.IsZero(), "there should be no 2/3 majority")
|
||||
assert.False(t, ok || !blockID.IsNil(), "there should be no 2/3 majority")
|
||||
|
||||
// 7th validator voted for some blockhash
|
||||
{
|
||||
@@ -156,7 +156,7 @@ func TestVoteSet_2_3Majority(t *testing.T) {
|
||||
_, err = signAddVote(privValidators[6], withBlockHash(vote, tmrand.Bytes(32)), voteSet)
|
||||
require.NoError(t, err)
|
||||
blockID, ok = voteSet.TwoThirdsMajority()
|
||||
assert.False(t, ok || !blockID.IsZero(), "there should be no 2/3 majority")
|
||||
assert.False(t, ok || !blockID.IsNil(), "there should be no 2/3 majority")
|
||||
}
|
||||
|
||||
// 8th validator voted for nil.
|
||||
@@ -168,7 +168,7 @@ func TestVoteSet_2_3Majority(t *testing.T) {
|
||||
_, err = signAddVote(privValidators[7], vote, voteSet)
|
||||
require.NoError(t, err)
|
||||
blockID, ok = voteSet.TwoThirdsMajority()
|
||||
assert.True(t, ok || blockID.IsZero(), "there should be 2/3 majority for nil")
|
||||
assert.True(t, ok || blockID.IsNil(), "there should be 2/3 majority for nil")
|
||||
}
|
||||
}
|
||||
|
||||
@@ -200,7 +200,7 @@ func TestVoteSet_2_3MajorityRedux(t *testing.T) {
|
||||
require.NoError(t, err)
|
||||
}
|
||||
blockID, ok := voteSet.TwoThirdsMajority()
|
||||
assert.False(t, ok || !blockID.IsZero(),
|
||||
assert.False(t, ok || !blockID.IsNil(),
|
||||
"there should be no 2/3 majority")
|
||||
|
||||
// 67th validator voted for nil
|
||||
@@ -212,7 +212,7 @@ func TestVoteSet_2_3MajorityRedux(t *testing.T) {
|
||||
_, err = signAddVote(privValidators[66], withBlockHash(vote, nil), voteSet)
|
||||
require.NoError(t, err)
|
||||
blockID, ok = voteSet.TwoThirdsMajority()
|
||||
assert.False(t, ok || !blockID.IsZero(),
|
||||
assert.False(t, ok || !blockID.IsNil(),
|
||||
"there should be no 2/3 majority: last vote added was nil")
|
||||
}
|
||||
|
||||
@@ -226,7 +226,7 @@ func TestVoteSet_2_3MajorityRedux(t *testing.T) {
|
||||
_, err = signAddVote(privValidators[67], withBlockPartSetHeader(vote, blockPartsHeader), voteSet)
|
||||
require.NoError(t, err)
|
||||
blockID, ok = voteSet.TwoThirdsMajority()
|
||||
assert.False(t, ok || !blockID.IsZero(),
|
||||
assert.False(t, ok || !blockID.IsNil(),
|
||||
"there should be no 2/3 majority: last vote added had different PartSetHeader Hash")
|
||||
}
|
||||
|
||||
@@ -240,7 +240,7 @@ func TestVoteSet_2_3MajorityRedux(t *testing.T) {
|
||||
_, err = signAddVote(privValidators[68], withBlockPartSetHeader(vote, blockPartsHeader), voteSet)
|
||||
require.NoError(t, err)
|
||||
blockID, ok = voteSet.TwoThirdsMajority()
|
||||
assert.False(t, ok || !blockID.IsZero(),
|
||||
assert.False(t, ok || !blockID.IsNil(),
|
||||
"there should be no 2/3 majority: last vote added had different PartSetHeader Total")
|
||||
}
|
||||
|
||||
@@ -253,7 +253,7 @@ func TestVoteSet_2_3MajorityRedux(t *testing.T) {
|
||||
_, err = signAddVote(privValidators[69], withBlockHash(vote, tmrand.Bytes(32)), voteSet)
|
||||
require.NoError(t, err)
|
||||
blockID, ok = voteSet.TwoThirdsMajority()
|
||||
assert.False(t, ok || !blockID.IsZero(),
|
||||
assert.False(t, ok || !blockID.IsNil(),
|
||||
"there should be no 2/3 majority: last vote added had different BlockHash")
|
||||
}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user