Compare commits

...

9 Commits

Author SHA1 Message Date
William Banfield
655a81e50e update docker cleanup 2021-09-16 12:29:16 -04:00
William Banfield
044ad28ec6 setup changes 2021-09-16 12:28:38 -04:00
William Banfield
68a9771527 e2e: cleanup in benchmark as well 2021-09-15 18:11:40 -04:00
William Banfield
547baf9e64 cleanup on all errors if preserve not specified 2021-09-15 18:04:51 -04:00
Sam Kleinman
55f6d20977 e2e: skip broadcastTxCommit check (#6949)
I think the `Sync` check covers our primary use case, and perhaps we
can turn this back on in the future after some kind of event-system
rewrite, or RPC rewrite that will avoid the serverside timeout.
2021-09-15 21:24:35 +00:00
Sam Kleinman
b9c35c1263 docs: fix openapi yaml lint (#6948)
saw this in the super lint.
2021-09-15 19:29:25 +00:00
Sam Kleinman
f08f72e334 rfc: e2e improvements (#6941) 2021-09-15 15:26:39 -04:00
Callum Waters
e932b469ed e2e: tweak semantics of waitForHeight (#6943) 2021-09-15 20:49:24 +02:00
Callum Waters
5db2a39643 docs: add documentation of unsafe_flush_mempool to openapi (#6947) 2021-09-15 17:28:01 +02:00
12 changed files with 359 additions and 39 deletions

View File

@@ -40,5 +40,6 @@ sections.
- [RFC-000: P2P Roadmap](./rfc-000-p2p-roadmap.rst)
- [RFC-001: Storage Engines](./rfc-001-storage-engine.rst)
- [RFC-002: Interprocess Communication](./rfc-002-ipc-ecosystem.md)
- [RFC-004: E2E Test Framework Enhancements](./rfc-004-e2e-framework.md)
<!-- - [RFC-NNN: Title](./rfc-NNN-title.md) -->

View File

@@ -0,0 +1,213 @@
========================================
RFC 004: E2E Test Framework Enhancements
========================================
Changelog
---------
- 2021-09-14: started initial draft (@tychoish)
Abstract
--------
This document discusses a series of improvements to the e2e test framework
that we can consider during the next few releases to help boost confidence in
Tendermint releases, and improve developer efficiency.
Background
----------
During the 0.35 release cycle, the E2E tests were a source of great
value, helping to identify a number of bugs before release. At the same time,
the tests were not consistently passing during this time, thereby reducing
their value, and forcing the core development team to allocate time and energy
to maintaining and chasing down issues with the e2e tests and the test
harness. The experience of this release cycle calls to mind a series of
improvements to the test framework, and this document attempts to capture
these improvements, along with motivations, and potential for impact.
Projects
--------
Flexible Workload Generation
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Presently the e2e suite contains a single workload generation pattern, which
exists simply to ensure that the test networks have some work during their
runs. However, the shape and volume of the work is very consistent and is very
gentle to help ensure test reliability.
We don't need a complex workload generation framework, but being able to have
a few different workload shapes available for test networks, both generated and
hand-crafted, would be useful.
Workload patterns/configurations might include:
- transaction targeting patterns (include light nodes, round robin, target
individual nodes)
- variable transaction size over time.
- transaction broadcast option (synchronously, checked, fire-and-forget,
mixed).
- number of transactions to submit.
- non-transaction workloads: (evidence submission, query, event subscription.)
Configurable Generator
~~~~~~~~~~~~~~~~~~~~~~
The nightly e2e suite is defined by the `testnet generator
<https://github.com/tendermint/tendermint/blob/master/test/e2e/generator/generate.go#L13-L65>`_,
and it's difficult to add dimensions or change the focus of the test suite in
any way without modifying the implementation of the generator. If the
generator were more configurable, potentially via a file rather than in
the Go implementation, we could modify the focus of the test suite on the
fly.
Features that we might want to configure:
- number of test networks to generate of various topologies, to improve
coverage of different configurations.
- test application configurations (to modify the latency of ABCI calls, etc.)
- size of test networks.
- workload shape and behavior.
- initial sync and catch-up configurations.
The workload generator currently provides runtime options for limiting the
generator to specific types of P2P stacks, and for generating multiple groups
of test cases to support parallelism. The goal is to extend this pattern and
avoid hardcoding the matrix of test cases in the generator code. Once the
testnet configuration generation behavior is configurable at runtime,
developers may be able to use the e2e framework to validate changes before
landing changes that break e2e tests a day later.
In addition to the autogenerated suite, it might make sense to maintain a
small collection of hand-crafted cases that exercise configurations of
concern, to run as part of the nightly (or less frequent) loop.
Implementation Plan Structure
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
As a development team, we should determine the features should impact the e2e
testing early in the development cycle, and if we intend to modify the e2e
tests to exercise a feature, we should identify this early and begin the
integration process as early as possible.
To facilitate this, we should adopt a practice whereby we exercise specific
features that are currently under development more rigorously in the e2e
suite, and then as development stabilizes we can reduce the number or weight
of these features in the suite.
As of 0.35 there are essentially two end to end tests: the suite of 64
generated test networks, and the hand crafted `ci.toml` test case. The
generated test cases help provide systemtic coverage, while the `ci` run
provides coverage for a large number of features.
Reduce Cycle Time
~~~~~~~~~~~~~~~~~
One of the barriers to leveraging the e2e framework, and one of the challenges
in debugging failures, is the cycle time of running a single test iteration is
quite high: 5 minutes to build the docker image, plus the time to run the test
or tests.
There are a number of improvements and enhancements that can reduce the cycle
time in practice:
- reduce the amount of time required to build the docker image used in these
tests. Without the dependency on CGo, the tendermint binaries could be
(cross) compiled outside of the docker container and then injected into
them, which would take better advantage of docker's native caching,
although, without the dependency on CGo there would be no hard requirement
for the e2e tests to use docker.
- support test parallelism. Because of the way the testnets are orchestrated
a single system can really only run one network at a time. For executions
(local or remote) with more resources, there's no reason to run a few
networks in parallel to reduce the feedback time.
- prune testnet configurations that are unlikely to provide good signal, to
shorten the time to feedback.
- apply some kind of tiered approach to test execution, to improve the
legibility of the test result. For example order tests by the dependency of
their features, or run test networks without perturbations before running
that configuration with perturbations, to be able to isolate the impact of
specific features.
- orchestrate the test harness directly from go test rather than via a special
harness and shell scripts so e2e tests may more naively fit into developers
existing workflows.
Many of these improvements, particularly, reducing the build time will also
reduce the time to get feedback during automated builds.
Deeper Insights
~~~~~~~~~~~~~~~
When a test network fails, it's incredibly difficult to understand _why_ the
network failed, as the current system provides very little insight into the
system outside of the process logs. When a test network stalls or fails
developers should be able to quickly and easily get a sense of the state of
the network and all nodes.
Improvements in persuit of this goal, include functionality that would help
node operators in production environments by improving the quality and utility
of the logging messages and other reported metrics, but also provide some
tools to collect and aggregate this data for developers in the context of test
networks.
- Interleave messages from all nodes in the network to be able to correlate
events during the test run.
- Collect structured metrics of the system operation (CPU/MEM/IO) during the
test run, as well as from each tendermint/application process.
- Build (simple) tools to be able to render and summarize the data collected
during the test run to answer basic questions about test outcome.
Flexible Assertions
~~~~~~~~~~~~~~~~~~~
Currently, all assertions run for every test network, which makes the
assertions pretty bland, and the framework primarily useful as a smoke-test
framework, but it might be useful to be able to write and run different
tests for different configurations. This could allow us to test outside of the
happy-path.
In general our existing assertions occupy a fraction of the total test time,
so the relative cost of adding a few extra test assertions would be of limited
cost, and could help build confidence.
Additional Kinds of Testing
~~~~~~~~~~~~~~~~~~~~~~~~~~~
The existing e2e suite, exercises networks of nodes that have homogeneous
tendermint version, stable configuration, that are expected to make
progress. There are many other possible test configurations that may be
interesting to engage with. These could include dimensions, such as:
- Multi-version testing to exercise our compatibility guarantees for networks
that might have different tendermint versions.
- As a flavor or mult-version testing, include upgrade testing, to build
confidence in migration code and procedures.
- Additional test applications, particularly practical-type applciations
including some that use gaiad and/or the cosmos-sdk. Test-only applications
that simulate other kinds of applications (e.g. variable application
operation latency.)
- Tests of "non-viable" configurations that ensure that forbidden combinations
lead to halts.
References
----------
- `ADR 66: End-to-End Testing <../architecture/adr-66-e2e-testing.md>`_

View File

@@ -379,6 +379,7 @@ func (c *Client) Update(ctx context.Context, now time.Time) (*types.LightBlock,
return nil, err
}
// If there is a new light block then verify it
if latestBlock.Height > lastTrustedHeight {
err = c.verifyLightBlock(ctx, latestBlock, now)
if err != nil {
@@ -388,7 +389,8 @@ func (c *Client) Update(ctx context.Context, now time.Time) (*types.LightBlock,
return latestBlock, nil
}
return nil, nil
// else return the latestTrustedBlock
return c.latestTrustedBlock, nil
}
// VerifyLightBlockAtHeight fetches the light block at the given height

View File

@@ -644,7 +644,7 @@ func TestClientReplacesPrimaryWithWitnessIfPrimaryIsUnavailable(t *testing.T) {
chainID,
trustOptions,
mockDeadNode,
[]provider.Provider{mockFullNode, mockFullNode},
[]provider.Provider{mockDeadNode, mockFullNode},
dbs.New(dbm.NewMemDB()),
light.Logger(log.TestingLogger()),
)
@@ -663,6 +663,32 @@ func TestClientReplacesPrimaryWithWitnessIfPrimaryIsUnavailable(t *testing.T) {
mockFullNode.AssertExpectations(t)
}
func TestClientReplacesPrimaryWithWitnessIfPrimaryDoesntHaveBlock(t *testing.T) {
mockFullNode := &provider_mocks.Provider{}
mockFullNode.On("LightBlock", mock.Anything, mock.Anything).Return(l1, nil)
mockDeadNode := &provider_mocks.Provider{}
mockDeadNode.On("LightBlock", mock.Anything, mock.Anything).Return(nil, provider.ErrLightBlockNotFound)
c, err := light.NewClient(
ctx,
chainID,
trustOptions,
mockDeadNode,
[]provider.Provider{mockDeadNode, mockFullNode},
dbs.New(dbm.NewMemDB()),
light.Logger(log.TestingLogger()),
)
require.NoError(t, err)
_, err = c.Update(ctx, bTime.Add(2*time.Hour))
require.NoError(t, err)
// we should still have the dead node as a witness because it
// hasn't repeatedly been unresponsive yet
assert.Equal(t, 2, len(c.Witnesses()))
mockDeadNode.AssertExpectations(t)
mockFullNode.AssertExpectations(t)
}
func TestClient_BackwardsVerification(t *testing.T) {
{
headers, vals, _ := genLightBlocksWithKeys(chainID, 9, 3, 0, bTime)

View File

@@ -601,6 +601,32 @@ paths:
application/json:
schema:
$ref: "#/components/schemas/ErrorResponse"
/unsafe_flush_mempool:
get:
summary: Flush mempool of all unconfirmed transactions
operationId: unsafe_flush_mempool
tags:
- Unsafe
description: |
Flush flushes out the mempool. It acquires a read-lock, fetches all the
transactions currently in the transaction store and removes each transaction
from the store and all indexes and finally resets the cache.
Note, flushing the mempool may leave the mempool in an inconsistent state.
responses:
"200":
description: empty answer
content:
application/json:
schema:
$ref: "#/components/schemas/EmptyResponse"
"500":
description: empty error
content:
application/json:
schema:
$ref: "#/components/schemas/ErrorResponse"
/blockchain:
get:
summary: "Get block headers (max: 20) for minHeight <= height <= maxHeight."

View File

@@ -22,7 +22,7 @@ import (
// Metrics are based of the `benchmarkLength`, the amount of consecutive blocks
// sampled from in the testnet
func Benchmark(ctx context.Context, testnet *e2e.Testnet, benchmarkLength int64) error {
block, _, err := waitForHeight(ctx, testnet, 0)
block, err := getLatestBlock(ctx, testnet)
if err != nil {
return err
}

View File

@@ -11,7 +11,7 @@ import (
// Cleanup removes the Docker Compose containers and testnet directory.
func Cleanup(testnet *e2e.Testnet) error {
err := cleanupDocker()
err := cleanupDocker(testnet.Name)
if err != nil {
return err
}
@@ -20,7 +20,7 @@ func Cleanup(testnet *e2e.Testnet) error {
// cleanupDocker removes all E2E resources (with label e2e=True), regardless
// of testnet.
func cleanupDocker() error {
func cleanupDocker(name string) error {
logger.Info("Removing Docker containers and networks")
// GNU xargs requires the -r flag to not run when input is empty, macOS
@@ -28,13 +28,13 @@ func cleanupDocker() error {
xargsR := `$(if [[ $OSTYPE == "linux-gnu"* ]]; then echo -n "-r"; fi)`
err := exec("bash", "-c", fmt.Sprintf(
"docker container ls -qa --filter label=e2e | xargs %v docker container rm -f", xargsR))
"docker container ls -qa --filter label=\"name=%v\" | xargs %v docker container rm -f", name, xargsR))
if err != nil {
return err
}
return exec("bash", "-c", fmt.Sprintf(
"docker network ls -q --filter label=e2e | xargs %v docker network rm", xargsR))
"docker network ls -q --filter label=\"name=%v\"| xargs %v docker network rm", name, xargsR))
}
// cleanupDir cleans up a testnet directory

View File

@@ -52,6 +52,9 @@ func NewCLI() *CLI {
if err := Cleanup(cli.testnet); err != nil {
return err
}
if !cli.preserve {
defer Cleanup(cli.testnet)
}
if err := Setup(cli.testnet); err != nil {
return err
}
@@ -103,11 +106,6 @@ func NewCLI() *CLI {
if err := Test(cli.testnet); err != nil {
return err
}
if !cli.preserve {
if err := Cleanup(cli.testnet); err != nil {
return err
}
}
return nil
},
}
@@ -269,6 +267,8 @@ Does not run any perbutations.
if err := Cleanup(cli.testnet); err != nil {
return err
}
defer Cleanup(cli.testnet)
if err := Setup(cli.testnet); err != nil {
return err
}
@@ -302,10 +302,6 @@ Does not run any perbutations.
return err
}
if err := Cleanup(cli.testnet); err != nil {
return err
}
return nil
},
})

View File

@@ -13,17 +13,22 @@ import (
)
// waitForHeight waits for the network to reach a certain height (or above),
// returning the highest height seen. Errors if the network is not making
// returning the block at the height seen. Errors if the network is not making
// progress at all.
// If height == 0, the initial height of the test network is used as the target.
func waitForHeight(ctx context.Context, testnet *e2e.Testnet, height int64) (*types.Block, *types.BlockID, error) {
var (
err error
maxResult *rpctypes.ResultBlock
clients = map[string]*rpchttp.HTTP{}
lastHeight int64
lastIncrease = time.Now()
nodesAtHeight = map[string]struct{}{}
numRunningNodes int
)
if height == 0 {
height = testnet.InitialHeight
}
for _, node := range testnet.Nodes {
if node.Stateless() {
continue
@@ -47,10 +52,10 @@ func waitForHeight(ctx context.Context, testnet *e2e.Testnet, height int64) (*ty
continue
}
// skip nodes that don't have state or haven't started yet
if node.Stateless() {
continue
}
if !node.HasStarted {
continue
}
@@ -67,16 +72,16 @@ func waitForHeight(ctx context.Context, testnet *e2e.Testnet, height int64) (*ty
wctx, cancel := context.WithTimeout(ctx, 10*time.Second)
defer cancel()
result, err := client.Block(wctx, nil)
result, err := client.Status(wctx)
if err != nil {
continue
}
if result.Block != nil && (maxResult == nil || result.Block.Height > maxResult.Block.Height) {
maxResult = result
if result.SyncInfo.LatestBlockHeight > lastHeight {
lastHeight = result.SyncInfo.LatestBlockHeight
lastIncrease = time.Now()
}
if maxResult != nil && maxResult.Block.Height >= height {
if result.SyncInfo.LatestBlockHeight >= height {
// the node has achieved the target height!
// add this node to the set of target
@@ -90,9 +95,16 @@ func waitForHeight(ctx context.Context, testnet *e2e.Testnet, height int64) (*ty
continue
}
// return once all nodes have reached
// the target height.
return maxResult.Block, &maxResult.BlockID, nil
// All nodes are at or above the target height. Now fetch the block for that target height
// and return it. We loop again through all clients because some may have pruning set but
// at least two of them should be archive nodes.
for _, c := range clients {
result, err := c.Block(ctx, &height)
if err != nil || result == nil || result.Block == nil {
continue
}
return result.Block, &result.BlockID, err
}
}
}
@@ -100,12 +112,12 @@ func waitForHeight(ctx context.Context, testnet *e2e.Testnet, height int64) (*ty
return nil, nil, errors.New("unable to connect to any network nodes")
}
if time.Since(lastIncrease) >= time.Minute {
if maxResult == nil {
return nil, nil, errors.New("chain stalled at unknown height")
if lastHeight == 0 {
return nil, nil, errors.New("chain stalled at unknown height (most likely upon starting)")
}
return nil, nil, fmt.Errorf("chain stalled at height %v [%d of %d nodes %+v]",
maxResult.Block.Height,
lastHeight,
len(nodesAtHeight),
numRunningNodes,
nodesAtHeight)
@@ -182,3 +194,35 @@ func waitForNode(ctx context.Context, node *e2e.Node, height int64) (*rpctypes.R
}
}
}
// getLatestBlock returns the last block that all active nodes in the network have
// agreed upon i.e. the earlist of each nodes latest block
func getLatestBlock(ctx context.Context, testnet *e2e.Testnet) (*types.Block, error) {
var earliestBlock *types.Block
for _, node := range testnet.Nodes {
// skip nodes that don't have state or haven't started yet
if node.Stateless() {
continue
}
if !node.HasStarted {
continue
}
client, err := node.Client()
if err != nil {
return nil, err
}
wctx, cancel := context.WithTimeout(ctx, 10*time.Second)
defer cancel()
result, err := client.Block(wctx, nil)
if err != nil {
return nil, err
}
if result.Block != nil && (earliestBlock == nil || earliestBlock.Height > result.Block.Height) {
earliestBlock = result.Block
}
}
return earliestBlock, nil
}

View File

@@ -139,6 +139,7 @@ networks:
{{ .Name }}:
labels:
e2e: true
name: {{ .Name }}
driver: bridge
{{- if .IPv6 }}
enable_ipv6: true
@@ -153,7 +154,8 @@ services:
{{ .Name }}:
labels:
e2e: true
container_name: {{ .Name }}
name: {{ $.Name }}
container_name: {{$.Name}}_{{ .Name }}
image: tendermint/e2e-node
{{- if eq .ABCIProtocol "builtin" }}
entrypoint: /usr/bin/entrypoint-builtin
@@ -162,10 +164,10 @@ services:
{{- end }}
init: true
ports:
- 26656
- {{ if .ProxyPort }}{{ addUint32 .ProxyPort 1000 }}:{{ end }}26660
- {{ if .ProxyPort }}{{ .ProxyPort }}:{{ end }}26657
- 6060
- 0:26656
- 0:26660
- 0:26657
- 0:6060
volumes:
- ./{{ .Name }}:/tendermint
networks:

View File

@@ -10,7 +10,7 @@ import (
// Wait waits for a number of blocks to be produced, and for all nodes to catch
// up with it.
func Wait(ctx context.Context, testnet *e2e.Testnet, blocks int64) error {
block, _, err := waitForHeight(ctx, testnet, 0)
block, err := getLatestBlock(ctx, testnet)
if err != nil {
return err
}

View File

@@ -68,7 +68,7 @@ func TestApp_Tx(t *testing.T) {
}{
{
Name: "Sync",
WaitTime: 30 * time.Second,
WaitTime: time.Minute,
BroadcastTx: func(client *http.HTTP) broadcastFunc {
return func(ctx context.Context, tx types.Tx) error {
_, err := client.BroadcastTxSync(ctx, tx)
@@ -78,7 +78,13 @@ func TestApp_Tx(t *testing.T) {
},
{
Name: "Commit",
WaitTime: time.Minute,
WaitTime: 15 * time.Second,
// TODO: turn this check back on if it can
// return reliably. Currently these calls have
// a hard timeout of 10s (server side
// configured). The Sync check is probably
// safe.
ShouldSkip: true,
BroadcastTx: func(client *http.HTTP) broadcastFunc {
return func(ctx context.Context, tx types.Tx) error {
_, err := client.BroadcastTxCommit(ctx, tx)
@@ -87,8 +93,12 @@ func TestApp_Tx(t *testing.T) {
},
},
{
Name: "Async",
WaitTime: time.Minute,
Name: "Async",
WaitTime: 90 * time.Second,
// TODO: turn this check back on if there's a
// way to avoid failures in the case that the
// transaction doesn't make it into the
// mempool. (retries?)
ShouldSkip: true,
BroadcastTx: func(client *http.HTTP) broadcastFunc {
return func(ctx context.Context, tx types.Tx) error {