Commit Graph

297 Commits

Author SHA1 Message Date
Sam Kleinman
b1dfbb8bc3 e2e: generator ensure p2p modes (#7021) 2021-09-28 17:04:37 -04:00
Sam Kleinman
c18470a5f1 e2e: use network size in load generator (#7019) 2021-09-28 16:47:35 -04:00
Sam Kleinman
e35a42fc68 e2e: use smaller transactions (#7016)
75% of the failures in the last run all ran with the 10kb
transactions. I'd like to dial it back and see if things improve more.
2021-09-28 14:39:26 +00:00
Sam Kleinman
6be36613c9 e2e: reduce number of stateless nodes in test networks (#7010) 2021-09-27 17:00:05 -04:00
Sam Kleinman
8023a2aeef e2e: add generator tests (#7008) 2021-09-27 15:38:03 -04:00
Sam Kleinman
6eaa3b24d6 ci: use cheaper codecov data collection (#7009) 2021-09-27 15:22:25 -04:00
Sam Kleinman
b150ea6b3e e2e: avoid seed nodes when statesyncing (#7006) 2021-09-27 14:08:08 -04:00
Sam Kleinman
b879f71e8e e2e: reduce log noise (#7004) 2021-09-27 13:27:08 -04:00
Callum Waters
60a6c6fb1a e2e: allow running of single node using the e2e app (#6982) 2021-09-27 15:43:07 +02:00
Sam Kleinman
fb9eaf576a e2e: improve chances of statesyncing success (#7001)
This reduces this situation where a node will get stuck block syncing,
which seemed to happen a lot in last nights run.
2021-09-26 16:10:36 +00:00
Sam Kleinman
37ca98a544 e2e: reduce number of statesyncs in test networks (#6999) 2021-09-25 19:14:38 -04:00
Sam Kleinman
c101fa17ab e2e: add limit and sort to generator (#6998)
I observed a couple of problems with the generator in some recent tests: 

- there were a couple of hybrid test cases which did not have any
  legacy nodes (randomness and all.) I change the probability to
  produce more reliable results.

- added options to the generation to be able to add a max (to
  compliment the earlier min) number of nodes for local testing. 

- added an option to support reversing the sort order so "more
  complex" networks were first, as well as tweaked some of the point
  values. 

- this refactored the generators cli parsing to be a bit more clear.
2021-09-25 15:53:04 +00:00
Sam Kleinman
5e45676875 e2e: do not inject evidence through light proxy (#6992)
In the last run, there were two problems at the RPC layer returned
from light nodes' RPC end points. I think exercising the light client
proxy RPC system is something that can/should be done via unit
testing, and that likely these errors are (in production) transient
and (in CI) very likely to fail for test environment issues.
2021-09-24 18:27:00 +00:00
Sam Kleinman
08982c81fc e2e: skip validation of status apphash (#6991)
I believe this assertion is likely redundant given that we're checking the block apphash.
2021-09-24 17:49:06 +00:00
Sam Kleinman
ab8cfb9f57 e2e: tighten timing for load generation (#6990) 2021-09-24 12:28:51 -04:00
Sam Kleinman
c909f8a236 e2e: avoid non-determinism in app hash check (#6985) 2021-09-24 11:52:47 -04:00
Sam Kleinman
5ccd668c78 e2e: load should be proportional to network (#6983) 2021-09-23 16:58:10 -04:00
Sam Kleinman
e94c418ad9 e2e: always preserve failed networks (#6981) 2021-09-23 14:52:14 -04:00
Sam Kleinman
3d410e4a6b e2e: only check validator sets after statesync (#6980) 2021-09-23 14:31:59 -04:00
Sam Kleinman
8a171b8426 e2e: improve manifest sorting algorithim (#6979) 2021-09-23 12:42:20 -04:00
Sam Kleinman
d04b6c2a5e e2e: run multiple should use preserve (#6972) 2021-09-22 13:13:31 -04:00
Sam Kleinman
1c4950dbd2 state: move package to internal (#6964) 2021-09-22 13:04:25 -04:00
Sam Kleinman
9dfdc62eb7 proxy: move proxy package to internal (#6953) 2021-09-20 15:18:48 -04:00
William Banfield
bf9232e99f e2e: cleanup on all errors if preserve not specified (#6950)
If the e2e tests error, they leave all of the e2e state around including containers and networks etc. 
We should clean this up when the tests shuts down, even if it exits in error.
2021-09-17 08:35:49 +00:00
Sam Kleinman
b0423e2445 e2e: allow load generator to succed for short tests (#6952)
This should address last night's failure. We've taken the perspective
of "the load generator shouldn't cause tests to fail" in recent
days/weeks, and I think this is just a next step along that line. The
e2e tests shouldn't test performance. 

I included some comments indicating the ways that this isn't ideal (it
is perhaps not), and I think that if test networks could make
assertions about the required rate, that might be a cool future
improvement (and good, perhaps, for system benchmarking.)
2021-09-16 15:45:51 +00:00
Sam Kleinman
55f6d20977 e2e: skip broadcastTxCommit check (#6949)
I think the `Sync` check covers our primary use case, and perhaps we
can turn this back on in the future after some kind of event-system
rewrite, or RPC rewrite that will avoid the serverside timeout.
2021-09-15 21:24:35 +00:00
Callum Waters
e932b469ed e2e: tweak semantics of waitForHeight (#6943) 2021-09-15 20:49:24 +02:00
Sam Kleinman
6909158933 e2e: reduce load pressure (#6939) 2021-09-14 10:44:30 -04:00
Sam Kleinman
c257cda212 e2e: slow load processes with longer evidence timeouts (#6936)
These are mostly the timeouts that I think we're still hitting in CI. 

At this point, the tests (on master) pass on my local machine (which is quite beefy) so I think this is just the first in (perhaps?) a sequence of changes that attempt to change timeouts and load patterns so that the tests pass in CI more reliably.
2021-09-13 20:57:25 +00:00
Sam Kleinman
abbe8209b5 e2e: reduce load volume (#6932) 2021-09-13 13:45:01 -04:00
Sam Kleinman
3bf0c7a712 e2e: improve p2p mode selection (#6929)
The previous implemention of hybrid set testing, which was entirely my
own creation, was a bit peculiar, and I think this probably clears thins up.

The previous implementation had far fewer legacy nodes in hybrid
networks, *and* also for some reason that I can't quite explain,
caused a test case to fail.
2021-09-12 06:30:58 +00:00
Sam Kleinman
1998cf7e77 e2e: compile tests (#6926) 2021-09-10 13:34:26 -04:00
Sam Kleinman
c3bcf9b180 e2e: test multiple broadcast tx methods (#6925) 2021-09-10 12:03:41 -04:00
Callum Waters
f1b9613301 e2e: increase retain height to at least twice evidence age (#6924) 2021-09-10 16:18:01 +02:00
Sam Kleinman
af71f1cbcb e2e: load generation and logging changes (#6912) 2021-09-10 09:26:17 -04:00
Sam Kleinman
0c3601bcac e2e: introduce canonical ordering of manifests (#6918) 2021-09-09 14:31:17 -04:00
William Banfield
dc0e04d243 rename configuration parameters to use the new blocksync nomenclature (#6896)
The 0.35 release cycle renamed the 'fastsync' functionality to 'blocksync'. This change brings the configuration parameters in line with that change. Namely, it updates the configuration file `[fastsync]` field to be `[blocksync]` and changes the command line flag and config file parameters `--fast-sync` and `fast-sync` to `--enable-block-sync` and `enable-block-sync` respectively.

Error messages were added to help users encountering these changes be able to quickly make the needed update to their files/scripts.

When using the old command line argument for fast-sync, the following is printed

```
./build/tendermint start --proxy-app=kvstore --consensus.create-empty-blocks=false --fast-sync=false
ERROR: invalid argument "false" for "--fast-sync" flag: --fast-sync has been deprecated, please use --enable-block-sync
```

When using one of the old config file parameters, the following is printed:
```
./build/tendermint start --proxy-app=kvstore --consensus.create-empty-blocks=false
ERROR: error in config file: a configuration parameter named 'fast-sync' was found in the configuration file. The 'fast-sync' parameter has been renamed to 'enable-block-sync', please update the 'fast-sync' field in your configuration file to 'enable-block-sync'
```
2021-09-08 13:58:12 +00:00
Callum Waters
8fe651ba30 e2e: clean up generation of evidence (#6904) 2021-09-07 12:20:43 +02:00
Sam Kleinman
1f8bb74bba e2e: skip assertions for stateless nodes (#6894) 2021-09-03 13:25:36 -04:00
Sam Kleinman
77615b900f e2e: wait for all nodes rather than just one (#6892) 2021-09-03 13:03:16 -04:00
Sam Kleinman
21b5e5931a e2e: skip light clients when waiting for height (#6891) 2021-09-03 10:19:15 -04:00
Callum Waters
bda948e814 statesync: implement p2p state provider (#6807) 2021-09-02 13:19:18 +02:00
Sam Kleinman
511bd3eb7f e2e: weight protocol dimensions (#6884)
This changes the focus of the e2e suite, to (roughly) focus on
configurations that are more well used. Most production users of
tendermint run ABCI application in process and the GRPC/socket methods
cover the vast majority of the remaining use cases. 

Perhaps we should consider drop support unix domain sockets in a
future release, but I think in the mean time it's useful to have the
tests *mostly* focus on the primary use cases.
2021-09-01 19:52:40 +00:00
Sam Kleinman
9a0081f076 e2e: change restart mechanism (#6883) 2021-09-01 12:49:45 -04:00
Sam Kleinman
7169d26ddf e2e: more reliable method for selecting node to inject evidence (#6880)
In retrospect my previous implementation of this node, could get
unlucky and never find the correct node. This method is more reliable.
2021-08-31 21:56:06 +00:00
Sam Kleinman
0df421b37f e2e: add weighted random configuration selector (#6869)
When revwing #6807 I assumed that `probSetChoice` worked this way. 

I think that the coverage of various configuration options should
generally track what we expect the actual useage to be to focus the
most test coverage on the configurations that are the most prevelent.
2021-08-27 16:32:40 +00:00
Sam Kleinman
6e921f6644 p2p: change default to use new stack (#6862)
This is just a configuration change to default to using the new stack
unless explicitly disabled (e.g. `UseLegacy`) this renames the
configuration value and makes the configuration logic more clear.

The legacy option is good to retain as a fallback if the new stack has
issues operationally, but we should make sure that most of the time
we're using the new stack.
2021-08-25 17:33:38 +00:00
Sam Kleinman
9c8379ef30 e2e: more consistent node selection during tests (#6857)
In the transaction load generator, the e2e test harness previously distributed load randomly to hosts, which was a source of test non-determinism. This change distributes the load generation to the different nodes in the set in a round robin fashion, to produce more reliable results, but does not otherwise change the behavior of the test harness.
2021-08-25 12:24:01 +00:00
Sam Kleinman
a374f74f7c e2e: cleanup node start function (#6842)
I realized after my last commit that my change made a following line of code a bit redundant.

(alternatively my last change was redunadnt to the existing code.)

I took this oppertunity to make some minor cleanups and logging changes to the node changes which I hope will make tests a bit more clear.
2021-08-20 17:26:04 +00:00
Sam Kleinman
a4cc8317da e2e: avoid starting nodes from the future (#6835) 2021-08-18 14:33:28 -04:00