Commit Graph

112 Commits

Author SHA1 Message Date
William Banfield
8b326810f5 Merge remote-tracking branch 'origin/wb/statesync-init-deadlock' into wb/statesync-init-deadlock 2021-09-30 15:06:43 -04:00
William Banfield
3185eff8c6 add comment to cleanup case 2021-09-30 15:05:42 -04:00
mergify[bot]
51193fdb75 Merge branch 'master' into wb/statesync-init-deadlock 2021-09-30 18:29:20 +00:00
William Banfield
8a0cd32ecf lint++ 2021-09-30 13:57:59 -04:00
William Banfield
6ccc2aef9e lint++ 2021-09-30 13:53:08 -04:00
M. J. Fromberger
bdd815ebc9 Align atomic struct field for compatibility in 32-bit ABIs. (#7037)
The layout of struct fields means that interior fields may not be properly
aligned for 64-bit access.

Fixes #7000.
2021-09-30 10:53:05 -07:00
William Banfield
530814991f Update internal/statesync/syncer.go
Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
2021-09-30 13:46:16 -04:00
William Banfield
93719f4494 minor 2021-09-30 13:43:20 -04:00
William Banfield
28eacd75c2 revert close ch order change 2021-09-30 13:42:54 -04:00
William Banfield
0d72e89ac4 recover on send on closed channel in statesync 2021-09-30 13:40:21 -04:00
William Banfield
6a0d9c832a blocksync: fix shutdown deadlock issue (#7030)
When shutting down blocksync, it is observed that the process can hang completely. A dump of running goroutines reveals that this is due to goroutines not listening on the correct shutdown signal. Namely, the `poolRoutine` goroutine does not wait on `pool.Quit`. The `poolRoutine` does not receive any other shutdown signal during `OnStop` becuase it must stop before the `r.closeCh` is closed. Currently the `poolRoutine` listens in the `closeCh` which will not close until the `poolRoutine` stops and calls `poolWG.Done()`.

This change also puts the `requestRoutine()` in the `OnStart` method to make it more visible since it does not rely on anything that is spawned in the `poolRoutine`.

```
goroutine 183 [semacquire]:
sync.runtime_Semacquire(0xc0000d3bd8)
        runtime/sema.go:56 +0x45
sync.(*WaitGroup).Wait(0xc0000d3bd0)
        sync/waitgroup.go:130 +0x65
github.com/tendermint/tendermint/internal/blocksync/v0.(*Reactor).OnStop(0xc0000d3a00)
        github.com/tendermint/tendermint/internal/blocksync/v0/reactor.go:193 +0x47
github.com/tendermint/tendermint/libs/service.(*BaseService).Stop(0xc0000d3a00, 0x0, 0x0)
        github.com/tendermint/tendermint/libs/service/service.go:171 +0x323
github.com/tendermint/tendermint/node.(*nodeImpl).OnStop(0xc00052c000)
        github.com/tendermint/tendermint/node/node.go:758 +0xc62
github.com/tendermint/tendermint/libs/service.(*BaseService).Stop(0xc00052c000, 0x0, 0x0)
        github.com/tendermint/tendermint/libs/service/service.go:171 +0x323
github.com/tendermint/tendermint/cmd/tendermint/commands.NewRunNodeCmd.func1.1()
        github.com/tendermint/tendermint/cmd/tendermint/commands/run_node.go:143 +0x62
github.com/tendermint/tendermint/libs/os.TrapSignal.func1(0xc000df6d20, 0x7f04a68da900, 0xc0004a8930, 0xc0005a72d8)
        github.com/tendermint/tendermint/libs/os/os.go:26 +0x102
created by github.com/tendermint/tendermint/libs/os.TrapSignal
        github.com/tendermint/tendermint/libs/os/os.go:22 +0xe6


goroutine 161 [select]:
github.com/tendermint/tendermint/internal/blocksync/v0.(*Reactor).poolRoutine(0xc0000d3a00, 0x0)
        github.com/tendermint/tendermint/internal/blocksync/v0/reactor.go:464 +0x2b3
created by github.com/tendermint/tendermint/internal/blocksync/v0.(*Reactor).OnStart
        github.com/tendermint/tendermint/internal/blocksync/v0/reactor.go:174 +0xf1

goroutine 162 [select]:
github.com/tendermint/tendermint/internal/blocksync/v0.(*Reactor).processBlockSyncCh(0xc0000d3a00)
        github.com/tendermint/tendermint/internal/blocksync/v0/reactor.go:310 +0x151
created by github.com/tendermint/tendermint/internal/blocksync/v0.(*Reactor).OnStart
        github.com/tendermint/tendermint/internal/blocksync/v0/reactor.go:177 +0x54

goroutine 163 [select]:
github.com/tendermint/tendermint/internal/blocksync/v0.(*Reactor).processPeerUpdates(0xc0000d3a00)
        github.com/tendermint/tendermint/internal/blocksync/v0/reactor.go:363 +0x12b
created by github.com/tendermint/tendermint/internal/blocksync/v0.(*Reactor).OnStart
        github.com/tendermint/tendermint/internal/blocksync/v0/reactor.go:178 +0x76
```
2021-09-30 16:19:18 +00:00
William Banfield
db44aeb62b fix test 2021-09-29 17:55:21 -04:00
William Banfield
a070c42777 nest the close channel cases 2021-09-29 17:54:30 -04:00
William Banfield
f8cebbc5af remove syncer done 2021-09-29 16:56:06 -04:00
William Banfield
022148613e fix 2021-09-29 16:54:22 -04:00
William Banfield
becc6c3072 ensure syncer is not closed before trying to send a snapshot discovery message 2021-09-29 16:47:39 -04:00
William Banfield
41384c6899 fix defer 2021-09-29 14:24:18 -04:00
William Banfield
7679cc5015 statesync: remove deadlock on init fail 2021-09-29 14:16:50 -04:00
Sam Kleinman
23fe6fd2f9 statesync: ensure test network properly configured (#7026)
This test reliably gets hung up on network configuration, (which may
be a real issue,) but it's network setup is handcranked and we should
ensure that the test focuses on it's core assertions and doesn't fail for 
test architecture reasons.
2021-09-29 16:38:27 +00:00
Sam Kleinman
8758078786 consensus: avoid unbuffered channel in state test (#7025) 2021-09-29 11:19:34 -04:00
lklimek
1bd1593f20 fix: race condition in p2p_switch and pex_reactor (#7015)
Closes https://github.com/tendermint/tendermint/issues/7014
2021-09-28 09:32:14 -04:00
Sam Kleinman
9a16d930c6 statesync: add logging while waiting for peers (#7007) 2021-09-27 16:46:40 -04:00
Callum Waters
60a6c6fb1a e2e: allow running of single node using the e2e app (#6982) 2021-09-27 15:43:07 +02:00
Sam Kleinman
71c6682b57 statesync: clean up reactor/syncer lifecylce (#6995)
I've been noticing that there are a number of situations where the
statesync reactor blocks waiting for peers (or similar,) I've moved
things around to improve outcomes in local tests.
2021-09-24 21:40:12 +00:00
Sam Kleinman
b203c91799 rpc: implement BroadcastTxCommit without event subscriptions (#6984) 2021-09-24 13:01:35 -04:00
Sam Kleinman
bb8ffcb95b store: move pacakge to internal (#6978) 2021-09-23 11:46:42 -04:00
M. J. Fromberger
cf7537ea5f cleanup: Reduce and normalize import path aliasing. (#6975)
The code in the Tendermint repository makes heavy use of import aliasing.
This is made necessary by our extensive reuse of common base package names, and
by repetition of similar names across different subdirectories.

Unfortunately we have not been very consistent about which packages we alias in
various circumstances, and the aliases we use vary. In the spirit of the advice
in the style guide and https://github.com/golang/go/wiki/CodeReviewComments#imports,
his change makes an effort to clean up and normalize import aliasing.

This change makes no API or behavioral changes. It is a pure cleanup intended
o help make the code more readable to developers (including myself) trying to
understand what is being imported where.

Only unexported names have been modified, and the changes were generated and
applied mechanically with gofmt -r and comby, respecting the lexical and
syntactic rules of Go.  Even so, I did not fix every inconsistency. Where the
changes would be too disruptive, I left it alone.

The principles I followed in this cleanup are:

- Remove aliases that restate the package name.
- Remove aliases where the base package name is unambiguous.
- Move overly-terse abbreviations from the import to the usage site.
- Fix lexical issues (remove underscores, remove capitalization).
- Fix import groupings to more closely match the style guide.
- Group blank (side-effecting) imports and ensure they are commented.
- Add aliases to multiple imports with the same base package name.
2021-09-23 07:52:07 -07:00
M. J. Fromberger
41ac5b90c5 Fix script paths in go:generate directives. (#6973)
We moved some files further down in the directory structure in #6964, which
caused the relative paths to the mockery wrapper to stop working.

There does not seem to be an obvious way to get the module root as a default
environment variable, so for now I just added the extra up-slashes.
2021-09-23 01:02:24 +00:00
Sam Kleinman
1c4950dbd2 state: move package to internal (#6964) 2021-09-22 13:04:25 -04:00
Sam Kleinman
07d10184a1 inspect: remove duplicated construction path (#6966) 2021-09-22 12:32:49 -04:00
JayT106
84ffaaaf37 statesync/rpc: metrics for the statesync and the rpc SyncInfo (#6795) 2021-09-21 09:22:16 +02:00
Sam Kleinman
9dfdc62eb7 proxy: move proxy package to internal (#6953) 2021-09-20 15:18:48 -04:00
William Banfield
382947ce93 rfc: add performance taxonomy rfc (#6921)
This document attempts to capture and discuss some of the areas of Tendermint that seem to be cited as causing performance issue. I'm hoping to continue to gather feedback and input on this document to better understand what issues Tendermint performance may cause for our users. 

The overall goal of this document is to allow the maintainers and community to get a better sense of these issues and to be more capably able to discuss them and weight trade-offs about any proposed performance-focused changes. This document does not aim to propose any performance improvements. It does suggest useful places for benchmarks and places where additional metrics would be useful for diagnosing and further understanding Tendermint performance.

Please comment with areas where my reasoning seems off or with additional areas that Tendermint performance may be causing user pain.
2021-09-16 06:13:27 +00:00
William Banfield
63aeb50665 upgrading: add information into the UPGRADING.md for users of the codebase wishing to upgrade (#6898)
* add information on upgrading to the new p2p library

* clarify p2p backwards compatibility

* reorder p2p queue list

* add demo for p2p selection

* fix spacing in upgrading
2021-09-08 09:41:12 -04:00
Callum Waters
8fe651ba30 e2e: clean up generation of evidence (#6904) 2021-09-07 12:20:43 +02:00
Callum Waters
bda948e814 statesync: implement p2p state provider (#6807) 2021-09-02 13:19:18 +02:00
Sam Kleinman
c4df8a3840 types: move mempool error for consistency (#6875)
This is a little change just to make things more consistent ahead of
the 0.35 release.
2021-08-30 17:42:58 +00:00
Aleksandr Bezobchuk
58a6cfff9a internal/consensus: update error log (#6863)
Issues reported in Osmosis, where the message is extremely long. Also, there is absolutely no reason to log the message IMO. If we must, we can make the message log DEBUG.
2021-08-25 22:43:21 +00:00
Hlib Kanunnikov
d0e33b4292 blocksync: complete transition from Blockchain to BlockSync (#6847) 2021-08-23 16:45:08 +02:00
William Banfield
e801328128 clist: add simple property tests (#6791)
Adds a simple property test to the `clist` package. This test uses the [rapid](https://github.com/flyingmutant/rapid) library and works by modeling the internal clist as a simple array.

Follow up from this mornings workshop with the Regen team.
2021-08-03 19:30:05 +00:00
William Banfield
dc7c212c41 mempool/v1: test reactor does not panic on broadcast (#6772)
This changes adds a failing test for issue #6660. It achieves this by adding a transaction, starting the `broadcastTxRoutine` in a goroutine and then adding another transaction to the mempool. The `broadcastTxRoutine` can receive the second inserted transaction before `insertTx` returns. In that case, `broadcastTxRoutine` will derefence a nil pointer when referencing the `gossipEl` and panic.
2021-08-02 13:02:43 +00:00
William Banfield
4e96c6b234 tools: add mockery to tools.go and remove mockery version strings (#6787)
This change aims to keep versions of mockery consistent across developer laptops.

This change adds mockery to the `tools.go` file so that its version can be managed consistently in the `go.mod` file.

Additionally, this change temporarily disables adding mockery's version number to generated files. There is an outstanding issue against the mockery project related to the version string behavior when running from `go get`. I have created a pull request to fix this issue in the mockery project.
see: https://github.com/vektra/mockery/issues/397
2021-07-30 20:47:15 +00:00
Callum Waters
02f8e4c0bd blockstore: fix problem with seen commit (#6782) 2021-07-30 17:37:04 +02:00
M. J. Fromberger
6dd8984fef Fix and clarify breaks from select cases. (#6781)
Update those break statements inside case clauses that are intended to reach an
enclosing for loop, so that they correctly exit the loop.

The candidate files for this change were located using:

    % staticcheck -checks SA4011 ./... | cut -d: -f-2

This change is intended to preserve the intended semantics of the code, but
since the code as-written did not have its intended effect, some behaviour may
change. Specifically: Some loops may have run longer than they were supposed
to, prior to this change.

In one case I was not able to clearly determine the intended outcome. That case
has been commented but otherwise left as-written.

Fixes #6780.
2021-07-29 22:28:32 -04:00
JayT106
9a2a7d4307 state/privval: vote timestamp fix (#6748) 2021-07-29 12:52:53 +02:00
Callum Waters
6ff4c3139c blockchain: rename to blocksync service (#6755) 2021-07-28 17:25:42 +02:00
JayT106
e87b0391cb cli/indexer: Reindex events (#6676) 2021-07-28 00:04:54 +02:00
Aleksandr Bezobchuk
4f73748bc8 mempool v1: tweak broadcastTxRoutine (#6771) 2021-07-27 15:34:06 -04:00
William Banfield
a751eee7f2 p2p: add test for pqueue dequeue full error (#6760)
This adds a test for closing the `pqueue` while the `pqueue` contains data that has not yet been dequeued.

This issue was found while debugging #6705 

This test will fail until @cmwaters fix for this condition is merged.
2021-07-26 19:22:32 +00:00
Callum Waters
a341a626e0 p2p: avoid blocking on the dequeCh (#6765) 2021-07-26 09:09:07 -04:00