Commit Graph

707 Commits

Author SHA1 Message Date
Cyrus Goh
5182ffee25 docs: master → docs-staging (#5990)
* Makefile: always pull image in proto-gen-docker. (#5953)

The `proto-gen-docker` target didn't pull an updated Docker image, and would use a local image if present which could be outdated and produce wrong results.

* test: fix TestPEXReactorRunning data race (#5955)

Fixes #5941.

Not entirely sure that this will fix the problem (couldn't reproduce), but in any case this is an artifact of a hack in the P2P transport refactor to make it work with the legacy P2P stack, and will be removed when the refactor is done anyway.

* test/fuzz: move fuzz tests into this repo (#5918)

Co-authored-by: Emmanuel T Odeke <emmanuel@orijtech.com>

Closes #5907

- add init-corpus to blockchain reactor
- remove validator-set FromBytes test
now that we have proto, we don't need to test it! bye amino
- simplify mempool test
do we want to test remote ABCI app?
- do not recreate mux on every crash in jsonrpc test
- update p2p pex reactor test
- remove p2p/listener test
the API has changed + I did not understand what it's tested anyway
- update secretconnection test
- add readme and makefile
- list inputs in readme
- add nightly workflow
- remove blockchain fuzz test
EncodeMsg / DecodeMsg no longer exist

* docker: dont login when in PR (#5961)

* docker: release Linux/ARM64 image (#5925)

Co-authored-by: Marko <marbar3778@yahoo.com>

* p2p: make PeerManager.DialNext() and EvictNext() block (#5947)

See #5936 and #5938 for background.

The plan was initially to have `DialNext()` and `EvictNext()` return a channel. However, implementing this became unnecessarily complicated and error-prone. As an example, the channel would be both consumed and populated (via method calls) by the same driving method (e.g. `Router.dialPeers()`) which could easily cause deadlocks where a method call blocked while sending on the channel that the caller itself was responsible for consuming (but couldn't since it was busy making the method call). It would also require a set of goroutines in the peer manager that would interact with the goroutines in the router in non-obvious ways, and fully populating the channel on startup could cause deadlocks with other startup tasks. Several issues like these made the solution hard to reason about.

I therefore simply made `DialNext()` and `EvictNext()` block until the next peer was available, using internal triggers to wake these methods up in a non-blocking fashion when any relevant state changes occurred. This proved much simpler to reason about, since there are no goroutines in the peer manager (except for trivial retry timers), nor any blocking channel sends, and it instead relies entirely on the existing goroutine structure of the router for concurrency. This also happens to be the same pattern used by the `Transport.Accept()` API, following Go stdlib conventions, so all router goroutines end up using a consistent pattern as well.

* libs/log: format []byte as hexidecimal string (uppercased) (#5960)

Closes: #5806 

Co-authored-by: Lanie Hei <heixx011@umn.edu>

* docs: log level docs (#5945)

## Description

add section on configuring log levels

Closes: #XXX

* .github: fix fuzz-nightly job (#5965)

outputs is a property of the job, not an individual step.

* e2e: add control over the log level of nodes (#5958)

* mempool: fix reactor tests (#5967)

## Description

Update the faux router to either drop channel errors or handle them based on an argument. This prevents deadlocks in tests where we try to send an error on the mempool channel but there is no reader.

Closes: #5956

* p2p: improve peerStore prototype (#5954)

This improves the `peerStore` prototype by e.g.:

* Using a database with Protobuf for persistence, but also keeping full peer set in memory for performance.
* Simplifying the API, by taking/returning struct copies for safety, and removing errors for in-memory operations.
* Caching the ranked peer set, as a temporary solution until a better data structure is implemented.
* Adding `PeerManagerOptions.MaxPeers` and pruning the peer store (based on rank) when it's full.
* Rewriting `PeerAddress` to be independent of `url.URL`, normalizing it and tightening semantics.

* p2p: simplify PeerManager upgrade logic (#5962)

Follow-up from #5947, branched off of #5954.

This simplifies the upgrade logic by adding explicit eviction requests, which can also be useful for other use-cases (e.g. if we need to ban a peer that's misbehaving). Changes:

* Add `evict` map which queues up peers to explicitly evict.
* `upgrading` now only tracks peers that we're upgrading via dialing (`DialNext` → `Dialed`/`DialFailed`).
* `Dialed` will unmark `upgrading`, and queue `evict` if still beyond capacity.
* `Accepted` will pick a random lower-scored peer to upgrade to, if appropriate, and doesn't care about `upgrading` (the dial will fail later, since it's already connected).
* `EvictNext` will return a peer scheduled in `evict` if any, otherwise if beyond capacity just evict the lowest-scored peer.

This limits all of the `upgrading` logic to `DialNext`, `Dialed`, and `DialFailed`, making it much simplier, and it should generally do the right thing in all cases I can think of.

* p2p: add PeerManager.Advertise() (#5957)

Adds a naïve `PeerManager.Advertise()` method that the new PEX reactor can use to fetch addresses to advertise, as well as some other `FIXME`s on address advertisement.

* blockchain v0: fix waitgroup data race (#5970)

## Description

Fixes the data race in usage of `WaitGroup`. Specifically, the case where we invoke `Wait` _before_ the first delta `Add` call when the current waitgroup counter is zero. See https://golang.org/pkg/sync/#WaitGroup.Add.

Still not sure how this manifests itself in a test since the reactor has to be stopped virtually immediately after being started (I think?).

Regardless, this is the appropriate fix.

closes: #5968

* tests: fix `make test` (#5966)

## Description
 
- bump deadlock dep to master
  - fixes `make test` since we now use `deadlock.Once`

Closes: #XXX

* terminate go-fuzz gracefully (w/ SIGINT) (#5973)

and preserve exit code.

```
2021/01/26 03:34:49 workers: 2, corpus: 4 (8m28s ago), crashers: 0, restarts: 1/9976, execs: 11013732 (21596/sec), cover: 121, uptime: 8m30s
make: *** [fuzz-mempool] Terminated
Makefile:5: recipe for target 'fuzz-mempool' failed
Error: Process completed with exit code 124.
```

https://github.com/tendermint/tendermint/runs/1766661614

`continue-on-error` should make GH ignore any error codes.

* p2p: add prototype PEX reactor for new stack (#5971)

This adds a prototype PEX reactor for the new P2P stack.

* proto/p2p: rename PEX messages and fields (#5974)

Fixes #5899 by renaming a bunch of P2P Protobuf entities (while maintaining wire compatibility):

* `Message` to `PexMessage` (as it's only used for PEX messages).
* `PexAddrs` to `PexResponse`.
* `PexResponse.Addrs` to `PexResponse.Addresses`.
* `NetAddress` to `PexAddress` (as it's only used by PEX).

* p2p: resolve PEX addresses in PEX reactor (#5980)

This changes the new prototype PEX reactor to resolve peer address URLs into IP/port PEX addresses itself. Branched off of #5974.

I've spent some time thinking about address handling in the P2P stack. We currently use `PeerAddress` URLs everywhere, except for two places: when dialing a peer, and when exchanging addresses via PEX. We had two options:

1. Resolve addresses to endpoints inside `PeerManager`. This would introduce a lot of added complexity: we would have to track connection statistics per endpoint, have goroutines that asynchronously resolve and refresh these endpoints, deal with resolve scheduling before dialing (which is trickier than it sounds since it involves multiple goroutines in the peer manager and router and messes with peer rating order), handle IP address visibility issues, and so on.

2. Resolve addresses to endpoints (IP/port) only where they're used: when dialing, and in PEX. Everywhere else we use URLs.

I went with 2, because this significantly simplifies the handling of hostname resolution, and because I really think the PEX reactor should migrate to exchanging URLs instead of IP/port numbers anyway -- this allows operators to use DNS names for validators (and can easily migrate them to new IPs and/or load balance requests), and also allows different protocols (e.g. QUIC and `MemoryTransport`). Happy to discuss this.

* test/p2p: close transports to avoid goroutine leak failures (#5982)

* mempool: fix TestReactorNoBroadcastToSender (#5984)

## Description

Looks like I missed a test in the original PR when fixing the tests.

Closes: #5956

* mempool: fix mempool tests timeout (#5988)

* p2p: use stopCtx when dialing peers in Router (#5983)

This ensures we don't leak dial goroutines when shutting down the router.

* docs: fix typo in state sync example (#5989)

Co-authored-by: Erik Grinaker <erik@interchain.berlin>
Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com>
Co-authored-by: Marko <marbar3778@yahoo.com>
Co-authored-by: odidev <odidev@puresoftware.com>
Co-authored-by: Lanie Hei <heixx011@umn.edu>
Co-authored-by: Callum Waters <cmwaters19@gmail.com>
Co-authored-by: Aleksandr Bezobchuk <alexanderbez@users.noreply.github.com>
Co-authored-by: Sergey <52304443+c29r3@users.noreply.github.com>
2021-01-26 11:46:21 -08:00
Callum
af723eca8a use correct source of evidence time
Conflicting votes are now sent to the evidence pool to form duplicate vote evidence only once
the height of the evidence is finished and the time of the block finalised.
2021-01-19 16:00:02 +01:00
Callum Waters
bada08c50c state sync: last consensus params height is not set (#5889) 2021-01-12 14:41:16 +01:00
Callum Waters
5b698ed13b tx indexer: use different field separator for keys (#5865) 2021-01-12 10:55:47 +01:00
Callum Waters
03a6fb2777 state: prune states using an iterator (#5864) 2021-01-08 17:05:27 +01:00
Callum Waters
9b9222f461 store: order-preserving varint key encoding (#5771) 2021-01-05 16:53:26 +01:00
Anton Kaliaev
e13b4386ff abci: modify Client interface and socket client (#5673)
`abci.Client`:

- Sync and Async methods now accept a context for cancellation
    * grpc client uses context to cancel both Sync and Async requests
    * local client ignores context parameter
    * socket client uses context to cancel Sync requests and to drop Async requests before sending them if context was cancelled prior to that

- Async methods return an error
    * socket client returns an error immediately if queue is full for Async requests
    * local client always returns nil error
    * grpc client returns an error if context was cancelled before we got response or the receiving queue had a space for response (do not confuse with the sending queue from the socket client)

- specify clients semantics in [doc.go](https://raw.githubusercontent.com/tendermint/tendermint/27112fffa62276bc016d56741f686f0f77931748/abci/client/doc.go)

`mempool.TxInfo`

- add optional `Context` to `TxInfo`, which can be used to cancel `CheckTx` request

Closes #5190
2020-11-30 16:46:16 +04:00
Callum Waters
3922dde05d evidence: structs can independently form abci evidence (#5610) 2020-11-04 17:14:48 +01:00
Callum Waters
d1ef5028a0 block: fix max commit sig size (#5567) 2020-10-26 10:18:59 +01:00
Callum Waters
55e8ccab21 block: use commit sig size instead of vote size (#5490) 2020-10-13 17:21:37 +02:00
Marko
e1644d00c5 mempool: length prefix txs when getting them from mempool (#5483)
## Description

In protobuf `[]byte` is varint encoded. When adding txs to the block we were not taking this into account. 


Closes: #XXX
2020-10-13 10:33:21 +00:00
Marko
346aa14db5 fix lint failures with 1.31 (#5489) 2020-10-13 10:22:53 +02:00
dependabot[bot]
710ed63f55 build(deps): Bump technote-space/get-diff-action from v3 to v4 (#5485)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Marko Baricevic <marbar3778@yahoo.com>
2020-10-12 14:13:45 +02:00
Marko
82e4693cc5 abci: remove setOption (#5447)
Remove Response/Request SetOption from ABCI.

Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com>
2020-10-08 19:12:12 +02:00
Callum Waters
302aec6dcc evidence: use bytes instead of quantity to limit size (#5449)
## Description

Closes: #5432
2020-10-07 09:29:52 +00:00
Anton Kaliaev
1635d1339c state: more test cases for block validation (#5415)
Closes #2589
2020-09-28 11:57:35 +00:00
Anton Kaliaev
4b99502d5b config: set time_iota_ms to timeout_commit in test genesis (#5386)
also, document consensus parameters.
https://forum.cosmos.network/t/consensus-timeouts-explained/1421

Closes #4489
2020-09-23 12:24:45 +04:00
Erik Grinaker
58b4deca86 blockstore: fix race conditions when loading data (#5382)
Fixes #5377 and comments in #4588 (review).
2020-09-23 10:13:43 +04:00
Callum Waters
ed002cea7e evidence: introduction of LightClientAttackEvidence and refactor of evidence lifecycle (#5361)
evidence: modify evidence types (#5342)

light: detect light client attacks (#5344)

evidence: refactor evidence pool (#5345)

abci: application evidence prepared by evidence pool (#5354)
2020-09-22 10:22:54 +02:00
Marko
56911ee352 state: define interface for state store (#5348)
## Description

Make an interface for the state store. 

Closes: #5213
2020-09-15 07:45:48 +00:00
Marko
0ed8dba991 lint: enable errcheck (#5336)
## Description

Enable errcheck linter throughout the codebase

Closes: #5059
2020-09-07 15:03:18 +00:00
Marko
b8d08b9ef4 lint: add errchecks (#5316)
## Description

Work towards enabling errcheck

ref #5059
2020-09-04 11:58:03 +00:00
Marko
fbdf8b098e mocks: update with 2.2.1 (#5294)
## Description

When downloading mockery I ran into an issue where we were using the old version. This PR updates to a more recent version.

changelog?

Closes: #XXX
2020-08-26 15:28:46 +00:00
Callum Waters
b7f6e47a42 evidence: modularise evidence by moving verification function into evidence package (#5234) 2020-08-20 18:11:21 +02:00
Erik Grinaker
feaa1ed17e state: don't save genesis state in database when loaded (#5231)
Fixes #5138. I don't have a strong opinion on this, but find it sort of odd that `Load` functions actually save as well.
2020-08-12 08:24:44 +00:00
Erik Grinaker
cc247c091b genesis: add support for arbitrary initial height (#5191)
Adds a genesis parameter `initial_height` which specifies the initial block height, as well as ABCI `RequestInitChain.InitialHeight` to pass it to the ABCI application, and `State.InitialHeight` to keep track of the initial height throughout the code. Fixes #2543, based on [RFC-002](https://github.com/tendermint/spec/pull/119). Spec changes in https://github.com/tendermint/spec/pull/135.
2020-08-11 17:03:28 +00:00
Callum Waters
312c4f8fe1 evidence: change evidence time to block time (#5219)
adds blockstore interface to evidence and adds fix to byzantine test
2020-08-11 14:39:07 +02:00
Callum Waters
68468fb024 evidence: fix usage of time field in abci evidence (#5201)
* fix usage of time in abci evidence

* update changelong and upgrading

* add test cases
2020-08-04 12:58:48 +02:00
Callum Waters
3c21c3546c evidence: remove phantom validator evidence (#5181) 2020-07-31 12:23:58 +02:00
Anton Kaliaev
4d43bfe3bd state: revert event hashing (#5159)
See ADR 058

Closes #5113

Spec PR: https://github.com/tendermint/spec/pull/122
2020-07-30 09:15:08 +00:00
Marko
dc71f265aa types: check if nil or empty valset (#5167)
Solves #5138 in the way that if a validatorSet is nil or empty it will not try to transform it to protobug

Co-authored-by: Callum Michael Waters <cmwaters19@gmail.com>
2020-07-29 20:16:42 +02:00
Callum Waters
b5a5f9274d evidence: minor correction to potential amnesia ev validate basic (#5151)
ValidateBasic() for PotentialAmnesiaEvidence checks that the rounds of the two votes are different and does not check Vote Type

ValidateBasic() now also ensures that the first block is not a nil block (else the validator hasn't actually locked onto a block)
2020-07-27 16:12:42 +02:00
Marko
7c8c356f71 ci: version linter fix (#5128)
## Description
linter version fix and run make format to have all ci run

Closes: #XXX
2020-07-16 09:01:02 +00:00
Marko
6ccccb0933 lint: errcheck (#5091)
## Description

add more error checks to tests


gonna do a third PR that tackles the non test cases
2020-07-14 11:04:41 +00:00
Callum Waters
37545bab88 evidence: new evidence event subscription (#5108) 2020-07-13 11:06:44 +02:00
Callum Waters
a97d05be4d evidence: check lunatic vote matches header (#5093) 2020-07-08 17:58:03 +02:00
Marko
7e2cc1db5e linter: (1/2) enable errcheck (#5064)
## Description

partially cleanup in preparation for errcheck

i ignored a bunch of defer errors in tests but with the update to go 1.14 we can use `t.Cleanup(func() { if err := <>; err != nil {..}}` to cover those errors, I will do this in pr number two of enabling errcheck.

ref #5059
2020-07-01 15:13:11 +00:00
Erik Grinaker
04b8cf7879 deps: bump tm-db to 0.6.0 (#5058) 2020-06-29 16:07:37 +02:00
Callum Waters
3ecc0ffe7e evidence: replace mock evidence with mocked duplicate vote evidence (#5036) 2020-06-24 07:24:30 +02:00
Marko
5412426ae8 indexer: remove index filtering (#5006)
Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com>
2020-06-23 17:42:10 +02:00
Callum Waters
65d7ce9c9c evidence: improve amnesia evidence handling (#5003)
fix bug so that PotentialAmnesiaEvidence is being gossiped

handle inbound amnesia evidence correctly

add method to check if potential amnesia evidence is on trial

fix a bug with the height when we upgrade to amnesia evidence

change evidence to using just pointers.

More logging in the evidence module

Co-authored-by: Marko <marbar3778@yahoo.com>
2020-06-23 17:09:14 +02:00
Anton Kaliaev
ceac02b891 types: add AppVersion to ConsensusParams (#5031)
Co-authored-by: JamesRay <66258875@qq.com>

making it possible to change app version via EndBlock
2020-06-23 12:14:16 +04:00
Marko
dedf0d2350 proto: folder structure adhere to buf (#5025) 2020-06-22 10:00:51 +02:00
Marko
51da4fe356 types: rename partsheader to partsetheader (#5029)
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2020-06-22 09:38:03 +02:00
Marko
b8b50733f0 encoding: remove codecs (#4996)
## Description

This pr removes amino from tendermint. 

Closes: #4278
2020-06-15 11:17:12 +00:00
Marko
74cae49c3b proto: leftover amino (#4986) 2020-06-15 11:14:36 +02:00
Anton Kaliaev
a8d8600308 [block#LastResultsHash] add Events + GasWanted/Used (#4845)
Closes #1007
2020-06-12 12:49:14 +04:00
Marko
bdac0818ac p2p: proto leftover (#4995)
## Description

removing codec.go from p2p pkg and some leftover amino encoding

Closes: #XXX
2020-06-11 14:37:29 +00:00
Marko
f6243d8b9e privval: migrate to protobuf (#4985) 2020-06-11 11:54:02 +02:00
Marko
31a361d119 proto: move keys to oneof (#4983) 2020-06-11 11:10:37 +02:00