Commit Graph

219 Commits

Author SHA1 Message Date
Cyrus Goh
5182ffee25 docs: master → docs-staging (#5990)
* Makefile: always pull image in proto-gen-docker. (#5953)

The `proto-gen-docker` target didn't pull an updated Docker image, and would use a local image if present which could be outdated and produce wrong results.

* test: fix TestPEXReactorRunning data race (#5955)

Fixes #5941.

Not entirely sure that this will fix the problem (couldn't reproduce), but in any case this is an artifact of a hack in the P2P transport refactor to make it work with the legacy P2P stack, and will be removed when the refactor is done anyway.

* test/fuzz: move fuzz tests into this repo (#5918)

Co-authored-by: Emmanuel T Odeke <emmanuel@orijtech.com>

Closes #5907

- add init-corpus to blockchain reactor
- remove validator-set FromBytes test
now that we have proto, we don't need to test it! bye amino
- simplify mempool test
do we want to test remote ABCI app?
- do not recreate mux on every crash in jsonrpc test
- update p2p pex reactor test
- remove p2p/listener test
the API has changed + I did not understand what it's tested anyway
- update secretconnection test
- add readme and makefile
- list inputs in readme
- add nightly workflow
- remove blockchain fuzz test
EncodeMsg / DecodeMsg no longer exist

* docker: dont login when in PR (#5961)

* docker: release Linux/ARM64 image (#5925)

Co-authored-by: Marko <marbar3778@yahoo.com>

* p2p: make PeerManager.DialNext() and EvictNext() block (#5947)

See #5936 and #5938 for background.

The plan was initially to have `DialNext()` and `EvictNext()` return a channel. However, implementing this became unnecessarily complicated and error-prone. As an example, the channel would be both consumed and populated (via method calls) by the same driving method (e.g. `Router.dialPeers()`) which could easily cause deadlocks where a method call blocked while sending on the channel that the caller itself was responsible for consuming (but couldn't since it was busy making the method call). It would also require a set of goroutines in the peer manager that would interact with the goroutines in the router in non-obvious ways, and fully populating the channel on startup could cause deadlocks with other startup tasks. Several issues like these made the solution hard to reason about.

I therefore simply made `DialNext()` and `EvictNext()` block until the next peer was available, using internal triggers to wake these methods up in a non-blocking fashion when any relevant state changes occurred. This proved much simpler to reason about, since there are no goroutines in the peer manager (except for trivial retry timers), nor any blocking channel sends, and it instead relies entirely on the existing goroutine structure of the router for concurrency. This also happens to be the same pattern used by the `Transport.Accept()` API, following Go stdlib conventions, so all router goroutines end up using a consistent pattern as well.

* libs/log: format []byte as hexidecimal string (uppercased) (#5960)

Closes: #5806 

Co-authored-by: Lanie Hei <heixx011@umn.edu>

* docs: log level docs (#5945)

## Description

add section on configuring log levels

Closes: #XXX

* .github: fix fuzz-nightly job (#5965)

outputs is a property of the job, not an individual step.

* e2e: add control over the log level of nodes (#5958)

* mempool: fix reactor tests (#5967)

## Description

Update the faux router to either drop channel errors or handle them based on an argument. This prevents deadlocks in tests where we try to send an error on the mempool channel but there is no reader.

Closes: #5956

* p2p: improve peerStore prototype (#5954)

This improves the `peerStore` prototype by e.g.:

* Using a database with Protobuf for persistence, but also keeping full peer set in memory for performance.
* Simplifying the API, by taking/returning struct copies for safety, and removing errors for in-memory operations.
* Caching the ranked peer set, as a temporary solution until a better data structure is implemented.
* Adding `PeerManagerOptions.MaxPeers` and pruning the peer store (based on rank) when it's full.
* Rewriting `PeerAddress` to be independent of `url.URL`, normalizing it and tightening semantics.

* p2p: simplify PeerManager upgrade logic (#5962)

Follow-up from #5947, branched off of #5954.

This simplifies the upgrade logic by adding explicit eviction requests, which can also be useful for other use-cases (e.g. if we need to ban a peer that's misbehaving). Changes:

* Add `evict` map which queues up peers to explicitly evict.
* `upgrading` now only tracks peers that we're upgrading via dialing (`DialNext` → `Dialed`/`DialFailed`).
* `Dialed` will unmark `upgrading`, and queue `evict` if still beyond capacity.
* `Accepted` will pick a random lower-scored peer to upgrade to, if appropriate, and doesn't care about `upgrading` (the dial will fail later, since it's already connected).
* `EvictNext` will return a peer scheduled in `evict` if any, otherwise if beyond capacity just evict the lowest-scored peer.

This limits all of the `upgrading` logic to `DialNext`, `Dialed`, and `DialFailed`, making it much simplier, and it should generally do the right thing in all cases I can think of.

* p2p: add PeerManager.Advertise() (#5957)

Adds a naïve `PeerManager.Advertise()` method that the new PEX reactor can use to fetch addresses to advertise, as well as some other `FIXME`s on address advertisement.

* blockchain v0: fix waitgroup data race (#5970)

## Description

Fixes the data race in usage of `WaitGroup`. Specifically, the case where we invoke `Wait` _before_ the first delta `Add` call when the current waitgroup counter is zero. See https://golang.org/pkg/sync/#WaitGroup.Add.

Still not sure how this manifests itself in a test since the reactor has to be stopped virtually immediately after being started (I think?).

Regardless, this is the appropriate fix.

closes: #5968

* tests: fix `make test` (#5966)

## Description
 
- bump deadlock dep to master
  - fixes `make test` since we now use `deadlock.Once`

Closes: #XXX

* terminate go-fuzz gracefully (w/ SIGINT) (#5973)

and preserve exit code.

```
2021/01/26 03:34:49 workers: 2, corpus: 4 (8m28s ago), crashers: 0, restarts: 1/9976, execs: 11013732 (21596/sec), cover: 121, uptime: 8m30s
make: *** [fuzz-mempool] Terminated
Makefile:5: recipe for target 'fuzz-mempool' failed
Error: Process completed with exit code 124.
```

https://github.com/tendermint/tendermint/runs/1766661614

`continue-on-error` should make GH ignore any error codes.

* p2p: add prototype PEX reactor for new stack (#5971)

This adds a prototype PEX reactor for the new P2P stack.

* proto/p2p: rename PEX messages and fields (#5974)

Fixes #5899 by renaming a bunch of P2P Protobuf entities (while maintaining wire compatibility):

* `Message` to `PexMessage` (as it's only used for PEX messages).
* `PexAddrs` to `PexResponse`.
* `PexResponse.Addrs` to `PexResponse.Addresses`.
* `NetAddress` to `PexAddress` (as it's only used by PEX).

* p2p: resolve PEX addresses in PEX reactor (#5980)

This changes the new prototype PEX reactor to resolve peer address URLs into IP/port PEX addresses itself. Branched off of #5974.

I've spent some time thinking about address handling in the P2P stack. We currently use `PeerAddress` URLs everywhere, except for two places: when dialing a peer, and when exchanging addresses via PEX. We had two options:

1. Resolve addresses to endpoints inside `PeerManager`. This would introduce a lot of added complexity: we would have to track connection statistics per endpoint, have goroutines that asynchronously resolve and refresh these endpoints, deal with resolve scheduling before dialing (which is trickier than it sounds since it involves multiple goroutines in the peer manager and router and messes with peer rating order), handle IP address visibility issues, and so on.

2. Resolve addresses to endpoints (IP/port) only where they're used: when dialing, and in PEX. Everywhere else we use URLs.

I went with 2, because this significantly simplifies the handling of hostname resolution, and because I really think the PEX reactor should migrate to exchanging URLs instead of IP/port numbers anyway -- this allows operators to use DNS names for validators (and can easily migrate them to new IPs and/or load balance requests), and also allows different protocols (e.g. QUIC and `MemoryTransport`). Happy to discuss this.

* test/p2p: close transports to avoid goroutine leak failures (#5982)

* mempool: fix TestReactorNoBroadcastToSender (#5984)

## Description

Looks like I missed a test in the original PR when fixing the tests.

Closes: #5956

* mempool: fix mempool tests timeout (#5988)

* p2p: use stopCtx when dialing peers in Router (#5983)

This ensures we don't leak dial goroutines when shutting down the router.

* docs: fix typo in state sync example (#5989)

Co-authored-by: Erik Grinaker <erik@interchain.berlin>
Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com>
Co-authored-by: Marko <marbar3778@yahoo.com>
Co-authored-by: odidev <odidev@puresoftware.com>
Co-authored-by: Lanie Hei <heixx011@umn.edu>
Co-authored-by: Callum Waters <cmwaters19@gmail.com>
Co-authored-by: Aleksandr Bezobchuk <alexanderbez@users.noreply.github.com>
Co-authored-by: Sergey <52304443+c29r3@users.noreply.github.com>
2021-01-26 11:46:21 -08:00
Aleksandr Bezobchuk
68bd2116f0 mempool: p2p refactor (#5919) 2021-01-22 09:34:12 -05:00
Marko
09cf0bcb01 privval: add grpc (#5725)
Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com>
2021-01-06 10:49:30 -08:00
Erik Grinaker
a0d4d85375 os: simplify EnsureDir() (#5871)
#5852 fixed an issue with error propagation in `os.EnsureDir()`. However, this function is basically identical to `os.MkdirAll()`, and can be replaced entirely with a call to it. We keep the function for backwards compatibility.
2021-01-06 15:27:35 +00:00
Erik Grinaker
1ccd23ca1d p2p: fix MConnection inbound traffic statistics and rate limiting (#5868)
Fixes #5866. Inbound traffic monitoring (and by extension inbound rate limiting) was inadvertently removed in 660e72a.
2021-01-06 15:38:23 +01:00
Erik Grinaker
9c47b572f7 libs/os: EnsureDir now returns IO errors and checks file type (#5852)
Fixes #5839.
2021-01-04 14:30:38 +00:00
Erik Grinaker
bcfc889f25 p2p: implement new Transport interface (#5791)
This implements a new `Transport` interface and related types for the P2P refactor in #5670. Previously, `conn.MConnection` was very tightly coupled to the `Peer` implementation -- in order to allow alternative non-multiplexed transports (e.g. QUIC), MConnection has now been moved below the `Transport` interface, as `MConnTransport`, and decoupled from the peer. Since the `p2p` package is not covered by our Go API stability, this is not considered a breaking change, and not listed in the changelog.

The initial approach was to implement the new interface in its final form (which also involved possible protocol changes, see https://github.com/tendermint/spec/pull/227). However, it turned out that this would require a large amount of changes to existing P2P code because of the previous tight coupling between `Peer` and `MConnection` and the reliance on subtleties in the MConnection behavior. Instead, I have broadened the `Transport` interface to expose much of the existing MConnection interface, preserved much of the existing MConnection logic and behavior in the transport implementation, and tried to make as few changes to the rest of the P2P stack as possible. We will instead reduce this interface gradually as we refactor other parts of the P2P stack.

The low-level transport code and protocol (e.g. MConnection, SecretConnection and so on) has not been significantly changed, and refactoring this is not a priority until we come up with a plan for QUIC adoption, as we may end up discarding the MConnection code entirely.

There are no tests of the new `MConnTransport`, as this code is likely to evolve as we proceed with the P2P refactor, but tests should be added before a final release. The E2E tests are sufficient for basic validation in the meanwhile.
2020-12-15 15:08:16 +00:00
Anton Kaliaev
b1bbd37519 libs/bits: validate BitArray in FromProto (#5720)
Closes #5705
2020-12-01 12:44:56 +00:00
Marko
781f4badc3 ci: build for 32 bit, libs: fix overflow (#5700) 2020-11-26 16:12:25 +01:00
Callum Waters
909da42789 light: make fraction parts uint64, ensuring that it is always positive (#5655) 2020-11-17 14:23:16 +01:00
Alessio Treglia
8bd3d5105f libs/os: remove unused aliases, add test cases (#5654)
Remove unused ReadFile (unused) and
WriteFile (almost unused, alias of ioutil.WriteFile).

Add testcases for Must{Read,Write}File.
2020-11-13 10:59:45 +00:00
Alessio Treglia
eb0d353767 libs/os: add test case for TrapSignal (#5646) 2020-11-12 10:19:05 +04:00
Anton Kaliaev
8e6194626e light: model-based tests (#5461)
This is the first iteration of model-based testing in Go Tendermint. The test runner is using the static JSON fixtures located under the ./json directory. In the future, the Rust tensgen binary will be used to generate those (given the static intermediate scenarios and the test seed, which will be published along with each testgen release).

Closes: #5322
2020-11-02 12:07:18 +04:00
Marko
346aa14db5 fix lint failures with 1.31 (#5489) 2020-10-13 10:22:53 +02:00
Marko
0ed8dba991 lint: enable errcheck (#5336)
## Description

Enable errcheck linter throughout the codebase

Closes: #5059
2020-09-07 15:03:18 +00:00
Anton Kaliaev
4827ed121f libs/bits: inline defer and change order of mutexes (#5187)
* libs/bits: inline defer and change order of mutexes

Closes #3217

* abci/client: unexpose StopForError func

a) it's not called outside
b) the reason for exposing it in the first place is unclear
c) Stop already exist if someone from outside wants to stop the client
2020-08-26 13:18:30 +04:00
Marko
a6032f4183 fix go 1.15 errors (#5277)
## Description

fix golang 1.15 errors. 

Closes: #XXX
2020-08-24 12:26:05 +00:00
Marko
42e4e8b58e lint: add markdown linter (#5254) 2020-08-17 16:40:50 +02:00
Sad Pencil
62d09ccc10 libs/rand: fix "out-of-memory" error on unexpected argument (#5215) 2020-08-10 10:42:14 +02:00
Marko
2d167aefcf ci: freeze golangci action version (#5196)
## Description

This PR updates golang-ci to latest and stops looking at master for the action. 

Closes: #XXX
2020-08-03 07:57:06 +00:00
Callum Waters
cf84dcd44c light cli: add feature flags and save providers (#5148) 2020-07-28 12:11:55 +02:00
n-hutton
375f0c819f add fixes for flaky tests (#5146)
While working on tendermint my colleague @jinmannwong fixed a few of the unit tests that we found to be flaky in our CI. We thought that you might find this useful, see below for comments.
2020-07-27 10:36:56 +04:00
Marko
2ac5a559b4 libs: wrap mutexes for build flag with godeadlock (#5126)
## Description

This PR wraps the stdlib sync.(RW)Mutex & godeadlock.(RW)Mutex. This enables using go-deadlock via a build flag instead of using sed to replace sync with godeadlock in all files

Closes: #3242
2020-07-20 07:55:09 +00:00
Marko
6ccccb0933 lint: errcheck (#5091)
## Description

add more error checks to tests


gonna do a third PR that tackles the non test cases
2020-07-14 11:04:41 +00:00
Erik Grinaker
bf3c87c864 test: deflake TestAddAndRemoveListenerConcurrency and TestSyncer_SyncAny (#5101)
Fixes #5094.
2020-07-08 13:33:50 +00:00
Marko
7e2cc1db5e linter: (1/2) enable errcheck (#5064)
## Description

partially cleanup in preparation for errcheck

i ignored a bunch of defer errors in tests but with the update to go 1.14 we can use `t.Cleanup(func() { if err := <>; err != nil {..}}` to cover those errors, I will do this in pr number two of enabling errcheck.

ref #5059
2020-07-01 15:13:11 +00:00
Marko
dedf0d2350 proto: folder structure adhere to buf (#5025) 2020-06-22 10:00:51 +02:00
Marko
f6243d8b9e privval: migrate to protobuf (#4985) 2020-06-11 11:54:02 +02:00
Marko
d54de61bf6 consensus: proto migration (#4984)
## Description

migrate consensus to protobuf

Closes: #XXX
2020-06-10 12:08:47 +00:00
Erik Grinaker
660e72a12f p2p/conn: migrate to Protobuf (#4990)
Migrates the p2p connections to Protobuf. Supersedes #4800.

gogoproto's `NewDelimitedReader()` uses an internal buffer, which makes it unsuitable for reading individual messages from a shared reader (since any remaining data in the buffer will be discarded). We therefore add a new `protoio` package with an unbuffered `NewDelimitedReader()`. Additionally, the `NewDelimitedWriter()` returns the number of bytes written, and we've added `MarshalDelimited()` and `UnmarshalDelimited()`, to ease migration of existing code.
2020-06-09 16:09:51 +00:00
Erik Grinaker
ccc990498d json: add Amino-compatible encoder/decoder (#4955)
Amino-compatible JSON encoder/decoder, including bug compatibility. Interface types must be registered via `json.RegisterType()`. Unlike Amino, this allows floats to be encoded/decoded.

Partial fix for #4828, needs code migration.
2020-06-08 11:42:35 +00:00
Marko
a88537bb88 ints: stricter numbers (#4939) 2020-06-04 16:34:56 +02:00
Anton Kaliaev
994912211c p2p/conn: add a test for MakeSecretConnection (#4829)
Refs #4154
2020-06-03 16:28:23 +04:00
Alessio Treglia
c8483531d8 consensus: attempt to repair the WAL file on data corruption (#4682)
Closes: #4578

Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com>
2020-06-03 15:14:12 +04:00
Marko
89a3cb2b0a libs: remove kv (#4874)
## Description

In https://github.com/tendermint/tendermint/pull/4466 a new type was created (EventAttribute) that replaced KV.Pair. This made the type Pair deadcode, this pr removes that type.

I don't think a changelog entry is needed since it is not being used anymore. But will add a section to the upgrading.md to let users know that it was replaced

Closes: #XXX
2020-05-26 06:37:24 +00:00
Marko
e03b61abd2 proto: add proto files for ibc unblock (#4853)
## Description

these proto files are meant to help unblock ibc in their quest of migrating the ibc module to proto.

Closes: #XXX
2020-05-25 15:52:34 +00:00
Marko
243dfbd585 proto: remove test files
## Description

remove test files for proto stubs

Closes: #XXX
2020-05-13 14:30:33 +00:00
Marko
01e032dc99 libs: remove bech32
## Description

bech32 is only used in the sdk. I moved Bech32 to the sdk and we can remove it here. Don't think this needs to go into a minor release

Closes: #XXX
2020-05-12 18:43:35 +00:00
Anton Kaliaev
b7b721c484 change use of errors.Wrap to fmt.Errorf with %w verb
Closes #4603

Commands used (VIM):

```
:args `rg -l errors.Wrap`
:argdo normal @q | update
```

where q is a macros rewriting the `errors.Wrap` to `fmt.Errorf`.
2020-05-12 03:35:47 +00:00
Anton Kaliaev
52784f67d0 mempool: allow ReapX and CheckTx functions to run in parallel
allow ReapX and CheckTx functions to run in parallel, making it not possible to block certain proposers from creating a new block.

Closes: #2972
2020-05-08 04:17:01 +00:00
Marko
678010c45e fix linters & switch to official linter (#4808) 2020-05-07 16:17:43 +02:00
Marko
b7c2d7a977 lint: enable nolintlinter, disable on tests
## Description
- enable nolintlint
- disable linting on tests

Closes: #XXX
2020-05-04 07:49:53 +00:00
Diep Pham
843d63f935 indexer: allow indexing an event at runtime (#4466)
The PR added a new field `index` to event attribute, that will cause indexer service to index the event if set to true.
2020-04-22 12:07:03 +02:00
Erik Grinaker
66b0ec0af1 clarify service logging
The service logging can be a bit unclear. For example, with state sync it would log:

```
I[2020-04-20|08:40:47.366] Starting StateSync     module=statesync impl=Reactor
I[2020-04-20|08:40:47.834] Starting state sync    module=statesync
```

Where the first message is the reactor service startup, and the second message is the start of the actual state sync process. This clarifies the first message by changing it to `Starting StateSync service`.

______

For contributor use:

- [ ] ~Wrote tests~
- [ ] ~Updated CHANGELOG_PENDING.md~
- [ ] ~Linked to Github issue with discussion and accepted design OR link to spec that describes this work.~
- [ ] ~Updated relevant documentation (`docs/`) and code comments~
- [x] Re-reviewed `Files changed` in the Github PR explorer
2020-04-20 09:23:46 +00:00
Alessio Treglia
fcbce21534 cli: add command to generate shell completion scripts (#4665)
How to use it:

```
$ . <(tendermint completion)
```

Note that the completion command does not show up in the help screen,
though it comes with its own --help option.

This is a port of the feature provided by cosmos-sdk.
2020-04-13 16:08:23 +04:00
Marko
044f1bf288 format: add format cmd & goimport repo (#4586)
* format: add format cmd & goimport repo

- replaced format command
- added goimports to format command
- ran goimports

Signed-off-by: Marko Baricevic <marbar3778@yahoo.com>

* fix outliers & undo proto file changes
2020-03-23 09:19:26 +01:00
Tess Rinearson
31bea92d12 libs/kv: remove unused type KI64Pair (#4542) 2020-03-10 21:34:44 +01:00
mergify[bot]
4c8e3c8145 fix: proto-breakage (#4506)
* fix: fix proto-breakage

- this is amed to fix proto breakage for consumers

Signed-off-by: Marko Baricevic <marbar3778@yahoo.com>

* fix for importing third_party everywhere

* undo change

* test breakage change

* test ssh

* test https

* change ssh to https

* fix phony
2020-03-05 08:41:36 +00:00
Shivani Joshi
78144306dd JSON tests related changes (#4461)
* test functions take time.Now and other minor changes

* updated remaining test files

* Update validation_test.go

* fix typo

* go fmt

* import time

Co-authored-by: Marko <marbar3778@yahoo.com>
2020-02-28 09:57:00 +01:00
Erik Grinaker
8f48c49543 Fix some golangci-lint warnings (#4448) 2020-02-20 13:43:40 +01:00