tendermint

mirror of https://github.com/tendermint/tendermint.git synced 2026-06-10 00:03:04 +00:00

Author	SHA1	Message	Date
Cyrus Goh	5182ffee25	docs: master → docs-staging (#5990 ) * Makefile: always pull image in proto-gen-docker. (#5953) The `proto-gen-docker` target didn't pull an updated Docker image, and would use a local image if present which could be outdated and produce wrong results. * test: fix TestPEXReactorRunning data race (#5955) Fixes #5941. Not entirely sure that this will fix the problem (couldn't reproduce), but in any case this is an artifact of a hack in the P2P transport refactor to make it work with the legacy P2P stack, and will be removed when the refactor is done anyway. * test/fuzz: move fuzz tests into this repo (#5918) Co-authored-by: Emmanuel T Odeke <emmanuel@orijtech.com> Closes #5907 - add init-corpus to blockchain reactor - remove validator-set FromBytes test now that we have proto, we don't need to test it! bye amino - simplify mempool test do we want to test remote ABCI app? - do not recreate mux on every crash in jsonrpc test - update p2p pex reactor test - remove p2p/listener test the API has changed + I did not understand what it's tested anyway - update secretconnection test - add readme and makefile - list inputs in readme - add nightly workflow - remove blockchain fuzz test EncodeMsg / DecodeMsg no longer exist * docker: dont login when in PR (#5961) * docker: release Linux/ARM64 image (#5925) Co-authored-by: Marko <marbar3778@yahoo.com> * p2p: make PeerManager.DialNext() and EvictNext() block (#5947) See #5936 and #5938 for background. The plan was initially to have `DialNext()` and `EvictNext()` return a channel. However, implementing this became unnecessarily complicated and error-prone. As an example, the channel would be both consumed and populated (via method calls) by the same driving method (e.g. `Router.dialPeers()`) which could easily cause deadlocks where a method call blocked while sending on the channel that the caller itself was responsible for consuming (but couldn't since it was busy making the method call). It would also require a set of goroutines in the peer manager that would interact with the goroutines in the router in non-obvious ways, and fully populating the channel on startup could cause deadlocks with other startup tasks. Several issues like these made the solution hard to reason about. I therefore simply made `DialNext()` and `EvictNext()` block until the next peer was available, using internal triggers to wake these methods up in a non-blocking fashion when any relevant state changes occurred. This proved much simpler to reason about, since there are no goroutines in the peer manager (except for trivial retry timers), nor any blocking channel sends, and it instead relies entirely on the existing goroutine structure of the router for concurrency. This also happens to be the same pattern used by the `Transport.Accept()` API, following Go stdlib conventions, so all router goroutines end up using a consistent pattern as well. * libs/log: format []byte as hexidecimal string (uppercased) (#5960) Closes: #5806 Co-authored-by: Lanie Hei <heixx011@umn.edu> * docs: log level docs (#5945) ## Description add section on configuring log levels Closes: #XXX * .github: fix fuzz-nightly job (#5965) outputs is a property of the job, not an individual step. * e2e: add control over the log level of nodes (#5958) * mempool: fix reactor tests (#5967) ## Description Update the faux router to either drop channel errors or handle them based on an argument. This prevents deadlocks in tests where we try to send an error on the mempool channel but there is no reader. Closes: #5956 * p2p: improve peerStore prototype (#5954) This improves the `peerStore` prototype by e.g.: * Using a database with Protobuf for persistence, but also keeping full peer set in memory for performance. * Simplifying the API, by taking/returning struct copies for safety, and removing errors for in-memory operations. * Caching the ranked peer set, as a temporary solution until a better data structure is implemented. * Adding `PeerManagerOptions.MaxPeers` and pruning the peer store (based on rank) when it's full. * Rewriting `PeerAddress` to be independent of `url.URL`, normalizing it and tightening semantics. * p2p: simplify PeerManager upgrade logic (#5962) Follow-up from #5947, branched off of #5954. This simplifies the upgrade logic by adding explicit eviction requests, which can also be useful for other use-cases (e.g. if we need to ban a peer that's misbehaving). Changes: * Add `evict` map which queues up peers to explicitly evict. * `upgrading` now only tracks peers that we're upgrading via dialing (`DialNext` → `Dialed`/`DialFailed`). * `Dialed` will unmark `upgrading`, and queue `evict` if still beyond capacity. * `Accepted` will pick a random lower-scored peer to upgrade to, if appropriate, and doesn't care about `upgrading` (the dial will fail later, since it's already connected). * `EvictNext` will return a peer scheduled in `evict` if any, otherwise if beyond capacity just evict the lowest-scored peer. This limits all of the `upgrading` logic to `DialNext`, `Dialed`, and `DialFailed`, making it much simplier, and it should generally do the right thing in all cases I can think of. * p2p: add PeerManager.Advertise() (#5957) Adds a naïve `PeerManager.Advertise()` method that the new PEX reactor can use to fetch addresses to advertise, as well as some other `FIXME`s on address advertisement. * blockchain v0: fix waitgroup data race (#5970) ## Description Fixes the data race in usage of `WaitGroup`. Specifically, the case where we invoke `Wait` _before_ the first delta `Add` call when the current waitgroup counter is zero. See https://golang.org/pkg/sync/#WaitGroup.Add. Still not sure how this manifests itself in a test since the reactor has to be stopped virtually immediately after being started (I think?). Regardless, this is the appropriate fix. closes: #5968 * tests: fix `make test` (#5966) ## Description - bump deadlock dep to master - fixes `make test` since we now use `deadlock.Once` Closes: #XXX * terminate go-fuzz gracefully (w/ SIGINT) (#5973) and preserve exit code. ``` 2021/01/26 03:34:49 workers: 2, corpus: 4 (8m28s ago), crashers: 0, restarts: 1/9976, execs: 11013732 (21596/sec), cover: 121, uptime: 8m30s make: *** [fuzz-mempool] Terminated Makefile:5: recipe for target 'fuzz-mempool' failed Error: Process completed with exit code 124. ``` https://github.com/tendermint/tendermint/runs/1766661614 `continue-on-error` should make GH ignore any error codes. * p2p: add prototype PEX reactor for new stack (#5971) This adds a prototype PEX reactor for the new P2P stack. * proto/p2p: rename PEX messages and fields (#5974) Fixes #5899 by renaming a bunch of P2P Protobuf entities (while maintaining wire compatibility): * `Message` to `PexMessage` (as it's only used for PEX messages). * `PexAddrs` to `PexResponse`. * `PexResponse.Addrs` to `PexResponse.Addresses`. * `NetAddress` to `PexAddress` (as it's only used by PEX). * p2p: resolve PEX addresses in PEX reactor (#5980) This changes the new prototype PEX reactor to resolve peer address URLs into IP/port PEX addresses itself. Branched off of #5974. I've spent some time thinking about address handling in the P2P stack. We currently use `PeerAddress` URLs everywhere, except for two places: when dialing a peer, and when exchanging addresses via PEX. We had two options: 1. Resolve addresses to endpoints inside `PeerManager`. This would introduce a lot of added complexity: we would have to track connection statistics per endpoint, have goroutines that asynchronously resolve and refresh these endpoints, deal with resolve scheduling before dialing (which is trickier than it sounds since it involves multiple goroutines in the peer manager and router and messes with peer rating order), handle IP address visibility issues, and so on. 2. Resolve addresses to endpoints (IP/port) only where they're used: when dialing, and in PEX. Everywhere else we use URLs. I went with 2, because this significantly simplifies the handling of hostname resolution, and because I really think the PEX reactor should migrate to exchanging URLs instead of IP/port numbers anyway -- this allows operators to use DNS names for validators (and can easily migrate them to new IPs and/or load balance requests), and also allows different protocols (e.g. QUIC and `MemoryTransport`). Happy to discuss this. * test/p2p: close transports to avoid goroutine leak failures (#5982) * mempool: fix TestReactorNoBroadcastToSender (#5984) ## Description Looks like I missed a test in the original PR when fixing the tests. Closes: #5956 * mempool: fix mempool tests timeout (#5988) * p2p: use stopCtx when dialing peers in Router (#5983) This ensures we don't leak dial goroutines when shutting down the router. * docs: fix typo in state sync example (#5989) Co-authored-by: Erik Grinaker <erik@interchain.berlin> Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com> Co-authored-by: Marko <marbar3778@yahoo.com> Co-authored-by: odidev <odidev@puresoftware.com> Co-authored-by: Lanie Hei <heixx011@umn.edu> Co-authored-by: Callum Waters <cmwaters19@gmail.com> Co-authored-by: Aleksandr Bezobchuk <alexanderbez@users.noreply.github.com> Co-authored-by: Sergey <52304443+c29r3@users.noreply.github.com>	2021-01-26 11:46:21 -08:00
Aleksandr Bezobchuk	68bd2116f0	mempool: p2p refactor (#5919 )	2021-01-22 09:34:12 -05:00
Aleksandr Bezobchuk	62d7a5d028	blockchain v0: p2p refactor (#5858 )	2021-01-18 16:35:11 -05:00
Callum Waters	bada08c50c	state sync: last consensus params height is not set (#5889 )	2021-01-12 14:41:16 +01:00
Callum Waters	385ea1db7d	store: use db iterators for pruning and range-based queries (#5848 )	2021-01-08 13:12:54 +01:00
Aleksandr Bezobchuk	e986602649	evidence: p2p refactor (#5747 )	2021-01-06 11:53:18 -05:00
Aleksandr Bezobchuk	8bf77d9b1a	statesync: do not recover panic on peer updates (#5869 )	2021-01-06 10:07:10 -05:00
Aleksandr Bezobchuk	c75dee5a02	state sync: Fix TestSyncer_SyncAny (#5835 )	2021-01-04 10:31:20 -05:00
Erik Grinaker	1b6df6783d	p2p: replace PeerID with NodeID	2021-01-04 11:25:20 +01:00
Anton Kaliaev	aef1ac7ba5	modify Reactor priorities (#5826 ) blockchain/vX reactor priority was decreased because during the normal operation (i.e. when the node is not fast syncing) blockchain priority can't be the same as consensus reactor priority. Otherwise, it's theoretically possible to slow down consensus by constantly requesting blocks from the node. NOTE: ideally blockchain/vX reactor priority would be dynamic. e.g. when the node is fast syncing, the priority is 10 (max), but when it's done fast syncing - the priority gets decreased to 5 (only to serve blocks for other nodes). But it's not possible now, therefore I decided to focus on the normal operation (priority = 5). evidence and consensus critical messages are more important than the mempool ones, hence priorities are bumped by 1 (from 5 to 6). statesync reactor priority was changed from 1 to 5 to be the same as blockchain/vX priority. Refs https://github.com/tendermint/tendermint/issues/5816	2020-12-23 12:31:00 +00:00
Aleksandr Bezobchuk	0565eb5943	state sync: cleanup (#5776 )	2020-12-09 10:29:28 -05:00
Aleksandr Bezobchuk	a879eb444d	p2p: state sync reactor refactor (#5671 )	2020-12-09 09:31:06 -05:00
Tess Rinearson	79890d8393	reactors: omit incoming message bytes from reactor logs (#5743 ) After a reactor has failed to parse an incoming message, it shouldn't output the "bad" data into the logs, as that data is unfiltered and could have anything in it. (We also don't think this information is helpful to have in the logs anyways.)	2020-12-03 22:12:08 +00:00
Anton Kaliaev	e13b4386ff	abci: modify Client interface and socket client (#5673 ) `abci.Client`: - Sync and Async methods now accept a context for cancellation * grpc client uses context to cancel both Sync and Async requests * local client ignores context parameter * socket client uses context to cancel Sync requests and to drop Async requests before sending them if context was cancelled prior to that - Async methods return an error * socket client returns an error immediately if queue is full for Async requests * local client always returns nil error * grpc client returns an error if context was cancelled before we got response or the receiving queue had a space for response (do not confuse with the sending queue from the socket client) - specify clients semantics in [doc.go](https://raw.githubusercontent.com/tendermint/tendermint/27112fffa62276bc016d56741f686f0f77931748/abci/client/doc.go) `mempool.TxInfo` - add optional `Context` to `TxInfo`, which can be used to cancel `CheckTx` request Closes #5190	2020-11-30 16:46:16 +04:00
Anton Kaliaev	f2f6a78809	docs: warn developers about calling blocking funcs in Receive (#5679 ) Refs #2888	2020-11-17 15:37:35 +00:00
Erik Grinaker	e7184c499d	statesync: check all necessary heights when adding snapshot to pool (#5516 ) Fixes #5511.	2020-10-16 11:47:12 +00:00
Marko	82e4693cc5	abci: remove setOption (#5447 ) Remove Response/Request SetOption from ABCI. Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com>	2020-10-08 19:12:12 +02:00
Erik Grinaker	f83ecdad1d	config: add state sync discovery_time setting (#5399 ) Reduces the state sync discovery time from 20 to 15 seconds, and makes it configurable.	2020-09-24 16:01:45 +02:00
Anton Kaliaev	85a4be87a7	rpc/client: take context as first param (#5347 ) Closes #5145 also applies to light/client	2020-09-23 09:21:57 +04:00
Callum Waters	ed002cea7e	evidence: introduction of LightClientAttackEvidence and refactor of evidence lifecycle (#5361 ) evidence: modify evidence types (#5342) light: detect light client attacks (#5344) evidence: refactor evidence pool (#5345) abci: application evidence prepared by evidence pool (#5354)	2020-09-22 10:22:54 +02:00
Marko	0ed8dba991	lint: enable errcheck (#5336 ) ## Description Enable errcheck linter throughout the codebase Closes: #5059	2020-09-07 15:03:18 +00:00
Erik Grinaker	39d2ac4dbc	statesync: fix the validator set heights (again) (#5330 ) This reverts the "fix" in #5311, after the real fix in #5328.	2020-09-03 15:05:04 +00:00
Erik Grinaker	2f4c1f60c7	statesync: broadcast snapshot request to all peers on startup (#5320 ) On startup, the peer-to-peer stack may have peers connected before the state sync process begins, causing these to not trigger `AddPeer` events and thus not be used for snapshot discovery. Broadcasting a snapshot request to these explicitly makes sure we discover snapshots from existing peers as well.	2020-09-02 08:16:08 +00:00
Callum Waters	2b58a62721	light: implement light block (#5298 )	2020-09-01 17:45:55 +02:00
Erik Grinaker	686361ff3e	statesync: fix valset off-by-one causing consensus failures (#5311 )	2020-08-31 13:31:00 +02:00
Marko	fbdf8b098e	mocks: update with 2.2.1 (#5294 ) ## Description When downloading mockery I ran into an issue where we were using the old version. This PR updates to a more recent version. changelog? Closes: #XXX	2020-08-26 15:28:46 +00:00
Erik Grinaker	cc247c091b	genesis: add support for arbitrary initial height (#5191 ) Adds a genesis parameter `initial_height` which specifies the initial block height, as well as ABCI `RequestInitChain.InitialHeight` to pass it to the ABCI application, and `State.InitialHeight` to keep track of the initial height throughout the code. Fixes #2543, based on [RFC-002](https://github.com/tendermint/spec/pull/119). Spec changes in https://github.com/tendermint/spec/pull/135.	2020-08-11 17:03:28 +00:00
Marko	40bd416d59	test: protobuf vectors for reactors (#5221 ) ## Description Add test vectors for all reactors - [x] state-sync - [x] privval - [x] mempool - [x] p2p - [x] evidence - [ ] light? this PR is primarily oriented at testvectors for things going over the wire. should we expand the testvectors into types as well? Closes: #XXX	2020-08-11 14:00:11 +00:00
Marko	2d167aefcf	ci: freeze golangci action version (#5196 ) ## Description This PR updates golang-ci to latest and stops looking at master for the action. Closes: #XXX	2020-08-03 07:57:06 +00:00
Marko	2ac5a559b4	libs: wrap mutexes for build flag with godeadlock (#5126 ) ## Description This PR wraps the stdlib sync.(RW)Mutex & godeadlock.(RW)Mutex. This enables using go-deadlock via a build flag instead of using sed to replace sync with godeadlock in all files Closes: #3242	2020-07-20 07:55:09 +00:00
Marko	6ccccb0933	lint: errcheck (#5091 ) ## Description add more error checks to tests gonna do a third PR that tackles the non test cases	2020-07-14 11:04:41 +00:00
Erik Grinaker	59a17b28a7	proto: improve enums (#5099 ) Fixes some minor issues with Protobuf enums, not likely to break anything. Branched off of #5096, rebase to `master` before merging.	2020-07-08 13:49:50 +00:00
Erik Grinaker	bf3c87c864	test: deflake TestAddAndRemoveListenerConcurrency and TestSyncer_SyncAny (#5101 ) Fixes #5094.	2020-07-08 13:33:50 +00:00
Marko	dedf0d2350	proto: folder structure adhere to buf (#5025 )	2020-06-22 10:00:51 +02:00
Marko	7a8224f8a3	state: proto migration (#4972 ) ## Description the second part of state proto migration Closes: #XXX	2020-06-08 10:16:35 +00:00
Marko	b9af87c4ea	state: proto migration (#4951 )	2020-06-05 10:47:16 +02:00
Marko	c2578e2262	light: rename lite2 to light & remove lite (#4946 ) This PR removes lite & renames lite2 to light throughout the repo Signed-off-by: Marko Baricevic <marbar3778@yahoo.com> Closes: #4944	2020-06-03 10:13:42 +00:00
Marko	4e6a844d6f	statesync: use Protobuf instead of Amino for p2p traffic (#4943 ) ## Description Closes: #XXX	2020-06-03 08:43:50 +00:00
Erik Grinaker	81c2798df0	abci: fix protobuf lint issues Fix some linter issues to conform with the Protobuf style guide. The state sync enum changes are ok to break since it's not released yet. Personally I find the uppercase kind of ugly, but that's what the guide says. Couldn't find a way to generate camel case in Go, short of specifying custom names for each and every enum variant. Another option would be to simply disable the enum case lint.	2020-05-07 14:40:22 +00:00
Marko	678010c45e	fix linters & switch to official linter (#4808 )	2020-05-07 16:17:43 +02:00
Marko	b7c2d7a977	lint: enable nolintlinter, disable on tests ## Description - enable nolintlint - disable linting on tests Closes: #XXX	2020-05-04 07:49:53 +00:00
Erik Grinaker	511ab6717c	add state sync reactor (#4705 ) Fixes #828. Adds state sync, as outlined in [ADR-053](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-053-state-sync-prototype.md). See related PRs in Cosmos SDK (https://github.com/cosmos/cosmos-sdk/pull/5803) and Gaia (https://github.com/cosmos/gaia/pull/327). This is split out of the previous PR #4645, and branched off of the ABCI interface in #4704. * Adds a new P2P reactor which exchanges snapshots with peers, and bootstraps an empty local node from remote snapshots when requested. * Adds a new configuration section `[statesync]` that enables state sync and configures the light client. Also enables `statesync:info` logging by default. * Integrates state sync into node startup. Does not support the v2 blockchain reactor, since it needs some reorganization to defer startup.	2020-04-29 10:47:00 +02:00

42 Commits