Commit Graph

38 Commits

Author SHA1 Message Date
Erik Grinaker
2eba38051a blockchain/v2: fix missing mutex unlock (#5862)
Fixes #5843.
2021-01-06 17:27:51 +01:00
Anton Kaliaev
b1328db07f modify Reactor priorities (#5826) (#5830)
blockchain/vX reactor priority was decreased because during the normal operation
(i.e. when the node is not fast syncing) blockchain priority can't be
the same as consensus reactor priority. Otherwise, it's theoretically possible to
slow down consensus by constantly requesting blocks from the node.

NOTE: ideally blockchain/vX reactor priority would be dynamic. e.g. when
the node is fast syncing, the priority is 10 (max), but when it's done
fast syncing - the priority gets decreased to 5 (only to serve blocks
for other nodes). But it's not possible now, therefore I decided to
focus on the normal operation (priority = 5).

evidence and consensus critical messages are more important than
the mempool ones, hence priorities are bumped by 1 (from 5 to 6).

statesync reactor priority was changed from 1 to 5 to be the same as
blockchain/vX priority.

Refs https://github.com/tendermint/tendermint/issues/5816
2020-12-23 18:05:14 +01:00
Tess Rinearson
0d9606e1b4 reactors: omit incoming message bytes from reactor logs (#5743)
After a reactor has failed to parse an incoming message, it shouldn't output the "bad" data into the logs, as that data is unfiltered and could have anything in it. (We also don't think this information is helpful to have in the logs anyways.)
2020-12-04 12:18:14 +01:00
Anton Kaliaev
54a0940e40 blockchain/v2: remove peers from the processor (#5607)
after they were pruned by the scheduler

Closes #5513
2020-11-05 16:55:11 +04:00
Anton Kaliaev
79d535dd67 blockchain/v2: fix "panic: duplicate block enqueued by processor" (#5499)
When a peer is stopped due to some network issue, the Reactor calls scheduler#handleRemovePeer, which removes the peer from the scheduler. BUT the peer stays in the processor, which sometimes could lead to "duplicate block enqueued by processor" panic WHEN the same block is requested by the scheduler again from a different peer. The solution is to return scPeerError, which will be propagated to the processor. The processor will clean up the blocks associated with the peer in purgePeer.

Closes #5513, #5517
2020-10-21 13:26:20 +04:00
Marko
0ed8dba991 lint: enable errcheck (#5336)
## Description

Enable errcheck linter throughout the codebase

Closes: #5059
2020-09-07 15:03:18 +00:00
Erik Grinaker
edf5cff80f blockchain: fix fast sync halt with initial height > 1 (#5249)
Blockchain reactors were not updated to handle arbitrary initial height after #5191.
2020-08-14 13:04:51 +00:00
Marko
2ac5a559b4 libs: wrap mutexes for build flag with godeadlock (#5126)
## Description

This PR wraps the stdlib sync.(RW)Mutex & godeadlock.(RW)Mutex. This enables using go-deadlock via a build flag instead of using sed to replace sync with godeadlock in all files

Closes: #3242
2020-07-20 07:55:09 +00:00
Anton Kaliaev
730e16566e proto: change type + a cleanup (#5107)
- drop Height & Base from StatusRequest
It does not make sense nor it's used anywhere currently. Also, there
seem to be no trace of these fields in the ADR-40 (blockchain reactor
v2).

- change PacketMsg#EOF type from int32 to bool
2020-07-13 10:24:17 +00:00
Marko
dedf0d2350 proto: folder structure adhere to buf (#5025) 2020-06-22 10:00:51 +02:00
Marko
a89f2581fc blockchain: proto migration (#4969)
## Description

migration of blockchain reactors to protobuf

Closes: #XXX
2020-06-08 12:45:03 +00:00
Erik Grinaker
b76b270a23 blockchain/v2: correctly set block store base in status responses (#4971)
See: https://github.com/tendermint/tendermint/pull/4969#pullrequestreview-425298225
2020-06-05 15:18:12 +00:00
Erik Grinaker
eb443f4b77 blockchain/v2: integrate with state sync
Integrates the blockchain v2 reactor with state sync, fixes #4765. This mostly involves deferring fast syncing until after state sync completes. I tried a few different approaches, this was the least effort:

* `Reactor.events` is `nil` if no fast sync is in progress, in which case events are not dispatched - most importantly `AddPeer`.

* Accept status messages from unknown peers in the scheduler and register them as ready. On fast sync startup, broadcast status requests to all existing peers.

* When switching from state sync, first send a `bcResetState` message to the processor and scheduler to update their states - most importantly the initial block height.

* When fast sync completes, shut down event loop, scheduler and processor, and set `events` channel to `nil`.
2020-05-07 10:34:01 +00:00
Marko
b7c2d7a977 lint: enable nolintlinter, disable on tests
## Description
- enable nolintlint
- disable linting on tests

Closes: #XXX
2020-05-04 07:49:53 +00:00
Erik Grinaker
8108ac9d17 blockchain/v2: respect fast_sync option (#4772)
Not thoroughly tested, but seems to work. Will do further testing as this is integrated with state sync.

Fixes #4688.
2020-04-29 13:54:37 +02:00
Erik Grinaker
dcc19272f9 blockchain/v2: fix excessive CPU usage due to spinning on closed channels (#4761)
The event loop uses a `select` on multiple channels. However, reading from a closed channel in Go always yields the channel's zero value. The processor and scheduler close their channels when done, and since these channels are always ready to receive, the event loop keeps spinning on them.

This changes `routine.terminate()` to not close the channel, and also removes `stopDemux` and instead uses `events` channel closure to signal event loop termination.

Fixes #4687.
2020-04-29 11:03:04 +02:00
Erik Grinaker
511ab6717c add state sync reactor (#4705)
Fixes #828. Adds state sync, as outlined in [ADR-053](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-053-state-sync-prototype.md). See related PRs in Cosmos SDK (https://github.com/cosmos/cosmos-sdk/pull/5803) and Gaia (https://github.com/cosmos/gaia/pull/327).

This is split out of the previous PR #4645, and branched off of the ABCI interface in #4704. 

* Adds a new P2P reactor which exchanges snapshots with peers, and bootstraps an empty local node from remote snapshots when requested.

* Adds a new configuration section `[statesync]` that enables state sync and configures the light client. Also enables `statesync:info` logging by default.

* Integrates state sync into node startup. Does not support the v2 blockchain reactor, since it needs some reorganization to defer startup.
2020-04-29 10:47:00 +02:00
Anton Kaliaev
8f463cf35c p2p: set RecvMessageCapacity to maxMsgSize in all reactors
to prevent malicious nodes from sending us large messages (~21MB, which
is the default `RecvMessageCapacity`)

This allows us to remove unnecessary `maxMsgSize` check in `decodeMsg`. Since each channel has a msg capacity set to `maxMsgSize`, there's no need to check it again in `decodeMsg`.

Closes #1503
2020-04-28 10:05:09 +00:00
Erik Grinaker
fb35b474ad blockchain/v2: allow setting nil switch, for CustomReactors()
<!-- < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < < ☺
v                               ✰  Thanks for creating a PR! ✰    
v    Before smashing the submit button please review the checkboxes.
v    If a checkbox is n/a - please still include it but + a little note why
☺ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >  -->

Fixes an issue reported in https://github.com/tendermint/tendermint/issues/4595#issuecomment-612667441.

Not sure if this is sufficient to fully remove the reactor, but it fixes the immediate problem.
______

For contributor use:

- [x] Wrote tests
- [x] ~Updated CHANGELOG_PENDING.md~
- [x] Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
- [x] ~Updated relevant documentation (`docs/`) and code comments~
- [x] Re-reviewed `Files changed` in the Github PR explorer
2020-04-14 08:48:40 +00:00
Erik Grinaker
4298bbcc4e add support for block pruning via ABCI Commit response (#4588)
* Added BlockStore.DeleteBlock()

* Added initial block pruner prototype

* wip

* Added BlockStore.PruneBlocks()

* Added consensus setting for block pruning

* Added BlockStore base

* Error on replay if base does not have blocks

* Handle missing blocks when sending VoteSetMaj23Message

* Error message tweak

* Properly update blockstore state

* Error message fix again

* blockchain: ignore peer missing blocks

* Added FIXME

* Added test for block replay with truncated history

* Handle peer base in blockchain reactor

* Improved replay error handling

* Added tests for Store.PruneBlocks()

* Fix non-RPC handling of truncated block history

* Panic on missing block meta in needProofBlock()

* Updated changelog

* Handle truncated block history in RPC layer

* Added info about earliest block in /status RPC

* Reorder height and base in blockchain reactor messages

* Updated changelog

* Fix tests

* Appease linter

* Minor review fixes

* Non-empty BlockStores should always have base > 0

* Update code to assume base > 0 invariant

* Added blockstore tests for pruning to 0

* Make sure we don't prune below the current base

* Added BlockStore.Size()

* config: added retain_blocks recommendations

* Update v1 blockchain reactor to handle blockstore base

* Added state database pruning

* Propagate errors on missing validator sets

* Comment tweaks

* Improved error message

Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com>

* use ABCI field ResponseCommit.retain_height instead of retain-blocks config option

* remove State.RetainHeight, return value instead

* fix minor issues

* rename pruneHeights() to pruneBlocks()

* noop to fix GitHub borkage

Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com>
2020-04-03 08:38:32 +00:00
Marko
044f1bf288 format: add format cmd & goimport repo (#4586)
* format: add format cmd & goimport repo

- replaced format command
- added goimports to format command
- ran goimports

Signed-off-by: Marko Baricevic <marbar3778@yahoo.com>

* fix outliers & undo proto file changes
2020-03-23 09:19:26 +01:00
Sean Braithwaite
ee993ba8ff blockchain: add v2 reactor (#4361)
The work includes the reactor which ties together all the seperate routines involved in the design of the blockchain v2 refactor. This PR replaces #4067 which got far too large and messy after a failed attempt to rebase.

## Commits:

* Blockchainv 2 reactor:

	+ I cleaner copy of the work done in #4067 which fell too far behind and was a nightmare to rebase.
	+ The work includes the reactor which ties together all the seperate routines involved in the design of the blockchain v2 refactor.

* fixes after merge

* reorder iIO interface methodset

* change iO -> IO

* panic before send nil block

* rename switchToConsensus -> trySwitchToConsensus

* rename tdState -> tmState

* Update blockchain/v2/reactor.go

Co-Authored-By: Bot from GolangCI <42910462+golangcibot@users.noreply.github.com>

* remove peer when it sends a block unsolicited

* check for not ready in markReceived

* fix error

* fix the pcFinished event

* typo fix

* add documentation for processor fields

* simplify time.Since

* try and make the linter happy

* some doc updates

* fix channel diagram

* Update adr-043-blockchain-riri-org.md

* panic on nil switch

* liting fixes

* account for nil block in bBlockResponseMessage

* panic on duplicate block enqueued by processor

* linting

* goimport reactor_test.go

Co-authored-by: Bot from GolangCI <42910462+golangcibot@users.noreply.github.com>
Co-authored-by: Anca Zamfir <ancazamfir@users.noreply.github.com>
Co-authored-by: Marko <marbar3778@yahoo.com>
Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com>
2020-02-19 16:00:14 +01:00
Sean Braithwaite
67cdd008cd Blockchain v2 processor (#4012)
* Add processor prototype

* Change processor API

    + expose a simple `handle` function which mutates internal state

* processor tests

* fix gofmt and ohter golangci issues

* scopelint var on range scope

* add check for short block received

* fix formatting

* small test reorg

* ignore unused for now

* ci fix changes

* go.mod revert
2019-10-14 19:06:11 +02:00
Marko
bf989eb272 fix linting (#4000) 2019-09-19 09:31:28 -04:00
Sean Braithwaite
d3d034e572 tidying 2019-09-17 15:18:15 -04:00
Sean Braithwaite
99b7a33f90 align buffer sizes 2019-09-14 13:01:19 -04:00
Sean Braithwaite
9bd2c0389f rename trySend to end 2019-09-13 18:54:25 -04:00
Sean Braithwaite
fbede85e20 changes based on feedback 2019-09-13 18:36:02 -04:00
Sean Braithwaite
e7ee314c99 Subsume the demuxer into the reactor
+ Simplify the design by demuxing events directly in the reactor
2019-09-13 11:33:38 -04:00
Sean Braithwaite
5474528db1 Switch to a priority queue:
* Routines will now use a priority queue instead of channels to
    iterate over events
2019-09-12 12:06:26 -04:00
Sean Braithwaite
78d4c3b88a fixes based on feedback 2019-08-13 17:57:17 +02:00
Sean Braithwaite
2c8cbfc26a linter fixes 2019-08-08 17:42:46 +02:00
Sean Braithwaite
acbfe67fb8 set logger 2019-08-08 16:56:39 +02:00
Sean Braithwaite
5b880fbcff cleanup events 2019-08-08 15:53:02 +02:00
Sean Braithwaite
c081b60ef6 Solidify API:
+ use `trySend` the replicate peer sending
    + expose `next()` as a chan of events as output
    + expose `final()` as a chan of error, for the final error
    + add `ready()` as chan struct when routine is ready
2019-08-08 15:12:11 +02:00
Sean Braithwaite
e4913f533a Fix race condition in shutdown:
+ ensure that we stop accepting messages once `stop` has been called
      to avoid the case in which we attempt to write to a channel which
      has already been closed
2019-08-06 18:00:14 +02:00
Sean Braithwaite
0cf9f86292 Modification based on feedback
+ `routine.send` returns false when routine is not running
    + this will prevent panics sending to channels which have been
    closed
    + Make output channels routine specific removing the risk of someone
    writting to a channel which was closed by another touine.
    + consistency changes between the routines and the demuxer
2019-08-06 13:27:15 +02:00
Sean Braithwaite
d1671d6175 blockchain v2: routines
+ Include an implementaiton of the routines specified in ADR-43
    along with a demuxer and some dummy reactor code
2019-08-03 09:19:32 +02:00