This reverts commit f939f962b1.
A lot of inbound links are still broken, so we will need to find a different
approach to suppressing unreleased docs.
(cherry picked from commit 59eaa4dba0)
This test relied on connecting to the external site `foo-bar.net`, and (predictably) the site went down and broke all of our CI runs. This changes it to use local HTTP servers instead.
Closes#5074
Old code does not work when --consensus.create_empty_blocks=false
(because it only calls tmos.Kill when ApplyBlock fails). New code is
listening ABCI clients for Quit and kills TM process if there were any
errors.
After a reactor has failed to parse an incoming message, it shouldn't output the "bad" data into the logs, as that data is unfiltered and could have anything in it. (We also don't think this information is helpful to have in the logs anyways.)
I introduced a new variable - syncEnded, which is now used to prevent
sending new events to channels (which would block otherwise) if reactor
is finished syncing
Closes#4591
Closes#5444
Now we record the fact that a peer does not have a requested block and later use this information to make a new request for the same block from another peer.
Add simple `NoBlockResponse` handling to blockchain reactor v1. I tested before and after with erik's e2e testing and was not able to reproduce the inability to sync after the changes were applied
Closes: #5394
in consensus/state.go, when calulating metrics, retrieve address (ergo, pubkey) once prior to iterating over validatorset to ensure we do not make excessive calls to signer.
Partially closes: #4865
Closes#4926
The dump consensus state had this:
"last_commit": {
"votes": [
"Vote{0:04CBBF43CA3E 385085/00/2(Precommit) 1B73DA9FC4C8 42C97B86D89D @ 2020-05-27T06:46:51.042392895Z}",
"Vote{1:055799E028FA 385085/00/2(Precommit) 652B08AD61EA 0D507D7FA3AB @ 2020-06-28T04:57:29.20793209Z}",
"Vote{2:056024CFA910 385085/00/2(Precommit) 652B08AD61EA C8E95532A4C3 @ 2020-06-28T04:57:29.452696998Z}",
"Vote{3:0741C95814DA 385085/00/2(Precommit) 652B08AD61EA 36D567615F7C @ 2020-06-28T04:57:29.279788593Z}",
Note there's a precommit in there from the first val from May (2020-05-27) while the rest are from today (2020-06-28). It suggests there's a validator from an old instance of the network at this height (they're using the same chain-id!). Obviously a single bad validator shouldn't be an issue. But the Commit refactor work introduced a bug.
When we propose a block, we get the block.LastCommit by calling MakeCommit on the set of precommits we saw for the last height. This set may include precommits for a different block, and hence the block.LastCommit we propose may include precommits that aren't actually for the last block (but of course +2/3 will be). Before v0.33, we just skipped over these precommits during verification. But in v0.33, we expect all signatures for a blockID to be for the same block ID! Thus we end up proposing a block that we can't verify.
Since the light client work introduced in v0.33 it appears full nodes
are no longer fully verifying commit signatures during block execution -
they stop after +2/3. See in VerifyCommit:
0c7fd316eb/types/validator_set.go (L700-L703)
This means proposers can propose blocks that contain valid +2/3
signatures and then the rest of the signatures can be whatever they
want. They can claim that all the other validators signed just by
including a CommitSig with arbitrary signature data. While this doesn't
seem to impact safety of Tendermint per se, it means that Commits may
contain a lot of invalid data. This is already true of blocks, since
they can include invalid txs filled with garbage, but in that case the
application knows they they are invalid and can punish the proposer. But
since applications dont verify commit signatures directly (they trust
tendermint to do that), they won't be able to detect it.
This can impact incentivization logic in the application that depends on
the LastCommitInfo sent in BeginBlock, which includes which validators
signed. For instance, Gaia incentivizes proposers with a bonus for
including more than +2/3 of the signatures. But a proposer can now claim
that bonus just by including arbitrary data for the final -1/3 of
validators without actually waiting for their signatures. There may be
other tricks that can be played because of this.
In general, the full node should be a fully verifying machine. While
it's true that the light client can avoid verifying all signatures by
stopping after +2/3, the full node can not. Thus the light client and
full node should use distinct VerifyCommit functions if one is going to
stop after +2/3 or otherwise perform less validation (for instance light
clients can also skip verifying votes for nil while full nodes can not).
See a commit with a bad signature that verifies here: 56367fd. From what
I can tell, Tendermint will go on to think this commit is valid and
forward this data to the app, so the app will think the second validator
actually signed when it clearly did not.
Fixes#4802. The Go HTTP server has a global panic handler for requests, so it was not as severe as first thought.
This fix can still panic, since we try to send a `500` response - if that happens, the Go HTTP server will terminate the connection. Otherwise, the client will get a 200 response, which we should avoid. I'm sort of torn on whether it's even necessary to include this fix, instead of just letting the HTTP server deal with it.
* blockchain/v2: fix excessive CPU usage due to spinning on closed channels (#4761)
The event loop uses a `select` on multiple channels. However, reading from a closed channel in Go always yields the channel's zero value. The processor and scheduler close their channels when done, and since these channels are always ready to receive, the event loop keeps spinning on them.
This changes `routine.terminate()` to not close the channel, and also removes `stopDemux` and instead uses `events` channel closure to signal event loop termination.
Fixes#4687.
* blockchain/v2: respect fast_sync option (#4772)
Not thoroughly tested, but seems to work. Will do further testing as this is integrated with state sync.
Fixes#4688.