Commit Graph

945 Commits

Author SHA1 Message Date
Anton Kaliaev
5fa540bdc9 mempool: add a safety check, write tests for mempoolIDs (#3487)
* mempool: add a safety check, write tests for mempoolIDs

and document 65536 limit in the mempool reactor spec

follow-up to https://github.com/tendermint/tendermint/pull/2778

* rename the test

* fixes after Ismail's review
2019-03-27 16:45:34 +01:00
Anton Kaliaev
1d4afb179b replace PB2TM.ConsensusParams with a call to params#Update (#3448)
Fixes #3444
2019-03-21 11:05:39 +01:00
Anton Kaliaev
4162ebe8b5 types: refactor PB2TM.ConsensusParams to take BlockTimeIota as an arg (#3442)
See https://github.com/tendermint/tendermint/pull/3403/files#r266208947

In #3403 we unexposed BlockTimeIota from the ABCI, but it's still part
of the ConsensusParams struct, so we have to remember to add it back
after calling PB2TM.ConsensusParams. Instead, PB2TM.ConsensusParams
should take it as an argument

Fixes #3432
2019-03-19 11:38:32 +04:00
Anton Kaliaev
3035572034 cs: comment out log.Error to avoid TestReactorValidatorSetChanges timing out (#3401) 2019-03-11 22:52:09 +04:00
Anton Kaliaev
15f621141d remove TimeIotaMs from ABCI consensus params (#3403)
Also

- init substructures to avoid panic in pb2tm.ConsensusParams
Before: if csp.Block is nil and we later try to access/write to it,
we'll panic.
After: if csp.Block is nil and we later try to access/write to it,
there'll be no panic.
2019-03-11 22:21:17 +04:00
Anton Kaliaev
b6a510a3e7 make ineffassign linter pass (#3386)
Refs #3262

This fixes two small bugs:

1) lite/dbprovider: return `ok` instead of true in parse* functions. It's weird that we're ignoring `ok` value before.
2) consensus/state: previously because of the shadowing we almost never output "Error with msg". Now we declare both `added` and `err` in the beginning of the function, so there's no shadowing.
2019-03-08 09:46:09 +04:00
Anton Kaliaev
52771e1287 make BlockTimeIota a consensus parameter, not a locally configurable … (#3048)
* make BlockTimeIota a consensus parameter, not a locally configurable option

Refs #2920

* make TimeIota int64 ms

Refs #2920

* update Gopkg.toml

* fixes after Ethan's review

* fix TestRemoteSignerProposalSigningFailed

* update changelog
2019-03-04 13:24:44 +04:00
Anton Kaliaev
f39138aa2e remove RoundState from EventDataRoundState (#3354)
Before we're using it to get a round state in tests. Now it can be done
by calling csX.GetRoundState. We will need to rewrite
TestStateSlashingPrevotes and TestStateSlashingPrecommits, which are
commented right now, to not rely on EventDataRoundState#RoundState
field.

Refs #1527
2019-03-04 12:18:32 +04:00
Anton Kaliaev
ec9bff5234 rename WAL#Flush to WAL#FlushAndSync (#3345)
* rename WAL#Flush to WAL#FlushAndSync

- rename auto#Flush to auto#FlushAndSync
- cleanup WAL interface to not leak implementation details!
  * remove Group()
  * add WALReader interface and return it in SearchForEndHeight()
- add interface assertions

Refs #3337

* replace WALReader with io.ReadCloser
2019-02-25 09:11:07 +04:00
Anton Kaliaev
2137ecc130 pubsub 2.0 (#3227)
* green pubsub tests :OK:

* get rid of clientToQueryMap

* Subscribe and SubscribeUnbuffered

* start adapting other pkgs to new pubsub

* nope

* rename MsgAndTags to Message

* remove TagMap

it does not bring any additional benefits

* bring back EventSubscriber

* fix test

* fix data race in TestStartNextHeightCorrectly

```
Write at 0x00c0001c7418 by goroutine 796:
  github.com/tendermint/tendermint/consensus.TestStartNextHeightCorrectly()
      /go/src/github.com/tendermint/tendermint/consensus/state_test.go:1296 +0xad
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:827 +0x162

Previous read at 0x00c0001c7418 by goroutine 858:
  github.com/tendermint/tendermint/consensus.(*ConsensusState).addVote()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:1631 +0x1366
  github.com/tendermint/tendermint/consensus.(*ConsensusState).tryAddVote()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:1476 +0x8f
  github.com/tendermint/tendermint/consensus.(*ConsensusState).handleMsg()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:667 +0xa1e
  github.com/tendermint/tendermint/consensus.(*ConsensusState).receiveRoutine()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:628 +0x794

Goroutine 796 (running) created at:
  testing.(*T).Run()
      /usr/local/go/src/testing/testing.go:878 +0x659
  testing.runTests.func1()
      /usr/local/go/src/testing/testing.go:1119 +0xa8
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:827 +0x162
  testing.runTests()
      /usr/local/go/src/testing/testing.go:1117 +0x4ee
  testing.(*M).Run()
      /usr/local/go/src/testing/testing.go:1034 +0x2ee
  main.main()
      _testmain.go:214 +0x332

Goroutine 858 (running) created at:
  github.com/tendermint/tendermint/consensus.(*ConsensusState).startRoutines()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:334 +0x221
  github.com/tendermint/tendermint/consensus.startTestRound()
      /go/src/github.com/tendermint/tendermint/consensus/common_test.go:122 +0x63
  github.com/tendermint/tendermint/consensus.TestStateFullRound1()
      /go/src/github.com/tendermint/tendermint/consensus/state_test.go:255 +0x397
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:827 +0x162
```

* fixes after my own review

* fix formatting

* wait 100ms before kicking a subscriber out

+ a test for indexer_service

* fixes after my second review

* no timeout

* add changelog entries

* fix merge conflicts

* fix typos after Thane's review

Co-Authored-By: melekes <anton.kalyaev@gmail.com>

* reformat code

* rewrite indexer service in the attempt to fix failing test

https://github.com/tendermint/tendermint/pull/3227/#issuecomment-462316527

* Revert "rewrite indexer service in the attempt to fix failing test"

This reverts commit 0d9107a098.

* another attempt to fix indexer

* fixes after Ethan's review

* use unbuffered channel when indexing transactions

Refs https://github.com/tendermint/tendermint/pull/3227#discussion_r258786716

* add a comment for EventBus#SubscribeUnbuffered

* format code
2019-02-22 23:11:27 -05:00
Anton Kaliaev
67fd428354 fix TestWALPeriodicSync (#3342)
The test was sometimes failing due to processFlushTicks being called too
early. The solution is to call wal#Start later in the test.
2019-02-22 11:49:08 +04:00
Anton Kaliaev
f22ada442a refactor decideProposal in common_test (#3343) 2019-02-22 11:48:40 +04:00
Anton Kaliaev
ed1de13548 cs: update wal comments (#3334)
* cs: update wal comments

Follow-up to https://github.com/tendermint/tendermint/pull/3300

* Update consensus/wal.go

Co-Authored-By: melekes <anton.kalyaev@gmail.com>
2019-02-21 18:28:02 +04:00
Thane Thomson
dff3deb2a9 cs: sync WAL more frequently (#3300)
As per #3043, this adds a ticker to sync the WAL every 2s while the WAL is running.

* Flush WAL every 2s

This adds a ticker that flushes the WAL every 2s while the WAL is
running. This is related to #3043.

* Fix spelling

* Increase timeout to 2mins for slower build environments

* Make WAL sync interval configurable

* Add TODO to replace testChan with more comprehensive testBus

* Remove extraneous debug statement

* Remove testChan in favour of using system time

As per
https://github.com/tendermint/tendermint/pull/3300#discussion_r255886586,
this removes the `testChan` WAL member and replaces the approach with a
system time-oriented one. In this new approach, we keep track of the
system time at which each flush and periodic flush successfully
occurred.

The naming of the various functions is also updated here to be more
consistent with "flushing" as opposed to "sync'ing".

* Update naming convention and ensure lock for timestamp update

* Add Flush method as part of WAL interface

Adds a `Flush` method as part of the WAL interface to enforce the idea
that we can manually trigger a WAL flush from outside of the WAL. This
is employed in the consensus state management to flush the WAL prior to
signing votes/proposals, as per https://github.com/tendermint/tendermint/issues/3043#issuecomment-453853630

* Update CHANGELOG_PENDING

* Remove mutex approach and replace with DI

The dependency injection approach to dealing with testing concerns could
allow similar effects to some kind of "testing bus"-based approach. This
commit introduces an example of this, where instead of relying on
(potentially fragile) timing of things between the code and the test, we
inject code into the function under test that can signal the test
through a channel.

This allows us to avoid the `time.Sleep()`-based approach previously
employed.

* Update comment on WAL flushing during vote signing

Co-Authored-By: thanethomson <connect@thanethomson.com>

* Simplify flush interval definition

Co-Authored-By: thanethomson <connect@thanethomson.com>

* Expand commentary on WAL disk flushing

Co-Authored-By: thanethomson <connect@thanethomson.com>

* Add broken test to illustrate WAL sync test problem

Removes test-related state (dependency injection code) from the WAL data
structure and adds test code to illustrate the problem with using
`WALGenerateNBlocks` and `wal.SearchForEndHeight` to test periodic
sync'ing.

* Fix test error messages

* Use WAL group buffer size to check for flush

A function is added to `libs/autofile/group.go#Group` in order to return
the size of the buffered data (i.e. data that has not yet been flushed
to disk). The test now checks that, prior to a `time.Sleep`, the group
buffer has data in it. After the `time.Sleep` (during which time the
periodic flush should have been called), the buffer should be empty.

* Remove config root dir removal from #3291

* Add godoc for NewWAL mentioning periodic sync
2019-02-20 09:45:18 +04:00
Anton Kaliaev
8283ca7ddb 3291 follow-up (#3323)
* changelog: use issue number instead of PR number

* follow up to #3291

- rpc/test/helpers.go add StopTendermint(node) func
- remove ensureDir(filepath.Dir(walFile), 0700)
- mempool/mempool_test.go add type cleanupFunc func()

* cmd/show_validator: wrap err to make it more clear
2019-02-18 13:23:40 +04:00
Alessio Treglia
59cc6d36c9 improve ResetTestRootWithChainID() concurrency safety (#3291)
* improve ResetTestRootWithChainID() concurrency safety

Rely on ioutil.TempDir() to create test root directories and ensure
multiple same-chain id test cases can run in parallel.

* Update config/toml.go

Co-Authored-By: alessio <quadrispro@ubuntu.com>

* clean up test directories after completion

Closes: #1034

* Remove redundant EnsureDir call

* s/PanicSafety()/panic()/s

* Put create dir functionality back in ResetTestRootWithChainID

* Place test directories in OS's tempdir

In modern UNIX and UNIX-like systems /tmp is very often
mounted as tmpfs. This might speed test execution a bit.

* Set 0700 to a const

* rootsDirs -> configRootDirs

* Don't double remove directories

* Avoid global variables

* Fix consensus tests

* Reduce defer stack

* Address review comments

* Try to fix tests

* Update CHANGELOG_PENDING.md

Co-Authored-By: alessio <quadrispro@ubuntu.com>

* Update consensus/common_test.go

Co-Authored-By: alessio <quadrispro@ubuntu.com>

* Update consensus/common_test.go

Co-Authored-By: alessio <quadrispro@ubuntu.com>
2019-02-18 11:45:27 +04:00
Zarko Milosevic
af8793c01a cs: reset triggered timeout precommit (#3310)
* Reset TriggeredTimeoutPrecommit as part of updateToState

* Add failing test and fix

* fix DATA RACE in TestResetTimeoutPrecommitUponNewHeight

```
WARNING: DATA RACE
Read at 0x00c001691d28 by goroutine 691:
  github.com/tendermint/tendermint/consensus.decideProposal()
      /go/src/github.com/tendermint/tendermint/consensus/common_test.go:133 +0x121
  github.com/tendermint/tendermint/consensus.TestResetTimeoutPrecommitUponNewHeight()
      /go/src/github.com/tendermint/tendermint/consensus/state_test.go:1389 +0x958
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:827 +0x162

Previous write at 0x00c001691d28 by goroutine 931:
  github.com/tendermint/tendermint/consensus.(*ConsensusState).updateToState()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:562 +0x5b2
  github.com/tendermint/tendermint/consensus.(*ConsensusState).finalizeCommit()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:1340 +0x141e
  github.com/tendermint/tendermint/consensus.(*ConsensusState).tryFinalizeCommit()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:1255 +0x66e
  github.com/tendermint/tendermint/consensus.(*ConsensusState).enterCommit.func1()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:1201 +0x135
  github.com/tendermint/tendermint/consensus.(*ConsensusState).enterCommit()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:1232 +0x94b
  github.com/tendermint/tendermint/consensus.(*ConsensusState).addVote()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:1657 +0x132e
  github.com/tendermint/tendermint/consensus.(*ConsensusState).tryAddVote()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:1503 +0x8f
  github.com/tendermint/tendermint/consensus.(*ConsensusState).handleMsg()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:694 +0xa1e
  github.com/tendermint/tendermint/consensus.(*ConsensusState).receiveRoutine()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:642 +0x948
  github.com/tendermint/tendermint/consensus.(*ConsensusState).receiveRoutine()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:642 +0x948
  github.com/tendermint/tendermint/consensus.(*ConsensusState).receiveRoutine()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:642 +0x948
  github.com/tendermint/tendermint/consensus.(*ConsensusState).receiveRoutine()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:655 +0x7dd
  github.com/tendermint/tendermint/consensus.(*ConsensusState).receiveRoutine()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:642 +0x948
  github.com/tendermint/tendermint/consensus.(*ConsensusState).receiveRoutine()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:655 +0x7dd

Goroutine 691 (running) created at:
  testing.(*T).Run()
      /usr/local/go/src/testing/testing.go:878 +0x659
  testing.runTests.func1()
      /usr/local/go/src/testing/testing.go:1119 +0xa8
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:827 +0x162
  testing.runTests()
  testing.(*M).Run()
      /usr/local/go/src/testing/testing.go:1034 +0x2ee
  main.main()
      _testmain.go:216 +0x332
```

* fix another DATA RACE by locking consensus

```
WARNING: DATA RACE
Read at 0x00c009b835a8 by goroutine 871:
  github.com/tendermint/tendermint/consensus.(*ConsensusState).createProposalBlock()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:955 +0x7c
  github.com/tendermint/tendermint/consensus.decideProposal()
      /go/src/github.com/tendermint/tendermint/consensus/common_test.go:127 +0x53
  github.com/tendermint/tendermint/consensus.TestResetTimeoutPrecommitUponNewHeight()
      /go/src/github.com/tendermint/tendermint/consensus/state_test.go:1389 +0x958
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:827 +0x162

Previous write at 0x00c009b835a8 by goroutine 931:
  github.com/tendermint/tendermint/consensus.(*ConsensusState).updateHeight()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:446 +0xb7
  github.com/tendermint/tendermint/consensus.(*ConsensusState).updateToState()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:542 +0x22f
  github.com/tendermint/tendermint/consensus.(*ConsensusState).finalizeCommit()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:1340 +0x141e
  github.com/tendermint/tendermint/consensus.(*ConsensusState).tryFinalizeCommit()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:1255 +0x66e
  github.com/tendermint/tendermint/consensus.(*ConsensusState).enterCommit.func1()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:1201 +0x135
  github.com/tendermint/tendermint/consensus.(*ConsensusState).enterCommit()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:1232 +0x94b
  github.com/tendermint/tendermint/consensus.(*ConsensusState).addVote()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:1657 +0x132e
  github.com/tendermint/tendermint/consensus.(*ConsensusState).tryAddVote()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:1503 +0x8f
  github.com/tendermint/tendermint/consensus.(*ConsensusState).handleMsg()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:694 +0xa1e
  github.com/tendermint/tendermint/consensus.(*ConsensusState).receiveRoutine()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:642 +0x948
  github.com/tendermint/tendermint/consensus.(*ConsensusState).receiveRoutine()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:642 +0x948
  github.com/tendermint/tendermint/consensus.(*ConsensusState).receiveRoutine()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:642 +0x948
  github.com/tendermint/tendermint/consensus.(*ConsensusState).receiveRoutine()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:655 +0x7dd
  github.com/tendermint/tendermint/consensus.(*ConsensusState).receiveRoutine()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:642 +0x948
  github.com/tendermint/tendermint/consensus.(*ConsensusState).receiveRoutine()
      /go/src/github.com/tendermint/tendermint/consensus/state.go:655 +0x7dd
```

* Fix failing test

* Delete profile.out

* fix data races
2019-02-18 11:29:41 +04:00
Anton Kaliaev
0b0a8b3128 cs/wal: refuse to encode msg that is bigger than maxMsgSizeBytes (#3303)
Earlier this week somebody posted this in GoS Riot chat:

```
E[2019-02-12|10:38:37.596] Corrupted entry. Skipping... module=consensus wal=/home/gaia/.gaiad/data/cs.wal/wal err="DataCorruptionError[length 878916964 exceeded maximum possible value of 1048576 bytes]"
E[2019-02-12|10:38:37.596] Corrupted entry. Skipping... module=consensus wal=/home/gaia/.gaiad/data/cs.wal/wal err="DataCorruptionError[length 825701731 exceeded maximum possible value of 1048576 bytes]"
E[2019-02-12|10:38:37.596] Corrupted entry. Skipping... module=consensus wal=/home/gaia/.gaiad/data/cs.wal/wal err="DataCorruptionError[length 1631073634 exceeded maximum possible value of 1048576 bytes]"
E[2019-02-12|10:38:37.596] Corrupted entry. Skipping... module=consensus wal=/home/gaia/.gaiad/data/cs.wal/wal err="DataCorruptionError[length 912418148 exceeded maximum possible value of 1048576 bytes]"
E[2019-02-12|10:38:37.600] Corrupted entry. Skipping... module=consensus wal=/home/gaia/.gaiad/data/cs.wal/wal err="DataCorruptionError[failed to read data: EOF]"
E[2019-02-12|10:38:37.600] Error on catchup replay. Proceeding to start ConsensusState anyway module=consensus err="Cannot replay height 7242. WAL does not contain #ENDHEIGHT for 7241"
E[2019-02-12|10:38:37.861] Error dialing peer module=p2p err="dial tcp 35.183.126.181:26656: i/o timeout
```

Note the length error messages. What has happened is the length field got corrupted probably. I've looked at the code and noticed that we don't check the msg size during encoding. This PR fixes that. It also improves a few error messages in WALDecoder.
2019-02-18 11:23:06 +04:00
Ethan Buchman
dc6567c677 consensus: flush wal on stop (#3297)
Should fix #3295
Also partial fix of #3043
2019-02-12 09:32:40 +04:00
Anton Kaliaev
7fd51e6ade make govet linter pass (#3292)
* make govet linter pass

Refs #3262

* close PipeReader and check for err
2019-02-11 16:31:34 +04:00
Ethan Buchman
4f2ef36701 types.NewCommit (#3275)
* types.NewCommit

* use types.NewCommit everywhere

* fix log in unsafe_reset

* memoize height and round in constructor

* notes about deprecating toVote

* bring back memoizeHeightRound
2019-02-08 18:40:41 -05:00
Ismail Khoffi
87bdc42bf8 Reject blocks with committed evidence (#37)
* evidence: NewEvidencePool takes evidenceDB

* evidence: failing TestStoreCommitDuplicate

tendermint/security#35

* GetEvidence -> GetEvidenceInfo

* fix TestStoreCommitDuplicate

* comment in VerifyEvidence

* add check if evidence was already seen
 - modify EventPool interface (EventStore is not known in ApplyBlock):
   - add IsCommitted method to iface
 - add test

* update changelog

* fix TestStoreMark:
 - priority in evidence info gets reset to zero after evidence gets committed

* review comments: simplify EvidencePool.IsCommitted
 - delete obsolete EvidenceStore.IsCommitted

* add simple test for IsCommitted

* update changelog: this is actually breaking (PR number still missing)

* fix TestStoreMark:
 - priority in evidence info gets reset to zero after evidence gets
 committed

* review suggestion: simplify return
2019-02-08 18:30:45 -05:00
Anton Kaliaev
fcebdf6720 Merge pull request #3261 from tendermint/anton/circle
Revert "quick fix for CircleCI (#2279)"
2019-02-07 10:28:22 +04:00
Ethan Buchman
45b70ae031 fix non deterministic test failures and race in privval socket (#3258)
* node: decrease retry conn timeout in test

Should fix #3256

The retry timeout was set to the default, which is the same as the
accept timeout, so it's no wonder this would fail. Here we decrease the
retry timeout so we can try many times before the accept timeout.

* p2p: increase handshake timeout in test

This fails sometimes, presumably because the handshake timeout is so low
(only 50ms). So increase it to 1s. Should fix #3187

* privval: fix race with ping. closes #3237

Pings happen in a go-routine and can happen concurrently with other
messages. Since we use a request/response protocol, we expect to send a
request and get back the corresponding response. But with pings
happening concurrently, this assumption could be violated. We were using
a mutex, but only a RWMutex, where the RLock was being held for sending
messages - this was to allow the underlying connection to be replaced if
it fails. Turns out we actually need to use a full lock (not just a read
lock) to prevent multiple requests from happening concurrently.

* node: fix test name. DelayedStop -> DelayedStart

* autofile: Wait() method

In the TestWALTruncate in consensus/wal_test.go we remove the WAL
directory at the end of the test. However the wal.Stop() does not
properly wait for the autofile group to finish shutting down. Hence it
was possible that the group's go-routine is still running when the
cleanup happens, which causes a panic since the directory disappeared.
Here we add a Wait() method to properly wait until the go-routine exits
so we can safely clean up. This fixes #2852.
2019-02-06 10:24:43 -05:00
Anton Kaliaev
1a35895ac8 remove gotype linter
WARN [runner/nolint] Found unknown linters in //nolint directives: gotype
2019-02-06 18:26:56 +04:00
Anton Kaliaev
6941d1bb35 use nolint label instead of commenting 2019-02-06 18:21:32 +04:00
Anton Kaliaev
ffd3bf8448 remove or comment out unused code 2019-02-06 15:16:38 +04:00
Ethan Buchman
1809efa350 Introduce CommitSig alias for Vote in Commit (#3245)
* types: memoize height/round in commit instead of first vote

* types: commit.ValidateBasic in VerifyCommit

* types: new CommitSig alias for Vote

In preparation for reducing the redundancy in Commits, we introduce the
CommitSig as an alias for Vote. This is non-breaking on the protocol,
and minor breaking on the Go API, as Commit now contains a list of
CommitSig instead of Vote.

* remove dependence on ToVote

* update some comments

* fix tests

* fix tests

* fixes from review
2019-02-04 13:01:59 -05:00
Ethan Buchman
39eba4e154 WAL: better errors and new fail point (#3246)
* privval: more info in errors

* wal: change Debug logs to Info

* wal: log and return error on corrupted wal instead of panicing

* fail: Exit right away instead of sending interupt

* consensus: FAIL before handling our own vote

allows to replicate #3089:
- run using `FAIL_TEST_INDEX=0`
- delete some bytes from the end of the WAL
- start normally

Results in logs like:

```
I[2019-02-03|18:12:58.225] Searching for height module=consensus wal=/Users/ethanbuchman/.tendermint/data/cs.wal/wal height=1 min=0 max=0
E[2019-02-03|18:12:58.225] Error on catchup replay. Proceeding to start ConsensusState anyway module=consensus err="failed to read data: EOF"
I[2019-02-03|18:12:58.225] Started node module=main nodeInfo="{ProtocolVersion:{P2P:6 Block:9 App:1} ID_:35e87e93f2e31f305b65a5517fd2102331b56002 ListenAddr:tcp://0.0.0.0:26656 Network:test-chain-J8JvJH Version:0.29.1 Channels:4020212223303800 Moniker:Ethans-MacBook-Pro.local Other:{TxIndex:on RPCAddress:tcp://0.0.0.0:26657}}"
E[2019-02-03|18:12:58.226] Couldn't connect to any seeds module=p2p
I[2019-02-03|18:12:59.229] Timed out module=consensus dur=998.568ms height=1 round=0 step=RoundStepNewHeight
I[2019-02-03|18:12:59.230] enterNewRound(1/0). Current: 1/0/RoundStepNewHeight module=consensus height=1 round=0
I[2019-02-03|18:12:59.230] enterPropose(1/0). Current: 1/0/RoundStepNewRound module=consensus height=1 round=0
I[2019-02-03|18:12:59.230] enterPropose: Our turn to propose module=consensus height=1 round=0 proposer=AD278B7767B05D7FBEB76207024C650988FA77D5 privValidator="PrivValidator{AD278B7767B05D7FBEB76207024C650988FA77D5 LH:1, LR:0, LS:2}"
E[2019-02-03|18:12:59.230] enterPropose: Error signing proposal module=consensus height=1 round=0 err="Error signing proposal: Step regression at height 1 round 0. Got 1, last step 2"
I[2019-02-03|18:13:02.233] Timed out module=consensus dur=3s height=1 round=0 step=RoundStepPropose
I[2019-02-03|18:13:02.233] enterPrevote(1/0). Current: 1/0/RoundStepPropose module=consensus
I[2019-02-03|18:13:02.233] enterPrevote: ProposalBlock is nil module=consensus height=1 round=0
E[2019-02-03|18:13:02.234] Error signing vote module=consensus height=1 round=0 vote="Vote{0:AD278B7767B0 1/00/1(Prevote) 000000000000 000000000000 @ 2019-02-04T02:13:02.233897Z}" err="Error signing vote: Conflicting data"
```

Notice the EOF, the step regression, and the conflicting data.

* wal: change errors to be DataCorruptionError

* exit on corrupt WAL

* fix log

* fix new line
2019-02-04 13:00:06 -05:00
Anton Kaliaev
d470945503 update gometalinter to 3.0.0 (#3233)
in the attempt to fix https://circleci.com/gh/tendermint/tendermint/43165

also

    code is simplified by running gofmt -s .
    remove unused vars
    enable linters we're currently passing
    remove deprecated linters
2019-01-30 12:24:26 +04:00
Thane Thomson
a335caaedb alias amino imports (#3219)
As per conversation here: https://github.com/tendermint/tendermint/pull/3218#discussion_r251364041

This is the result of running the following code on the repo:

```bash
find . -name '*.go' | grep -v 'vendor/' | xargs -n 1 goimports -w
```
2019-01-28 16:13:17 +04:00
cong
71e5939441 start eventBus & indexerService before replay and use them while replaying blocks (#3194)
so if we did not index the last block (because of panic or smth else), we index it during replay

Closes #3186
2019-01-28 11:36:35 +04:00
Zarko Milosevic
c20fbed2f7 [WIP] fix halting issue (#3197)
fix halting issue
2019-01-24 09:33:47 -05:00
Ethan Buchman
d17969e378 Merge pull request #3154 from tendermint/master
Merge master back to develop
2019-01-18 12:39:21 -05:00
Ethan Buchman
8fd8f800d0 Bucky/fix evidence halt (#34)
* consensus: createProposalBlock function

* blockExecutor.CreateProposalBlock

- factored out of consensus pkg into a method on blockExec
- new private interfaces for mempool ("txNotifier") and evpool with one function each
- consensus tests still require more mempool methods

* failing test for CreateProposalBlock

* Fix bug in include evidece into block

* evidence: change maxBytes to maxSize

* MaxEvidencePerBlock

- changed to return both the max number and the max bytes
- preparation for #2590

* changelog

* fix linter

* Fix from review

Co-Authored-By: ebuchman <ethan@coinculture.info>
2019-01-17 21:46:40 -05:00
Anton Kaliaev
dcb8f88525 add "chain_id" label for all metrics (#3123)
* add "chain_id" label for all metrics

Refs #3082

* fix labels extraction
2019-01-15 12:16:33 -05:00
srmo
616c3a4bae cs: prettify logging of ignored votes (#3086)
Refs #3038
2019-01-06 12:00:12 +03:00
Ismail Khoffi
6a80412a01 Remove privval.GetAddress(), memoize pubkey (#2948)
privval: remove GetAddress(), memoize pubkey
2018-12-22 00:36:45 -05:00
yutianwu
41e2eeee9c R4R: Split immutable and mutable parts of priv_validator.json (#2870)
* split immutable and mutable parts of priv_validator.json

* fix bugs

* minor changes

* retrig test

* delete scripts/wire2amino.go

* fix test

* fixes from review

* privval: remove mtx

* rearrange priv_validator.go

* upgrade path

* write tests for the upgrade

* fix for unsafe_reset_all

* add test

* add reset test
2018-12-21 16:58:27 -05:00
JamesRay
f82a8ff73a During replay, when appHeight==0, only saveState if stateHeight is also 0 (#3006)
* optimize addProposalBlockPart

* optimize addProposalBlockPart

* if ProposalBlockParts and LockedBlockParts both exist,let LockedBlockParts overwrite ProposalBlockParts.

* fix tryAddBlock

* broadcast lockedBlockParts in higher priority

* when appHeight==0, it's better fetch genDoc than state.validators.

* not save state if replay from height 1

* only save state if replay from height 1 when stateHeight is also 1

* only save state if replay from height 1 when stateHeight is also 1

* only save state if replay from height 0 when stateHeight is also 0

* handshake info's response version only update when stateHeight==0

* save the handshake responseInfo appVersion
2018-12-15 14:33:30 -05:00
Leo Wang
2f64717bb5 return an error if validator set is empty in genesis file and after InitChain (#2971)
Fixes #2951
2018-12-07 12:30:58 +04:00
Ismail Khoffi
b30c34e713 rename Accum -> ProposerPriority: (#2932)
- rename fields, methods, comments, tests
2018-11-28 15:35:09 -05:00
Dev Ojha
4039276085 remove unnecessary "crypto" import alias (#2940) 2018-11-28 23:53:04 +04:00
srmo
e291fbbebe 2871 remove proposalHeartbeat infrastructure (#2874)
* 2871 remove proposalHeartbeat infrastructure

* 2871 add preliminary changelog entry
2018-11-28 08:52:34 -05:00
nagarajmanjunath
bef39f3346 Updated Marshal and unmarshal methods to make compatible with protobuf (#2918)
* upadtes in grpc Marshal and unmarshal

* update comments
2018-11-27 08:07:20 -05:00
Ethan Buchman
47a0669d12 Fix fast sync stack with wrong block #2457 (#2731)
* fix fastsync may stuck by a wrong block

* fixes from updates

* fixes from review

* Align spec with the changes

* fmt
2018-11-26 15:31:11 -05:00
JamesRay
fe3b97fd66 It's better read from genDoc than from state.validators when appHeight==0 in replay (#2893)
* optimize addProposalBlockPart

* optimize addProposalBlockPart

* if ProposalBlockParts and LockedBlockParts both exist,let LockedBlockParts overwrite ProposalBlockParts.

* fix tryAddBlock

* broadcast lockedBlockParts in higher priority

* when appHeight==0, it's better fetch genDoc than state.validators.
2018-11-26 08:03:08 -05:00
Anton Kaliaev
b487feba42 node: refactor privValidator ext client code & tests (#2895)
* update ConsensusState#OnStop comment

* consensus: set logger for WAL in tests

* refactor privValidator client code and tests

follow-up on https://github.com/tendermint/tendermint/pull/2866
2018-11-21 21:24:13 +04:00
srmo
1466a2cc9f #2815 do not broadcast heartbeat proposal when we are non-validator (#2819)
* #2815 do not broadcast heartbeat proposal when we are non-validator

* #2815 adding preliminary changelog entry

* #2815 cosmetics and added test

* #2815 missed a little detail

- it's enough to call getAddress() once here

* #2815 remove debug logging from tests

* #2815 OK. I seem to be doing something fundamentally wrong here

* #2815 next iteration of proposalHeartbeat tests

- try and use "ensure" pattern in common_test

* 2815 incorporate review comments
2018-11-17 15:23:39 -05:00
kevlubkcm
a676c71678 [R4R] Add proposer to NewRound event and proposal info to CompleteProposal event (#2767)
* add proposer info to EventCompleteProposal

* create separate EventData structure for CompleteProposal

* cant us rs.Proposal to get BlockID because it is not guaranteed to be set yet

* copying RoundState isnt helping us here

* add Step back to make compatible with original RoundState event. update changelog

* add NewRound event

* fix test

* remove unneeded RoundState

* put height round step into a struct

* pull out ValidatorInfo struct. add ensureProposal assert

* remove height-round-state sub-struct refactor

* minor fixes from review
2018-11-15 18:40:42 -05:00