diff --git a/CHANGELOG.md b/CHANGELOG.md index 0216a533b..b4a022dcc 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -230,6 +230,26 @@ Special thanks to external contributors on this release: @JayT106, - [cmd/tendermint/commands] [\#6623](https://github.com/tendermint/tendermint/pull/6623) replace `$HOME/.some/test/dir` with `t.TempDir` (@tanyabouman) - [statesync] \6807 Implement P2P state provider as an alternative to RPC (@cmwaters) +## v0.34.18 + +### BREAKING CHANGES + +- CLI/RPC/Config + - [cli] [\#8258](https://github.com/tendermint/tendermint/pull/8258) Fix a bug in the cli that caused `unsafe-reset-all` to panic + +## v0.34.17 + +### BREAKING CHANGES + +- CLI/RPC/Config + + - [cli] [\#8081](https://github.com/tendermint/tendermint/issues/8081) make the reset command safe to use (@marbar3778). + +### BUG FIXES + +- [consensus] [\#8079](https://github.com/tendermint/tendermint/issues/8079) start the timeout ticker before relay (backport #7844) (@creachadair). +- [consensus] [\#7992](https://github.com/tendermint/tendermint/issues/7992) [\#7994](https://github.com/tendermint/tendermint/issues/7994) change lock handling in handleMsg and reactor to alleviate issues gossiping during long ABCI calls (@williambanfield). + ## v0.34.16 Special thanks to external contributors on this release: @yihuang @@ -1953,7 +1973,7 @@ more details. - [rpc] [\#3269](https://github.com/tendermint/tendermint/issues/2826) Limit number of unique clientIDs with open subscriptions. Configurable via `rpc.max_subscription_clients` - [rpc] [\#3269](https://github.com/tendermint/tendermint/issues/2826) Limit number of unique queries a given client can subscribe to at once. Configurable via `rpc.max_subscriptions_per_client`. - [rpc] [\#3435](https://github.com/tendermint/tendermint/issues/3435) Default ReadTimeout and WriteTimeout changed to 10s. WriteTimeout can increased by setting `rpc.timeout_broadcast_tx_commit` in the config. - - [rpc/client] [\#3269](https://github.com/tendermint/tendermint/issues/3269) Update `EventsClient` interface to reflect new pubsub/eventBus API [ADR-33](https://github.com/tendermint/tendermint/blob/develop/docs/architecture/adr-033-pubsub.md). This includes `Subscribe`, `Unsubscribe`, and `UnsubscribeAll` methods. + - [rpc/client] [\#3269](https://github.com/tendermint/tendermint/issues/3269) Update `EventsClient` interface to reflect new pubsub/eventBus API [ADR-33](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-033-pubsub.md). This includes `Subscribe`, `Unsubscribe`, and `UnsubscribeAll` methods. * Apps - [abci] [\#3403](https://github.com/tendermint/tendermint/issues/3403) Remove `time_iota_ms` from BlockParams. This is a @@ -2006,7 +2026,7 @@ more details. - [blockchain] [\#3358](https://github.com/tendermint/tendermint/pull/3358) Fix timer leak in `BlockPool` (@guagualvcha) - [cmd] [\#3408](https://github.com/tendermint/tendermint/issues/3408) Fix `testnet` command's panic when creating non-validator configs (using `--n` flag) (@srmo) - [libs/db/remotedb/grpcdb] [\#3402](https://github.com/tendermint/tendermint/issues/3402) Close Iterator/ReverseIterator after use -- [libs/pubsub] [\#951](https://github.com/tendermint/tendermint/issues/951), [\#1880](https://github.com/tendermint/tendermint/issues/1880) Use non-blocking send when dispatching messages [ADR-33](https://github.com/tendermint/tendermint/blob/develop/docs/architecture/adr-033-pubsub.md) +- [libs/pubsub] [\#951](https://github.com/tendermint/tendermint/issues/951), [\#1880](https://github.com/tendermint/tendermint/issues/1880) Use non-blocking send when dispatching messages [ADR-33](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-033-pubsub.md) - [lite] [\#3364](https://github.com/tendermint/tendermint/issues/3364) Fix `/validators` and `/abci_query` proxy endpoints (@guagualvcha) - [p2p/conn] [\#3347](https://github.com/tendermint/tendermint/issues/3347) Reject all-zero shared secrets in the Diffie-Hellman step of secret-connection @@ -2710,7 +2730,7 @@ Special thanks to external contributors on this release: This release is mostly about the ConsensusParams - removing fields and enforcing MaxGas. It also addresses some issues found via security audit, removes various unused functions from `libs/common`, and implements -[ADR-012](https://github.com/tendermint/tendermint/blob/develop/docs/architecture/adr-012-peer-transport.md). +[ADR-012](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-012-peer-transport.md). BREAKING CHANGES: @@ -2775,7 +2795,7 @@ are affected by a change. A few more breaking changes are in the works - each will come with a clear Architecture Decision Record (ADR) explaining the change. You can review ADRs -[here](https://github.com/tendermint/tendermint/tree/develop/docs/architecture) +[here](https://github.com/tendermint/tendermint/tree/master/docs/architecture) or in the [open Pull Requests](https://github.com/tendermint/tendermint/pulls). You can also check in on the [issues marked as breaking](https://github.com/tendermint/tendermint/issues?q=is%3Aopen+is%3Aissue+label%3Abreaking). @@ -2791,7 +2811,7 @@ BREAKING CHANGES: - [abci] Added address of the original proposer of the block to Header - [abci] Change ABCI Header to match Tendermint exactly - [abci] [\#2159](https://github.com/tendermint/tendermint/issues/2159) Update use of `Validator` (see - [ADR-018](https://github.com/tendermint/tendermint/blob/develop/docs/architecture/adr-018-ABCI-Validators.md)): + [ADR-018](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-018-ABCI-Validators.md)): - Remove PubKey from `Validator` (so it's just Address and Power) - Introduce `ValidatorUpdate` (with just PubKey and Power) - InitChain and EndBlock use ValidatorUpdate @@ -2813,7 +2833,7 @@ BREAKING CHANGES: - [state] [\#1815](https://github.com/tendermint/tendermint/issues/1815) Validator set changes are now delayed by one block (!) - Add NextValidatorSet to State, changes on-disk representation of state - [state] [\#2184](https://github.com/tendermint/tendermint/issues/2184) Enforce ConsensusParams.BlockSize.MaxBytes (See - [ADR-020](https://github.com/tendermint/tendermint/blob/develop/docs/architecture/adr-020-block-size.md)). + [ADR-020](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-020-block-size.md)). - Remove ConsensusParams.BlockSize.MaxTxs - Introduce maximum sizes for all components of a block, including ChainID - [types] Updates to the block Header: @@ -2824,7 +2844,7 @@ BREAKING CHANGES: - [consensus] [\#2203](https://github.com/tendermint/tendermint/issues/2203) Implement BFT time - Timestamp in block must be monotonic and equal the median of timestamps in block's LastCommit - [crypto] [\#2239](https://github.com/tendermint/tendermint/issues/2239) Secp256k1 signature changes (See - [ADR-014](https://github.com/tendermint/tendermint/blob/develop/docs/architecture/adr-014-secp-malleability.md)): + [ADR-014](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-014-secp-malleability.md)): - format changed from DER to `r || s`, both little endian encoded as 32 bytes. - malleability removed by requiring `s` to be in canonical form. @@ -3054,7 +3074,7 @@ BREAKING CHANGES: FEATURES - [cmd] Added metrics (served under `/metrics` using a Prometheus client; disabled by default). See the new `instrumentation` section in the config and - [metrics](https://tendermint.readthedocs.io/projects/tools/en/develop/metrics.html) + [metrics](https://github.com/tendermint/tendermint/blob/master/docs/nodes/metrics.md) guide. - [p2p] Add IPv6 support to peering. - [p2p] Add `external_address` to config to allow specifying the address for @@ -3168,7 +3188,7 @@ BREAKING: FEATURES -- [rpc] the RPC documentation is now published to https://tendermint.github.io/slate +- [rpc] the RPC documentation is now published to `https://tendermint.github.io/slate` - [p2p] AllowDuplicateIP config option to refuse connections from same IP. - true by default for now, false by default in next breaking release - [docs] Add docs for query, tx indexing, events, pubsub diff --git a/DOCKER/README.md b/DOCKER/README.md index 63c682811..4aa868e7a 100644 --- a/DOCKER/README.md +++ b/DOCKER/README.md @@ -8,7 +8,7 @@ Official releases can be found [here](https://github.com/tendermint/tendermint/r The Dockerfile for tendermint is not expected to change in the near future. The master file used for all builds can be found [here](https://raw.githubusercontent.com/tendermint/tendermint/master/DOCKER/Dockerfile). -Respective versioned files can be found (replace the Xs with the version number). +Respective versioned files can be found at `https://raw.githubusercontent.com/tendermint/tendermint/vX.XX.XX/DOCKER/Dockerfile` (replace the Xs with the version number). ## Quick reference diff --git a/UPGRADING.md b/UPGRADING.md index 6ae381b7b..931272cce 100644 --- a/UPGRADING.md +++ b/UPGRADING.md @@ -100,7 +100,7 @@ these parameters may do so by setting the `ConsensusParams.Timeout` field of the As a safety measure in case of unusual timing issues during the upgrade to v0.36, an operator may override the consensus timeout values for a single node. Note, however, that these overrides will be removed in Tendermint v0.37. See -[configuration](https://github.com/tendermint/tendermint/blob/wb/issue-8182/docs/nodes/configuration.md) +[configuration](https://github.com/tendermint/tendermint/blob/master/docs/nodes/configuration.md) for more information about these overrides. For more discussion of this, see [ADR 074](https://tinyurl.com/adr074), which diff --git a/crypto/README.md b/crypto/README.md index 20346d715..d60628d97 100644 --- a/crypto/README.md +++ b/crypto/README.md @@ -12,7 +12,7 @@ For any specific algorithm, use its specific module e.g. ## Binary encoding -For Binary encoding, please refer to the [Tendermint encoding specification](https://docs.tendermint.com/master/spec/blockchain/encoding.html). +For Binary encoding, please refer to the [Tendermint encoding specification](https://docs.tendermint.com/master/spec/core/encoding.html). ## JSON Encoding diff --git a/internal/consensus/replay.go b/internal/consensus/replay.go index 177b9fbad..3b2dea930 100644 --- a/internal/consensus/replay.go +++ b/internal/consensus/replay.go @@ -220,7 +220,6 @@ func NewHandshaker( eventBus *eventbus.EventBus, genDoc *types.GenesisDoc, ) *Handshaker { - return &Handshaker{ stateStore: stateStore, initialState: state, @@ -228,7 +227,6 @@ func NewHandshaker( eventBus: eventBus, genDoc: genDoc, logger: logger, - nBlocks: 0, } } @@ -359,7 +357,9 @@ func (h *Handshaker) ReplayBlocks( // First handle edge cases and constraints on the storeBlockHeight and storeBlockBase. switch { case storeBlockHeight == 0: - assertAppHashEqualsOneFromState(appHash, state) + if err := checkAppHashEqualsOneFromState(appHash, state); err != nil { + return nil, err + } return appHash, nil case appBlockHeight == 0 && state.InitialHeight < storeBlockBase: @@ -376,11 +376,11 @@ func (h *Handshaker) ReplayBlocks( case storeBlockHeight < stateBlockHeight: // the state should never be ahead of the store (this is under tendermint's control) - panic(fmt.Sprintf("StateBlockHeight (%d) > StoreBlockHeight (%d)", stateBlockHeight, storeBlockHeight)) + return nil, fmt.Errorf("StateBlockHeight (%d) > StoreBlockHeight (%d)", stateBlockHeight, storeBlockHeight) case storeBlockHeight > stateBlockHeight+1: // store should be at most one ahead of the state (this is under tendermint's control) - panic(fmt.Sprintf("StoreBlockHeight (%d) > StateBlockHeight + 1 (%d)", storeBlockHeight, stateBlockHeight+1)) + return nil, fmt.Errorf("StoreBlockHeight (%d) > StateBlockHeight + 1 (%d)", storeBlockHeight, stateBlockHeight+1) } var err error @@ -395,7 +395,9 @@ func (h *Handshaker) ReplayBlocks( } else if appBlockHeight == storeBlockHeight { // We're good! - assertAppHashEqualsOneFromState(appHash, state) + if err := checkAppHashEqualsOneFromState(appHash, state); err != nil { + return nil, err + } return appHash, nil } @@ -415,7 +417,11 @@ func (h *Handshaker) ReplayBlocks( // but we'd have to allow the WAL to replay a block that wrote it's #ENDHEIGHT h.logger.Info("Replay last block using real app") state, err = h.replayBlock(ctx, state, storeBlockHeight, appClient) - return state.AppHash, err + if err != nil { + return nil, err + + } + return state.AppHash, nil case appBlockHeight == storeBlockHeight: // We ran Commit, but didn't save the state, so replayBlock with mock app. @@ -437,13 +443,13 @@ func (h *Handshaker) ReplayBlocks( return nil, err } - return state.AppHash, err + return state.AppHash, nil } } - panic(fmt.Sprintf("uncovered case! appHeight: %d, storeHeight: %d, stateHeight: %d", - appBlockHeight, storeBlockHeight, stateBlockHeight)) + return nil, fmt.Errorf("uncovered case! appHeight: %d, storeHeight: %d, stateHeight: %d", + appBlockHeight, storeBlockHeight, stateBlockHeight) } func (h *Handshaker) replayBlocks( @@ -452,7 +458,8 @@ func (h *Handshaker) replayBlocks( appClient abciclient.Client, appBlockHeight, storeBlockHeight int64, - mutateState bool) ([]byte, error) { + mutateState bool, +) ([]byte, error) { // App is further behind than it should be, so we need to replay blocks. // We replay all blocks from appBlockHeight+1. // @@ -478,7 +485,9 @@ func (h *Handshaker) replayBlocks( block := h.store.LoadBlock(i) // Extra check to ensure the app was not changed in a way it shouldn't have. if len(appHash) > 0 { - assertAppHashEqualsOneFromBlock(appHash, block) + if err := checkAppHashEqualsOneFromBlock(appHash, block); err != nil { + return nil, err + } } if i == finalBlock && !mutateState { @@ -510,7 +519,9 @@ func (h *Handshaker) replayBlocks( appHash = state.AppHash } - assertAppHashEqualsOneFromState(appHash, state) + if err := checkAppHashEqualsOneFromState(appHash, state); err != nil { + return nil, err + } return appHash, nil } @@ -539,24 +550,25 @@ func (h *Handshaker) replayBlock( return state, nil } -func assertAppHashEqualsOneFromBlock(appHash []byte, block *types.Block) { +func checkAppHashEqualsOneFromBlock(appHash []byte, block *types.Block) error { if !bytes.Equal(appHash, block.AppHash) { - panic(fmt.Sprintf(`block.AppHash does not match AppHash after replay. Got %X, expected %X. + return fmt.Errorf(`block.AppHash does not match AppHash after replay. Got '%X', expected '%X'. -Block: %v -`, - appHash, block.AppHash, block)) +Block: %v`, + appHash, block.AppHash, block) } + return nil } -func assertAppHashEqualsOneFromState(appHash []byte, state sm.State) { +func checkAppHashEqualsOneFromState(appHash []byte, state sm.State) error { if !bytes.Equal(appHash, state.AppHash) { - panic(fmt.Sprintf(`state.AppHash does not match AppHash after replay. Got -%X, expected %X. + return fmt.Errorf(`state.AppHash does not match AppHash after replay. Got '%X', expected '%X'. State: %v Did you reset Tendermint without resetting your application's data?`, - appHash, state.AppHash, state)) + appHash, state.AppHash, state) } + + return nil } diff --git a/internal/consensus/replay_test.go b/internal/consensus/replay_test.go index 468d912ac..6d9e82a05 100644 --- a/internal/consensus/replay_test.go +++ b/internal/consensus/replay_test.go @@ -944,7 +944,7 @@ func buildTMStateFromChain( return state } -func TestHandshakePanicsIfAppReturnsWrongAppHash(t *testing.T) { +func TestHandshakeErrorsIfAppReturnsWrongAppHash(t *testing.T) { // 1. Initialize tendermint and commit 3 blocks with the following app hashes: // - 0x01 // - 0x02 @@ -988,12 +988,8 @@ func TestHandshakePanicsIfAppReturnsWrongAppHash(t *testing.T) { require.NoError(t, err) t.Cleanup(func() { cancel(); proxyApp.Wait() }) - assert.Panics(t, func() { - h := NewHandshaker(logger, stateStore, state, store, eventBus, genDoc) - if err = h.Handshake(ctx, proxyApp); err != nil { - t.Log(err) - } - }) + h := NewHandshaker(logger, stateStore, state, store, eventBus, genDoc) + assert.Error(t, h.Handshake(ctx, proxyApp)) } // 3. Tendermint must panic if app returns wrong hash for the last block @@ -1008,12 +1004,8 @@ func TestHandshakePanicsIfAppReturnsWrongAppHash(t *testing.T) { require.NoError(t, err) t.Cleanup(func() { cancel(); proxyApp.Wait() }) - assert.Panics(t, func() { - h := NewHandshaker(logger, stateStore, state, store, eventBus, genDoc) - if err = h.Handshake(ctx, proxyApp); err != nil { - t.Log(err) - } - }) + h := NewHandshaker(logger, stateStore, state, store, eventBus, genDoc) + require.Error(t, h.Handshake(ctx, proxyApp)) } } diff --git a/internal/consensus/state.go b/internal/consensus/state.go index f5d2f32e4..5588a5b70 100644 --- a/internal/consensus/state.go +++ b/internal/consensus/state.go @@ -124,6 +124,7 @@ type State struct { stateStore sm.Store initialStatePopulated bool + skipBootstrapping bool // create and execute blocks blockExec *sm.BlockExecutor @@ -185,6 +186,12 @@ type State struct { // StateOption sets an optional parameter on the State. type StateOption func(*State) +// SkipStateStoreBootstrap is a state option forces the constructor to +// skip state bootstrapping during construction. +func SkipStateStoreBootstrap(sm *State) { + sm.skipBootstrapping = true +} + // NewState returns a new State. func NewState( ctx context.Context, @@ -223,16 +230,21 @@ func NewState( cs.doPrevote = cs.defaultDoPrevote cs.setProposal = cs.defaultSetProposal - if err := cs.updateStateFromStore(ctx); err != nil { - return nil, err - } - // NOTE: we do not call scheduleRound0 yet, we do that upon Start() cs.BaseService = *service.NewBaseService(logger, "State", cs) for _, option := range options { option(cs) } + // this is not ideal, but it lets the consensus tests start + // node-fragments gracefully while letting the nodes + // themselves avoid this. + if !cs.skipBootstrapping { + if err := cs.updateStateFromStore(ctx); err != nil { + return nil, err + } + } + return cs, nil } diff --git a/internal/p2p/README.md b/internal/p2p/README.md index 9ba7303fa..16ad1d5f6 100644 --- a/internal/p2p/README.md +++ b/internal/p2p/README.md @@ -7,5 +7,5 @@ Docs: - [Connection](https://docs.tendermint.com/master/spec/p2p/connection.html) for details on how connections and multiplexing work - [Peer](https://docs.tendermint.com/master/spec/p2p/node.html) for details on peer ID, handshakes, and peer exchange - [Node](https://docs.tendermint.com/master/spec/p2p/node.html) for details about different types of nodes and how they should work -- [Pex](https://docs.tendermint.com/master/spec/reactors/pex/pex.html) for details on peer discovery and exchange +- [Pex](https://docs.tendermint.com/master/spec/p2p/messages/pex.html) for details on peer discovery and exchange - [Config](https://docs.tendermint.com/master/spec/p2p/config.html) for details on some config option diff --git a/internal/statesync/reactor_test.go b/internal/statesync/reactor_test.go index f59a6e4ee..cef0735f2 100644 --- a/internal/statesync/reactor_test.go +++ b/internal/statesync/reactor_test.go @@ -196,11 +196,11 @@ func setup( } func TestReactor_Sync(t *testing.T) { - ctx, cancel := context.WithCancel(context.Background()) + ctx, cancel := context.WithTimeout(context.Background(), 2*time.Minute) defer cancel() const snapshotHeight = 7 - rts := setup(ctx, t, nil, nil, 2) + rts := setup(ctx, t, nil, nil, 100) chain := buildLightBlockChain(ctx, t, 1, 10, time.Now()) // app accepts any snapshot rts.conn.On("OfferSnapshot", ctx, mock.AnythingOfType("types.RequestOfferSnapshot")). @@ -224,8 +224,7 @@ func TestReactor_Sync(t *testing.T) { closeCh := make(chan struct{}) defer close(closeCh) - go handleLightBlockRequests(ctx, t, chain, rts.blockOutCh, - rts.blockInCh, closeCh, 0) + go handleLightBlockRequests(ctx, t, chain, rts.blockOutCh, rts.blockInCh, closeCh, 0) go graduallyAddPeers(ctx, t, rts.peerUpdateCh, closeCh, 1*time.Second) go handleSnapshotRequests(ctx, t, rts.snapshotOutCh, rts.snapshotInCh, closeCh, []snapshot{ { diff --git a/networks/local/README.md b/networks/local/README.md index dcb31ae71..10fc19932 100644 --- a/networks/local/README.md +++ b/networks/local/README.md @@ -1,3 +1,3 @@ # Local Cluster with Docker Compose -See the [docs](https://docs.tendermint.com/master/networks/docker-compose.html). +See the [docs](https://docs.tendermint.com/master/tools/docker-compose.html). diff --git a/node/node.go b/node/node.go index 9b608d6f0..3ee75cfcf 100644 --- a/node/node.go +++ b/node/node.go @@ -61,17 +61,18 @@ type nodeImpl struct { // services eventSinks []indexer.EventSink + initialState sm.State stateStore sm.Store blockStore *store.BlockStore // store the blockchain to disk evPool *evidence.Pool stateSync bool // whether the node should state sync on startup stateSyncReactor *statesync.Reactor // for hosting and restoring state sync snapshots - - services []service.Service - rpcListeners []net.Listener // rpc servers - shutdownOps closer - rpcEnv *rpccore.Environment - prometheusSrv *http.Server + indexerService *indexer.Service + services []service.Service + rpcListeners []net.Listener // rpc servers + shutdownOps closer + rpcEnv *rpccore.Environment + prometheusSrv *http.Server } // newDefaultNode returns a Tendermint node with default settings for the @@ -157,20 +158,8 @@ func makeNode( nodeMetrics := defaultMetricsProvider(cfg.Instrumentation)(genDoc.ChainID) - // Create the proxyApp and establish connections to the ABCI app (consensus, mempool, query). proxyApp := proxy.New(client, logger.With("module", "proxy"), nodeMetrics.proxy) - if err := proxyApp.Start(ctx); err != nil { - return nil, fmt.Errorf("error starting proxy app connections: %w", err) - } - - // EventBus and IndexerService must be started before the handshake because - // we might need to index the txs of the replayed block as this might not have happened - // when the node stopped last time (i.e. the node stopped or crashed after it saved the block - // but before it indexed the txs) eventBus := eventbus.NewDefault(logger.With("module", "events")) - if err := eventBus.Start(ctx); err != nil { - return nil, combineCloseError(err, makeCloser(closers)) - } var eventLog *eventlog.Log if w := cfg.RPC.EventLogWindowSize; w > 0 { @@ -185,13 +174,11 @@ func makeNode( } } - indexerService, eventSinks, err := createAndStartIndexerService( - ctx, cfg, dbProvider, eventBus, - logger, genDoc.ChainID, nodeMetrics.indexer) + indexerService, eventSinks, err := createIndexerService( + cfg, dbProvider, eventBus, logger, genDoc.ChainID, nodeMetrics.indexer) if err != nil { return nil, combineCloseError(err, makeCloser(closers)) } - closers = append(closers, func() error { indexerService.Stop(); return nil }) privValidator, err := createPrivval(ctx, logger, cfg, genDoc, filePrivval) if err != nil { @@ -213,34 +200,6 @@ func makeNode( } } - // Create the handshaker, which calls RequestInfo, sets the AppVersion on the state, - // and replays any blocks as necessary to sync tendermint with the app. - if err := consensus.NewHandshaker( - logger.With("module", "handshaker"), - stateStore, state, blockStore, eventBus, genDoc, - ).Handshake(ctx, proxyApp); err != nil { - return nil, combineCloseError(err, makeCloser(closers)) - } - - // Reload the state. It will have the Version.Consensus.App set by the - // Handshake, and may have other modifications as well (ie. depending on - // what happened during block replay). - state, err = stateStore.Load() - if err != nil { - return nil, combineCloseError( - fmt.Errorf("cannot load state: %w", err), - makeCloser(closers)) - } - - logNodeStartupInfo(state, pubKey, logger, cfg.Mode) - - // TODO: Fetch and provide real options and do proper p2p bootstrapping. - // TODO: Use a persistent peer database. - nodeInfo, err := makeNodeInfo(cfg, nodeKey, eventSinks, genDoc, state.Version.Consensus) - if err != nil { - return nil, combineCloseError(err, makeCloser(closers)) - } - peerManager, peerCloser, err := createPeerManager(cfg, dbProvider, nodeKey.ID) closers = append(closers, peerCloser) if err != nil { @@ -257,15 +216,15 @@ func makeNode( privValidator: privValidator, peerManager: peerManager, - nodeInfo: nodeInfo, nodeKey: nodeKey, - eventSinks: eventSinks, + eventSinks: eventSinks, + indexerService: indexerService, + services: []service.Service{eventBus}, - services: []service.Service{eventBus}, - - stateStore: stateStore, - blockStore: blockStore, + initialState: state, + stateStore: stateStore, + blockStore: blockStore, shutdownOps: makeCloser(closers), @@ -408,6 +367,48 @@ func makeNode( // OnStart starts the Node. It implements service.Service. func (n *nodeImpl) OnStart(ctx context.Context) error { + if err := n.rpcEnv.ProxyApp.Start(ctx); err != nil { + return fmt.Errorf("error starting proxy app connections: %w", err) + } + + // EventBus and IndexerService must be started before the handshake because + // we might need to index the txs of the replayed block as this might not have happened + // when the node stopped last time (i.e. the node stopped or crashed after it saved the block + // but before it indexed the txs) + if err := n.rpcEnv.EventBus.Start(ctx); err != nil { + return err + } + + if err := n.indexerService.Start(ctx); err != nil { + return err + } + + // Create the handshaker, which calls RequestInfo, sets the AppVersion on the state, + // and replays any blocks as necessary to sync tendermint with the app. + if err := consensus.NewHandshaker(n.logger.With("module", "handshaker"), + n.stateStore, n.initialState, n.blockStore, n.rpcEnv.EventBus, n.genesisDoc, + ).Handshake(ctx, n.rpcEnv.ProxyApp); err != nil { + return err + } + + // Reload the state. It will have the Version.Consensus.App set by the + // Handshake, and may have other modifications as well (ie. depending on + // what happened during block replay). + state, err := n.stateStore.Load() + if err != nil { + return fmt.Errorf("cannot load state: %w", err) + } + + logNodeStartupInfo(state, n.rpcEnv.PubKey, n.logger, n.config.Mode) + + // TODO: Fetch and provide real options and do proper p2p bootstrapping. + // TODO: Use a persistent peer database. + n.nodeInfo, err = makeNodeInfo(n.config, n.nodeKey, n.eventSinks, n.genesisDoc, state.Version.Consensus) + if err != nil { + return err + } + // Start Internal Services + if n.config.RPC.PprofListenAddress != "" { rpcCtx, rpcCancel := context.WithCancel(ctx) srv := &http.Server{Addr: n.config.RPC.PprofListenAddress, Handler: nil} @@ -445,7 +446,7 @@ func (n *nodeImpl) OnStart(ctx context.Context) error { } } - state, err := n.stateStore.Load() + state, err = n.stateStore.Load() if err != nil { return err } diff --git a/node/node_test.go b/node/node_test.go index 165013883..7306d18a5 100644 --- a/node/node_test.go +++ b/node/node_test.go @@ -62,12 +62,13 @@ func TestNodeStartStop(t *testing.T) { require.NoError(t, n.Start(ctx)) // wait for the node to produce a block - tctx, cancel := context.WithTimeout(ctx, time.Second) + tctx, cancel := context.WithTimeout(ctx, 10*time.Second) defer cancel() blocksSub, err := n.EventBus().SubscribeWithArgs(tctx, pubsub.SubscribeArgs{ ClientID: "node_test", Query: types.EventQueryNewBlock, + Limit: 1000, }) require.NoError(t, err) _, err = blocksSub.Next(tctx) @@ -138,6 +139,8 @@ func TestNodeSetAppVersion(t *testing.T) { // create node n := getTestNode(ctx, t, cfg, logger) + require.NoError(t, n.Start(ctx)) + // default config uses the kvstore app appVersion := kvstore.ProtocolVersion @@ -630,7 +633,7 @@ func TestNodeSetEventSink(t *testing.T) { genDoc, err := types.GenesisDocFromFile(cfg.GenesisFile()) require.NoError(t, err) - indexService, eventSinks, err := createAndStartIndexerService(ctx, cfg, + indexService, eventSinks, err := createIndexerService(cfg, config.DefaultDBProvider, eventBus, logger, genDoc.ChainID, indexer.NopMetrics()) require.NoError(t, err) diff --git a/node/setup.go b/node/setup.go index e87fac79c..1057fb6fa 100644 --- a/node/setup.go +++ b/node/setup.go @@ -95,8 +95,7 @@ func initDBs( return blockStore, stateDB, makeCloser(closers), nil } -func createAndStartIndexerService( - ctx context.Context, +func createIndexerService( cfg *config.Config, dbProvider config.DBProvider, eventBus *eventbus.EventBus, @@ -116,10 +115,6 @@ func createAndStartIndexerService( Metrics: metrics, }) - if err := indexerService.Start(ctx); err != nil { - return nil, nil, err - } - return indexerService, eventSinks, nil } @@ -264,6 +259,7 @@ func createConsensusReactor( evidencePool, eventBus, consensus.StateMetrics(csMetrics), + consensus.SkipStateStoreBootstrap, ) if err != nil { return nil, nil, err diff --git a/spec/p2p/messages/pex.md b/spec/p2p/messages/pex.md index ea5986f0d..e02393d52 100644 --- a/spec/p2p/messages/pex.md +++ b/spec/p2p/messages/pex.md @@ -26,7 +26,7 @@ PexResponse is an list of net addresses provided to a peer to dial. | Name | Type | Description | Field Number | |-------|------------------------------------|------------------------------------------|--------------| -| addresses | repeated [PexAddress](#PexAddress) | List of peer addresses available to dial | 1 | +| addresses | repeated [PexAddress](#pexaddress) | List of peer addresses available to dial | 1 | ### PexAddress @@ -41,7 +41,7 @@ into a `NodeAddress`. See [ParseNodeAddress](https://github.com/tendermint/tende Message is a [`oneof` protobuf type](https://developers.google.com/protocol-buffers/docs/proto#oneof). The one of consists of two messages. -| Name | Type | Description | Field Number | -|--------------|---------------------------|------------------------------------------------------|--------------| -| pex_request | [PexRequest](#PexRequest) | Empty request asking for a list of addresses to dial | 3 | -| pex_response | [PexResponse](#PexResponse) | List of addresses to dial | 4 | +| Name | Type | Description | Field Number | +|--------------|-----------------------------|------------------------------------------------------|--------------| +| pex_request | [PexRequest](#pexrequest) | Empty request asking for a list of addresses to dial | 3 | +| pex_response | [PexResponse](#pexresponse) | List of addresses to dial | 4 | diff --git a/test/e2e/README.md b/test/e2e/README.md index 00bce5ad8..70510b6fa 100644 --- a/test/e2e/README.md +++ b/test/e2e/README.md @@ -11,7 +11,7 @@ This creates and runs a testnet named `ci` under `networks/ci/`. ## Conceptual Overview -End-to-end testnets are used to test Tendermint functionality as a user would use it, by spinning up a set of nodes with various configurations and making sure the nodes and network behave correctly. The background for the E2E test suite is outlined in [RFC-001](https://github.com/tendermint/tendermint/blob/master/docs/rfc/rfc-001-end-to-end-testing.md). +End-to-end testnets are used to test Tendermint functionality as a user would use it, by spinning up a set of nodes with various configurations and making sure the nodes and network behave correctly. The background for the E2E test suite is outlined in [RFC-001](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-066-e2e-testing.md). The end-to-end tests can be thought of in this manner: @@ -180,4 +180,4 @@ tendermint start ./build/node ./node.socket.toml ``` -Check `node/config.go` to see how the settings of the test application can be tweaked. \ No newline at end of file +Check `node/config.go` to see how the settings of the test application can be tweaked.