mirror of
https://github.com/tendermint/tendermint.git
synced 2026-01-05 04:55:18 +00:00
inspect: add inspect mode for debugging crashed tendermint node (#6785)
EDIT: Updated, see [comment below]( https://github.com/tendermint/tendermint/pull/6785#issuecomment-897793175) This change adds a sketch of the `Debug` mode. This change adds a `Debug` struct to the node package. This `Debug` struct is intended to be created and started by a command in the `cmd` directory. The `Debug` struct runs the RPC server on the data directories: both the state store and the block store. This change required a good deal of refactoring. Namely, a new `rpc.go` file was added to the `node` package. This file encapsulates functions for starting RPC servers used by nodes. A potential additional change is to further factor this code into shared code _in_ the `rpc` package. Minor API tweaks were also made that seemed appropriate such as the mechanism for fetching routes from the `rpc/core` package. Additional work is required to register the `Debug` service as a command in the `cmd` directory but I am looking for feedback on if this direction seems appropriate before diving much further. closes: #5908
This commit is contained in:
@@ -62,3 +62,30 @@ given destination directory. Each archive will contain:
|
||||
|
||||
Note: goroutine.out and heap.out will only be written if a profile address is
|
||||
provided and is operational. This command is blocking and will log any error.
|
||||
|
||||
## Tendermint Inspect
|
||||
|
||||
Tendermint includes an `inspect` command for querying Tendermint's state store and block
|
||||
store over Tendermint RPC.
|
||||
|
||||
When the Tendermint consensus engine detects inconsistent state, it will crash the
|
||||
entire Tendermint process.
|
||||
While in this inconsistent state, a node running Tendermint's consensus engine will not start up.
|
||||
The `inspect` command runs only a subset of Tendermint's RPC endpoints for querying the block store
|
||||
and state store.
|
||||
`inspect` allows operators to query a read-only view of the stage.
|
||||
`inspect` does not run the consensus engine at all and can therefore be used to debug
|
||||
processes that have crashed due to inconsistent state.
|
||||
|
||||
|
||||
To start the `inspect` process, run
|
||||
```bash
|
||||
tendermint inspect
|
||||
```
|
||||
|
||||
### RPC endpoints
|
||||
The list of available RPC endpoints can be found by making a request to the RPC port.
|
||||
For an `inspect` process running on `127.0.0.1:26657`, navigate your browser to
|
||||
`http://127.0.0.1:26657/` to retrieve the list of enabled RPC endpoints.
|
||||
|
||||
Additional information on the Tendermint RPC endpoints can be found in the [rpc documentation](https://docs.tendermint.com/master/rpc).
|
||||
|
||||
@@ -64,13 +64,42 @@ It won’t kill the node, but it will gather all of the above data and package i
|
||||
|
||||
At this point, depending on how severe the degradation is, you may want to restart the process.
|
||||
|
||||
## Tendermint Inspect
|
||||
|
||||
What if the Tendermint node will not start up due to inconsistent consensus state?
|
||||
|
||||
When a node running the Tendermint consensus engine detects an inconsistent state
|
||||
it will crash the entire Tendermint process.
|
||||
The Tendermint consensus engine cannot be run in this inconsistent state and the so node
|
||||
will fail to start up as a result.
|
||||
The Tendermint RPC server can provide valuable information for debugging in this situation.
|
||||
The Tendermint `inspect` command will run a subset of the Tendermint RPC server
|
||||
that is useful for debugging inconsistent state.
|
||||
|
||||
### Running inspect
|
||||
|
||||
Start up the `inspect` tool on the machine where Tendermint crashed using:
|
||||
```bash
|
||||
tendermint inspect --home=</path/to/app.d>
|
||||
```
|
||||
|
||||
`inspect` will use the data directory specified in your Tendermint configuration file.
|
||||
`inspect` will also run the RPC server at the address specified in your Tendermint configuration file.
|
||||
|
||||
### Using inspect
|
||||
|
||||
With the `inspect` server running, you can access RPC endpoints that are critically important
|
||||
for debugging.
|
||||
Calling the `/status`, `/consensus_state` and `/dump_consensus_state` RPC endpoint
|
||||
will return useful information about the Tendermint consensus state.
|
||||
|
||||
## Outro
|
||||
|
||||
We’re hoping that the `tendermint debug` subcommand will become de facto the first response to any accidents.
|
||||
We’re hoping that these Tendermint tools will become de facto the first response for any accidents.
|
||||
|
||||
Let us know what your experience has been so far! Have you had a chance to try `tendermint debug` yet?
|
||||
Let us know what your experience has been so far! Have you had a chance to try `tendermint debug` or `tendermint inspect` yet?
|
||||
|
||||
Join our chat, where we discuss the current issues and future improvements.
|
||||
Join our [discord chat](https://discord.gg/vcExX9T), where we discuss the current issues and future improvements.
|
||||
|
||||
—
|
||||
|
||||
|
||||
Reference in New Issue
Block a user