docs: write about debug kill and dump (#4516)

* docs: write about debug kill and dump

Closes #4325

* wrap file tree in code blocks
This commit is contained in:
Anton Kaliaev
2020-03-03 12:37:29 +04:00
committed by GitHub
parent 3b2e1f22e4
commit a60d032b07
3 changed files with 94 additions and 38 deletions

View File

@@ -111,54 +111,44 @@ to achieve the same things.
## Debugging Tendermint
If you ever have to debug Tendermint, the first thing you should
probably do is to check out the logs. See [How to read
logs](./how-to-read-logs.md), where we explain what certain log
statements mean.
If you ever have to debug Tendermint, the first thing you should probably do is
check out the logs. See [How to read logs](./how-to-read-logs.md), where we
explain what certain log statements mean.
If, after skimming through the logs, things are not clear still, the
next thing to try is query the /status RPC endpoint. It provides the
necessary info: whenever the node is syncing or not, what height it is
on, etc.
If, after skimming through the logs, things are not clear still, the next thing
to try is querying the `/status` RPC endpoint. It provides the necessary info:
whenever the node is syncing or not, what height it is on, etc.
```
```sh
curl http(s)://{ip}:{rpcPort}/status
```
`dump_consensus_state` will give you a detailed overview of the
consensus state (proposer, lastest validators, peers states). From it,
you should be able to figure out why, for example, the network had
halted.
`/dump_consensus_state` will give you a detailed overview of the consensus
state (proposer, latest validators, peers states). From it, you should be able
to figure out why, for example, the network had halted.
```
```sh
curl http(s)://{ip}:{rpcPort}/dump_consensus_state
```
There is a reduced version of this endpoint - `consensus_state`, which
returns just the votes seen at the current height.
There is a reduced version of this endpoint - `/consensus_state`, which returns
just the votes seen at the current height.
- [Github Issues](https://github.com/tendermint/tendermint/issues)
- [StackOverflow
questions](https://stackoverflow.com/questions/tagged/tendermint)
If, after consulting with the logs and above endpoints, you still have no idea
what's happening, consider using `tendermint debug kill` sub-command. This
command will scrap all the available info and kill the process. See
[Debugging](../tools/debugging.md) for the exact format.
### Debug Utility
Tendermint also ships with a `debug` sub-command that allows you to kill a live
Tendermint process while collecting useful information in a compressed archive
such as the configuration used, consensus state, network state, the node' status,
the WAL, and even the stacktrace of the process before exit. These files can be
useful to examine when debugging a faulty Tendermint process.
In addition, the `debug` sub-command also allows you to dump debugging data into
compressed archives at a regular interval. These archives contain the goroutine
and heap profiles in addition to the consensus state, network info, node status,
and even the WAL.
You can inspect the resulting archive yourself or create an issue on
[Github](https://github.com/tendermint/tendermint). Before opening an issue
however, be sure to check if there's [no existing
issue](https://github.com/tendermint/tendermint/issues) already.
## Monitoring Tendermint
Each Tendermint instance has a standard `/health` RPC endpoint, which
responds with 200 (OK) if everything is fine and 500 (or no response) -
if something is wrong.
Each Tendermint instance has a standard `/health` RPC endpoint, which responds
with 200 (OK) if everything is fine and 500 (or no response) - if something is
wrong.
Other useful endpoints include mentioned earlier `/status`, `/net_info` and
`/validators`.
@@ -166,6 +156,10 @@ Other useful endpoints include mentioned earlier `/status`, `/net_info` and
Tendermint also can report and serve Prometheus metrics. See
[Metrics](./metrics.md).
`tendermint debug dump` sub-command can be used to periodically dump useful
information into an archive. See [Debugging](../tools/debugging.md) for more
information.
## What happens when my app dies?
You are supposed to run Tendermint under a [process