Since starting off as a wee validator, I've been mystified by the volume of p2p logspam, which often makes it impossible to monitor other tasks. Thus, routine p2p events, have been cast into the land of debug.
---
#### PR checklist
- [x] Tests written/updated, or no tests needed
- [x] `CHANGELOG_PENDING.md` updated, or no changelog entry needed
- [x] Updated relevant documentation (`docs/`) and code comments, or no
documentation updates needed
Co-authored-by: Jacob Gadikian <jacobgadikian@gmail.com>
Our documentation build process is rather finicky. I noticed yet another build failure that doesn't actually cause the docs site build to fail - see https://github.com/tendermint/tendermint/actions/runs/3578638807/jobs/6019002654
Basically the `docs/post.sh` script deletes the `.vuepress/public/rpc` directory after generating the docs for a specific version of Tendermint, and then the `docs/pre.sh` script attempts to copy the next version's OpenAPI script into `.vuepress/public/rpc` but that directory doesn't exist.
For some reason this doesn't cause the npm build process to fail - it just results in the OpenAPI doc not being available.
This is hard to test locally because our docs build process currently relies on checking out specific branches to do the full docs site build: 4290ad2982/Makefile (L290-L300)
---
#### PR checklist
- [x] Tests written/updated, or no tests needed
- [x] `CHANGELOG_PENDING.md` updated, or no changelog entry needed
- [x] Updated relevant documentation (`docs/`) and code comments, or no
documentation updates needed
(cherry picked from commit 18d38dd409)
Co-authored-by: Thane Thomson <connect@thanethomson.com>
* consensus: correctly save reference to previous height precommits (#9760)
This change reduces the number of Precommit messages sent to peers by 50%.
During the `ApplyNewRoundStepMessage`, we update the known state of the peer sending us the message.
We set the value of `ps.PRS.Precommits` to nil in this method if the peer is entering a new height or round.
34ca3fb474/consensus/reactor.go (L1368)
We then assign `ps.PRS.LastCommit = ps.PRS.Precommits` if the peer is entering a new height only - this does not happen during just a round change. This therefore results in `ps.PRS.LastCommit` having the value `nil`.
When the `LastCommit` bit field is seen as `nil` in the reactor, an empty bit field is initialized.
34ca3fb474/consensus/reactor.go (L1273)
The code responsible for gossiping votes from the previous height uses this `LastCommit` value and, seeing an empty `LastCommit` will resend every `Precommit` from the previous height since it lost the information it previously had detailing which precommits from the previous height the peer had.
This can be seen in the code responsible for gossiping precommits from the previous height:
34ca3fb474/consensus/reactor.go (L773)
Where this code grabs the, previously `nil`, `LastCommit` bit field:
34ca3fb474/consensus/reactor.go (L1204-L1212)
Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com>
Co-authored-by: William Banfield <wbanfield@gmail.com>
* rpc: Add caching support (#9650)
* Set cache control in the HTTP-RPC response header
* Add a simply cache policy to the RPC routes
* add a condition to check the RPC request has default height settings
* fix cherry pick error
* update pending log
* use options struct intead of single parameter
* refacor FuncOptions to functional options
* add functional options in WebSocket RPC function
* revert doc
* replace deprecated function call
* revise functional options
* remove unuse comment
* fix revised error
* adjust cache-control settings
* Update rpc/jsonrpc/server/http_json_handler.go
Co-authored-by: Thane Thomson <connect@thanethomson.com>
* linter: Fix false positive
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* rpc: Separate cacheable and non-cacheable HTTP response writers
Allows us to roll this change out in a non-API-breaking way, since this
is an additive change.
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* rpc: Ensure consistent caching strategy
Ensure a consistent caching strategy across both JSONRPC- and URI-based
requests.
This requires a bit of a refactor of the previous caching logic, which
is complicated a little by the complex reflection-based approach taken
in the Tendermint RPC.
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* rpc: Add more tests for caching
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Update CHANGELOG_PENDING
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* light: Sync routes config with RPC core
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* rpc: Update OpenAPI docs
Signed-off-by: Thane Thomson <connect@thanethomson.com>
Signed-off-by: Thane Thomson <connect@thanethomson.com>
Co-authored-by: jayt106 <jaytseng106@gmail.com>
Co-authored-by: jay tseng <jay.tseng@crypto.com>
Co-authored-by: JayT106 <JayT106@users.noreply.github.com>
(cherry picked from commit 816c6bac00)
# Conflicts:
# CHANGELOG_PENDING.md
# light/proxy/routes.go
# rpc/core/routes.go
# rpc/openapi/openapi.yaml
# test/fuzz/tests/rpc_jsonrpc_server_test.go
* Fix conflict in CHANGELOG_PENDING
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Resolve remaining conflicts
Signed-off-by: Thane Thomson <connect@thanethomson.com>
Signed-off-by: Thane Thomson <connect@thanethomson.com>
Co-authored-by: Thane Thomson <connect@thanethomson.com>
* p2p: add a per-message type send and receive metric (#9622)
* p2p: ressurrect the p2p envelope and use to calculate message metric
Add new SendEnvelope, TrySendEnvelope, BroadcastEnvelope, and ReceiveEnvelope methods in the p2p package to work with the new envelope type.
Care was taken to ensure this was performed in a non-breaking manner.
Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com>
Co-authored-by: William Banfield <wbanfield@gmail.com>
* QA Process report for v0.37.x (and baseline for v0.34.x) (#9499)
* 1st version. 200 nodes. Missing rotating node
* Small fixes
* Addressed @jmalicevic's comment
* Explain in method how to set the tmint version to test. Improve result section
* 1st version of how to run the 'rotating node' testnet
* Apply suggestions from @williambanfield
Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com>
* Addressed @williambanfield's comments
* Added reference to Unix load metric
* Added total TXs
* Fixed some 'png's that got swapped. Excluded '.*-node-exporter' processes from memory plots
* Report for rotating node
* Adressed remaining comments from @williambanfield
* Cosmetic
* Addressed some of @thanethomson's comments
* Re-executed the 200 node tests and updated the corresponding sections of the report
* Ignore Python virtualenv directories
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Add latency vs throughput script
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Add README for latency vs throughput script
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Fix local links to folders
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* v034: only have one level-1 heading
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Adjust headings
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* v0.37.x: add links to issues/PRs
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* v0.37.x: add note about bug being present in v0.34
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* method: adjust heading depths
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Show data points on latency vs throughput plot
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Add latency vs throughput plots
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Correct mentioning of v0.34.21 and add heading
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Refactor latency vs throughput script
Update the latency vs throughput script to rather generate plots from
the "raw" CSV output from the loadtime reporting tool as opposed to the
separated CSV files from the experimental method.
Also update the relevant documentation, and regenerate the images from
the raw CSV data (resulting in pretty much the same plots as the
previous ones).
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Remove unused default duration const
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Adjust experiment start time to be more accurate and re-plot latency vs throughput
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Addressed @williambanfield's comments
* Apply suggestions from code review
Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com>
* scripts: Update latency vs throughput readme for clarity
Signed-off-by: Thane Thomson <connect@thanethomson.com>
Signed-off-by: Thane Thomson <connect@thanethomson.com>
Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com>
Co-authored-by: Thane Thomson <connect@thanethomson.com>
(cherry picked from commit b06e1cea54)
* Remove v037 dir
* Removed reference to v0.37 testnets
Co-authored-by: Sergio Mena <sergio@informal.systems>
* Added print
* Fix unmarshall
* Fix unmarshalling
* Simplified steps to unmarshall
* minor
* Use 'encoding/hex'
* Forget about C, this is Go!
* gosec warning
* Set maximum payload size
* nosec annotation
(cherry picked from commit b42c439776)
Co-authored-by: Sergio Mena <sergio@informal.systems>
* security/p2p: prevent peers who errored being added to the peer_set (#9500)
* Mark failed removal of peer to address security bug
Co-authored-by: Callum Waters <cmwaters19@gmail.com>
(cherry picked from commit c0bdb2423a)
* Changelong entry and added missing functions for implementations of Peer
Co-authored-by: Jasmina Malicevic <jasmina.dustinac@gmail.com>
* Extend the load report tool to include transactions' hashes (#9509)
* Add transaction hash to raw data
* Add hash in formatted output
* Cosmetic
(cherry picked from commit cdd3479f20)
# Conflicts:
# test/loadtime/cmd/report/main.go
* Resolve conflict
* Appease linter
Co-authored-by: Sergio Mena <sergio@informal.systems>
* loadtime: add block time to the data point (#9484)
This pull request adds the block time as the unix time since the epoch to the `report` tool's csv output.
```csv
...
a7a8b903-1136-4da1-97aa-d25da7b4094f,1614226790,1663707084905417366,4,200,1024
a7a8b903-1136-4da1-97aa-d25da7b4094f,1614196724,1663707084905417366,4,200,1024
a7a8b903-1136-4da1-97aa-d25da7b4094f,1613097336,1663707084905417366,4,200,1024
a7a8b903-1136-4da1-97aa-d25da7b4094f,1609365168,1663707084905417366,4,200,1024
a7a8b903-1136-4da1-97aa-d25da7b4094f,1617199169,1663707084905417366,4,200,1024
a7a8b903-1136-4da1-97aa-d25da7b4094f,1615197134,1663707084905417366,4,200,1024
a7a8b903-1136-4da1-97aa-d25da7b4094f,1610399447,1663707084905417366,4,200,1024
...
```
#### PR checklist
- [ ] Tests written/updated, or no tests needed
- [ ] `CHANGELOG_PENDING.md` updated, or no changelog entry needed
- [ ] Updated relevant documentation (`docs/`) and code comments, or no
documentation updates needed
(cherry picked from commit 5fe1a72416)
* lint fix
Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com>
Co-authored-by: William Banfield <wbanfield@gmail.com>
the `NewClient` method is called by the load test framework for each connection. This means that if multiple connections are instantiated, each connection will erroneously have its own UUID. This PR changes the UUID generation to happen at the _beginning_ of the script instead of on client creation so that each experimental run shares a UUID.
Caught while preparing the script for production readiness.
#### PR checklist
- [ ] Tests written/updated, or no tests needed
- [ ] `CHANGELOG_PENDING.md` updated, or no changelog entry needed
- [ ] Updated relevant documentation (`docs/`) and code comments, or no
documentation updates needed
(cherry picked from commit 59a711eabe)
Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com>
* add separated runs by UUID (#9367)
This _should_ be the last piece needed for this tool.
This allows the tool to generate reports on multiple experimental runs that may have been performed against the same chain.
The `load` tool has been updated to generate a `UUID` on startup to uniquely identify each experimental run. The `report` tool separates all of the results it reads by `UUID` and performs separate calculations for each discovered experiment.
Sample output is as follows
```
Experiment ID: 6bd7d1e8-d82c-4dbe-a1b3-40ab99e4fa30
Connections: 1
Rate: 1000
Size: 1024
Total Valid Tx: 9000
Total Negative Latencies: 0
Minimum Latency: 86.632837ms
Maximum Latency: 1.151089602s
Average Latency: 813.759361ms
Standard Deviation: 225.189977ms
Experiment ID: 453960af-6295-4282-aed6-367fc17c0de0
Connections: 1
Rate: 1000
Size: 1024
Total Valid Tx: 9000
Total Negative Latencies: 0
Minimum Latency: 79.312992ms
Maximum Latency: 1.162446243s
Average Latency: 422.755139ms
Standard Deviation: 241.832475ms
Total Invalid Tx: 0
```
closes: #9352
#### PR checklist
- [ ] Tests written/updated, or no tests needed
- [ ] `CHANGELOG_PENDING.md` updated, or no changelog entry needed
- [ ] Updated relevant documentation (`docs/`) and code comments, or no
documentation updates needed
(cherry picked from commit 1067ba1571)
# Conflicts:
# go.mod
* fix merge conflict
* fix lint
Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com>
Co-authored-by: William Banfield <wbanfield@gmail.com>
* ci: Remove "(WARNING: BETA SOFTWARE)" tagline from all upcoming releases (#9371)
This is by no means a signal that we offer any additional guarantees with our software. This warning seems somewhat pointless given that:
1. Our open source license clearly states that we offer no warranties with this software.
2. We are clearly still pre-1.0.
It also doesn't make sense to append "(WARNING: BETA SOFTWARE)" to pre-releases such as alpha releases, which are to be considered _more_ unstable than beta releases.
---
#### PR checklist
- [x] Tests written/updated, or no tests needed
- [x] `CHANGELOG_PENDING.md` updated, or no changelog entry needed
- [x] Updated relevant documentation (`docs/`) and code comments, or no
documentation updates needed
(cherry picked from commit d7645628f1)
# Conflicts:
# .goreleaser.yml
* Resolve conflicts
Signed-off-by: Thane Thomson <connect@thanethomson.com>
* Sync root docs with main
Signed-off-by: Thane Thomson <connect@thanethomson.com>
Signed-off-by: Thane Thomson <connect@thanethomson.com>
Co-authored-by: Thane Thomson <connect@thanethomson.com>
* test: add the loadtime report tool (#9351)
This pull request adds the report tool and modifies the loadtime libraries to better support its use.
(cherry picked from commit 8655080a0f)
* add nolint
Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com>
Co-authored-by: William Banfield <wbanfield@gmail.com>