Commit Graph

124 Commits

Author SHA1 Message Date
Sam Kleinman
436a38f876 p2p: track peers by address (#8841) 2022-06-23 10:03:10 -04:00
Ian Jungyong Um
2e11760fbe p2p: fix typo (#8836) 2022-06-22 09:30:11 +02:00
William Banfield
8860e027a8 p2p: more dial routines (#8827)
The dial routines perform network i/o, which is a blocking call into the kernel. These routines are completely unable to do anything else while the dial occurs, so for most of their lifecycle they are sitting idle waiting for the tcp stack to hand them data. We should increase this value by _a lot_ to enable more concurrent dials. This is unlikely to cause CPU starvation because these routines sit idle most of the time. The current value causes dials to occur _way_ too slowly. 

Below is a graph demonstrating the before and after of this change in a testnetwork with many dead peers. You can observe that the rate that we connect to new, valid peers, is _much_ higher than previously. Change was deployed around the 31 minute mark on the graph.

![image](https://user-images.githubusercontent.com/4561443/174919007-50e4453a-edd8-41d0-97ee-dea8853d57f7.png)
2022-06-22 00:51:09 +00:00
Sam Kleinman
cfd13825e2 p2p: add eviction metrics and cleanup dialing error handling (#8819) 2022-06-21 20:44:14 +00:00
Sam Kleinman
28d3239958 p2p: wake dialing thread after sleep (#8803) 2022-06-20 15:47:56 +00:00
Ian Jungyong Um
e3e162ff10 p2p: fix typo (#8793)
Fix the typo in router
2022-06-19 15:26:25 +00:00
Sam Kleinman
4d820ff4f5 p2p: peer score should not wrap around (#8790) 2022-06-17 22:27:38 +00:00
Sam Kleinman
0ac03468d8 p2p: track peers stored on startup (#8787) 2022-06-17 18:10:10 +00:00
Sam Kleinman
9e5b13725d p2p: peer store and dialing changes (#8737) 2022-06-17 08:02:10 -04:00
Sam Kleinman
51b3f111dc p2p: fix mconn transport accept test (#8762)
Fix minor test incongruency missed earlier.
2022-06-14 23:48:48 +00:00
Sam Kleinman
979a6a1b13 p2p: accept should not abort on first error (#8759) 2022-06-14 19:12:53 -04:00
Sam Kleinman
bf1cb89bb7 Revert "p2p: self-add node should not error (tendermint#8753)" (#8757) 2022-06-14 20:55:10 +00:00
Sam Kleinman
7971f4a2fc p2p: self-add node should not error (#8753) 2022-06-14 16:45:05 +00:00
Sam Kleinman
666d93338a p2p: shed peers from store from other networks (#8678) 2022-06-02 11:14:25 -04:00
M. J. Fromberger
a988cefe5d Update generated mocks after #8607. (#8612) 2022-05-25 15:48:56 +00:00
Sam Kleinman
d59a53be01 p2p: reduce ability of SendError to disconnect peers (#8597) 2022-05-24 11:19:32 -04:00
Sam Kleinman
2897b75853 p2p: remove unused get height methods (#8569) 2022-05-17 10:56:26 -04:00
William Banfield
92811b9153 metrics: transition all metrics to using metricsgen generated constructors. (#8488)
## What does this change do?

This pull request completes the change to the `metricsgen` metrics. It adds `go generate` directives to all of the files containing the `Metrics` structs.

Using the outputs of `metricsdiff` between these generated metrics and `master`, we can see that there is not a diff between the two sets of metrics when run locally.
```
[william@sidewinder] tendermint[wb/metrics-gen-transition]:. ◆ ./scripts/metricsgen/metricsdiff/metricsdiff metrics_master metrics_generated
[william@sidewinder] tendermint[wb/metrics-gen-transition]:. ◆
```

This change also adds parsing for a `metrics:` key in a field comment. If a comment line begins with `//metrics:` the rest of the line is interpreted to be the metric help text. Additionally, a bug where lists of labels were not properly quoted in the `metricsgen` rendered output was fixed.
2022-05-12 18:39:12 +00:00
Sam Kleinman
37287ead94 p2p: remove message type from channel implementation (#8452) 2022-05-02 10:52:57 -04:00
Sam Kleinman
cf2a00b398 p2p: avoid using p2p.Channel internals (#8444) 2022-04-29 17:21:36 -04:00
Sam Kleinman
2a58ea3ab2 p2p: use nodeinfo less often (#8427) 2022-04-27 21:13:38 +00:00
Sam Kleinman
8670678291 p2p: remove support for multiple transports and endpoints (#8420) 2022-04-27 14:29:19 -04:00
M. J. Fromberger
001449d536 Remove obsolete build tagged patch for net.Pipe. (#8399)
The p2p/conn library was using a build patch to work around an old issue with
the net.Conn type that has not existed since Go 1.10. Remove the workaround and
update usage to use the standard net.Pipe directly.
2022-04-23 09:57:42 -07:00
Sam Kleinman
b5e6cf50d1 abci: Application should return errors errors and nilable response objects (#8396) 2022-04-22 20:40:42 -04:00
Sam Kleinman
8345dc4f7c abci: application type should take contexts (#8388) 2022-04-22 10:58:01 -04:00
Sam Kleinman
efd4f4a40b cleanup: unused parameters (#8372) 2022-04-18 16:45:21 -04:00
Sam Kleinman
889341152a p2p: fix setting in con-tracker (#8370) 2022-04-18 15:49:11 -04:00
M. J. Fromberger
14f41ac5e3 Fix more broken Markdown links. (#8271) 2022-04-07 00:15:20 -07:00
Sam Kleinman
d153388446 p2p: inject nodeinfo into router (#8261) 2022-04-06 14:02:07 -04:00
Sam Kleinman
9d1e8eaad4 node: remove channel and peer update initialization from construction (#8238) 2022-04-05 13:26:53 +00:00
Sam Kleinman
97f7021712 statesync: merge channel processing (#8240) 2022-04-04 12:31:15 -04:00
Sam Kleinman
0bded371c5 testing: logger cleanup (#8153)
This contains two major changes:

- Remove the legacy test logging method, and just explicitly call the
  noop logger. This is just to make the test logging behavior more
  coherent and clear. 
  
- Move the logging in the light package from the testing.T logger to
  the noop logger. It's really the case that we very rarely need/want
  to consider test logs unless we're doing reproductions and running a
  narrow set of tests.
  
In most cases, I (for one) prefer to run in verbose mode so I can
watch progress of tests, but I basically never need to consider
logs. If I do want to see logs, then I can edit in the testing.T
logger locally (which is what you have to do today, anyway.)
2022-03-18 17:39:38 +00:00
JayT106
4400b0f6d3 p2p: adjust max non-persistent peer score (#8137)
Guarantee persistent peers have the highest connecting priority. 
The peerStore.Ranked returns an arbitrary order of peers with the same scores.
2022-03-17 14:30:45 -07:00
M. J. Fromberger
658a7661c5 p2p: remove unnecessary panic handling in PEX reactor (#8110)
The message handling in this reactor is all under control of the reactor
itself, and does not call out to callbacks or other externally-supplied code.
It doesn't need to check for panics.

- Remove an irrelevant channel ID check.
- Remove an unnecessary panic recovery wrapper.
2022-03-11 08:23:03 -08:00
M. J. Fromberger
89b4321af2 p2p: update polling interval calculation for PEX requests (#8106)
The PEX reactor has a simple feedback control mechanism to decide how often to
poll peers for peer address updates. The idea is to poll more frequently when
knowledge of the network is less, and decrease frequency as knowledge grows.

This change solves two problems:

1. It is possible in some cases we may poll a peer "too often" and get dropped
   by that peer for spamming.

2. The first successful peer update with any content resets the polling timer
   to a very long time (10m), meaning if we are unlucky in getting an
   incomplete reply while the network is small, we may not try again for a very
   long time. This may contribute to difficulties bootstrapping sync.

The main change here is to only update the interval when new information is
added to the system, and not (as before) whenever a request is sent out to a
peer. The rate computation is essentially the same as before, although the code
has been a bit simplified, and I consolidated some of the error handling so
that we don't have to check in multiple places for the same conditions.

Related changes:

- Improve error diagnostics for too-soon and overflow conditions.
- Clean up state handling in the poll interval computation.
- Pin the minimum interval avert a chance of PEX spamming a peer.
2022-03-11 07:49:33 -08:00
JayT106
d9c9675e2a p2p+flowrate: rate control refactor (#7828)
Adding `CurrentTransferRate ` in the flowrate package because only the status of the transfer rate has been used.
2022-03-10 13:48:23 +00:00
M. J. Fromberger
2df5c85a8d Fix govet errors for %w use in test errors. (#8083)
The %w syntax is a fmt.Errorf thing, not supported by the testing package.
2022-03-07 18:17:37 +00:00
M. J. Fromberger
a22942504c p2p: re-enable tests previously disabled (#8049) 2022-03-01 12:25:11 -08:00
Sam Kleinman
58dc172611 p2p: plumb rudamentary service discovery to rectors and update statesync (#8030)
This is a little coarse, but the idea is that we'll send information
about the channels a peer has upon the peer-up event that we send to
reactors that we can then use to reject peers (if neeeded) from reactors.

This solves the problem where statesync would hang in test networks
(and presumably real) where we would attempt to statesync from seed
nodes, thereby hanging silently forever.
2022-02-28 20:02:54 +00:00
Sam Kleinman
a153f82433 p2p: ignore transport close error during cleanup (#8011)
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2022-02-25 14:48:41 -05:00
Sam Kleinman
89dbebd1c5 p2p: retry failed connections slightly more aggressively (#8010)
* p2p: retry failed connections slightly more aggressively

* fix dial interval test
2022-02-25 18:05:29 +00:00
Sam Kleinman
c8ae5db50e p2p: relax pong timeout (#8007) 2022-02-25 10:58:29 -05:00
Sam Kleinman
c85e3e4ba8 p2p: mconn track last message for pongs (#7995)
* p2p: mconn track last message for pongs

* fix spell

* cr feedback

* test fix part one

* cleanup tests

* fix comment

Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
2022-02-25 15:15:13 +00:00
Sam Kleinman
9968f53c15 p2p: make mconn transport test less flaky (#7973)
The previous implementation of the *test* was flaky, and this irons
out some of those problems. The primary assertion that was failing
(less than 1% of the time) was an error on close that I think we
shouldn't care about.
2022-02-23 19:06:03 +00:00
Sam Kleinman
be83ec6664 p2p: pass start time to flowrate and cleanup constructors (#7838)
After poking around #7828, I saw the oppertunity for this cleanup,
which I think is both reasonable on its own, and quite low impact, and
removes the math around process start time.
2022-02-18 13:30:19 +00:00
Sam Kleinman
28d34d635c service: change stop interface (#7816) 2022-02-17 11:23:32 -05:00
Sergio Mena
d3548eb706 Completed the existing FinalizeBlock PR and rebased to master (#7798)
* Rebased and git-squashed the commits in PR #6546

migrate abci to finalizeBlock

work on abci, proxy and mempool

abciresponse, blok events, indexer, some tests

fix some tests

fix errors

fix errors in abci

fix tests amd errors

* Fixes after rebasing PR#6546

* Restored height to RequestFinalizeBlock & other

* Fixed more UTs

* Fixed kvstore

* More UT fixes

* last TC fixed

* make format

* Update internal/consensus/mempool_test.go

Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com>

* Addressed @williambanfield's comments

* Fixed UTs

* Addressed last comments from @williambanfield

* make format

Co-authored-by: marbar3778 <marbar3778@yahoo.com>
Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com>
2022-02-14 23:41:28 +01:00
Sam Kleinman
824960c565 libs/service: regularize Stop semantics and concurrency primitives (#7809) 2022-02-14 08:28:29 -05:00
Sam Kleinman
dbb7d6ecdd sync+p2p: remove closer (#7805) 2022-02-11 15:12:25 -05:00
Gui
ebbc3f02f5 p2p: always advertise self, to enable mutual address discovery (#7594)
Fixes #7593
2022-01-19 21:39:59 +00:00