Commit Graph

274 Commits

Author SHA1 Message Date
M. J. Fromberger
82738eb016 Move the libs/pubsub package to internal scope (#7451)
No API changes, merely changes the import path.
2021-12-15 07:09:32 -08:00
M. J. Fromberger
ab7da86b06 Internalize libs/sync. (#7450)
Inline the one usage of this library, and remove the lib.
2021-12-14 15:14:30 -08:00
M. J. Fromberger
da697089d0 Move libs/async to internal/libs/async. (#7449) 2021-12-14 14:05:42 -08:00
Sam Kleinman
2ff962a63a log: dissallow nil loggers (#7445) 2021-12-14 12:45:13 -05:00
Sam Kleinman
d0e03f01fc sync: remove special mutexes (#7438) 2021-12-13 13:35:32 -05:00
Sam Kleinman
65c0aaee5e p2p: use recieve for channel iteration (#7425) 2021-12-13 11:04:44 -05:00
dependabot[bot]
f8bf2cb912 build(deps): Bump github.com/adlio/schema from 1.1.15 to 1.2.2 (#7423)
* build(deps): Bump github.com/adlio/schema from 1.1.15 to 1.2.2

Bumps [github.com/adlio/schema](https://github.com/adlio/schema) from 1.1.15 to 1.2.2.
- [Release notes](https://github.com/adlio/schema/releases)
- [Commits](https://github.com/adlio/schema/compare/v1.1.15...v1.2.2)

---
updated-dependencies:
- dependency-name: github.com/adlio/schema
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Work around API changes in the migrator package.

A recent update inadvertently broke the API by changing the receiver types of
the methods without updating the constructor.

See: https://github.com/adlio/schema/issues/13

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
2021-12-10 10:30:04 -08:00
M. J. Fromberger
a925f4fa84 Fix a panic in the indexer service test. (#7424)
The test service was starting up without a logger and crashing
while trying to log.
2021-12-10 10:03:42 -08:00
Emmanuel T Odeke
3e92899bd9 internal/libs/protoio: optimize MarshalDelimited by plain byteslice allocations+sync.Pool (#7325)
Noticed in profiles that invoking *VoteSignBytes always created a
bytes.Buffer, then discarded it inside protoio.MarshalDelimited.
I dug further and examined the call paths and noticed that we
unconditionally create the bytes.Buffer, even though we might
have proto messages (in the common case) that implement
MarshalTo([]byte), and invoked varintWriter. Instead by inlining
this case, we skip a bunch of allocations and CPU cycles,
which then reflects properly on all calling functions. Here
are the benchmark results:

```shell
$ benchstat before.txt after.txt
name                                        old time/op    new time/op      delta
types.VoteSignBytes-8                       705ns ± 3%     573ns ± 6%       -18.74% (p=0.000 n=18+20)
types.CommitVoteSignBytes-8                 8.15µs ± 9%    6.81µs ± 4%      -16.51% (p=0.000 n=20+19)
protoio.MarshalDelimitedWithMarshalTo-8     788ns ± 8%     772ns ± 3%       -2.01%  (p=0.050 n=20+20)
protoio.MarshalDelimitedNoMarshalTo-8       989ns ± 4%     845ns ± 2%       -14.51% (p=0.000 n=20+18)

name                                        old alloc/op   new alloc/op    delta
types.VoteSignBytes-8                       792B ± 0%      600B ± 0%       -24.24%  (p=0.000 n=20+20)
types.CommitVoteSignBytes-8                 9.52kB ± 0%    7.60kB ± 0%     -20.17%  (p=0.000 n=20+20)
protoio.MarshalDelimitedNoMarshalTo-8       808B ± 0%      440B ± 0%       -45.54%  (p=0.000 n=20+20)

name                                        old allocs/op  new allocs/op   delta
types.VoteSignBytes-8                       13.0 ± 0%      10.0 ± 0%       -23.08%  (p=0.000 n=20+20)
types.CommitVoteSignBytes-8                 140 ± 0%       110 ± 0%        -21.43%  (p=0.000 n=20+20)
protoio.MarshalDelimitedNoMarshalTo-8       10.0 ± 0%      7.0 ± 0%        -30.00%  (p=0.000 n=20+20)
```

Thanks to Tharsis who tasked me to help them increase TPS and who
are keen on improving Tendermint and efficiency.
2021-12-10 09:36:43 -08:00
Sam Kleinman
bd6dc3ca88 p2p: refactor channel Send/out (#7414) 2021-12-09 14:03:41 -05:00
Sam Kleinman
867d406c6c state: pass connected context (#7410) 2021-12-08 11:09:08 -05:00
Sam Kleinman
9c21d4140b mempool: avoid arbitrary background contexts (#7409)
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2021-12-08 14:29:13 +00:00
Sam Kleinman
cb88bd3941 p2p: migrate to use new interface for channel errors (#7403)
* p2p: migrate to use new interface for channel errors

* Update internal/p2p/p2ptest/require.go

Co-authored-by: M. J. Fromberger <michael.j.fromberger@gmail.com>

* rename

* feedback

Co-authored-by: M. J. Fromberger <michael.j.fromberger@gmail.com>
2021-12-08 14:05:01 +00:00
Sam Kleinman
892f5d9524 service: cleanup mempool and peer update shutdown (#7401) 2021-12-08 08:44:32 -05:00
Sam Kleinman
0ff3d4b89d service: cleanup close channel in reactors (#7399) 2021-12-07 11:40:59 -05:00
Sam Kleinman
6b35cc1a47 p2p: remove unneeded close channels from p2p layer (#7392) 2021-12-07 10:40:07 -05:00
Sam Kleinman
a62ac27047 service: remove exported logger from base implemenation (#7381) 2021-12-06 10:16:42 -05:00
Sam Kleinman
4e355d80c4 p2p: implement interface for p2p.Channel without channels (#7378) 2021-12-03 15:19:04 -05:00
Sam Kleinman
5f9dd2e7f5 p2p/upnp: remove unused functionality (#7379) 2021-12-03 13:44:36 -05:00
Sam Kleinman
8a991e288c service: plumb contexts to all (most) threads (#7363)
This continues the push of plumbing contexts through tendermint. I
attempted to find all goroutines in the production code (non-test) and
made sure that these threads would exit when their contexts were
canceled, and I believe this PR does that.
2021-12-02 21:38:38 +00:00
Callum Waters
bca2080c01 cmd: add integration test and fix bug in rollback command (#7315) 2021-12-02 12:17:16 +01:00
Federico Kunze Küllmer
5f57d84dd3 rpc: implement header and header_by_hash queries (#7270) 2021-12-02 11:16:31 +01:00
Sam Kleinman
a823d167bc service: cleanup base implementation and some caller implementations (#7301) 2021-12-01 09:28:06 -05:00
Sam Kleinman
3749c37847 p2p: remove unused trust package (#7359) 2021-11-30 17:55:08 -05:00
M. J. Fromberger
76dea94a01 Remove now-unused nolint:lll directives. (#7356) 2021-11-30 21:32:21 +00:00
M. J. Fromberger
ab1788b922 Fix incorrect tests using the PSQL sink. (#7349)
Some of our tests were creating a psql event sink and expecting
it to report (or not report) certain kinds of errors. These tests
were ill-founded in a couple of ways:

1. Tests that required the Postgres driver were not loading it.
   This led to spurious successes on tests that wanted "some error"
   from the sink constructor, but didn't exercise the right path.

2. Tests that wanted a Postgres sink to succeed without a database.
   These tests "passed" because they weren't actually establishing a
   connection to the database, but if they had would have failed for
   the lack of one.

To fix this:
- Load the postgres driver in tests that need it.
- Verify connectivity before reporting successful creation of a PSQL event sink.
- Remove tests that wanted a psql sink without a database, since that case
  is already tested elsewhere.
2021-11-30 12:57:44 -08:00
Sam Kleinman
070445bc10 pex: improve goroutine lifecycle (#7343)
I saw a race detected in a test here that I think would be better
handled by just wiring up these threads.
2021-11-30 19:22:28 +00:00
Sam Kleinman
4af2dbd03b eventbus: plumb contexts (#7337)
* eventbus: plumb contexts

* fix lint
2021-11-30 14:24:11 +00:00
M. J. Fromberger
1dca1a8f97 Performance improvements for the event query API (#7319)
Rework the implementation of event query parsing and execution to
improve performance and reduce memory usage.

Previous memory and CPU profiles of the pubsub service showed query
processing as a significant hotspot. While we don't have evidence that
this is visibly hurting users, fixing it is fairly easy and self-contained.

Updates #6439.

Typical benchmark results comparing the original implementation (PEG) with the reworked implementation (Custom):
```
TEST                        TIME/OP  BYTES/OP  ALLOCS/OP  SPEEDUP   MEM SAVING
BenchmarkParsePEG-12       51716 ns  526832    27
BenchmarkParseCustom-12     2167 ns    4616    17         23.8x     99.1%
BenchmarkMatchPEG-12        3086 ns    1097    22
BenchmarkMatchCustom-12    294.2 ns      64     3         10.5x     94.1%
```

Components:
* Add a basic parsing benchmark.
* Move the original query implementation to a subdirectory.
* Add lexical scanner for Query expressions.
* Add a parser for Query expressions.
* Implement query compiler.
* Add test cases based on OpenAPI examples.
* Add MustCompile to replace the original MustParse, and update usage.
2021-11-29 13:08:48 -08:00
Sam Kleinman
7e58f02eb8 service: remove quit method (#7293) 2021-11-19 11:49:51 -05:00
Sam Kleinman
6ab62fe7b6 service: remove stop method and use contexts (#7292) 2021-11-18 17:56:21 -05:00
Sam Kleinman
d7606777cf libs/service: pass logger explicitly (#7288)
This is a very small change, but removes a method from the
`service.Service` interface (a win!) and forces callers to explicitly
pass loggers in to objects during construction rather than (later)
injecting them. There's not a real need for this kind of lazy
construction of loggers, and I think a decent potential for confusion
for mutable loggers.

The main concern I have is that this changes the constructor API for
ABCI clients. I think this is fine, and I suspect that as we plumb
contexts through, and make changes to the RPC services there'll be a
number of similar sorts of changes to various (quasi) public
interfaces, which I think we should welcome.
2021-11-16 16:20:56 +00:00
Sam Kleinman
2a455be46c libs/os: remove arbitrary os.Exit (#7284)
I think calling os.Exit at arbitrary points is _bad_ and is good to
delete. I think panics in the case of data courruption have a chance
of providing useful information.
2021-11-15 19:25:29 +00:00
Sam Kleinman
a15ae5b53a node+consensus: handshaker initialization (#7283)
This mostly just pushes more of initialization out of the node package.
2021-11-15 18:28:52 +00:00
Sam Kleinman
27560cf7a4 p2p: reduce peer score for dial failures (#7265)
When dialing fails to succeed we should reduce the score of the peer,
which puts the peer at (potentially) greater chances of being removed
from the peer manager, and reduces the chance of the peer being
gossiped by the PEX reactor.
2021-11-10 15:52:18 +00:00
William Banfield
4acd117b5e evidence: remove source of non-determinism from test (#7266)
The evidence test produces a set of mock evidence in the evidence pool of the 'Primary' node. The test then fills the evidence pools of secondaries with half of this mock evidence. Finally, the test waits until the secondary has an evidence pool as full as the primary. 

The assertions that are removed here were checking that the primary and secondaries' evidence channels were empty. However, nothing in the test actually ensures that the channels are empty. The test only waits for the secondaries to have received the complete set of evidence, and the secondaries already received half of the evidence at the beginning. It's more than possible that the secondaries can receive the complete set of evidence and not finish reading the duplicate evidence off the channels.
2021-11-09 20:48:01 +00:00
M. J. Fromberger
9dc3d7f9a2 Set a cap on the length of subscription queries. (#7263)
As a safety measure, don't allow a query string to be unreasonably
long. The query filter is not especially efficient, so a query that
needs more than basic detail should filter coarsely in the subscriber
and refine on the client side.

This affects Subscribe and TxSearch queries.
2021-11-09 08:35:51 -08:00
Callum Waters
b3b90f820c consensus: add some more checks to vote counting (#7253) 2021-11-09 10:51:19 +01:00
M. J. Fromberger
d5865af1f4 Add basic metrics to the indexer package. (#7250)
This follows the same model as we did in the p2p package.

Rework the indexer service constructor to take a struct of arguments,
that makes it easier to construct the optional settings.
Deprecate but do not remove the existing constructor.

Clean up node initialization a little bit.
2021-11-05 12:50:53 -07:00
M. J. Fromberger
54d7030510 pubsub: Move indexing out of the primary subscription path (#7231)
This is part of the work described by #7156.

Remove "unbuffered subscriptions" from the pubsub service.
Replace them with a dedicated blocking "observer" mechanism.
Use the observer mechanism for indexing.

Add a SubscribeWithArgs method and deprecate the old Subscribe
method. Remove SubscribeUnbuffered entirely (breaking).

Rework the Subscription interface to eliminate exposed channels.
Subscriptions now use a context to manage lifecycle notifications.

Internalize the eventbus package.
2021-11-05 10:25:25 -07:00
Sam Kleinman
63192ac300 consensus: remove stale WAL benchmark (#7194) 2021-11-03 07:01:46 -04:00
M. J. Fromberger
d32913c889 pubsub: Use a dynamic queue for buffered subscriptions (#7177)
Updates #7156, and a follow-up to #7070.

Event subscriptions in Tendermint currently use a fixed-length Go
channel as a queue. When the channel fills up, the publisher
immediately terminates the subscription. This prevents slow
subscribers from creating memory pressure on the node by not
servicing their queue fast enough.

Replace the buffered channel used to deliver events to buffered
subscribers with an explicit queue. The queue provides a soft
quota and burst credit mechanism: Clients that usually keep up
can survive occasional bursts, without allowing truly slow
clients to hog resources indefinitely.
2021-11-01 10:38:27 -07:00
Sam Kleinman
5cc980698a mempool: consoldate implementations (#7171)
* mempool: consoldate implementations

* update chagelog

* fix test

* Apply suggestions from code review

Co-authored-by: M. J. Fromberger <michael.j.fromberger@gmail.com>

* cleanup locking comments

* context twiddle

* migrate away from deprecated ioutil APIs (#7175)

Co-authored-by: Callum Waters <cmwaters19@gmail.com>
Co-authored-by: M. J. Fromberger <fromberger@interchain.io>

Co-authored-by: M. J. Fromberger <michael.j.fromberger@gmail.com>
Co-authored-by: Callum Waters <cmwaters19@gmail.com>
Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
2021-10-29 04:19:06 -04:00
Sharad Chand
8441b3715a migrate away from deprecated ioutil APIs (#7175)
Co-authored-by: Callum Waters <cmwaters19@gmail.com>
Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
2021-10-28 10:34:07 -07:00
Sam Kleinman
e2a103a315 mempool: port reactor tests from legacy implementation (#7162) 2021-10-27 09:45:17 -04:00
Sam Kleinman
93eb940dcd config: WriteConfigFile should return error (#7169) 2021-10-27 08:46:18 -04:00
Sam Kleinman
4bd8c5ab6f p2p: transport should be captive resposibility of router (#7160)
The main (and minor) win of this PR is that the transport is fully the
responsibility of the router and the node doesn't need to be responsible for its lifecylce.
2021-10-26 16:34:44 +00:00
Sam Kleinman
b15b2c1b78 flowrate: cleanup unused files (#7158)
I saw one of these tests fail and it looks like it was using code that
wasn't being called anywhere, so I deleted it, and avoided the package
name aliasing.
2021-10-26 16:11:34 +00:00
William Banfield
b4bc6bb4e8 p2p: add message type into the send/recv bytes metrics (#7155)
This pull request adds a new "mesage_type" label to the send/recv bytes metrics calculated in the p2p code.

Below is a snippet of the updated metrics that includes the updated label:
```
tendermint_p2p_peer_receive_bytes_total{chID="32",chain_id="ci",message_type="consensus_HasVote",peer_id="2551a13ed720101b271a5df4816d1e4b3d3bd133"} 652
tendermint_p2p_peer_receive_bytes_total{chID="32",chain_id="ci",message_type="consensus_HasVote",peer_id="4b1068420ef739db63377250553562b9a978708a"} 631
tendermint_p2p_peer_receive_bytes_total{chID="32",chain_id="ci",message_type="consensus_HasVote",peer_id="927c50a5e508c747830ce3ba64a3f70fdda58ef2"} 631
tendermint_p2p_peer_receive_bytes_total{chID="32",chain_id="ci",message_type="consensus_NewRoundStep",peer_id="2551a13ed720101b271a5df4816d1e4b3d3bd133"} 393
tendermint_p2p_peer_receive_bytes_total{chID="32",chain_id="ci",message_type="consensus_NewRoundStep",peer_id="4b1068420ef739db63377250553562b9a978708a"} 357
tendermint_p2p_peer_receive_bytes_total{chID="32",chain_id="ci",message_type="consensus_NewRoundStep",peer_id="927c50a5e508c747830ce3ba64a3f70fdda58ef2"} 386
```
2021-10-26 12:45:33 +00:00
Sam Kleinman
23be048294 p2p: use correct transport configuration (#7152) 2021-10-25 07:19:20 -04:00