Commit Graph

9501 Commits

Author SHA1 Message Date
William Banfield
70624e8d27 more lock instrumentation 2022-08-03 18:20:12 -04:00
William Banfield
6b16cf6d68 Revert "update stats queue to be smaller"
This reverts commit d176124aa0.
2022-08-03 18:15:46 -04:00
William Banfield
d392a07b99 no peer status 2022-08-03 18:02:11 -04:00
William Banfield
ffd982af4f Revert "close"
This reverts commit 71607e6fcd.
2022-08-03 17:38:16 -04:00
William Banfield
d176124aa0 update stats queue to be smaller 2022-08-03 17:27:04 -04:00
William Banfield
71607e6fcd close 2022-08-03 17:19:09 -04:00
William Banfield
705316442a More metrics 2022-08-03 16:58:12 -04:00
William Banfield
83dea898fb add metrics 2022-08-03 16:45:44 -04:00
William Banfield
c764cebbe7 add unlock 2022-08-03 16:20:06 -04:00
William Banfield
f859f5ef6e add intermediate lock log 2022-08-03 13:51:39 -04:00
William Banfield
92a8e74fdf add more logs 2022-08-03 11:26:08 -04:00
William Banfield
5d2593c6ee add lock logs 2022-07-29 15:49:08 -04:00
dependabot[bot]
b4eaccd242 build(deps): Bump github.com/creachadair/tomledit from 0.0.22 to 0.0.23 (#9085)
Bumps [github.com/creachadair/tomledit](https://github.com/creachadair/tomledit) from 0.0.22 to 0.0.23.
- [Release notes](https://github.com/creachadair/tomledit/releases)
- [Commits](https://github.com/creachadair/tomledit/compare/v0.0.22...v0.0.23)

---
updated-dependencies:
- dependency-name: github.com/creachadair/tomledit
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-07-25 07:03:29 -07:00
dependabot[bot]
ed4d0de559 build(deps): Bump github.com/golangci/golangci-lint (#9069) 2022-07-25 10:34:58 +02:00
dependabot[bot]
a4ce134c93 build(deps): Bump github.com/bufbuild/buf from 1.3.1 to 1.6.0 (#9064) 2022-07-22 15:30:09 +02:00
mergify[bot]
0d2bf39c23 indexer: work around indexing problem for duplicate transactions (forward port: #8625) (#8950) 2022-07-21 19:33:08 +02:00
dependabot[bot]
d4495b8626 build(deps): Bump google.golang.org/grpc from 1.47.0 to 1.48.0 (#9060) 2022-07-21 18:53:12 +02:00
dependabot[bot]
ba671c1acf build(deps): Bump github.com/BurntSushi/toml from 1.1.0 to 1.2.0 (#9063)
Bumps [github.com/BurntSushi/toml](https://github.com/BurntSushi/toml) from 1.1.0 to 1.2.0.
- [Release notes](https://github.com/BurntSushi/toml/releases)
- [Commits](https://github.com/BurntSushi/toml/compare/v1.1.0...v1.2.0)

---
updated-dependencies:
- dependency-name: github.com/BurntSushi/toml
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-07-21 09:19:33 -04:00
dependabot[bot]
65feb7097b build(deps): Bump github.com/golangci/golangci-lint (#9045) 2022-07-20 17:22:47 -07:00
M. J. Fromberger
9d1dd560e6 Prepare changelog for Release v0.35.9. (#9057) v0.35.9 2022-07-20 15:28:54 -07:00
mergify[bot]
f6bbd8302c migration: scope key migration to stores (#9005) (#9027) 2022-07-20 14:24:53 +02:00
Callum Waters
3e96a376b0 spec: merge v0.35 spec into tendermint (#9018) 2022-07-20 12:37:46 +02:00
M. J. Fromberger
183e249709 Prepare changelog for candidate v0.35.9-rc0 (#9040) v0.35.9-rc0 2022-07-19 14:02:14 -07:00
M. J. Fromberger
22ed610083 mempool: rework lock discipline to mitigate callback deadlocks (#9030)
The priority mempool has a stricter synchronization requirement than the legacy
mempool. Under sufficiently-heavy load, exclusive access can lead to deadlocks
when processing a large batch of transaction rechecks through an out-of-process
application using the socket client.

By design, a socket client stalls when its send buffer fills, during which time
it holds a lock shared with the receive thread.  While blocked in this state, a
response read by the receive thread waits for the shared lock so the callback
can be invoked.

If we're lucky, the server will then read the next request and make enough room
in the buffer for the sender to proceed. If not however (e.g., if the next
request is bigger than the one just consumed), the receive thread is blocked:
It is waiting on the lock and cannot read a response.  Once the server's output
buffer fills, the system deadlocks.

This can happen with any sufficiently-busy workload, but is more likely during
a large recheck in the v1 mempool, where the callbacks need exclusive access to
mempool state.  As a workaround, process rechecks for the priority mempool in
their own goroutines outside the mempool mutex.  Responses still head-of-line
block, but will no longer get pushback due to contention on the mempool itself.
2022-07-19 13:28:46 -07:00
dependabot[bot]
32761ec729 build(deps): Bump github.com/golangci/golangci-lint (#9037)
Bumps [github.com/golangci/golangci-lint](https://github.com/golangci/golangci-lint) from 1.46.0 to 1.47.0.
- [Release notes](https://github.com/golangci/golangci-lint/releases)
- [Changelog](https://github.com/golangci/golangci-lint/blob/master/CHANGELOG.md)
- [Commits](https://github.com/golangci/golangci-lint/compare/v1.46.0...v1.47.0)

---
updated-dependencies:
- dependency-name: github.com/golangci/golangci-lint
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-07-19 08:17:34 -07:00
dependabot[bot]
5edc9e3a15 build(deps): Bump pgregory.net/rapid from 0.4.7 to 0.4.8 (#9015)
Bumps [pgregory.net/rapid](https://github.com/flyingmutant/rapid) from 0.4.7 to 0.4.8.
- [Release notes](https://github.com/flyingmutant/rapid/releases)
- [Commits](https://github.com/flyingmutant/rapid/compare/v0.4.7...v0.4.8)

---
updated-dependencies:
- dependency-name: pgregory.net/rapid
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Sam Kleinman <garen@tychoish.com>
Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
2022-07-15 09:33:52 -07:00
mergify[bot]
6b18dfcea1 Extract a library from the confix command-line tool. (backport #9012) (#9025)
(cherry picked from commit 18b5a500da)

Pull out the library functionality from scripts/confix and move it to
internal/libs/confix. Replace scripts/confix with a simple stub that has the
same command-line API, but uses the library instead.

Related:

- Move and update unit tests.
- Move scripts/confix/condiff to scripts/condiff.
- Update test data for v34, v35, and v36.
- Update reference diffs.
- Update testdata README.

Co-authored-by: M. J. Fromberger <fromberger@interchain.io>
2022-07-15 08:46:28 -07:00
M. J. Fromberger
9f2522148b config: fix the comments on p2p.queue-type (#9021)
These got disarranged during a previous cleanup.
2022-07-15 07:11:16 -07:00
dependabot[bot]
819e7f4bdd build(deps): Bump google.golang.org/grpc from 1.47.0 to 1.48.0 (#8992)
Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.47.0 to 1.48.0.
- [Release notes](https://github.com/grpc/grpc-go/releases)
- [Commits](https://github.com/grpc/grpc-go/compare/v1.47.0...v1.48.0)

---
updated-dependencies:
- dependency-name: google.golang.org/grpc
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-07-14 16:42:27 -07:00
M. J. Fromberger
9177206750 Prepare changelog for Release v0.35.8 (#8988) v0.35.8 2022-07-14 14:49:37 -07:00
mergify[bot]
0c6efd8c51 config: update config to reflect simple-priority queue (#9007) (#9008)
Update the queue documentation to reflect the types of queues and current default queue.

(cherry picked from commit c1c501ecd4)

Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com>
2022-07-14 17:13:41 -04:00
Sam Kleinman
f8d15fc682 blocksync: drop support for enabled=false (#8912) 2022-07-14 13:19:12 -04:00
William Banfield
7971514b55 p2p: configure max accepted for non-legacy as well (#8999)
* p2p: configure max connected for non-legacy as well

* remove explicit 0
2022-07-14 10:11:34 -04:00
M. J. Fromberger
b94470a6a4 mempool: ensure evicted transactions are removed from the cache (#9000)
In the original implementation transactions evicted for priority were also
removed from the cache. In addition, remove expired transactions from
the cache.

Related:

- Add Has method to cache implementations.
- Update tests to exercise this condition.
2022-07-14 06:51:54 -07:00
mergify[bot]
a1c8f8df0b doc: fix typos in quick-start.md. (#8990) (#8997) 2022-07-14 11:44:08 +02:00
M. J. Fromberger
3790968156 mempool: release lock during app connection flush (#8984)
This case is symmetric to what we did for CheckTx calls, where we release the
mempool mutex to ensure callbacks can fire during call setup.  We also need
this behaviour for application flush, for the same reason: The caller holds the
lock by contract from the Mempool interface.
2022-07-12 10:28:51 -07:00
M. J. Fromberger
9e64c95e56 mempool: reduce lock contention during CheckTx (cleanup) (#8983)
The way this was originally structured, we reacquired the lock after issuing
the initial ABCI CheckTx call, only to immediately release it. Restructure the
code so that this redundant acquire is no longer necessary.
2022-07-12 08:00:29 -07:00
M. J. Fromberger
cb93d3b587 mempool: don't log message type mismatch in the default callback (#8969) 2022-07-11 18:06:49 -07:00
M. J. Fromberger
f98de20f7e p2p: ensure closed channels stop receiving service (#8979)
Once these channels are closed, we should not continue to service them, as they
will never again deliver nonzero values.
2022-07-11 16:34:05 -07:00
M. J. Fromberger
451e697331 Update generated mocks after upgrade of Mockery v2. (#8973) 2022-07-11 09:18:36 -04:00
mergify[bot]
e3292a48e3 p2p: simpler priority queue (backport #8929) (#8956) 2022-07-08 13:29:42 -04:00
M. J. Fromberger
6a354a1e8d Update pending changelog. (#8965) 2022-07-08 09:54:50 -07:00
mergify[bot]
1daf7b939d p2p: make peer gossiping coinflip safer (#8949) (#8963)
Closes #8948

(cherry picked from commit 61ce384d75)

Co-authored-by: Sam Kleinman <garen@tychoish.com>
2022-07-08 12:32:12 -04:00
mergify[bot]
156c305b08 p2p: delete cruft (#8958) (#8959)
I think the decision in #8806 is that we shouldn't do this yet, so I think it's best to just drop this.

(cherry picked from commit 636320f901)

Co-authored-by: Sam Kleinman <garen@tychoish.com>
2022-07-08 09:59:57 -04:00
M. J. Fromberger
bc49f66c35 Add more unit tests for the priority mempool. (#8961)
- Add a test for time-based (TTL) expiration.
- Add tests for eviction based on size and priority.
2022-07-07 14:56:34 -07:00
M. J. Fromberger
9b02094827 Fix unbounded heap growth in the priority mempool. (#8944)
The primary effect of this change is to simplify the implementation of the
priority mempool to eliminate an unbounded heap growth observed by Vega team
when it was enabled in their testnet. It updates and fixes #8775.

The main body of this change is to remove the auxiliary indexing structures,
and use only the concurrent list structure (the same as the legacy mempool) to
maintain both gossip order and priority.

This means that operations that require priority information, such as block
updates and insert-time evictions, require a linear scan over the mempool.
This tradeoff greatly simplifies the code and eliminates the long-term heap
load, at the cost of some extra CPU and short-lived working memory during
CheckTx and Update calls.

Rough benchmark results:

 - This PR:
   BenchmarkTxMempool_CheckTx-10             486373              2271 ns/op
 - Original priority mempool implementation:
   BenchmarkTxMempool_CheckTx-10             500302              2113 ns/op
 - Legacy (v0) mempool:
   BenchmarkCheckTx-10                       364591              3571 ns/op

These benchmarks are not a good proxy for production load, but at least suggest
that the overhead of the implementation changes are not cause for concern.

In addition:

- Rework synchronization so that access to shared data structures is safe.
  Previously shared locks were used to exclude block updates during calls that
  update mempool state. Now access is properly exclusive where necessary.

- Fix a bug in the recheck flow, where priority updates from the application
  were not correctly reflected in the index structures.

- Eliminate the need for separate recheck cursors during block update. This
  avoids the need to explicitly invalidate elements of the concurrent list,
  which averts the dependency cycle that led to objects being pinned.

- Clean up, clarify, and fix inaccuracies in documentation comments throughout
  the package.

Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com>
2022-07-07 07:15:08 -07:00
William Banfield
da83edc588 p2p: return from conn send on stopped mconn (#8904)
Co-authored-by: Sam Kleinman <garen@tychoish.com>
2022-07-06 10:41:55 -04:00
mergify[bot]
047d7c927b p2p: fix flakey test due to disconnect cooldown (#8917) (#8918)
This test was made flakey by #8839. The cooldown period means that the node in the test will not try to reconnect as quickly as the test expects. This change makes the cooldown shorter in the test so that the node quickly reconnects.

(cherry picked from commit 5274f80de4)

Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com>
Co-authored-by: Sam Kleinman <garen@tychoish.com>
2022-07-05 19:11:38 -04:00
mergify[bot]
49788adde5 p2p: use correct context error (#8916) (#8920)
handshakeCtx is the internal context carrying the timeout. Its error should be used for the error return.

(cherry picked from commit 921530c352)

Co-authored-by: William Banfield <4561443+williambanfield@users.noreply.github.com>
Co-authored-by: Sam Kleinman <garen@tychoish.com>
Co-authored-by: Callum Waters <cmwaters19@gmail.com>
2022-07-05 13:36:26 -04:00
dependabot[bot]
e414d0a878 build(deps): Bump github.com/libp2p/go-buffer-pool from 0.0.2 to 0.1.0 (#8931) 2022-07-05 12:19:03 +02:00