Compare commits
30 Commits
master
...
wb/handsha
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
0f90d699c3 | ||
|
|
6c8e350110 | ||
|
|
14b14989f7 | ||
|
|
007e98a4e6 | ||
|
|
c60e170362 | ||
|
|
464f54d145 | ||
|
|
54d9874828 | ||
|
|
1b872d768b | ||
|
|
1975cdd750 | ||
|
|
e3cd47d89d | ||
|
|
fb794b1ce5 | ||
|
|
74f3e15dc9 | ||
|
|
7728ae6e04 | ||
|
|
c29c667c99 | ||
|
|
7e7a2535c6 | ||
|
|
94e7978259 | ||
|
|
cbba7f3d74 | ||
|
|
efda2ff816 | ||
|
|
10a39d6b91 | ||
|
|
1e2d37d6a4 | ||
|
|
c368abceab | ||
|
|
7335278479 | ||
|
|
ad78120d99 | ||
|
|
ee22b1a5e2 | ||
|
|
f2f99f1550 | ||
|
|
cbadc179f2 | ||
|
|
fe42df46b0 | ||
|
|
4ff6e367f8 | ||
|
|
b030ed40f0 | ||
|
|
783ab230d8 |
2
.github/workflows/proto-lint.yml
vendored
@@ -15,7 +15,7 @@ jobs:
|
||||
timeout-minutes: 5
|
||||
steps:
|
||||
- uses: actions/checkout@v3
|
||||
- uses: bufbuild/buf-setup-action@v1.4.0
|
||||
- uses: bufbuild/buf-setup-action@v1.5.0
|
||||
- uses: bufbuild/buf-lint-action@v1
|
||||
with:
|
||||
input: 'proto'
|
||||
|
||||
@@ -37,6 +37,10 @@ Special thanks to external contributors on this release:
|
||||
- [p2p] \#7035 Remove legacy P2P routing implementation and associated configuration options. (@tychoish)
|
||||
- [p2p] \#7265 Peer manager reduces peer score for each failed dial attempts for peers that have not successfully dialed. (@tychoish)
|
||||
- [p2p] [\#7594](https://github.com/tendermint/tendermint/pull/7594) always advertise self, to enable mutual address discovery. (@altergui)
|
||||
- [p2p] \#8737 Introduce "inactive" peer label to avoid re-dialing incompatible peers. (@tychoish)
|
||||
- [p2p] \#8737 Increase frequency of dialing attempts to reduce latency for peer acquisition. (@tychoish)
|
||||
- [p2p] \#8737 Improvements to peer scoring and sorting to gossip a greater variety of peers during PEX. (@tychoish)
|
||||
- [p2p] \#8737 Track incoming and outgoing peers separately to ensure more peer slots open for incoming connections. (@tychoish)
|
||||
|
||||
- Go API
|
||||
|
||||
|
||||
@@ -627,6 +627,10 @@ type P2PConfig struct { //nolint: maligned
|
||||
// outbound).
|
||||
MaxConnections uint16 `mapstructure:"max-connections"`
|
||||
|
||||
// MaxOutgoingConnections defines the maximum number of connected peers (inbound and
|
||||
// outbound).
|
||||
MaxOutgoingConnections uint16 `mapstructure:"max-outgoing-connections"`
|
||||
|
||||
// MaxIncomingConnectionAttempts rate limits the number of incoming connection
|
||||
// attempts per IP address.
|
||||
MaxIncomingConnectionAttempts uint `mapstructure:"max-incoming-connection-attempts"`
|
||||
@@ -674,6 +678,7 @@ func DefaultP2PConfig() *P2PConfig {
|
||||
ExternalAddress: "",
|
||||
UPNP: false,
|
||||
MaxConnections: 64,
|
||||
MaxOutgoingConnections: 32,
|
||||
MaxIncomingConnectionAttempts: 100,
|
||||
FlushThrottleTimeout: 100 * time.Millisecond,
|
||||
// The MTU (Maximum Transmission Unit) for Ethernet is 1500 bytes.
|
||||
@@ -708,6 +713,9 @@ func (cfg *P2PConfig) ValidateBasic() error {
|
||||
if cfg.RecvRate < 0 {
|
||||
return errors.New("recv-rate can't be negative")
|
||||
}
|
||||
if cfg.MaxOutgoingConnections > cfg.MaxConnections {
|
||||
return errors.New("max-outgoing-connections cannot be larger than max-connections")
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
|
||||
@@ -309,6 +309,10 @@ upnp = {{ .P2P.UPNP }}
|
||||
# Maximum number of connections (inbound and outbound).
|
||||
max-connections = {{ .P2P.MaxConnections }}
|
||||
|
||||
# Maximum number of connections reserved for outgoing
|
||||
# connections. Must be less than max-connections
|
||||
max-outgoing-connections = {{ .P2P.MaxOutgoingConnections }}
|
||||
|
||||
# Rate limits the number of incoming connection attempts per IP address.
|
||||
max-incoming-connection-attempts = {{ .P2P.MaxIncomingConnectionAttempts }}
|
||||
|
||||
|
||||
@@ -62,7 +62,7 @@ as `abci-cli` above. The kvstore just stores transactions in a merkle
|
||||
tree.
|
||||
|
||||
Its code can be found
|
||||
[here](https://github.com/tendermint/tendermint/blob/master/abci/cmd/abci-cli/abci-cli.go)
|
||||
[here](https://github.com/tendermint/tendermint/blob/v0.36.x/abci/cmd/abci-cli/abci-cli.go)
|
||||
and looks like:
|
||||
|
||||
```go
|
||||
|
||||
@@ -15,7 +15,7 @@ the block itself is never stored.
|
||||
Each event contains a type and a list of attributes, which are key-value pairs
|
||||
denoting something about what happened during the method's execution. For more
|
||||
details on `Events`, see the
|
||||
[ABCI](https://github.com/tendermint/tendermint/blob/master/spec/abci/abci.md#events)
|
||||
[ABCI](https://github.com/tendermint/tendermint/blob/v0.36.x/spec/abci/abci.md#events)
|
||||
documentation.
|
||||
|
||||
An `Event` has a composite key associated with it. A `compositeKey` is
|
||||
|
||||
@@ -1,120 +0,0 @@
|
||||
---
|
||||
order: 1
|
||||
parent:
|
||||
order: false
|
||||
---
|
||||
|
||||
# Architecture Decision Records (ADR)
|
||||
|
||||
This is a location to record all high-level architecture decisions in the tendermint project.
|
||||
|
||||
You can read more about the ADR concept in this [blog post](https://product.reverb.com/documenting-architecture-decisions-the-reverb-way-a3563bb24bd0#.78xhdix6t).
|
||||
|
||||
An ADR should provide:
|
||||
|
||||
- Context on the relevant goals and the current state
|
||||
- Proposed changes to achieve the goals
|
||||
- Summary of pros and cons
|
||||
- References
|
||||
- Changelog
|
||||
|
||||
Note the distinction between an ADR and a spec. The ADR provides the context, intuition, reasoning, and
|
||||
justification for a change in architecture, or for the architecture of something
|
||||
new. The spec is much more compressed and streamlined summary of everything as
|
||||
it stands today.
|
||||
|
||||
If recorded decisions turned out to be lacking, convene a discussion, record the new decisions here, and then modify the code to match.
|
||||
|
||||
Note the context/background should be written in the present tense.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
### Implemented
|
||||
|
||||
- [ADR-001: Logging](./adr-001-logging.md)
|
||||
- [ADR-002: Event-Subscription](./adr-002-event-subscription.md)
|
||||
- [ADR-003: ABCI-APP-RPC](./adr-003-abci-app-rpc.md)
|
||||
- [ADR-004: Historical-Validators](./adr-004-historical-validators.md)
|
||||
- [ADR-005: Consensus-Params](./adr-005-consensus-params.md)
|
||||
- [ADR-008: Priv-Validator](./adr-008-priv-validator.md)
|
||||
- [ADR-009: ABCI-Design](./adr-009-ABCI-design.md)
|
||||
- [ADR-010: Crypto-Changes](./adr-010-crypto-changes.md)
|
||||
- [ADR-011: Monitoring](./adr-011-monitoring.md)
|
||||
- [ADR-014: Secp-Malleability](./adr-014-secp-malleability.md)
|
||||
- [ADR-015: Crypto-Encoding](./adr-015-crypto-encoding.md)
|
||||
- [ADR-016: Protocol-Versions](./adr-016-protocol-versions.md)
|
||||
- [ADR-017: Chain-Versions](./adr-017-chain-versions.md)
|
||||
- [ADR-018: ABCI-Validators](./adr-018-ABCI-Validators.md)
|
||||
- [ADR-019: Multisigs](./adr-019-multisigs.md)
|
||||
- [ADR-020: Block-Size](./adr-020-block-size.md)
|
||||
- [ADR-021: ABCI-Events](./adr-021-abci-events.md)
|
||||
- [ADR-025: Commit](./adr-025-commit.md)
|
||||
- [ADR-026: General-Merkle-Proof](./adr-026-general-merkle-proof.md)
|
||||
- [ADR-033: Pubsub](./adr-033-pubsub.md)
|
||||
- [ADR-034: Priv-Validator-File-Structure](./adr-034-priv-validator-file-structure.md)
|
||||
- [ADR-043: Blockchain-RiRi-Org](./adr-043-blockchain-riri-org.md)
|
||||
- [ADR-044: Lite-Client-With-Weak-Subjectivity](./adr-044-lite-client-with-weak-subjectivity.md)
|
||||
- [ADR-046: Light-Client-Implementation](./adr-046-light-client-implementation.md)
|
||||
- [ADR-047: Handling-Evidence-From-Light-Client](./adr-047-handling-evidence-from-light-client.md)
|
||||
- [ADR-051: Double-Signing-Risk-Reduction](./adr-051-double-signing-risk-reduction.md)
|
||||
- [ADR-052: Tendermint-Mode](./adr-052-tendermint-mode.md)
|
||||
- [ADR-053: State-Sync-Prototype](./adr-053-state-sync-prototype.md)
|
||||
- [ADR-054: Crypto-Encoding-2](./adr-054-crypto-encoding-2.md)
|
||||
- [ADR-055: Protobuf-Design](./adr-055-protobuf-design.md)
|
||||
- [ADR-056: Light-Client-Amnesia-Attacks](./adr-056-light-client-amnesia-attacks.md)
|
||||
- [ADR-059: Evidence-Composition-and-Lifecycle](./adr-059-evidence-composition-and-lifecycle.md)
|
||||
- [ADR-062: P2P-Architecture](./adr-062-p2p-architecture.md)
|
||||
- [ADR-063: Privval-gRPC](./adr-063-privval-grpc.md)
|
||||
- [ADR-066: E2E-Testing](./adr-066-e2e-testing.md)
|
||||
- [ADR-072: Restore Requests for Comments](./adr-072-request-for-comments.md)
|
||||
- [ADR-077: Block Retention](./adr-077-block-retention.md)
|
||||
- [ADR-078: Non-zero Genesis](./adr-078-nonzero-genesis.md)
|
||||
- [ADR-079: ED25519 Verification](./adr-079-ed25519-verification.md)
|
||||
- [ADR-080: Reverse Sync](./adr-080-reverse-sync.md)
|
||||
|
||||
### Accepted
|
||||
|
||||
- [ADR-006: Trust-Metric](./adr-006-trust-metric.md)
|
||||
- [ADR-024: Sign-Bytes](./adr-024-sign-bytes.md)
|
||||
- [ADR-035: Documentation](./adr-035-documentation.md)
|
||||
- [ADR-039: Peer-Behaviour](./adr-039-peer-behaviour.md)
|
||||
- [ADR-060: Go-API-Stability](./adr-060-go-api-stability.md)
|
||||
- [ADR-061: P2P-Refactor-Scope](./adr-061-p2p-refactor-scope.md)
|
||||
- [ADR-065: Custom Event Indexing](./adr-065-custom-event-indexing.md)
|
||||
- [ADR-068: Reverse-Sync](./adr-068-reverse-sync.md)
|
||||
- [ADR-067: Mempool Refactor](./adr-067-mempool-refactor.md)
|
||||
- [ADR-075: RPC Event Subscription Interface](./adr-075-rpc-subscription.md)
|
||||
- [ADR-076: Combine Spec and Tendermint Repositories](./adr-076-combine-spec-repo.md)
|
||||
- [ADR-081: Protocol Buffers Management](./adr-081-protobuf-mgmt.md)
|
||||
|
||||
### Deprecated
|
||||
|
||||
None
|
||||
|
||||
### Rejected
|
||||
|
||||
- [ADR-023: ABCI-Propose-tx](./adr-023-ABCI-propose-tx.md)
|
||||
- [ADR-029: Check-Tx-Consensus](./adr-029-check-tx-consensus.md)
|
||||
- [ADR-058: Event-Hashing](./adr-058-event-hashing.md)
|
||||
|
||||
### Proposed
|
||||
|
||||
- [ADR-007: Trust-Metric-Usage](./adr-007-trust-metric-usage.md)
|
||||
- [ADR-012: Peer-Transport](./adr-012-peer-transport.md)
|
||||
- [ADR-013: Symmetric-Crypto](./adr-013-symmetric-crypto.md)
|
||||
- [ADR-022: ABCI-Errors](./adr-022-abci-errors.md)
|
||||
- [ADR-030: Consensus-Refactor](./adr-030-consensus-refactor.md)
|
||||
- [ADR-036: Empty Blocks via ABCI](./adr-036-empty-blocks-abci.md)
|
||||
- [ADR-037: Deliver-Block](./adr-037-deliver-block.md)
|
||||
- [ADR-038: Non-Zero-Start-Height](./adr-038-non-zero-start-height.md)
|
||||
- [ADR-040: Blockchain Reactor Refactor](./adr-040-blockchain-reactor-refactor.md)
|
||||
- [ADR-041: Proposer-Selection-via-ABCI](./adr-041-proposer-selection-via-abci.md)
|
||||
- [ADR-042: State Sync Design](./adr-042-state-sync.md)
|
||||
- [ADR-045: ABCI-Evidence](./adr-045-abci-evidence.md)
|
||||
- [ADR-050: Improved Trusted Peering](./adr-050-improved-trusted-peering.md)
|
||||
- [ADR-057: RPC](./adr-057-RPC.md)
|
||||
- [ADR-064: Batch Verification](./adr-064-batch-verification.md)
|
||||
- [ADR-069: Node Initialization](./adr-069-flexible-node-initialization.md)
|
||||
- [ADR-071: Proposer-Based Timestamps](./adr-071-proposer-based-timestamps.md)
|
||||
- [ADR-073: Adopt LibP2P](./adr-073-libp2p.md)
|
||||
- [ADR-074: Migrate Timeout Parameters to Consensus Parameters](./adr-074-timeout-params.md)
|
||||
@@ -1,216 +0,0 @@
|
||||
# ADR 1: Logging
|
||||
|
||||
## Context
|
||||
|
||||
Current logging system in Tendermint is very static and not flexible enough.
|
||||
|
||||
Issues: [358](https://github.com/tendermint/tendermint/issues/358), [375](https://github.com/tendermint/tendermint/issues/375).
|
||||
|
||||
What we want from the new system:
|
||||
|
||||
- per package dynamic log levels
|
||||
- dynamic logger setting (logger tied to the processing struct)
|
||||
- conventions
|
||||
- be more visually appealing
|
||||
|
||||
"dynamic" here means the ability to set smth in runtime.
|
||||
|
||||
## Decision
|
||||
|
||||
### 1) An interface
|
||||
|
||||
First, we will need an interface for all of our libraries (`tmlibs`, Tendermint, etc.). My personal preference is go-kit `Logger` interface (see Appendix A.), but that is too much a bigger change. Plus we will still need levels.
|
||||
|
||||
```go
|
||||
# log.go
|
||||
type Logger interface {
|
||||
Debug(msg string, keyvals ...interface{}) error
|
||||
Info(msg string, keyvals ...interface{}) error
|
||||
Error(msg string, keyvals ...interface{}) error
|
||||
|
||||
With(keyvals ...interface{}) Logger
|
||||
}
|
||||
```
|
||||
|
||||
On a side note: difference between `Info` and `Notice` is subtle. We probably
|
||||
could do without `Notice`. Don't think we need `Panic` or `Fatal` as a part of
|
||||
the interface. These funcs could be implemented as helpers. In fact, we already
|
||||
have some in `tmlibs/common`.
|
||||
|
||||
- `Debug` - extended output for devs
|
||||
- `Info` - all that is useful for a user
|
||||
- `Error` - errors
|
||||
|
||||
`Notice` should become `Info`, `Warn` either `Error` or `Debug` depending on the message, `Crit` -> `Error`.
|
||||
|
||||
This interface should go into `tmlibs/log`. All libraries which are part of the core (tendermint/tendermint) should obey it.
|
||||
|
||||
### 2) Logger with our current formatting
|
||||
|
||||
On top of this interface, we will need to implement a stdout logger, which will be used when Tendermint is configured to output logs to STDOUT.
|
||||
|
||||
Many people say that they like the current output, so let's stick with it.
|
||||
|
||||
```
|
||||
NOTE[2017-04-25|14:45:08] ABCI Replay Blocks module=consensus appHeight=0 storeHeight=0 stateHeight=0
|
||||
```
|
||||
|
||||
Couple of minor changes:
|
||||
|
||||
```
|
||||
I[2017-04-25|14:45:08.322] ABCI Replay Blocks module=consensus appHeight=0 storeHeight=0 stateHeight=0
|
||||
```
|
||||
|
||||
Notice the level is encoded using only one char plus milliseconds.
|
||||
|
||||
Note: there are many other formats out there like [logfmt](https://brandur.org/logfmt).
|
||||
|
||||
This logger could be implemented using any logger - [logrus](https://github.com/sirupsen/logrus), [go-kit/log](https://github.com/go-kit/kit/tree/master/log), [zap](https://github.com/uber-go/zap), log15 so far as it
|
||||
|
||||
a) supports coloring output<br>
|
||||
b) is moderately fast (buffering) <br>
|
||||
c) conforms to the new interface or adapter could be written for it <br>
|
||||
d) is somewhat configurable<br>
|
||||
|
||||
go-kit is my favorite so far. Check out how easy it is to color errors in red https://github.com/go-kit/kit/blob/master/log/term/example_test.go#L12. Although, coloring could only be applied to the whole string :(
|
||||
|
||||
```
|
||||
go-kit +: flexible, modular
|
||||
go-kit “-”: logfmt format https://brandur.org/logfmt
|
||||
|
||||
logrus +: popular, feature rich (hooks), API and output is more like what we want
|
||||
logrus -: not so flexible
|
||||
```
|
||||
|
||||
```go
|
||||
# tm_logger.go
|
||||
// NewTmLogger returns a logger that encodes keyvals to the Writer in
|
||||
// tm format.
|
||||
func NewTmLogger(w io.Writer) Logger {
|
||||
return &tmLogger{kitlog.NewLogfmtLogger(w)}
|
||||
}
|
||||
|
||||
func (l tmLogger) SetLevel(level string() {
|
||||
switch (level) {
|
||||
case "debug":
|
||||
l.sourceLogger = level.NewFilter(l.sourceLogger, level.AllowDebug())
|
||||
}
|
||||
}
|
||||
|
||||
func (l tmLogger) Info(msg string, keyvals ...interface{}) error {
|
||||
l.sourceLogger.Log("msg", msg, keyvals...)
|
||||
}
|
||||
|
||||
# log.go
|
||||
func With(logger Logger, keyvals ...interface{}) Logger {
|
||||
kitlog.With(logger.sourceLogger, keyvals...)
|
||||
}
|
||||
```
|
||||
|
||||
Usage:
|
||||
|
||||
```go
|
||||
logger := log.NewTmLogger(os.Stdout)
|
||||
logger.SetLevel(config.GetString("log_level"))
|
||||
node.SetLogger(log.With(logger, "node", Name))
|
||||
```
|
||||
|
||||
**Other log formatters**
|
||||
|
||||
In the future, we may want other formatters like JSONFormatter.
|
||||
|
||||
```
|
||||
{ "level": "notice", "time": "2017-04-25 14:45:08.562471297 -0400 EDT", "module": "consensus", "msg": "ABCI Replay Blocks", "appHeight": 0, "storeHeight": 0, "stateHeight": 0 }
|
||||
```
|
||||
|
||||
### 3) Dynamic logger setting
|
||||
|
||||
https://dave.cheney.net/2017/01/23/the-package-level-logger-anti-pattern
|
||||
|
||||
This is the hardest part and where the most work will be done. logger should be tied to the processing struct, or the context if it adds some fields to the logger.
|
||||
|
||||
```go
|
||||
type BaseService struct {
|
||||
log log15.Logger
|
||||
name string
|
||||
started uint32 // atomic
|
||||
stopped uint32 // atomic
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
BaseService already contains `log` field, so most of the structs embedding it should be fine. We should rename it to `logger`.
|
||||
|
||||
The only thing missing is the ability to set logger:
|
||||
|
||||
```
|
||||
func (bs *BaseService) SetLogger(l log.Logger) {
|
||||
bs.logger = l
|
||||
}
|
||||
```
|
||||
|
||||
### 4) Conventions
|
||||
|
||||
Important keyvals should go first. Example:
|
||||
|
||||
```
|
||||
correct
|
||||
I[2017-04-25|14:45:08.322] ABCI Replay Blocks module=consensus instance=1 appHeight=0 storeHeight=0 stateHeight=0
|
||||
```
|
||||
|
||||
not
|
||||
|
||||
```
|
||||
wrong
|
||||
I[2017-04-25|14:45:08.322] ABCI Replay Blocks module=consensus appHeight=0 storeHeight=0 stateHeight=0 instance=1
|
||||
```
|
||||
|
||||
for that in most cases you'll need to add `instance` field to a logger upon creating, not when u log a particular message:
|
||||
|
||||
```go
|
||||
colorFn := func(keyvals ...interface{}) term.FgBgColor {
|
||||
for i := 1; i < len(keyvals); i += 2 {
|
||||
if keyvals[i] == "instance" && keyvals[i+1] == "1" {
|
||||
return term.FgBgColor{Fg: term.Blue}
|
||||
} else if keyvals[i] == "instance" && keyvals[i+1] == "1" {
|
||||
return term.FgBgColor{Fg: term.Red}
|
||||
}
|
||||
}
|
||||
return term.FgBgColor{}
|
||||
}
|
||||
logger := term.NewLogger(os.Stdout, log.NewTmLogger, colorFn)
|
||||
|
||||
c1 := NewConsensusReactor(...)
|
||||
c1.SetLogger(log.With(logger, "instance", 1))
|
||||
|
||||
c2 := NewConsensusReactor(...)
|
||||
c2.SetLogger(log.With(logger, "instance", 2))
|
||||
```
|
||||
|
||||
## Status
|
||||
|
||||
Implemented
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
Dynamic logger, which could be turned off for some modules at runtime. Public interface for other projects using Tendermint libraries.
|
||||
|
||||
### Negative
|
||||
|
||||
We may loose the ability to color keys in keyvalue pairs. go-kit allow you to easily change foreground / background colors of the whole string, but not its parts.
|
||||
|
||||
### Neutral
|
||||
|
||||
## Appendix A.
|
||||
|
||||
I really like a minimalistic approach go-kit took with his logger https://github.com/go-kit/kit/tree/master/log:
|
||||
|
||||
```
|
||||
type Logger interface {
|
||||
Log(keyvals ...interface{}) error
|
||||
}
|
||||
```
|
||||
|
||||
See [The Hunt for a Logger Interface](https://web.archive.org/web/20210902161539/https://go-talks.appspot.com/github.com/ChrisHines/talks/structured-logging/structured-logging.slide#1). The advantage is greater composability (check out how go-kit defines colored logging or log-leveled logging on top of this interface https://github.com/go-kit/kit/tree/master/log).
|
||||
@@ -1,88 +0,0 @@
|
||||
# ADR 2: Event Subscription
|
||||
|
||||
## Context
|
||||
|
||||
In the light client (or any other client), the user may want to **subscribe to
|
||||
a subset of transactions** (rather than all of them) using `/subscribe?event=X`. For
|
||||
example, I want to subscribe for all transactions associated with a particular
|
||||
account. Same for fetching. The user may want to **fetch transactions based on
|
||||
some filter** (rather than fetching all the blocks). For example, I want to get
|
||||
all transactions for a particular account in the last two weeks (`tx's block time >= '2017-06-05'`).
|
||||
|
||||
Now you can't even subscribe to "all txs" in Tendermint.
|
||||
|
||||
The goal is a simple and easy to use API for doing that.
|
||||
|
||||

|
||||
|
||||
## Decision
|
||||
|
||||
ABCI app return tags with a `DeliverTx` response inside the `data` field (_for
|
||||
now, later we may create a separate field_). Tags is a list of key-value pairs,
|
||||
protobuf encoded.
|
||||
|
||||
Example data:
|
||||
|
||||
```json
|
||||
{
|
||||
"abci.account.name": "Igor",
|
||||
"abci.account.address": "0xdeadbeef",
|
||||
"tx.gas": 7
|
||||
}
|
||||
```
|
||||
|
||||
### Subscribing for transactions events
|
||||
|
||||
If the user wants to receive only a subset of transactions, ABCI-app must
|
||||
return a list of tags with a `DeliverTx` response. These tags will be parsed and
|
||||
matched with the current queries (subscribers). If the query matches the tags,
|
||||
subscriber will get the transaction event.
|
||||
|
||||
```
|
||||
/subscribe?query="tm.event = Tx AND tx.hash = AB0023433CF0334223212243BDD AND abci.account.invoice.number = 22"
|
||||
```
|
||||
|
||||
A new package must be developed to replace the current `events` package. It
|
||||
will allow clients to subscribe to a different types of events in the future:
|
||||
|
||||
```
|
||||
/subscribe?query="abci.account.invoice.number = 22"
|
||||
/subscribe?query="abci.account.invoice.owner CONTAINS Igor"
|
||||
```
|
||||
|
||||
### Fetching transactions
|
||||
|
||||
This is a bit tricky because a) we want to support a number of indexers, all of
|
||||
which have a different API b) we don't know whenever tags will be sufficient
|
||||
for the most apps (I guess we'll see).
|
||||
|
||||
```
|
||||
/txs/search?query="tx.hash = AB0023433CF0334223212243BDD AND abci.account.owner CONTAINS Igor"
|
||||
/txs/search?query="abci.account.owner = Igor"
|
||||
```
|
||||
|
||||
For historic queries we will need a indexing storage (Postgres, SQLite, ...).
|
||||
|
||||
### Issues
|
||||
|
||||
- https://github.com/tendermint/tendermint/issues/376
|
||||
- https://github.com/tendermint/tendermint/issues/287
|
||||
- https://github.com/tendermint/tendermint/issues/525 (related)
|
||||
|
||||
## Status
|
||||
|
||||
Implemented
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- same format for event notifications and search APIs
|
||||
- powerful enough query
|
||||
|
||||
### Negative
|
||||
|
||||
- performance of the `match` function (where we have too many queries / subscribers)
|
||||
- there is an issue where there are too many txs in the DB
|
||||
|
||||
### Neutral
|
||||
@@ -1,34 +0,0 @@
|
||||
# ADR 3: Must an ABCI-app have an RPC server?
|
||||
|
||||
## Context
|
||||
|
||||
ABCI-server could expose its own RPC-server and act as a proxy to Tendermint.
|
||||
|
||||
The idea was for the Tendermint RPC to just be a transparent proxy to the app.
|
||||
Clients need to talk to Tendermint for proofs, unless we burden all app devs
|
||||
with exposing Tendermint proof stuff. Also seems less complex to lock down one
|
||||
server than two, but granted it makes querying a bit more kludgy since it needs
|
||||
to be passed as a `Query`. Also, **having a very standard rpc interface means
|
||||
the light-client can work with all apps and handle proofs**. The only
|
||||
app-specific logic is decoding the binary data to a more readable form (eg.
|
||||
json). This is a huge advantage for code-reuse and standardization.
|
||||
|
||||
## Decision
|
||||
|
||||
We dont expose an RPC server on any of our ABCI-apps.
|
||||
|
||||
## Status
|
||||
|
||||
Implemented
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- Unified interface for all apps
|
||||
|
||||
### Negative
|
||||
|
||||
- `Query` interface
|
||||
|
||||
### Neutral
|
||||
@@ -1,38 +0,0 @@
|
||||
# ADR 004: Historical Validators
|
||||
|
||||
## Context
|
||||
|
||||
Right now, we can query the present validator set, but there is no history.
|
||||
If you were offline for a long time, there is no way to reconstruct past validators. This is needed for the light client and we agreed needs enhancement of the API.
|
||||
|
||||
## Decision
|
||||
|
||||
For every block, store a new structure that contains either the latest validator set,
|
||||
or the height of the last block for which the validator set changed. Note this is not
|
||||
the height of the block which returned the validator set change itself, but the next block,
|
||||
ie. the first block it comes into effect for.
|
||||
|
||||
Storing the validators will be handled by the `state` package.
|
||||
|
||||
At some point in the future, we may consider more efficient storage in the case where the validators
|
||||
are updated frequently - for instance by only saving the diffs, rather than the whole set.
|
||||
|
||||
An alternative approach suggested keeping the validator set, or diffs of it, in a merkle IAVL tree.
|
||||
While it might afford cheaper proofs that a validator set has not changed, it would be more complex,
|
||||
and likely less efficient.
|
||||
|
||||
## Status
|
||||
|
||||
Implemented
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- Can query old validator sets, with proof.
|
||||
|
||||
### Negative
|
||||
|
||||
- Writes an extra structure to disk with every block.
|
||||
|
||||
### Neutral
|
||||
@@ -1,85 +0,0 @@
|
||||
# ADR 005: Consensus Params
|
||||
|
||||
## Context
|
||||
|
||||
Consensus critical parameters controlling blockchain capacity have until now been hard coded, loaded from a local config, or neglected.
|
||||
Since they may be need to be different in different networks, and potentially to evolve over time within
|
||||
networks, we seek to initialize them in a genesis file, and expose them through the ABCI.
|
||||
|
||||
While we have some specific parameters now, like maximum block and transaction size, we expect to have more in the future,
|
||||
such as a period over which evidence is valid, or the frequency of checkpoints.
|
||||
|
||||
## Decision
|
||||
|
||||
### ConsensusParams
|
||||
|
||||
No consensus critical parameters should ever be found in the `config.toml`.
|
||||
|
||||
A new `ConsensusParams` is optionally included in the `genesis.json` file,
|
||||
and loaded into the `State`. Any items not included are set to their default value.
|
||||
A value of 0 is undefined (see ABCI, below). A value of -1 is used to indicate the parameter does not apply.
|
||||
The parameters are used to determine the validity of a block (and tx) via the union of all relevant parameters.
|
||||
|
||||
```
|
||||
type ConsensusParams struct {
|
||||
BlockSize
|
||||
TxSize
|
||||
BlockGossip
|
||||
}
|
||||
|
||||
type BlockSize struct {
|
||||
MaxBytes int
|
||||
MaxTxs int
|
||||
MaxGas int
|
||||
}
|
||||
|
||||
type TxSize struct {
|
||||
MaxBytes int
|
||||
MaxGas int
|
||||
}
|
||||
|
||||
type BlockGossip struct {
|
||||
BlockPartSizeBytes int
|
||||
}
|
||||
```
|
||||
|
||||
The `ConsensusParams` can evolve over time by adding new structs that cover different aspects of the consensus rules.
|
||||
|
||||
The `BlockPartSizeBytes` and the `BlockSize.MaxBytes` are enforced to be greater than 0.
|
||||
The former because we need a part size, the latter so that we always have at least some sanity check over the size of blocks.
|
||||
|
||||
### ABCI
|
||||
|
||||
#### InitChain
|
||||
|
||||
InitChain currently takes the initial validator set. It should be extended to also take parts of the ConsensusParams.
|
||||
There is some case to be made for it to take the entire Genesis, except there may be things in the genesis,
|
||||
like the BlockPartSize, that the app shouldn't really know about.
|
||||
|
||||
#### EndBlock
|
||||
|
||||
The EndBlock response includes a `ConsensusParams`, which includes BlockSize and TxSize, but not BlockGossip.
|
||||
Other param struct can be added to `ConsensusParams` in the future.
|
||||
The `0` value is used to denote no change.
|
||||
Any other value will update that parameter in the `State.ConsensusParams`, to be applied for the next block.
|
||||
Tendermint should have hard-coded upper limits as sanity checks.
|
||||
|
||||
## Status
|
||||
|
||||
Implemented
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- Alternative capacity limits and consensus parameters can be specified without re-compiling the software.
|
||||
- They can also change over time under the control of the application
|
||||
|
||||
### Negative
|
||||
|
||||
- More exposed parameters is more complexity
|
||||
- Different rules at different heights in the blockchain complicates fast sync
|
||||
|
||||
### Neutral
|
||||
|
||||
- The TxSize, which checks validity, may be in conflict with the config's `max_block_size_tx`, which determines proposal sizes
|
||||
@@ -1,229 +0,0 @@
|
||||
# ADR 006: Trust Metric Design
|
||||
|
||||
## Context
|
||||
|
||||
The proposed trust metric will allow Tendermint to maintain local trust rankings for peers it has directly interacted with, which can then be used to implement soft security controls. The calculations were obtained from the [TrustGuard](https://dl.acm.org/citation.cfm?id=1060808) project.
|
||||
|
||||
### Background
|
||||
|
||||
The Tendermint Core project developers would like to improve Tendermint security and reliability by keeping track of the level of trustworthiness peers have demonstrated within the peer-to-peer network. This way, undesirable outcomes from peers will not immediately result in them being dropped from the network (potentially causing drastic changes to take place). Instead, peers behavior can be monitored with appropriate metrics and be removed from the network once Tendermint Core is certain the peer is a threat. For example, when the PEXReactor makes a request for peers network addresses from a already known peer, and the returned network addresses are unreachable, this untrustworthy behavior should be tracked. Returning a few bad network addresses probably shouldn’t cause a peer to be dropped, while excessive amounts of this behavior does qualify the peer being dropped.
|
||||
|
||||
Trust metrics can be circumvented by malicious nodes through the use of strategic oscillation techniques, which adapts the malicious node’s behavior pattern in order to maximize its goals. For instance, if the malicious node learns that the time interval of the Tendermint trust metric is _X_ hours, then it could wait _X_ hours in-between malicious activities. We could try to combat this issue by increasing the interval length, yet this will make the system less adaptive to recent events.
|
||||
|
||||
Instead, having shorter intervals, but keeping a history of interval values, will give our metric the flexibility needed in order to keep the network stable, while also making it resilient against a strategic malicious node in the Tendermint peer-to-peer network. Also, the metric can access trust data over a rather long period of time while not greatly increasing its history size by aggregating older history values over a larger number of intervals, and at the same time, maintain great precision for the recent intervals. This approach is referred to as fading memories, and closely resembles the way human beings remember their experiences. The trade-off to using history data is that the interval values should be preserved in-between executions of the node.
|
||||
|
||||
### References
|
||||
|
||||
S. Mudhakar, L. Xiong, and L. Liu, “TrustGuard: Countering Vulnerabilities in Reputation Management for Decentralized Overlay Networks,” in _Proceedings of the 14th international conference on World Wide Web, pp. 422-431_, May 2005.
|
||||
|
||||
## Decision
|
||||
|
||||
The proposed trust metric will allow a developer to inform the trust metric store of all good and bad events relevant to a peer's behavior, and at any time, the metric can be queried for a peer's current trust ranking.
|
||||
|
||||
The three subsections below will cover the process being considered for calculating the trust ranking, the concept of the trust metric store, and the interface for the trust metric.
|
||||
|
||||
### Proposed Process
|
||||
|
||||
The proposed trust metric will count good and bad events relevant to the object, and calculate the percent of counters that are good over an interval with a predefined duration. This is the procedure that will continue for the life of the trust metric. When the trust metric is queried for the current **trust value**, a resilient equation will be utilized to perform the calculation.
|
||||
|
||||
The equation being proposed resembles a Proportional-Integral-Derivative (PID) controller used in control systems. The proportional component allows us to be sensitive to the value of the most recent interval, while the integral component allows us to incorporate trust values stored in the history data, and the derivative component allows us to give weight to sudden changes in the behavior of a peer. We compute the trust value of a peer in interval i based on its current trust ranking, its trust rating history prior to interval _i_ (over the past _maxH_ number of intervals) and its trust ranking fluctuation. We will break up the equation into the three components.
|
||||
|
||||
```math
|
||||
(1) Proportional Value = a * R[i]
|
||||
```
|
||||
|
||||
where _R_[*i*] denotes the raw trust value at time interval _i_ (where _i_ == 0 being current time) and _a_ is the weight applied to the contribution of the current reports. The next component of our equation uses a weighted sum over the last _maxH_ intervals to calculate the history value for time _i_:
|
||||
|
||||
`H[i] =` 
|
||||
|
||||
The weights can be chosen either optimistically or pessimistically. An optimistic weight creates larger weights for newer history data values, while the the pessimistic weight creates larger weights for time intervals with lower scores. The default weights used during the calculation of the history value are optimistic and calculated as _Wk_ = 0.8^_k_, for time interval _k_. With the history value available, we can now finish calculating the integral value:
|
||||
|
||||
```math
|
||||
(2) Integral Value = b * H[i]
|
||||
```
|
||||
|
||||
Where _H_[*i*] denotes the history value at time interval _i_ and _b_ is the weight applied to the contribution of past performance for the object being measured. The derivative component will be calculated as follows:
|
||||
|
||||
```math
|
||||
D[i] = R[i] – H[i]
|
||||
|
||||
(3) Derivative Value = c(D[i]) * D[i]
|
||||
```
|
||||
|
||||
Where the value of _c_ is selected based on the _D_[*i*] value relative to zero. The default selection process makes _c_ equal to 0 unless _D_[*i*] is a negative value, in which case c is equal to 1. The result is that the maximum penalty is applied when current behavior is lower than previously experienced behavior. If the current behavior is better than the previously experienced behavior, then the Derivative Value has no impact on the trust value. With the three components brought together, our trust value equation is calculated as follows:
|
||||
|
||||
```math
|
||||
TrustValue[i] = a * R[i] + b * H[i] + c(D[i]) * D[i]
|
||||
```
|
||||
|
||||
As a performance optimization that will keep the amount of raw interval data being saved to a reasonable size of _m_, while allowing us to represent 2^_m_ - 1 history intervals, we can employ the fading memories technique that will trade space and time complexity for the precision of the history data values by summarizing larger quantities of less recent values. While our equation above attempts to access up to _maxH_ (which can be 2^_m_ - 1), we will map those requests down to _m_ values using equation 4 below:
|
||||
|
||||
```math
|
||||
(4) j = index, where index > 0
|
||||
```
|
||||
|
||||
Where _j_ is one of _(0, 1, 2, … , m – 1)_ indices used to access history interval data. Now we can access the raw intervals using the following calculations:
|
||||
|
||||
```math
|
||||
R[0] = raw data for current time interval
|
||||
```
|
||||
|
||||
`R[j] =` 
|
||||
|
||||
### Trust Metric Store
|
||||
|
||||
Similar to the P2P subsystem AddrBook, the trust metric store will maintain information relevant to Tendermint peers. Additionally, the trust metric store will ensure that trust metrics will only be active for peers that a node is currently and directly engaged with.
|
||||
|
||||
Reactors will provide a peer key to the trust metric store in order to retrieve the associated trust metric. The trust metric can then record new positive and negative events experienced by the reactor, as well as provided the current trust score calculated by the metric.
|
||||
|
||||
When the node is shutting down, the trust metric store will save history data for trust metrics associated with all known peers. This saved information allows experiences with a peer to be preserved across node executions, which can span a tracking windows of days or weeks. The trust history data is loaded automatically during OnStart.
|
||||
|
||||
### Interface Detailed Design
|
||||
|
||||
Each trust metric allows for the recording of positive/negative events, querying the current trust value/score, and the stopping/pausing of tracking over time intervals. This can be seen below:
|
||||
|
||||
```go
|
||||
// TrustMetric - keeps track of peer reliability
|
||||
type TrustMetric struct {
|
||||
// Private elements.
|
||||
}
|
||||
|
||||
// Pause tells the metric to pause recording data over time intervals.
|
||||
// All method calls that indicate events will unpause the metric
|
||||
func (tm *TrustMetric) Pause() {}
|
||||
|
||||
// Stop tells the metric to stop recording data over time intervals
|
||||
func (tm *TrustMetric) Stop() {}
|
||||
|
||||
// BadEvents indicates that an undesirable event(s) took place
|
||||
func (tm *TrustMetric) BadEvents(num int) {}
|
||||
|
||||
// GoodEvents indicates that a desirable event(s) took place
|
||||
func (tm *TrustMetric) GoodEvents(num int) {}
|
||||
|
||||
// TrustValue gets the dependable trust value; always between 0 and 1
|
||||
func (tm *TrustMetric) TrustValue() float64 {}
|
||||
|
||||
// TrustScore gets a score based on the trust value always between 0 and 100
|
||||
func (tm *TrustMetric) TrustScore() int {}
|
||||
|
||||
// NewMetric returns a trust metric with the default configuration
|
||||
func NewMetric() *TrustMetric {}
|
||||
|
||||
//------------------------------------------------------------------------------------------------
|
||||
// For example
|
||||
|
||||
tm := NewMetric()
|
||||
|
||||
tm.BadEvents(1)
|
||||
score := tm.TrustScore()
|
||||
|
||||
tm.Stop()
|
||||
```
|
||||
|
||||
Some of the trust metric parameters can be configured. The weight values should probably be left alone in more cases, yet the time durations for the tracking window and individual time interval should be considered.
|
||||
|
||||
```go
|
||||
// TrustMetricConfig - Configures the weight functions and time intervals for the metric
|
||||
type TrustMetricConfig struct {
|
||||
// Determines the percentage given to current behavior
|
||||
ProportionalWeight float64
|
||||
|
||||
// Determines the percentage given to prior behavior
|
||||
IntegralWeight float64
|
||||
|
||||
// The window of time that the trust metric will track events across.
|
||||
// This can be set to cover many days without issue
|
||||
TrackingWindow time.Duration
|
||||
|
||||
// Each interval should be short for adapability.
|
||||
// Less than 30 seconds is too sensitive,
|
||||
// and greater than 5 minutes will make the metric numb
|
||||
IntervalLength time.Duration
|
||||
}
|
||||
|
||||
// DefaultConfig returns a config with values that have been tested and produce desirable results
|
||||
func DefaultConfig() TrustMetricConfig {}
|
||||
|
||||
// NewMetricWithConfig returns a trust metric with a custom configuration
|
||||
func NewMetricWithConfig(tmc TrustMetricConfig) *TrustMetric {}
|
||||
|
||||
//------------------------------------------------------------------------------------------------
|
||||
// For example
|
||||
|
||||
config := TrustMetricConfig{
|
||||
TrackingWindow: time.Minute * 60 * 24, // one day
|
||||
IntervalLength: time.Minute * 2,
|
||||
}
|
||||
|
||||
tm := NewMetricWithConfig(config)
|
||||
|
||||
tm.BadEvents(10)
|
||||
tm.Pause()
|
||||
tm.GoodEvents(1) // becomes active again
|
||||
```
|
||||
|
||||
A trust metric store should be created with a DB that has persistent storage so it can save history data across node executions. All trust metrics instantiated by the store will be created with the provided TrustMetricConfig configuration.
|
||||
|
||||
When you attempt to fetch the trust metric for a peer, and an entry does not exist in the trust metric store, a new metric is automatically created and the entry made within the store.
|
||||
|
||||
In additional to the fetching method, GetPeerTrustMetric, the trust metric store provides a method to call when a peer has disconnected from the node. This is so the metric can be paused (history data will not be saved) for periods of time when the node is not having direct experiences with the peer.
|
||||
|
||||
```go
|
||||
// TrustMetricStore - Manages all trust metrics for peers
|
||||
type TrustMetricStore struct {
|
||||
cmn.BaseService
|
||||
|
||||
// Private elements
|
||||
}
|
||||
|
||||
// OnStart implements Service
|
||||
func (tms *TrustMetricStore) OnStart(context.Context) error { return nil }
|
||||
|
||||
// OnStop implements Service
|
||||
func (tms *TrustMetricStore) OnStop() {}
|
||||
|
||||
// NewTrustMetricStore returns a store that saves data to the DB
|
||||
// and uses the config when creating new trust metrics
|
||||
func NewTrustMetricStore(db dbm.DB, tmc TrustMetricConfig) *TrustMetricStore {}
|
||||
|
||||
// Size returns the number of entries in the trust metric store
|
||||
func (tms *TrustMetricStore) Size() int {}
|
||||
|
||||
// GetPeerTrustMetric returns a trust metric by peer key
|
||||
func (tms *TrustMetricStore) GetPeerTrustMetric(key string) *TrustMetric {}
|
||||
|
||||
// PeerDisconnected pauses the trust metric associated with the peer identified by the key
|
||||
func (tms *TrustMetricStore) PeerDisconnected(key string) {}
|
||||
|
||||
//------------------------------------------------------------------------------------------------
|
||||
// For example
|
||||
|
||||
db := dbm.NewDB("trusthistory", "goleveldb", dirPathStr)
|
||||
tms := NewTrustMetricStore(db, DefaultConfig())
|
||||
|
||||
tm := tms.GetPeerTrustMetric(key)
|
||||
tm.BadEvents(1)
|
||||
|
||||
tms.PeerDisconnected(key)
|
||||
```
|
||||
|
||||
## Status
|
||||
|
||||
Approved.
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- The trust metric will allow Tendermint to make non-binary security and reliability decisions
|
||||
- Will help Tendermint implement deterrents that provide soft security controls, yet avoids disruption on the network
|
||||
- Will provide useful profiling information when analyzing performance over time related to peer interaction
|
||||
|
||||
### Negative
|
||||
|
||||
- Requires saving the trust metric history data across node executions
|
||||
|
||||
### Neutral
|
||||
|
||||
- Keep in mind that, good events need to be recorded just as bad events do using this implementation
|
||||
@@ -1,106 +0,0 @@
|
||||
# ADR 007: Trust Metric Usage Guide
|
||||
|
||||
## Context
|
||||
|
||||
Tendermint is required to monitor peer quality in order to inform its peer dialing and peer exchange strategies.
|
||||
|
||||
When a node first connects to the network, it is important that it can quickly find good peers.
|
||||
Thus, while a node has fewer connections, it should prioritize connecting to higher quality peers.
|
||||
As the node becomes well connected to the rest of the network, it can dial lesser known or lesser
|
||||
quality peers and help assess their quality. Similarly, when queried for peers, a node should make
|
||||
sure they dont return low quality peers.
|
||||
|
||||
Peer quality can be tracked using a trust metric that flags certain behaviours as good or bad. When enough
|
||||
bad behaviour accumulates, we can mark the peer as bad and disconnect.
|
||||
For example, when the PEXReactor makes a request for peers network addresses from an already known peer, and the returned network addresses are unreachable, this undesirable behavior should be tracked. Returning a few bad network addresses probably shouldn’t cause a peer to be dropped, while excessive amounts of this behavior does qualify the peer for removal. The originally proposed approach and design document for the trust metric can be found in the [ADR 006](adr-006-trust-metric.md) document.
|
||||
|
||||
The trust metric implementation allows a developer to obtain a peer's trust metric from a trust metric store, and track good and bad events relevant to a peer's behavior, and at any time, the peer's metric can be queried for a current trust value. The current trust value is calculated with a formula that utilizes current behavior, previous behavior, and change between the two. Current behavior is calculated as the percentage of good behavior within a time interval. The time interval is short; probably set between 30 seconds and 5 minutes. On the other hand, the historic data can estimate a peer's behavior over days worth of tracking. At the end of a time interval, the current behavior becomes part of the historic data, and a new time interval begins with the good and bad counters reset to zero.
|
||||
|
||||
These are some important things to keep in mind regarding how the trust metrics handle time intervals and scoring:
|
||||
|
||||
- Each new time interval begins with a perfect score
|
||||
- Bad events quickly bring the score down and good events cause the score to slowly rise
|
||||
- When the time interval is over, the percentage of good events becomes historic data.
|
||||
|
||||
Some useful information about the inner workings of the trust metric:
|
||||
|
||||
- When a trust metric is first instantiated, a timer (ticker) periodically fires in order to handle transitions between trust metric time intervals
|
||||
- If a peer is disconnected from a node, the timer should be paused, since the node is no longer connected to that peer
|
||||
- The ability to pause the metric is supported with the store **PeerDisconnected** method and the metric **Pause** method
|
||||
- After a pause, if a good or bad event method is called on a metric, it automatically becomes unpaused and begins a new time interval.
|
||||
|
||||
## Decision
|
||||
|
||||
The trust metric capability is now available, yet, it still leaves the question of how should it be applied throughout Tendermint in order to properly track the quality of peers?
|
||||
|
||||
### Proposed Process
|
||||
|
||||
Peers are managed using an address book and a trust metric:
|
||||
|
||||
- The address book keeps a record of peers and provides selection methods
|
||||
- The trust metric tracks the quality of the peers
|
||||
|
||||
#### Presence in Address Book
|
||||
|
||||
Outbound peers are added to the address book before they are dialed,
|
||||
and inbound peers are added once the peer connection is set up.
|
||||
Peers are also added to the address book when they are received in response to
|
||||
a pexRequestMessage.
|
||||
|
||||
While a node has less than `needAddressThreshold`, it will periodically request more,
|
||||
via pexRequestMessage, from randomly selected peers and from newly dialed outbound peers.
|
||||
|
||||
When a new address is added to an address book that has more than `0.5*needAddressThreshold` addresses,
|
||||
then with some low probability, a randomly chosen low quality peer is removed.
|
||||
|
||||
#### Outbound Peers
|
||||
|
||||
Peers attempt to maintain a minimum number of outbound connections by
|
||||
repeatedly querying the address book for peers to connect to.
|
||||
While a node has few to no outbound connections, the address book is biased to return
|
||||
higher quality peers. As the node increases the number of outbound connections,
|
||||
the address book is biased to return less-vetted or lower-quality peers.
|
||||
|
||||
#### Inbound Peers
|
||||
|
||||
Peers also maintain a maximum number of total connections, MaxNumPeers.
|
||||
If a peer has MaxNumPeers, new incoming connections will be accepted with low probability.
|
||||
When such a new connection is accepted, the peer disconnects from a probabilistically chosen low ranking peer
|
||||
so it does not exceed MaxNumPeers.
|
||||
|
||||
#### Peer Exchange
|
||||
|
||||
When a peer receives a pexRequestMessage, it returns a random sample of high quality peers from the address book. Peers with no score or low score should not be inclided in a response to pexRequestMessage.
|
||||
|
||||
#### Peer Quality
|
||||
|
||||
Peer quality is tracked in the connection and across the reactors by storing the TrustMetric in the peer's
|
||||
thread safe Data store.
|
||||
|
||||
Peer behaviour is then defined as one of the following:
|
||||
|
||||
- Fatal - something outright malicious that causes us to disconnect the peer and ban it from the address book for some amount of time
|
||||
- Bad - Any kind of timeout, messages that don't unmarshal, fail other validity checks, or messages we didn't ask for or aren't expecting (usually worth one bad event)
|
||||
- Neutral - Unknown channels/message types/version upgrades (no good or bad events recorded)
|
||||
- Correct - Normal correct behavior (worth one good event)
|
||||
- Good - some random majority of peers per reactor sending us useful messages (worth more than one good event).
|
||||
|
||||
Note that Fatal behaviour causes us to remove the peer, and neutral behaviour does not affect the score.
|
||||
|
||||
## Status
|
||||
|
||||
Proposed.
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- Bringing the address book and trust metric store together will cause the network to be built in a way that encourages greater security and reliability.
|
||||
|
||||
### Negative
|
||||
|
||||
- TBD
|
||||
|
||||
### Neutral
|
||||
|
||||
- Keep in mind that, good events need to be recorded just as bad events do using this implementation.
|
||||
@@ -1,35 +0,0 @@
|
||||
# ADR 008: SocketPV
|
||||
|
||||
Tendermint node's should support only two in-process PrivValidator
|
||||
implementations:
|
||||
|
||||
- FilePV uses an unencrypted private key in a "priv_validator.json" file - no
|
||||
configuration required (just `tendermint init validator`).
|
||||
- TCPVal and IPCVal use TCP and Unix sockets respectively to send signing requests
|
||||
to another process - the user is responsible for starting that process themselves.
|
||||
|
||||
Both TCPVal and IPCVal addresses can be provided via flags at the command line
|
||||
or in the configuration file; TCPVal addresses must be of the form
|
||||
`tcp://<ip_address>:<port>` and IPCVal addresses `unix:///path/to/file.sock` -
|
||||
doing so will cause Tendermint to ignore any private validator files.
|
||||
|
||||
TCPVal will listen on the given address for incoming connections from an external
|
||||
private validator process. It will halt any operation until at least one external
|
||||
process successfully connected.
|
||||
|
||||
The external priv_validator process will dial the address to connect to
|
||||
Tendermint, and then Tendermint will send requests on the ensuing connection to
|
||||
sign votes and proposals. Thus the external process initiates the connection,
|
||||
but the Tendermint process makes all requests. In a later stage we're going to
|
||||
support multiple validators for fault tolerance. To prevent double signing they
|
||||
need to be synced, which is deferred to an external solution (see #1185).
|
||||
|
||||
Conversely, IPCVal will make an outbound connection to an existing socket opened
|
||||
by the external validator process.
|
||||
|
||||
In addition, Tendermint will provide implementations that can be run in that
|
||||
external process. These include:
|
||||
|
||||
- FilePV will encrypt the private key, and the user must enter password to
|
||||
decrypt key when process is started.
|
||||
- LedgerPV uses a Ledger Nano S to handle all signing.
|
||||
@@ -1,271 +0,0 @@
|
||||
# ADR 009: ABCI UX Improvements
|
||||
|
||||
## Changelog
|
||||
|
||||
23-06-2018: Some minor fixes from review
|
||||
07-06-2018: Some updates based on discussion with Jae
|
||||
07-06-2018: Initial draft to match what was released in ABCI v0.11
|
||||
|
||||
## Context
|
||||
|
||||
The ABCI was first introduced in late 2015. It's purpose is to be:
|
||||
|
||||
- a generic interface between state machines and their replication engines
|
||||
- agnostic to the language the state machine is written in
|
||||
- agnostic to the replication engine that drives it
|
||||
|
||||
This means ABCI should provide an interface for both pluggable applications and
|
||||
pluggable consensus engines.
|
||||
|
||||
To achieve this, it uses Protocol Buffers (proto3) for message types. The dominant
|
||||
implementation is in Go.
|
||||
|
||||
After some recent discussions with the community on github, the following were
|
||||
identified as pain points:
|
||||
|
||||
- Amino encoded types
|
||||
- Managing validator sets
|
||||
- Imports in the protobuf file
|
||||
|
||||
See the [references](#references) for more.
|
||||
|
||||
### Imports
|
||||
|
||||
The native proto library in Go generates inflexible and verbose code.
|
||||
Many in the Go community have adopted a fork called
|
||||
[gogoproto](https://github.com/gogo/protobuf) that provides a
|
||||
variety of features aimed to improve the developer experience.
|
||||
While `gogoproto` is nice, it creates an additional dependency, and compiling
|
||||
the protobuf types for other languages has been reported to fail when `gogoproto` is used.
|
||||
|
||||
### Amino
|
||||
|
||||
Amino is an encoding protocol designed to improve over insufficiencies of protobuf.
|
||||
It's goal is to be proto4.
|
||||
|
||||
Many people are frustrated by incompatibility with protobuf,
|
||||
and with the requirement for Amino to be used at all within ABCI.
|
||||
|
||||
We intend to make Amino successful enough that we can eventually use it for ABCI
|
||||
message types directly. By then it should be called proto4. In the meantime,
|
||||
we want it to be easy to use.
|
||||
|
||||
### PubKey
|
||||
|
||||
PubKeys are encoded using Amino (and before that, go-wire).
|
||||
Ideally, PubKeys are an interface type where we don't know all the
|
||||
implementation types, so its unfitting to use `oneof` or `enum`.
|
||||
|
||||
### Addresses
|
||||
|
||||
The address for ED25519 pubkey is the RIPEMD160 of the Amino
|
||||
encoded pubkey. This introduces an Amino dependency in the address generation,
|
||||
a functionality that is widely required and should be easy to compute as
|
||||
possible.
|
||||
|
||||
### Validators
|
||||
|
||||
To change the validator set, applications can return a list of validator updates
|
||||
with ResponseEndBlock. In these updates, the public key _must_ be included,
|
||||
because Tendermint requires the public key to verify validator signatures. This
|
||||
means ABCI developers have to work with PubKeys. That said, it would also be
|
||||
convenient to work with address information, and for it to be simple to do so.
|
||||
|
||||
### AbsentValidators
|
||||
|
||||
Tendermint also provides a list of validators in BeginBlock who did not sign the
|
||||
last block. This allows applications to reflect availability behaviour in the
|
||||
application, for instance by punishing validators for not having votes included
|
||||
in commits.
|
||||
|
||||
### InitChain
|
||||
|
||||
Tendermint passes in a list of validators here, and nothing else. It would
|
||||
benefit the application to be able to control the initial validator set. For
|
||||
instance the genesis file could include application-based information about the
|
||||
initial validator set that the application could process to determine the
|
||||
initial validator set. Additionally, InitChain would benefit from getting all
|
||||
the genesis information.
|
||||
|
||||
### Header
|
||||
|
||||
ABCI provides the Header in RequestBeginBlock so the application can have
|
||||
important information about the latest state of the blockchain.
|
||||
|
||||
## Decision
|
||||
|
||||
### Imports
|
||||
|
||||
Move away from gogoproto. In the short term, we will just maintain a second
|
||||
protobuf file without the gogoproto annotations. In the medium term, we will
|
||||
make copies of all the structs in Golang and shuttle back and forth. In the long
|
||||
term, we will use Amino.
|
||||
|
||||
### Amino
|
||||
|
||||
To simplify ABCI application development in the short term,
|
||||
Amino will be completely removed from the ABCI:
|
||||
|
||||
- It will not be required for PubKey encoding
|
||||
- It will not be required for computing PubKey addresses
|
||||
|
||||
That said, we are working to make Amino a huge success, and to become proto4.
|
||||
To facilitate adoption and cross-language compatibility in the near-term, Amino
|
||||
v1 will:
|
||||
|
||||
- be fully compatible with the subset of proto3 that excludes `oneof`
|
||||
- use the Amino prefix system to provide interface types, as opposed to `oneof`
|
||||
style union types.
|
||||
|
||||
That said, an Amino v2 will be worked on to improve the performance of the
|
||||
format and its useability in cryptographic applications.
|
||||
|
||||
### PubKey
|
||||
|
||||
Encoding schemes infect software. As a generic middleware, ABCI aims to have
|
||||
some cross scheme compatibility. For this it has no choice but to include opaque
|
||||
bytes from time to time. While we will not enforce Amino encoding for these
|
||||
bytes yet, we need to provide a type system. The simplest way to do this is to
|
||||
use a type string.
|
||||
|
||||
PubKey will now look like:
|
||||
|
||||
```
|
||||
message PubKey {
|
||||
string type
|
||||
bytes data
|
||||
}
|
||||
```
|
||||
|
||||
where `type` can be:
|
||||
|
||||
- "ed225519", with `data = <raw 32-byte pubkey>`
|
||||
- "secp256k1", with `data = <33-byte OpenSSL compressed pubkey>`
|
||||
|
||||
As we want to retain flexibility here, and since ideally, PubKey would be an
|
||||
interface type, we do not use `enum` or `oneof`.
|
||||
|
||||
### Addresses
|
||||
|
||||
To simplify and improve computing addresses, we change it to the first 20-bytes of the SHA256
|
||||
of the raw 32-byte public key.
|
||||
|
||||
We continue to use the Bitcoin address scheme for secp256k1 keys.
|
||||
|
||||
### Validators
|
||||
|
||||
Add a `bytes address` field:
|
||||
|
||||
```
|
||||
message Validator {
|
||||
bytes address
|
||||
PubKey pub_key
|
||||
int64 power
|
||||
}
|
||||
```
|
||||
|
||||
### RequestBeginBlock and AbsentValidators
|
||||
|
||||
To simplify this, RequestBeginBlock will include the complete validator set,
|
||||
including the address, and voting power of each validator, along
|
||||
with a boolean for whether or not they voted:
|
||||
|
||||
```
|
||||
message RequestBeginBlock {
|
||||
bytes hash
|
||||
Header header
|
||||
LastCommitInfo last_commit_info
|
||||
repeated Evidence byzantine_validators
|
||||
}
|
||||
|
||||
message LastCommitInfo {
|
||||
int32 CommitRound
|
||||
repeated SigningValidator validators
|
||||
}
|
||||
|
||||
message SigningValidator {
|
||||
Validator validator
|
||||
bool signed_last_block
|
||||
}
|
||||
```
|
||||
|
||||
Note that in Validators in RequestBeginBlock, we DO NOT include public keys. Public keys are
|
||||
larger than addresses and in the future, with quantum computers, will be much
|
||||
larger. The overhead of passing them, especially during fast-sync, is
|
||||
significant.
|
||||
|
||||
Additional, addresses are changing to be simpler to compute, further removing
|
||||
the need to include pubkeys here.
|
||||
|
||||
In short, ABCI developers must be aware of both addresses and public keys.
|
||||
|
||||
### ResponseEndBlock
|
||||
|
||||
Since ResponseEndBlock includes Validator, it must now include their address.
|
||||
|
||||
### InitChain
|
||||
|
||||
Change RequestInitChain to give the app all the information from the genesis file:
|
||||
|
||||
```
|
||||
message RequestInitChain {
|
||||
int64 time
|
||||
string chain_id
|
||||
ConsensusParams consensus_params
|
||||
repeated Validator validators
|
||||
bytes app_state_bytes
|
||||
}
|
||||
```
|
||||
|
||||
Change ResponseInitChain to allow the app to specify the initial validator set
|
||||
and consensus parameters.
|
||||
|
||||
```
|
||||
message ResponseInitChain {
|
||||
ConsensusParams consensus_params
|
||||
repeated Validator validators
|
||||
}
|
||||
```
|
||||
|
||||
### Header
|
||||
|
||||
Now that Tendermint Amino will be compatible with proto3, the Header in ABCI
|
||||
should exactly match the Tendermint header - they will then be encoded
|
||||
identically in ABCI and in Tendermint Core.
|
||||
|
||||
## Status
|
||||
|
||||
Implemented
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- Easier for developers to build on the ABCI
|
||||
- ABCI and Tendermint headers are identically serialized
|
||||
|
||||
### Negative
|
||||
|
||||
- Maintenance overhead of alternative type encoding scheme
|
||||
- Performance overhead of passing all validator info every block (at least its
|
||||
only addresses, and not also pubkeys)
|
||||
- Maintenance overhead of duplicate types
|
||||
|
||||
### Neutral
|
||||
|
||||
- ABCI developers must know about validator addresses
|
||||
|
||||
## References
|
||||
|
||||
- [ABCI v0.10.3 Specification (before this
|
||||
proposal)](https://github.com/tendermint/abci/blob/v0.10.3/specification.rst)
|
||||
- [ABCI v0.11.0 Specification (implementing first draft of this
|
||||
proposal)](https://github.com/tendermint/abci/blob/v0.11.0/specification.md)
|
||||
- [Ed25519 addresses](https://github.com/tendermint/go-crypto/issues/103)
|
||||
- [InitChain contains the
|
||||
Genesis](https://github.com/tendermint/abci/issues/216)
|
||||
- [PubKeys](https://github.com/tendermint/tendermint/issues/1524)
|
||||
- [Notes on
|
||||
Header](https://github.com/tendermint/tendermint/issues/1605)
|
||||
- [Gogoproto issues](https://github.com/tendermint/abci/issues/256)
|
||||
- [Absent Validators](https://github.com/tendermint/abci/issues/231)
|
||||
@@ -1,77 +0,0 @@
|
||||
# ADR 010: Crypto Changes
|
||||
|
||||
## Context
|
||||
|
||||
Tendermint is a cryptographic protocol that uses and composes a variety of cryptographic primitives.
|
||||
|
||||
After nearly 4 years of development, Tendermint has recently undergone multiple security reviews to search for vulnerabilities and to assess the the use and composition of cryptographic primitives.
|
||||
|
||||
### Hash Functions
|
||||
|
||||
Tendermint uses RIPEMD160 universally as a hash function, most notably in its Merkle tree implementation.
|
||||
|
||||
RIPEMD160 was chosen because it provides the shortest fingerprint that is long enough to be considered secure (ie. birthday bound of 80-bits).
|
||||
It was also developed in the open academic community, unlike NSA-designed algorithms like SHA256.
|
||||
|
||||
That said, the cryptographic community appears to unanimously agree on the security of SHA256. It has become a universal standard, especially now that SHA1 is broken, being required in TLS connections and having optimized support in hardware.
|
||||
|
||||
### Merkle Trees
|
||||
|
||||
Tendermint uses a simple Merkle tree to compute digests of large structures like transaction batches
|
||||
and even blockchain headers. The Merkle tree length prefixes byte arrays before concatenating and hashing them.
|
||||
It uses RIPEMD160.
|
||||
|
||||
### Addresses
|
||||
|
||||
ED25519 addresses are computed using the RIPEMD160 of the Amino encoding of the public key.
|
||||
RIPEMD160 is generally considered an outdated hash function, and is much slower
|
||||
than more modern functions like SHA256 or Blake2.
|
||||
|
||||
### Authenticated Encryption
|
||||
|
||||
Tendermint P2P connections use authenticated encryption to provide privacy and authentication in the communications.
|
||||
This is done using the simple Station-to-Station protocol with the NaCL Ed25519 library.
|
||||
|
||||
While there have been no vulnerabilities found in the implementation, there are some concerns:
|
||||
|
||||
- NaCL uses Salsa20, a not-widely used and relatively out-dated stream cipher that has been obsoleted by ChaCha20
|
||||
- Connections use RIPEMD160 to compute a value that is used for the encryption nonce with subtle requirements on how it's used
|
||||
|
||||
## Decision
|
||||
|
||||
### Hash Functions
|
||||
|
||||
Use the first 20-bytes of the SHA256 hash instead of RIPEMD160 for everything
|
||||
|
||||
### Merkle Trees
|
||||
|
||||
TODO
|
||||
|
||||
### Addresses
|
||||
|
||||
Compute ED25519 addresses as the first 20-bytes of the SHA256 of the raw 32-byte public key
|
||||
|
||||
### Authenticated Encryption
|
||||
|
||||
Make the following changes:
|
||||
|
||||
- Use xChaCha20 instead of xSalsa20 - https://github.com/tendermint/tendermint/issues/1124
|
||||
- Use an HKDF instead of RIPEMD160 to compute nonces - https://github.com/tendermint/tendermint/issues/1165
|
||||
|
||||
## Status
|
||||
|
||||
Implemented
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- More modern and standard cryptographic functions with wider adoption and hardware acceleration
|
||||
|
||||
### Negative
|
||||
|
||||
- Exact authenticated encryption construction isn't already provided in a well-used library
|
||||
|
||||
### Neutral
|
||||
|
||||
## References
|
||||
@@ -1,116 +0,0 @@
|
||||
# ADR 011: Monitoring
|
||||
|
||||
## Changelog
|
||||
|
||||
08-06-2018: Initial draft
|
||||
11-06-2018: Reorg after @xla comments
|
||||
13-06-2018: Clarification about usage of labels
|
||||
|
||||
## Context
|
||||
|
||||
In order to bring more visibility into Tendermint, we would like it to report
|
||||
metrics and, maybe later, traces of transactions and RPC queries. See
|
||||
https://github.com/tendermint/tendermint/issues/986.
|
||||
|
||||
A few solutions were considered:
|
||||
|
||||
1. [Prometheus](https://prometheus.io)
|
||||
a) Prometheus API
|
||||
b) [go-kit metrics package](https://github.com/go-kit/kit/tree/master/metrics) as an interface plus Prometheus
|
||||
c) [telegraf](https://github.com/influxdata/telegraf)
|
||||
d) new service, which will listen to events emitted by pubsub and report metrics
|
||||
2. [OpenCensus](https://opencensus.io/introduction/)
|
||||
|
||||
### 1. Prometheus
|
||||
|
||||
Prometheus seems to be the most popular product out there for monitoring. It has
|
||||
a Go client library, powerful queries, alerts.
|
||||
|
||||
**a) Prometheus API**
|
||||
|
||||
We can commit to using Prometheus in Tendermint, but I think Tendermint users
|
||||
should be free to choose whatever monitoring tool they feel will better suit
|
||||
their needs (if they don't have existing one already). So we should try to
|
||||
abstract interface enough so people can switch between Prometheus and other
|
||||
similar tools.
|
||||
|
||||
**b) go-kit metrics package as an interface**
|
||||
|
||||
metrics package provides a set of uniform interfaces for service
|
||||
instrumentation and offers adapters to popular metrics packages:
|
||||
|
||||
https://godoc.org/github.com/go-kit/kit/metrics#pkg-subdirectories
|
||||
|
||||
Comparing to Prometheus API, we're losing customisability and control, but gaining
|
||||
freedom in choosing any instrument from the above list given we will extract
|
||||
metrics creation into a separate function (see "providers" in node/node.go).
|
||||
|
||||
**c) telegraf**
|
||||
|
||||
Unlike already discussed options, telegraf does not require modifying Tendermint
|
||||
source code. You create something called an input plugin, which polls
|
||||
Tendermint RPC every second and calculates the metrics itself.
|
||||
|
||||
While it may sound good, but some metrics we want to report are not exposed via
|
||||
RPC or pubsub, therefore can't be accessed externally.
|
||||
|
||||
**d) service, listening to pubsub**
|
||||
|
||||
Same issue as the above.
|
||||
|
||||
### 2. opencensus
|
||||
|
||||
opencensus provides both metrics and tracing, which may be important in the
|
||||
future. It's API looks different from go-kit and Prometheus, but looks like it
|
||||
covers everything we need.
|
||||
|
||||
Unfortunately, OpenCensus go client does not define any
|
||||
interfaces, so if we want to abstract away metrics we
|
||||
will need to write interfaces ourselves.
|
||||
|
||||
### List of metrics
|
||||
|
||||
| | Name | Type | Description |
|
||||
| --- | ------------------------------------ | ------ | ----------------------------------------------------------------------------- |
|
||||
| A | consensus_height | Gauge | |
|
||||
| A | consensus_validators | Gauge | Number of validators who signed |
|
||||
| A | consensus_validators_power | Gauge | Total voting power of all validators |
|
||||
| A | consensus_missing_validators | Gauge | Number of validators who did not sign |
|
||||
| A | consensus_missing_validators_power | Gauge | Total voting power of the missing validators |
|
||||
| A | consensus_byzantine_validators | Gauge | Number of validators who tried to double sign |
|
||||
| A | consensus_byzantine_validators_power | Gauge | Total voting power of the byzantine validators |
|
||||
| A | consensus_block_interval | Timing | Time between this and last block (Block.Header.Time) |
|
||||
| | consensus_block_time | Timing | Time to create a block (from creating a proposal to commit) |
|
||||
| | consensus_time_between_blocks | Timing | Time between committing last block and (receiving proposal creating proposal) |
|
||||
| A | consensus_rounds | Gauge | Number of rounds |
|
||||
| | consensus_prevotes | Gauge | |
|
||||
| | consensus_precommits | Gauge | |
|
||||
| | consensus_prevotes_total_power | Gauge | |
|
||||
| | consensus_precommits_total_power | Gauge | |
|
||||
| A | consensus_num_txs | Gauge | |
|
||||
| A | mempool_size | Gauge | |
|
||||
| A | consensus_total_txs | Gauge | |
|
||||
| A | consensus_block_size | Gauge | In bytes |
|
||||
| A | p2p_peers | Gauge | Number of peers node's connected to |
|
||||
|
||||
`A` - will be implemented in the fist place.
|
||||
|
||||
**Proposed solution**
|
||||
|
||||
## Status
|
||||
|
||||
Implemented
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
Better visibility, support of variety of monitoring backends
|
||||
|
||||
### Negative
|
||||
|
||||
One more library to audit, messing metrics reporting code with business domain.
|
||||
|
||||
### Neutral
|
||||
|
||||
-
|
||||
@@ -1,113 +0,0 @@
|
||||
# ADR 012: PeerTransport
|
||||
|
||||
## Context
|
||||
|
||||
One of the more apparent problems with the current architecture in the p2p
|
||||
package is that there is no clear separation of concerns between different
|
||||
components. Most notably the `Switch` is currently doing physical connection
|
||||
handling. An artifact is the dependency of the Switch on
|
||||
`[config.P2PConfig`](https://github.com/tendermint/tendermint/blob/05a76fb517f50da27b4bfcdc7b4cf185fc61eff6/config/config.go#L272-L339).
|
||||
|
||||
Addresses:
|
||||
|
||||
- [#2046](https://github.com/tendermint/tendermint/issues/2046)
|
||||
- [#2047](https://github.com/tendermint/tendermint/issues/2047)
|
||||
|
||||
First iteraton in [#2067](https://github.com/tendermint/tendermint/issues/2067)
|
||||
|
||||
## Decision
|
||||
|
||||
Transport concerns will be handled by a new component (`PeerTransport`) which
|
||||
will provide Peers at its boundary to the caller. In turn `Switch` will use
|
||||
this new component accept new `Peer`s and dial them based on `NetAddress`.
|
||||
|
||||
### PeerTransport
|
||||
|
||||
Responsible for emitting and connecting to Peers. The implementation of `Peer`
|
||||
is left to the transport, which implies that the chosen transport dictates the
|
||||
characteristics of the implementation handed back to the `Switch`. Each
|
||||
transport implementation is responsible to filter establishing peers specific
|
||||
to its domain, for the default multiplexed implementation the following will
|
||||
apply:
|
||||
|
||||
- connections from our own node
|
||||
- handshake fails
|
||||
- upgrade to secret connection fails
|
||||
- prevent duplicate ip
|
||||
- prevent duplicate id
|
||||
- nodeinfo incompatibility
|
||||
|
||||
```go
|
||||
// PeerTransport proxies incoming and outgoing peer connections.
|
||||
type PeerTransport interface {
|
||||
// Accept returns a newly connected Peer.
|
||||
Accept() (Peer, error)
|
||||
|
||||
// Dial connects to a Peer.
|
||||
Dial(NetAddress) (Peer, error)
|
||||
}
|
||||
|
||||
// EXAMPLE OF DEFAULT IMPLEMENTATION
|
||||
|
||||
// multiplexTransport accepts tcp connections and upgrades to multiplexted
|
||||
// peers.
|
||||
type multiplexTransport struct {
|
||||
listener net.Listener
|
||||
|
||||
acceptc chan accept
|
||||
closec <-chan struct{}
|
||||
listenc <-chan struct{}
|
||||
|
||||
dialTimeout time.Duration
|
||||
handshakeTimeout time.Duration
|
||||
nodeAddr NetAddress
|
||||
nodeInfo NodeInfo
|
||||
nodeKey NodeKey
|
||||
|
||||
// TODO(xla): Remove when MConnection is refactored into mPeer.
|
||||
mConfig conn.MConnConfig
|
||||
}
|
||||
|
||||
var _ PeerTransport = (*multiplexTransport)(nil)
|
||||
|
||||
// NewMTransport returns network connected multiplexed peers.
|
||||
func NewMTransport(
|
||||
nodeAddr NetAddress,
|
||||
nodeInfo NodeInfo,
|
||||
nodeKey NodeKey,
|
||||
) *multiplexTransport
|
||||
```
|
||||
|
||||
### Switch
|
||||
|
||||
From now the Switch will depend on a fully setup `PeerTransport` to
|
||||
retrieve/reach out to its peers. As the more low-level concerns are pushed to
|
||||
the transport, we can omit passing the `config.P2PConfig` to the Switch.
|
||||
|
||||
```go
|
||||
func NewSwitch(transport PeerTransport, opts ...SwitchOption) *Switch
|
||||
```
|
||||
|
||||
## Status
|
||||
|
||||
In Review.
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- free Switch from transport concerns - simpler implementation
|
||||
- pluggable transport implementation - simpler test setup
|
||||
- remove Switch dependency on P2PConfig - easier to test
|
||||
|
||||
### Negative
|
||||
|
||||
- more setup for tests which depend on Switches
|
||||
|
||||
### Neutral
|
||||
|
||||
- multiplexed will be the default implementation
|
||||
|
||||
[0] These guards could be potentially extended to be pluggable much like
|
||||
middlewares to express different concerns required by differentally configured
|
||||
environments.
|
||||
@@ -1,99 +0,0 @@
|
||||
# ADR 013: Need for symmetric cryptography
|
||||
|
||||
## Context
|
||||
|
||||
We require symmetric ciphers to handle how we encrypt keys in the sdk,
|
||||
and to potentially encrypt `priv_validator.json` in tendermint.
|
||||
|
||||
Currently we use AEAD's to support symmetric encryption,
|
||||
which is great since we want data integrity in addition to privacy and authenticity.
|
||||
We don't currently have a scenario where we want to encrypt without data integrity,
|
||||
so it is fine to optimize our code to just use AEAD's.
|
||||
Currently there is not a way to switch out AEAD's easily, this ADR outlines a way
|
||||
to easily swap these out.
|
||||
|
||||
### How do we encrypt with AEAD's
|
||||
|
||||
AEAD's typically require a nonce in addition to the key.
|
||||
For the purposes we require symmetric cryptography for,
|
||||
we need encryption to be stateless.
|
||||
Because of this we use random nonces.
|
||||
(Thus the AEAD must support random nonces)
|
||||
|
||||
We currently construct a random nonce, and encrypt the data with it.
|
||||
The returned value is `nonce || encrypted data`.
|
||||
The limitation of this is that does not provide a way to identify
|
||||
which algorithm was used in encryption.
|
||||
Consequently decryption with multiple algoritms is sub-optimal.
|
||||
(You have to try them all)
|
||||
|
||||
## Decision
|
||||
|
||||
We should create the following two methods in a new `crypto/encoding/symmetric` package:
|
||||
|
||||
```golang
|
||||
func Encrypt(aead cipher.AEAD, plaintext []byte) (ciphertext []byte, err error)
|
||||
func Decrypt(key []byte, ciphertext []byte) (plaintext []byte, err error)
|
||||
func Register(aead cipher.AEAD, algo_name string, NewAead func(key []byte) (cipher.Aead, error)) error
|
||||
```
|
||||
|
||||
This allows you to specify the algorithm in encryption, but not have to specify
|
||||
it in decryption.
|
||||
This is intended for ease of use in downstream applications, in addition to people
|
||||
looking at the file directly.
|
||||
One downside is that for the encrypt function you must have already initialized an AEAD,
|
||||
but I don't really see this as an issue.
|
||||
|
||||
If there is no error in encryption, Encrypt will return `algo_name || nonce || aead_ciphertext`.
|
||||
`algo_name` should be length prefixed, using standard varuint encoding.
|
||||
This will be binary data, but thats not a problem considering the nonce and ciphertext are also binary.
|
||||
|
||||
This solution requires a mapping from aead type to name.
|
||||
We can achieve this via reflection.
|
||||
|
||||
```golang
|
||||
func getType(myvar interface{}) string {
|
||||
if t := reflect.TypeOf(myvar); t.Kind() == reflect.Ptr {
|
||||
return "*" + t.Elem().Name()
|
||||
} else {
|
||||
return t.Name()
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Then we maintain a map from the name returned from `getType(aead)` to `algo_name`.
|
||||
|
||||
In decryption, we read the `algo_name`, and then instantiate a new AEAD with the key.
|
||||
Then we call the AEAD's decrypt method on the provided nonce/ciphertext.
|
||||
|
||||
`Register` allows a downstream user to add their own desired AEAD to the symmetric package.
|
||||
It will error if the AEAD name is already registered.
|
||||
This prevents a malicious import from modifying / nullifying an AEAD at runtime.
|
||||
|
||||
## Implementation strategy
|
||||
|
||||
The golang implementation of what is proposed is rather straight forward.
|
||||
The concern is that we will break existing private keys if we just switch to this.
|
||||
If this is concerning, we can make a simple script which doesn't require decoding privkeys,
|
||||
for converting from the old format to the new one.
|
||||
|
||||
## Status
|
||||
|
||||
Proposed.
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- Allows us to support new AEAD's, in a way that makes decryption easier
|
||||
- Allows downstream users to add their own AEAD
|
||||
|
||||
### Negative
|
||||
|
||||
- We will have to break all private keys stored on disk.
|
||||
They can be recovered using seed words, and upgrade scripts are simple.
|
||||
|
||||
### Neutral
|
||||
|
||||
- Caller has to instantiate the AEAD with the private key.
|
||||
However it forces them to be aware of what signing algorithm they are using, which is a positive.
|
||||
@@ -1,63 +0,0 @@
|
||||
# ADR 014: Secp256k1 Signature Malleability
|
||||
|
||||
## Context
|
||||
|
||||
Secp256k1 has two layers of malleability.
|
||||
The signer has a random nonce, and thus can produce many different valid signatures.
|
||||
This ADR is not concerned with that.
|
||||
The second layer of malleability basically allows one who is given a signature
|
||||
to produce exactly one more valid signature for the same message from the same public key.
|
||||
(They don't even have to know the message!)
|
||||
The math behind this will be explained in the subsequent section.
|
||||
|
||||
Note that in many downstream applications, signatures will appear in a transaction, and therefore in the tx hash.
|
||||
This means that if someone broadcasts a transaction with secp256k1 signature, the signature can be altered into the other form by anyone in the p2p network.
|
||||
Thus the tx hash will change, and this altered tx hash may be committed instead.
|
||||
This breaks the assumption that you can broadcast a valid transaction and just wait for its hash to be included on chain.
|
||||
One example is if you are broadcasting a tx in cosmos,
|
||||
and you wait for it to appear on chain before incrementing your sequence number.
|
||||
You may never increment your sequence number if a different tx hash got committed.
|
||||
Removing this second layer of signature malleability concerns could ease downstream development.
|
||||
|
||||
### ECDSA context
|
||||
|
||||
Secp256k1 is ECDSA over a particular curve.
|
||||
The signature is of the form `(r, s)`, where `s` is a field element.
|
||||
(The particular field is the `Z_n`, where the elliptic curve has order `n`)
|
||||
However `(r, -s)` is also another valid solution.
|
||||
Note that anyone can negate a group element, and therefore can get this second signature.
|
||||
|
||||
## Decision
|
||||
|
||||
We can just distinguish a canonical form for the ECDSA signatures.
|
||||
Then we require that all ECDSA signatures be in the form which we defined as canonical.
|
||||
We reject signatures in non-canonical form.
|
||||
|
||||
A canonical form is rather easy to define and check.
|
||||
It would just be the smaller of the two values for `s`, defined lexicographically.
|
||||
This is a simple check, instead of checking if `s < n`, instead check `s <= (n - 1)/2`.
|
||||
An example of another cryptosystem using this
|
||||
is the parity definition here https://github.com/zkcrypto/pairing/pull/30#issuecomment-372910663.
|
||||
|
||||
This is the same solution Ethereum has chosen for solving secp malleability.
|
||||
|
||||
## Proposed Implementation
|
||||
|
||||
Fork https://github.com/btcsuite/btcd, and just update the [parse sig method](https://github.com/btcsuite/btcd/blob/11fcd83963ab0ecd1b84b429b1efc1d2cdc6d5c5/btcec/signature.go#L195) and serialize functions to enforce our canonical form.
|
||||
|
||||
## Status
|
||||
|
||||
Implemented
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- Lets us maintain the ability to expect a tx hash to appear in the blockchain.
|
||||
|
||||
### Negative
|
||||
|
||||
- More work in all future implementations (Though this is a very simple check)
|
||||
- Requires us to maintain another fork
|
||||
|
||||
### Neutral
|
||||
@@ -1,84 +0,0 @@
|
||||
# ADR 015: Crypto encoding
|
||||
|
||||
## Context
|
||||
|
||||
We must standardize our method for encoding public keys and signatures on chain.
|
||||
Currently we amino encode the public keys and signatures.
|
||||
The reason we are using amino here is primarily due to ease of support in
|
||||
parsing for other languages.
|
||||
We don't need its upgradability properties in cryptosystems, as a change in
|
||||
the crypto that requires adapting the encoding, likely warrants being deemed
|
||||
a new cryptosystem.
|
||||
(I.e. using new public parameters)
|
||||
|
||||
## Decision
|
||||
|
||||
### Public keys
|
||||
|
||||
For public keys, we will continue to use amino encoding on the canonical
|
||||
representation of the pubkey.
|
||||
(Canonical as defined by the cryptosystem itself)
|
||||
This has two significant drawbacks.
|
||||
Amino encoding is less space-efficient, due to requiring support for upgradability.
|
||||
Amino encoding support requires forking protobuf and adding this new interface support
|
||||
option in the language of choice.
|
||||
|
||||
The reason for continuing to use amino however is that people can create code
|
||||
more easily in languages that already have an up to date amino library.
|
||||
It is possible that this will change in the future, if it is deemed that
|
||||
requiring amino for interacting with Tendermint cryptography is unnecessary.
|
||||
|
||||
The arguments for space efficiency here are refuted on the basis that there are
|
||||
far more egregious wastages of space in the SDK.
|
||||
The space requirement of the public keys doesn't cause many problems beyond
|
||||
increasing the space attached to each validator / account.
|
||||
|
||||
The alternative to using amino here would be for us to create an enum type.
|
||||
Switching to just an enum type is worthy of investigation post-launch.
|
||||
For reference, part of amino encoding interfaces is basically a 4 byte enum
|
||||
type definition.
|
||||
Enum types would just change that 4 bytes to be a variant, and it would remove
|
||||
the protobuf overhead, but it would be hard to integrate into the existing API.
|
||||
|
||||
### Signatures
|
||||
|
||||
Signatures should be switched to be `[]byte`.
|
||||
Spatial efficiency in the signatures is quite important,
|
||||
as it directly affects the gas cost of every transaction,
|
||||
and the throughput of the chain.
|
||||
Signatures don't need to encode what type they are for (unlike public keys)
|
||||
since public keys must already be known.
|
||||
Therefore we can validate the signature without needing to encode its type.
|
||||
|
||||
When placed in state, signatures will still be amino encoded, but it will be the
|
||||
primitive type `[]byte` getting encoded.
|
||||
|
||||
#### Ed25519
|
||||
|
||||
Use the canonical representation for signatures.
|
||||
|
||||
#### Secp256k1
|
||||
|
||||
There isn't a clear canonical representation here.
|
||||
Signatures have two elements `r,s`.
|
||||
These bytes are encoded as `r || s`, where `r` and `s` are both exactly
|
||||
32 bytes long, encoded big-endian.
|
||||
This is basically Ethereum's encoding, but without the leading recovery bit.
|
||||
|
||||
## Status
|
||||
|
||||
Implemented
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- More space efficient signatures
|
||||
|
||||
### Negative
|
||||
|
||||
- We have an amino dependency for cryptography.
|
||||
|
||||
### Neutral
|
||||
|
||||
- No change to public keys
|
||||
@@ -1,308 +0,0 @@
|
||||
# ADR 016: Protocol Versions
|
||||
|
||||
## TODO
|
||||
|
||||
- How to / should we version the authenticated encryption handshake itself (ie.
|
||||
upfront protocol negotiation for the P2PVersion)
|
||||
- How to / should we version ABCI itself? Should it just be absorbed by the
|
||||
BlockVersion?
|
||||
|
||||
## Changelog
|
||||
|
||||
- 18-09-2018: Updates after working a bit on implementation
|
||||
- ABCI Handshake needs to happen independently of starting the app
|
||||
conns so we can see the result
|
||||
- Add question about ABCI protocol version
|
||||
- 16-08-2018: Updates after discussion with SDK team
|
||||
- Remove signalling for next version from Header/ABCI
|
||||
- 03-08-2018: Updates from discussion with Jae:
|
||||
- ProtocolVersion contains Block/AppVersion, not Current/Next
|
||||
- signal upgrades to Tendermint using EndBlock fields
|
||||
- dont restrict peer compatibilty by version to simplify syncing old nodes
|
||||
- 28-07-2018: Updates from review
|
||||
- split into two ADRs - one for protocol, one for chains
|
||||
- include signalling for upgrades in header
|
||||
- 16-07-2018: Initial draft - was originally joint ADR for protocol and chain
|
||||
versions
|
||||
|
||||
## Context
|
||||
|
||||
Here we focus on software-agnostic protocol versions.
|
||||
|
||||
The Software Version is covered by SemVer and described elsewhere.
|
||||
It is not relevant to the protocol description, suffice to say that if any protocol version
|
||||
changes, the software version changes, but not necessarily vice versa.
|
||||
|
||||
Software version should be included in NodeInfo for convenience/diagnostics.
|
||||
|
||||
We are also interested in versioning across different blockchains in a
|
||||
meaningful way, for instance to differentiate branches of a contentious
|
||||
hard-fork. We leave that for a later ADR.
|
||||
|
||||
## Requirements
|
||||
|
||||
We need to version components of the blockchain that may be independently upgraded.
|
||||
We need to do it in a way that is scalable and maintainable - we can't just litter
|
||||
the code with conditionals.
|
||||
|
||||
We can consider the complete version of the protocol to contain the following sub-versions:
|
||||
BlockVersion, P2PVersion, AppVersion. These versions reflect the major sub-components
|
||||
of the software that are likely to evolve together, at different rates, and in different ways,
|
||||
as described below.
|
||||
|
||||
The BlockVersion defines the core of the blockchain data structures and
|
||||
should change infrequently.
|
||||
|
||||
The P2PVersion defines how peers connect and communicate with eachother - it's
|
||||
not part of the blockchain data structures, but defines the protocols used to build the
|
||||
blockchain. It may change gradually.
|
||||
|
||||
The AppVersion determines how we compute app specific information, like the
|
||||
AppHash and the Results.
|
||||
|
||||
All of these versions may change over the life of a blockchain, and we need to
|
||||
be able to help new nodes sync up across version changes. This means we must be willing
|
||||
to connect to peers with older version.
|
||||
|
||||
### BlockVersion
|
||||
|
||||
- All tendermint hashed data-structures (headers, votes, txs, responses, etc.).
|
||||
- Note the semantic meaning of a transaction may change according to the AppVersion, but the way txs are merklized into the header is part of the BlockVersion
|
||||
- It should be the least frequent/likely to change.
|
||||
- Tendermint should be stabilizing - it's just Atomic Broadcast.
|
||||
- We can start considering for Tendermint v2.0 in a year
|
||||
- It's easy to determine the version of a block from its serialized form
|
||||
|
||||
### P2PVersion
|
||||
|
||||
- All p2p and reactor messaging (messages, detectable behaviour)
|
||||
- Will change gradually as reactors evolve to improve performance and support new features - eg proposed new message types BatchTx in the mempool and HasBlockPart in the consensus
|
||||
- It's easy to determine the version of a peer from its first serialized message/s
|
||||
- New versions must be compatible with at least one old version to allow gradual upgrades
|
||||
|
||||
### AppVersion
|
||||
|
||||
- The ABCI state machine (txs, begin/endblock behaviour, commit hashing)
|
||||
- Behaviour and message types will change abruptly in the course of the life of a chain
|
||||
- Need to minimize complexity of the code for supporting different AppVersions at different heights
|
||||
- Ideally, each version of the software supports only a _single_ AppVersion at one time
|
||||
- this means we checkout different versions of the software at different heights instead of littering the code
|
||||
with conditionals
|
||||
- minimize the number of data migrations required across AppVersion (ie. most AppVersion should be able to read the same state from disk as previous AppVersion).
|
||||
|
||||
## Ideal
|
||||
|
||||
Each component of the software is independently versioned in a modular way and its easy to mix and match and upgrade.
|
||||
|
||||
## Proposal
|
||||
|
||||
Each of BlockVersion, AppVersion, P2PVersion, is a monotonically increasing uint64.
|
||||
|
||||
To use these versions, we need to update the block Header, the p2p NodeInfo, and the ABCI.
|
||||
|
||||
### Header
|
||||
|
||||
Block Header should include a `Version` struct as its first field like:
|
||||
|
||||
```
|
||||
type Version struct {
|
||||
Block uint64
|
||||
App uint64
|
||||
}
|
||||
```
|
||||
|
||||
Here, `Version.Block` defines the rules for the current block, while
|
||||
`Version.App` defines the app version that processed the last block and computed
|
||||
the `AppHash` in the current block. Together they provide a complete description
|
||||
of the consensus-critical protocol.
|
||||
|
||||
Since we have settled on a proto3 header, the ability to read the BlockVersion out of the serialized header is unanimous.
|
||||
|
||||
Using a Version struct gives us more flexibility to add fields without breaking
|
||||
the header.
|
||||
|
||||
The ProtocolVersion struct includes both the Block and App versions - it should
|
||||
serve as a complete description of the consensus-critical protocol.
|
||||
|
||||
### NodeInfo
|
||||
|
||||
NodeInfo should include a Version struct as its first field like:
|
||||
|
||||
```
|
||||
type Version struct {
|
||||
P2P uint64
|
||||
Block uint64
|
||||
App uint64
|
||||
|
||||
Other []string
|
||||
}
|
||||
```
|
||||
|
||||
Note this effectively makes `Version.P2P` the first field in the NodeInfo, so it
|
||||
should be easy to read this out of the serialized header if need be to facilitate an upgrade.
|
||||
|
||||
The `Version.Other` here should include additional information like the name of the software client and
|
||||
it's SemVer version - this is for convenience only. Eg.
|
||||
`tendermint-core/v0.22.8`. It's a `[]string` so it can include information about
|
||||
the version of Tendermint, of the app, of Tendermint libraries, etc.
|
||||
|
||||
### ABCI
|
||||
|
||||
Since the ABCI is responsible for keeping Tendermint and the App in sync, we
|
||||
need to communicate version information through it.
|
||||
|
||||
On startup, we use Info to perform a basic handshake. It should include all the
|
||||
version information.
|
||||
|
||||
We also need to be able to update versions in the life of a blockchain. The
|
||||
natural place to do this is EndBlock.
|
||||
|
||||
Note that currently the result of the Handshake isn't exposed anywhere, as the
|
||||
handshaking happens inside the `proxy.AppConns` abstraction. We will need to
|
||||
remove the handshaking from the `proxy` package so we can call it independently
|
||||
and get the result, which should contain the application version.
|
||||
|
||||
#### Info
|
||||
|
||||
RequestInfo should add support for protocol versions like:
|
||||
|
||||
```
|
||||
message RequestInfo {
|
||||
string version
|
||||
uint64 block_version
|
||||
uint64 p2p_version
|
||||
}
|
||||
```
|
||||
|
||||
Similarly, ResponseInfo should return the versions:
|
||||
|
||||
```
|
||||
message ResponseInfo {
|
||||
string data
|
||||
|
||||
string version
|
||||
uint64 app_version
|
||||
|
||||
int64 last_block_height
|
||||
bytes last_block_app_hash
|
||||
}
|
||||
```
|
||||
|
||||
The existing `version` fields should be called `software_version` but we leave
|
||||
them for now to reduce the number of breaking changes.
|
||||
|
||||
#### EndBlock
|
||||
|
||||
Updating the version could be done either with new fields or by using the
|
||||
existing `tags`. Since we're trying to communicate information that will be
|
||||
included in Tendermint block Headers, it should be native to the ABCI, and not
|
||||
something embedded through some scheme in the tags. Thus, version updates should
|
||||
be communicated through EndBlock.
|
||||
|
||||
EndBlock already contains `ConsensusParams`. We can add version information to
|
||||
the ConsensusParams as well:
|
||||
|
||||
```
|
||||
message ConsensusParams {
|
||||
|
||||
BlockSize block_size
|
||||
EvidenceParams evidence_params
|
||||
VersionParams version
|
||||
}
|
||||
|
||||
message VersionParams {
|
||||
uint64 block_version
|
||||
uint64 app_version
|
||||
}
|
||||
```
|
||||
|
||||
For now, the `block_version` will be ignored, as we do not allow block version
|
||||
to be updated live. If the `app_version` is set, it signals that the app's
|
||||
protocol version has changed, and the new `app_version` will be included in the
|
||||
`Block.Header.Version.App` for the next block.
|
||||
|
||||
### BlockVersion
|
||||
|
||||
BlockVersion is included in both the Header and the NodeInfo.
|
||||
|
||||
Changing BlockVersion should happen quite infrequently and ideally only for
|
||||
critical upgrades. For now, it is not encoded in ABCI, though it's always
|
||||
possible to use tags to signal an external process to co-ordinate an upgrade.
|
||||
|
||||
Note Ethereum has not had to make an upgrade like this (everything has been at state machine level, AFAIK).
|
||||
|
||||
### P2PVersion
|
||||
|
||||
P2PVersion is not included in the block Header, just the NodeInfo.
|
||||
|
||||
P2PVersion is the first field in the NodeInfo. NodeInfo is also proto3 so this is easy to read out.
|
||||
|
||||
Note we need the peer/reactor protocols to take the versions of peers into account when sending messages:
|
||||
|
||||
- don't send messages they don't understand
|
||||
- don't send messages they don't expect
|
||||
|
||||
Doing this will be specific to the upgrades being made.
|
||||
|
||||
Note we also include the list of reactor channels in the NodeInfo and already don't send messages for channels the peer doesn't understand.
|
||||
If upgrades always use new channels, this simplifies the development cost of backwards compatibility.
|
||||
|
||||
Note NodeInfo is only exchanged after the authenticated encryption handshake to ensure that it's private.
|
||||
Doing any version exchange before encrypting could be considered information leakage, though I'm not sure
|
||||
how much that matters compared to being able to upgrade the protocol.
|
||||
|
||||
XXX: if needed, can we change the meaning of the first byte of the first message to encode a handshake version?
|
||||
this is the first byte of a 32-byte ed25519 pubkey.
|
||||
|
||||
### AppVersion
|
||||
|
||||
AppVersion is also included in the block Header and the NodeInfo.
|
||||
|
||||
AppVersion essentially defines how the AppHash and LastResults are computed.
|
||||
|
||||
### Peer Compatibility
|
||||
|
||||
Restricting peer compatibility based on version is complicated by the need to
|
||||
help old peers, possibly on older versions, sync the blockchain.
|
||||
|
||||
We might be tempted to say that we only connect to peers with the same
|
||||
AppVersion and BlockVersion (since these define the consensus critical
|
||||
computations), and a select list of P2PVersions (ie. those compatible with
|
||||
ours), but then we'd need to make accomodations for connecting to peers with the
|
||||
right Block/AppVersion for the height they're on.
|
||||
|
||||
For now, we will connect to peers with any version and restrict compatibility
|
||||
solely based on the ChainID. We leave more restrictive rules on peer
|
||||
compatibiltiy to a future proposal.
|
||||
|
||||
### Future Changes
|
||||
|
||||
It may be valuable to support an `/unsafe_stop?height=_` endpoint to tell Tendermint to shutdown at a given height.
|
||||
This could be use by an external manager process that oversees upgrades by
|
||||
checking out and installing new software versions and restarting the process. It
|
||||
would subscribe to the relevant upgrade event (needs to be implemented) and call `/unsafe_stop` at
|
||||
the correct height (of course only after getting approval from its user!)
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- Make tendermint and application versions native to the ABCI to more clearly
|
||||
communicate about them
|
||||
- Distinguish clearly between protocol versions and software version to
|
||||
facilitate implementations in other languages
|
||||
- Versions included in key data structures in easy to discern way
|
||||
- Allows proposers to signal for upgrades and apps to decide when to actually change the
|
||||
version (and start signalling for a new version)
|
||||
|
||||
### Neutral
|
||||
|
||||
- Unclear how to version the initial P2P handshake itself
|
||||
- Versions aren't being used (yet) to restrict peer compatibility
|
||||
- Signalling for a new version happens through the proposer and must be
|
||||
tallied/tracked in the app.
|
||||
|
||||
### Negative
|
||||
|
||||
- Adds more fields to the ABCI
|
||||
- Implies that a single codebase must be able to handle multiple versions
|
||||
@@ -1,99 +0,0 @@
|
||||
# ADR 017: Chain Versions
|
||||
|
||||
## TODO
|
||||
|
||||
- clarify how to handle slashing when ChainID changes
|
||||
|
||||
## Changelog
|
||||
|
||||
- 28-07-2018: Updates from review
|
||||
- split into two ADRs - one for protocol, one for chains
|
||||
- 16-07-2018: Initial draft - was originally joint ADR for protocol and chain
|
||||
versions
|
||||
|
||||
## Context
|
||||
|
||||
Software and Protocol versions are covered in a separate ADR.
|
||||
|
||||
Here we focus on chain versions.
|
||||
|
||||
## Requirements
|
||||
|
||||
We need to version blockchains across protocols, networks, forks, etc.
|
||||
We need chain identifiers and descriptions so we can talk about a multitude of chains,
|
||||
and especially the differences between them, in a meaningful way.
|
||||
|
||||
### Networks
|
||||
|
||||
We need to support many independent networks running the same version of the software,
|
||||
even possibly starting from the same initial state.
|
||||
They must have distinct identifiers so that peers know which one they are joining and so
|
||||
validators and users can prevent replay attacks.
|
||||
|
||||
Call this the `NetworkName` (note we currently call this `ChainID` in the software. In this
|
||||
ADR, ChainID has a different meaning).
|
||||
It represents both the application being run and the community or intention
|
||||
of running it.
|
||||
|
||||
Peers only connect to other peers with the same NetworkName.
|
||||
|
||||
### Forks
|
||||
|
||||
We need to support existing networks upgrading and forking, wherein they may do any of:
|
||||
|
||||
- revert back to some height, continue with the same versions but new blocks
|
||||
- arbitrarily mutate state at some height, continue with the same versions (eg. Dao Fork)
|
||||
- change the AppVersion at some height
|
||||
|
||||
Note because of Tendermint's voting power threshold rules, a chain can only be extended under the "original" rules and under the new rules
|
||||
if 1/3 or more is double signing, which is expressly prohibited, and is supposed to result in their punishment on both chains. Since they can censor
|
||||
the punishment, the chain is expected to be hardforked to remove the validators. Thus, if both branches are to continue after a fork,
|
||||
they will each require a new identifier, and the old chain identifier will be retired (ie. only useful for syncing history, not for new blocks)..
|
||||
|
||||
TODO: explain how to handle slashing when chain id changed!
|
||||
|
||||
We need a consistent way to describe forks.
|
||||
|
||||
## Proposal
|
||||
|
||||
### ChainDescription
|
||||
|
||||
ChainDescription is a complete immutable description of a blockchain. It takes the following form:
|
||||
|
||||
```
|
||||
ChainDescription = <NetworkName>/<BlockVersion>/<AppVersion>/<StateHash>/<ValHash>/<ConsensusParamsHash>
|
||||
```
|
||||
|
||||
Here, StateHash is the merkle root of the initial state, ValHash is the merkle root of the initial Tendermint validator set,
|
||||
and ConsensusParamsHash is the merkle root of the initial Tendermint consensus parameters.
|
||||
|
||||
The `genesis.json` file must contain enough information to compute this value. It need not contain the StateHash or ValHash itself,
|
||||
but contain the state from which they can be computed with the given protocol versions.
|
||||
|
||||
NOTE: consider splitting NetworkName into NetworkName and AppName - this allows
|
||||
folks to independently use the same application for different networks (ie we
|
||||
could imagine multiple communities of validators wanting to put up a Hub using
|
||||
the same app but having a distinct network name. Arguably not needed if
|
||||
differences will come via different initial state / validators).
|
||||
|
||||
#### ChainID
|
||||
|
||||
Define `ChainID = TMHASH(ChainDescriptor)`. It's the unique ID of a blockchain.
|
||||
|
||||
It should be Bech32 encoded when handled by users, eg. with `cosmoschain` prefix.
|
||||
|
||||
#### Forks and Uprades
|
||||
|
||||
When a chain forks or upgrades but continues the same history, it takes a new ChainDescription as follows:
|
||||
|
||||
```
|
||||
ChainDescription = <ChainID>/x/<Height>/<ForkDescription>
|
||||
```
|
||||
|
||||
Where
|
||||
|
||||
- ChainID is the ChainID from the previous ChainDescription (ie. its hash)
|
||||
- `x` denotes that a change occured
|
||||
- `Height` is the height the change occured
|
||||
- ForkDescription has the same form as ChainDescription but for the fork
|
||||
- this allows forks to specify new versions for tendermint or the app, as well as arbitrary changes to the state or validator set
|
||||
@@ -1,100 +0,0 @@
|
||||
# ADR 018: ABCI Validator Improvements
|
||||
|
||||
## Changelog
|
||||
|
||||
016-08-2018: Follow up from review: - Revert changes to commit round - Remind about justification for removing pubkey - Update pros/cons
|
||||
05-08-2018: Initial draft
|
||||
|
||||
## Context
|
||||
|
||||
ADR 009 introduced major improvements to the ABCI around validators and the use
|
||||
of Amino. Here we follow up with some additional changes to improve the naming
|
||||
and expected use of Validator messages.
|
||||
|
||||
## Decision
|
||||
|
||||
### Validator
|
||||
|
||||
Currently a Validator contains `address` and `pub_key`, and one or the other is
|
||||
optional/not-sent depending on the use case. Instead, we should have a
|
||||
`Validator` (with just the address, used for RequestBeginBlock)
|
||||
and a `ValidatorUpdate` (with the pubkey, used for ResponseEndBlock):
|
||||
|
||||
```
|
||||
message Validator {
|
||||
bytes address
|
||||
int64 power
|
||||
}
|
||||
|
||||
message ValidatorUpdate {
|
||||
PubKey pub_key
|
||||
int64 power
|
||||
}
|
||||
```
|
||||
|
||||
As noted in [ADR-009](adr-009-ABCI-design.md),
|
||||
the `Validator` does not contain a pubkey because quantum public keys are
|
||||
quite large and it would be wasteful to send them all over ABCI with every block.
|
||||
Thus, applications that want to take advantage of the information in BeginBlock
|
||||
are _required_ to store pubkeys in state (or use much less efficient lazy means
|
||||
of verifying BeginBlock data).
|
||||
|
||||
### RequestBeginBlock
|
||||
|
||||
LastCommitInfo currently has an array of `SigningValidator` that contains
|
||||
information for each validator in the entire validator set.
|
||||
Instead, this should be called `VoteInfo`, since it is information about the
|
||||
validator votes.
|
||||
|
||||
Note that all votes in a commit must be from the same round.
|
||||
|
||||
```
|
||||
message LastCommitInfo {
|
||||
int64 round
|
||||
repeated VoteInfo commit_votes
|
||||
}
|
||||
|
||||
message VoteInfo {
|
||||
Validator validator
|
||||
bool signed_last_block
|
||||
}
|
||||
```
|
||||
|
||||
### ResponseEndBlock
|
||||
|
||||
Use ValidatorUpdates instead of Validators. Then it's clear we don't need an
|
||||
address, and we do need a pubkey.
|
||||
|
||||
We could require the address here as well as a sanity check, but it doesn't seem
|
||||
necessary.
|
||||
|
||||
### InitChain
|
||||
|
||||
Use ValidatorUpdates for both Request and Response. InitChain
|
||||
is about setting/updating the initial validator set, unlike BeginBlock
|
||||
which is just informational.
|
||||
|
||||
## Status
|
||||
|
||||
Implemented
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- Clarifies the distinction between the different uses of validator information
|
||||
|
||||
### Negative
|
||||
|
||||
- Apps must still store the public keys in state to utilize the RequestBeginBlock info
|
||||
|
||||
### Neutral
|
||||
|
||||
- ResponseEndBlock does not require an address
|
||||
|
||||
## References
|
||||
|
||||
- [Latest ABCI Spec](https://github.com/tendermint/tendermint/blob/v0.22.8/docs/app-dev/abci-spec.md)
|
||||
- [ADR-009](https://github.com/tendermint/tendermint/blob/v0.22.8/docs/architecture/adr-009-ABCI-design.md)
|
||||
- [Issue #1712 - Don't send PubKey in
|
||||
RequestBeginBlock](https://github.com/tendermint/tendermint/issues/1712)
|
||||
@@ -1,162 +0,0 @@
|
||||
# ADR 019: Encoding standard for Multisignatures
|
||||
|
||||
## Changelog
|
||||
|
||||
06-08-2018: Minor updates
|
||||
|
||||
27-07-2018: Update draft to use amino encoding
|
||||
|
||||
11-07-2018: Initial Draft
|
||||
|
||||
5-26-2021: Multisigs were moved into the Cosmos-sdk
|
||||
|
||||
## Context
|
||||
|
||||
Multisignatures, or technically _Accountable Subgroup Multisignatures_ (ASM),
|
||||
are signature schemes which enable any subgroup of a set of signers to sign any message,
|
||||
and reveal to the verifier exactly who the signers were.
|
||||
This allows for complex conditionals of when to validate a signature.
|
||||
|
||||
Suppose the set of signers is of size _n_.
|
||||
If we validate a signature if any subgroup of size _k_ signs a message,
|
||||
this becomes what is commonly reffered to as a _k of n multisig_ in Bitcoin.
|
||||
|
||||
This ADR specifies the encoding standard for general accountable subgroup multisignatures,
|
||||
k of n accountable subgroup multisignatures, and its weighted variant.
|
||||
|
||||
In the future, we can also allow for more complex conditionals on the accountable subgroup.
|
||||
|
||||
## Proposed Solution
|
||||
|
||||
### New structs
|
||||
|
||||
Every ASM will then have its own struct, implementing the crypto.Pubkey interface.
|
||||
|
||||
This ADR assumes that [replacing crypto.Signature with []bytes](https://github.com/tendermint/tendermint/issues/1957) has been accepted.
|
||||
|
||||
#### K of N threshold signature
|
||||
|
||||
The pubkey is the following struct:
|
||||
|
||||
```golang
|
||||
type ThresholdMultiSignaturePubKey struct { // K of N threshold multisig
|
||||
K uint `json:"threshold"`
|
||||
Pubkeys []crypto.Pubkey `json:"pubkeys"`
|
||||
}
|
||||
```
|
||||
|
||||
We will derive N from the length of pubkeys. (For spatial efficiency in encoding)
|
||||
|
||||
`Verify` will expect an `[]byte` encoded version of the Multisignature.
|
||||
(Multisignature is described in the next section)
|
||||
The multisignature will be rejected if the bitmap has less than k indices,
|
||||
or if any signature at any of the k indices is not a valid signature from
|
||||
the kth public key on the message.
|
||||
(If more than k signatures are included, all must be valid)
|
||||
|
||||
`Bytes` will be the amino encoded version of the pubkey.
|
||||
|
||||
Address will be `Hash(amino_encoded_pubkey)`
|
||||
|
||||
The reason this doesn't use `log_8(n)` bytes per signer is because that heavily optimizes for the case where a very small number of signers are required.
|
||||
e.g. for `n` of size `24`, that would only be more space efficient for `k < 3`.
|
||||
This seems less likely, and that it should not be the case optimized for.
|
||||
|
||||
#### Weighted threshold signature
|
||||
|
||||
The pubkey is the following struct:
|
||||
|
||||
```golang
|
||||
type WeightedThresholdMultiSignaturePubKey struct {
|
||||
Weights []uint `json:"weights"`
|
||||
Threshold uint `json:"threshold"`
|
||||
Pubkeys []crypto.Pubkey `json:"pubkeys"`
|
||||
}
|
||||
```
|
||||
|
||||
Weights and Pubkeys must be of the same length.
|
||||
Everything else proceeds identically to the K of N multisig,
|
||||
except the multisig fails if the sum of the weights is less than the threshold.
|
||||
|
||||
#### Multisignature
|
||||
|
||||
The inter-mediate phase of the signatures (as it accrues more signatures) will be the following struct:
|
||||
|
||||
```golang
|
||||
type Multisignature struct {
|
||||
BitArray CryptoBitArray // Documented later
|
||||
Sigs [][]byte
|
||||
```
|
||||
|
||||
It is important to recall that each private key will output a signature on the provided message itself.
|
||||
So no signing algorithm ever outputs the multisignature.
|
||||
The UI will take a signature, cast into a multisignature, and then keep adding
|
||||
new signatures into it, and when done marshal into `[]byte`.
|
||||
This will require the following helper methods:
|
||||
|
||||
```golang
|
||||
func SigToMultisig(sig []byte, n int)
|
||||
func GetIndex(pk crypto.Pubkey, []crypto.Pubkey)
|
||||
func AddSignature(sig Signature, index int, multiSig *Multisignature)
|
||||
```
|
||||
|
||||
The multisignature will be converted to an `[]byte` using amino.MarshalBinaryBare. \*
|
||||
|
||||
#### Bit Array
|
||||
|
||||
We would be using a new implementation of a bitarray. The struct it would be encoded/decoded from is
|
||||
|
||||
```golang
|
||||
type CryptoBitArray struct {
|
||||
ExtraBitsStored byte `json:"extra_bits"` // The number of extra bits in elems.
|
||||
Elems []byte `json:"elems"`
|
||||
}
|
||||
```
|
||||
|
||||
The reason for not using the BitArray currently implemented in `libs/common/bit_array.go`
|
||||
is that it is less space efficient, due to a space / time trade-off.
|
||||
Evidence for this is outlined in [this issue](https://github.com/tendermint/tendermint/issues/2077).
|
||||
|
||||
In the multisig, we will not be performing arithmetic operations,
|
||||
so there is no performance increase with the current implementation,
|
||||
and just loss of spatial efficiency.
|
||||
Implementing this new bit array with `[]byte` _should_ be simple, as no
|
||||
arithmetic operations between bit arrays are required, and save a couple of bytes.
|
||||
(Explained in that same issue)
|
||||
|
||||
When this bit array encoded, the number of elements is encoded due to amino.
|
||||
However we may be encoding a full byte for what we actually only need 1-7 bits for.
|
||||
We store that difference in ExtraBitsStored.
|
||||
This allows for us to have an unbounded number of signers, and is more space efficient than what is currently used in `libs/common`.
|
||||
Again the implementation of this space saving feature is straight forward.
|
||||
|
||||
### Encoding the structs
|
||||
|
||||
We will use straight forward amino encoding. This is chosen for ease of compatibility in other languages.
|
||||
|
||||
### Future points of discussion
|
||||
|
||||
If desired, we can use ed25519 batch verification for all ed25519 keys.
|
||||
This is a future point of discussion, but would be backwards compatible as this information won't need to be marshalled.
|
||||
(There may even be cofactor concerns without ristretto)
|
||||
Aggregation of pubkeys / sigs in Schnorr sigs / BLS sigs is not backwards compatible, and would need to be a new ASM type.
|
||||
|
||||
## Status
|
||||
|
||||
Implemented (moved to cosmos-sdk)
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- Supports multisignatures, in a way that won't require any special cases in our downstream verification code.
|
||||
- Easy to serialize / deserialize
|
||||
- Unbounded number of signers
|
||||
|
||||
### Negative
|
||||
|
||||
- Larger codebase, however this should reside in a subfolder of tendermint/crypto, as it provides no new interfaces. (Ref #https://github.com/tendermint/go-crypto/issues/136)
|
||||
- Space inefficient due to utilization of amino encoding
|
||||
- Suggested implementation requires a new struct for every ASM.
|
||||
|
||||
### Neutral
|
||||
@@ -1,104 +0,0 @@
|
||||
# ADR 020: Limiting txs size inside a block
|
||||
|
||||
## Changelog
|
||||
|
||||
13-08-2018: Initial Draft
|
||||
15-08-2018: Second version after Dev's comments
|
||||
28-08-2018: Third version after Ethan's comments
|
||||
30-08-2018: AminoOverheadForBlock => MaxAminoOverheadForBlock
|
||||
31-08-2018: Bounding evidence and chain ID
|
||||
13-01-2019: Add section on MaxBytes vs MaxDataBytes
|
||||
|
||||
## Context
|
||||
|
||||
We currently use MaxTxs to reap txs from the mempool when proposing a block,
|
||||
but enforce MaxBytes when unmarshaling a block, so we could easily propose a
|
||||
block thats too large to be valid.
|
||||
|
||||
We should just remove MaxTxs all together and stick with MaxBytes, and have a
|
||||
`mempool.ReapMaxBytes`.
|
||||
|
||||
But we can't just reap BlockSize.MaxBytes, since MaxBytes is for the entire block,
|
||||
not for the txs inside the block. There's extra amino overhead + the actual
|
||||
headers on top of the actual transactions + evidence + last commit.
|
||||
We could also consider using a MaxDataBytes instead of or in addition to MaxBytes.
|
||||
|
||||
## MaxBytes vs MaxDataBytes
|
||||
|
||||
The [PR #3045](https://github.com/tendermint/tendermint/pull/3045) suggested
|
||||
additional clarity/justification was necessary here, wither respect to the use
|
||||
of MaxDataBytes in addition to, or instead of, MaxBytes.
|
||||
|
||||
MaxBytes provides a clear limit on the total size of a block that requires no
|
||||
additional calculation if you want to use it to bound resource usage, and there
|
||||
has been considerable discussions about optimizing tendermint around 1MB blocks.
|
||||
Regardless, we need some maximum on the size of a block so we can avoid
|
||||
unmarshaling blocks that are too big during the consensus, and it seems more
|
||||
straightforward to provide a single fixed number for this rather than a
|
||||
computation of "MaxDataBytes + everything else you need to make room for
|
||||
(signatures, evidence, header)". MaxBytes provides a simple bound so we can
|
||||
always say "blocks are less than X MB".
|
||||
|
||||
Having both MaxBytes and MaxDataBytes feels like unnecessary complexity. It's
|
||||
not particularly surprising for MaxBytes to imply the maximum size of the
|
||||
entire block (not just txs), one just has to know that a block includes header,
|
||||
txs, evidence, votes. For more fine grained control over the txs included in the
|
||||
block, there is the MaxGas. In practice, the MaxGas may be expected to do most of
|
||||
the tx throttling, and the MaxBytes to just serve as an upper bound on the total
|
||||
size. Applications can use MaxGas as a MaxDataBytes by just taking the gas for
|
||||
every tx to be its size in bytes.
|
||||
|
||||
## Proposed solution
|
||||
|
||||
Therefore, we should
|
||||
|
||||
1) Get rid of MaxTxs.
|
||||
2) Rename MaxTxsBytes to MaxBytes.
|
||||
|
||||
When we need to ReapMaxBytes from the mempool, we calculate the upper bound as follows:
|
||||
|
||||
```
|
||||
ExactLastCommitBytes = {number of validators currently enabled} * {MaxVoteBytes}
|
||||
MaxEvidenceBytesPerBlock = MaxBytes / 10
|
||||
ExactEvidenceBytes = cs.evpool.PendingEvidence(MaxEvidenceBytesPerBlock) * MaxEvidenceBytes
|
||||
|
||||
mempool.ReapMaxBytes(MaxBytes - MaxAminoOverheadForBlock - ExactLastCommitBytes - ExactEvidenceBytes - MaxHeaderBytes)
|
||||
```
|
||||
|
||||
where MaxVoteBytes, MaxEvidenceBytes, MaxHeaderBytes and MaxAminoOverheadForBlock
|
||||
are constants defined inside the `types` package:
|
||||
|
||||
- MaxVoteBytes - 170 bytes
|
||||
- MaxEvidenceBytes - 364 bytes
|
||||
- MaxHeaderBytes - 476 bytes (~276 bytes hashes + 200 bytes - 50 UTF-8 encoded
|
||||
symbols of chain ID 4 bytes each in the worst case + amino overhead)
|
||||
- MaxAminoOverheadForBlock - 8 bytes (assuming MaxHeaderBytes includes amino
|
||||
overhead for encoding header, MaxVoteBytes - for encoding vote, etc.)
|
||||
|
||||
ChainID needs to bound to 50 symbols max.
|
||||
|
||||
When reaping evidence, we use MaxBytes to calculate the upper bound (e.g. 1/10)
|
||||
to save some space for transactions.
|
||||
|
||||
NOTE while reaping the `max int` bytes in mempool, we should account that every
|
||||
transaction will take `len(tx)+aminoOverhead`, where aminoOverhead=1-4 bytes.
|
||||
|
||||
We should write a test that fails if the underlying structs got changed, but
|
||||
MaxXXX stayed the same.
|
||||
|
||||
## Status
|
||||
|
||||
Implemented
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
* one way to limit the size of a block
|
||||
* less variables to configure
|
||||
|
||||
### Negative
|
||||
|
||||
* constants that need to be adjusted if the underlying structs got changed
|
||||
|
||||
### Neutral
|
||||
@@ -1,52 +0,0 @@
|
||||
# ADR 012: ABCI Events
|
||||
|
||||
## Changelog
|
||||
|
||||
- *2018-09-02* Remove ABCI errors component. Update description for events
|
||||
- *2018-07-12* Initial version
|
||||
|
||||
## Context
|
||||
|
||||
ABCI tags were first described in [ADR 002](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-002-event-subscription.md).
|
||||
They are key-value pairs that can be used to index transactions.
|
||||
|
||||
Currently, ABCI messages return a list of tags to describe an
|
||||
"event" that took place during the Check/DeliverTx/Begin/EndBlock,
|
||||
where each tag refers to a different property of the event, like the sending and receiving account addresses.
|
||||
|
||||
Since there is only one list of tags, recording data for multiple such events in
|
||||
a single Check/DeliverTx/Begin/EndBlock must be done using prefixes in the key
|
||||
space.
|
||||
|
||||
Alternatively, groups of tags that constitute an event can be separated by a
|
||||
special tag that denotes a break between the events. This would allow
|
||||
straightforward encoding of multiple events into a single list of tags without
|
||||
prefixing, at the cost of these "special" tags to separate the different events.
|
||||
|
||||
TODO: brief description of how the indexing works
|
||||
|
||||
## Decision
|
||||
|
||||
Instead of returning a list of tags, return a list of events, where
|
||||
each event is a list of tags. This way we naturally capture the concept of
|
||||
multiple events happening during a single ABCI message.
|
||||
|
||||
TODO: describe impact on indexing and querying
|
||||
|
||||
## Status
|
||||
|
||||
Implemented
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- Ability to track distinct events separate from ABCI calls (DeliverTx/BeginBlock/EndBlock)
|
||||
- More powerful query abilities
|
||||
|
||||
### Negative
|
||||
|
||||
- More complex query syntax
|
||||
- More complex search implementation
|
||||
|
||||
### Neutral
|
||||
@@ -1,63 +0,0 @@
|
||||
# ADR 022: ABCI Errors
|
||||
|
||||
## Changelog
|
||||
|
||||
- *2018-09-01* Initial version
|
||||
|
||||
## Context
|
||||
|
||||
ABCI errors should provide an abstraction between application details
|
||||
and the client interface responsible for formatting & displaying errors to the user.
|
||||
|
||||
Currently, this abstraction consists of a single integer (the `code`), where any
|
||||
`code > 0` is considered an error (ie. invalid transaction) and all type
|
||||
information about the error is contained in the code. This integer is
|
||||
expected to be decoded by the client into a known error string, where any
|
||||
more specific data is contained in the `data`.
|
||||
|
||||
In a [previous conversation](https://github.com/tendermint/abci/issues/165#issuecomment-353704015),
|
||||
it was suggested that not all non-zero codes need to be errors, hence why it's called `code` and not `error code`.
|
||||
It is unclear exactly how the semantics of the `code` field will evolve, though
|
||||
better lite-client proofs (like discussed for tags
|
||||
[here](https://github.com/tendermint/tendermint/issues/1007#issuecomment-413917763))
|
||||
may play a role.
|
||||
|
||||
Note that having all type information in a single integer
|
||||
precludes an easy coordination method between "module implementers" and "client
|
||||
implementers", especially for apps with many "modules". With an unbounded error domain (such as a string), module
|
||||
implementers can pick a globally unique prefix & error code set, so client
|
||||
implementers could easily implement support for "module A" regardless of which
|
||||
particular blockchain network it was running in and which other modules were running with it. With
|
||||
only error codes, globally unique codes are difficult/impossible, as the space
|
||||
is finite and collisions are likely without an easy way to coordinate.
|
||||
|
||||
For instance, while trying to build an ecosystem of modules that can be composed into a single
|
||||
ABCI application, the Cosmos-SDK had to hack a higher level "codespace" into the
|
||||
single integer so that each module could have its own space to express its
|
||||
errors.
|
||||
|
||||
## Decision
|
||||
|
||||
Include a `string code_space` in all ABCI messages that have a `code`.
|
||||
This allows applications to namespace the codes so they can experiment with
|
||||
their own code schemes.
|
||||
|
||||
It is the responsibility of applications to limit the size of the `code_space`
|
||||
string.
|
||||
|
||||
How the codespace is hashed into block headers (ie. so it can be queried
|
||||
efficiently by lite clients) is left for a separate ADR.
|
||||
|
||||
## Consequences
|
||||
|
||||
## Positive
|
||||
|
||||
- No need for complex codespacing on a single integer
|
||||
- More expressive type system for errors
|
||||
|
||||
## Negative
|
||||
|
||||
- Another field in the response needs to be accounted for
|
||||
- Some redundancy with `code` field
|
||||
- May encourage more error/code type info to move to the `codespace` string, which
|
||||
could impact lite clients.
|
||||
@@ -1,183 +0,0 @@
|
||||
# ADR 023: ABCI `ProposeTx` Method
|
||||
|
||||
## Changelog
|
||||
|
||||
25-06-2018: Initial draft based on [#1776](https://github.com/tendermint/tendermint/issues/1776)
|
||||
|
||||
## Context
|
||||
|
||||
[#1776](https://github.com/tendermint/tendermint/issues/1776) was
|
||||
opened in relation to implementation of a Plasma child chain using Tendermint
|
||||
Core as consensus/replication engine.
|
||||
|
||||
Due to the requirements of [Minimal Viable Plasma (MVP)](https://ethresear.ch/t/minimal-viable-plasma/426) and [Plasma Cash](https://ethresear.ch/t/plasma-cash-plasma-with-much-less-per-user-data-checking/1298), it is necessary for ABCI apps to have a mechanism to handle the following cases (more may emerge in the near future):
|
||||
|
||||
1. `deposit` transactions on the Root Chain, which must consist of a block
|
||||
with a single transaction, where there are no inputs and only one output
|
||||
made in favour of the depositor. In this case, a `block` consists of
|
||||
a transaction with the following shape:
|
||||
|
||||
```
|
||||
[0, 0, 0, 0, #input1 - zeroed out
|
||||
0, 0, 0, 0, #input2 - zeroed out
|
||||
<depositor_address>, <amount>, #output1 - in favour of depositor
|
||||
0, 0, #output2 - zeroed out
|
||||
<fee>,
|
||||
]
|
||||
```
|
||||
|
||||
`exit` transactions may also be treated in a similar manner, wherein the
|
||||
input is the UTXO being exited on the Root Chain, and the output belongs to
|
||||
a reserved "burn" address, e.g., `0x0`. In such cases, it is favourable for
|
||||
the containing block to only hold a single transaction that may receive
|
||||
special treatment.
|
||||
|
||||
2. Other "internal" transactions on the child chain, which may be initiated
|
||||
unilaterally. The most basic example of is a coinbase transaction
|
||||
implementing validator node incentives, but may also be app-specific. In
|
||||
these cases, it may be favourable for such transactions to
|
||||
be ordered in a specific manner, e.g., coinbase transactions will always be
|
||||
at index 0. In general, such strategies increase the determinism and
|
||||
predictability of blockchain applications.
|
||||
|
||||
While it is possible to deal with the cases enumerated above using the
|
||||
existing ABCI, currently available result in suboptimal workarounds. Two are
|
||||
explained in greater detail below.
|
||||
|
||||
### Solution 1: App state-based Plasma chain
|
||||
|
||||
In this work around, the app maintains a `PlasmaStore` with a corresponding
|
||||
`Keeper`. The PlasmaStore is responsible for maintaing a second, separate
|
||||
blockchain that complies with the MVP specification, including `deposit`
|
||||
blocks and other "internal" transactions. These "virtual" blocks are then broadcasted
|
||||
to the Root Chain.
|
||||
|
||||
This naive approach is, however, fundamentally flawed, as it by definition
|
||||
diverges from the canonical chain maintained by Tendermint. This is further
|
||||
exacerbated if the business logic for generating such transactions is
|
||||
potentially non-deterministic, as this should not even be done in
|
||||
`Begin/EndBlock`, which may, as a result, break consensus guarantees.
|
||||
|
||||
Additinoally, this has serious implications for "watchers" - independent third parties,
|
||||
or even an auxilliary blockchain, responsible for ensuring that blocks recorded
|
||||
on the Root Chain are consistent with the Plasma chain's. Since, in this case,
|
||||
the Plasma chain is inconsistent with the canonical one maintained by Tendermint
|
||||
Core, it seems that there exists no compact means of verifying the legitimacy of
|
||||
the Plasma chain without replaying every state transition from genesis (!).
|
||||
|
||||
### Solution 2: Broadcast to Tendermint Core from ABCI app
|
||||
|
||||
This approach is inspired by `tendermint`, in which Ethereum transactions are
|
||||
relayed to Tendermint Core. It requires the app to maintain a client connection
|
||||
to the consensus engine.
|
||||
|
||||
Whenever an "internal" transaction needs to be created, the proposer of the
|
||||
current block broadcasts the transaction or transactions to Tendermint as
|
||||
needed in order to ensure that the Tendermint chain and Plasma chain are
|
||||
completely consistent.
|
||||
|
||||
This allows "internal" transactions to pass through the full consensus
|
||||
process, and can be validated in methods like `CheckTx`, i.e., signed by the
|
||||
proposer, is the semantically correct, etc. Note that this involves informing
|
||||
the ABCI app of the block proposer, which was temporarily hacked in as a means
|
||||
of conducting this experiment, although this should not be necessary when the
|
||||
current proposer is passed to `BeginBlock`.
|
||||
|
||||
It is much easier to relay these transactions directly to the Root
|
||||
Chain smart contract and/or maintain a "compressed" auxiliary chain comprised
|
||||
of Plasma-friendly blocks that 100% reflect the canonical (Tendermint)
|
||||
blockchain. Unfortunately, this approach not idiomatic (i.e., utilises the
|
||||
Tendermint consensus engine in unintended ways). Additionally, it does not
|
||||
allow the application developer to:
|
||||
|
||||
- Control the _ordering_ of transactions in the proposed block (e.g., index 0,
|
||||
or 0 to `n` for coinbase transactions)
|
||||
- Control the _number_ of transactions in the block (e.g., when a `deposit`
|
||||
block is required)
|
||||
|
||||
Since determinism is of utmost importance in blockchain engineering, this approach,
|
||||
while more viable, should also not be considered as fit for production.
|
||||
|
||||
## Decision
|
||||
|
||||
### `ProposeTx`
|
||||
|
||||
In order to address the difficulties described above, the ABCI interface must
|
||||
expose an additional method, tentatively named `ProposeTx`.
|
||||
|
||||
It should have the following signature:
|
||||
|
||||
```
|
||||
ProposeTx(RequestProposeTx) ResponseProposeTx
|
||||
```
|
||||
|
||||
Where `RequestProposeTx` and `ResponseProposeTx` are `message`s with the
|
||||
following shapes:
|
||||
|
||||
```
|
||||
message RequestProposeTx {
|
||||
int64 next_block_height = 1; // height of the block the proposed tx would be part of
|
||||
Validator proposer = 2; // the proposer details
|
||||
}
|
||||
|
||||
message ResponseProposeTx {
|
||||
int64 num_tx = 1; // the number of tx to include in proposed block
|
||||
repeated bytes txs = 2; // ordered transaction data to include in block
|
||||
bool exclusive = 3; // whether the block should include other transactions (from `mempool`)
|
||||
}
|
||||
```
|
||||
|
||||
`ProposeTx` would be called by before `mempool.Reap` at this
|
||||
[line](https://github.com/tendermint/tendermint/blob/9cd9f3338bc80a12590631632c23c8dbe3ff5c34/consensus/state.go#L935).
|
||||
Depending on whether `exclusive` is `true` or `false`, the proposed
|
||||
transactions are then pushed on top of the transactions received from
|
||||
`mempool.Reap`.
|
||||
|
||||
### `DeliverTx`
|
||||
|
||||
Since the list of `tx` received from `ProposeTx` are _not_ passed through `CheckTx`,
|
||||
it is probably a good idea to provide a means of differentiatiating "internal" transactions
|
||||
from user-generated ones, in case the app developer needs/wants to take extra measures to
|
||||
ensure validity of the proposed transactions.
|
||||
|
||||
Therefore, the `RequestDeliverTx` message should be changed to provide an additional flag, like so:
|
||||
|
||||
```
|
||||
message RequestDeliverTx {
|
||||
bytes tx = 1;
|
||||
bool internal = 2;
|
||||
}
|
||||
```
|
||||
|
||||
Alternatively, an additional method `DeliverProposeTx` may be added as an accompanient to
|
||||
`ProposeTx`. However, it is not clear at this stage if this additional overhead is necessary
|
||||
to preserve consensus guarantees given that a simple flag may suffice for now.
|
||||
|
||||
## Status
|
||||
|
||||
Pending
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- Tendermint ABCI apps will be able to function as minimally viable Plasma chains.
|
||||
- It will thereby become possible to add an extension to `cosmos-sdk` to enable
|
||||
ABCI apps to support both IBC and Plasma, maximising interop.
|
||||
- ABCI apps will have great control and flexibility in managing blockchain state,
|
||||
without having to resort to non-deterministic hacks and/or unsafe workarounds
|
||||
|
||||
### Negative
|
||||
|
||||
- Maintenance overhead of exposing additional ABCI method
|
||||
- Potential security issues that may have been overlooked and must now be tested extensively
|
||||
|
||||
### Neutral
|
||||
|
||||
- ABCI developers must deal with increased (albeit nominal) API surface area.
|
||||
|
||||
## References
|
||||
|
||||
- [#1776 Plasma and "Internal" Transactions in ABCI Apps](https://github.com/tendermint/tendermint/issues/1776)
|
||||
- [Minimal Viable Plasma](https://ethresear.ch/t/minimal-viable-plasma/426)
|
||||
- [Plasma Cash: Plasma with much less per-user data checking](https://ethresear.ch/t/plasma-cash-plasma-with-much-less-per-user-data-checking/1298)
|
||||
@@ -1,234 +0,0 @@
|
||||
# ADR 024: SignBytes and validator types in privval
|
||||
|
||||
## Context
|
||||
|
||||
Currently, the messages exchanged between tendermint and a (potentially remote) signer/validator,
|
||||
namely votes, proposals, and heartbeats, are encoded as a JSON string
|
||||
(e.g., via `Vote.SignBytes(...)`) and then
|
||||
signed . JSON encoding is sub-optimal for both, hardware wallets
|
||||
and for usage in ethereum smart contracts. Both is laid down in detail in [issue#1622].
|
||||
|
||||
Also, there are currently no differences between sign-request and -replies. Also, there is no possibility
|
||||
for a remote signer to include an error code or message in case something went wrong.
|
||||
The messages exchanged between tendermint and a remote signer currently live in
|
||||
[privval/socket.go] and encapsulate the corresponding types in [types].
|
||||
|
||||
|
||||
[privval/socket.go]: https://github.com/tendermint/tendermint/blob/d419fffe18531317c28c29a292ad7d253f6cafdf/privval/socket.go#L496-L502
|
||||
[issue#1622]: https://github.com/tendermint/tendermint/issues/1622
|
||||
[types]: https://github.com/tendermint/tendermint/tree/master/types
|
||||
|
||||
|
||||
## Decision
|
||||
|
||||
- restructure vote, proposal, and heartbeat such that their encoding is easily parseable by
|
||||
hardware devices and smart contracts using a binary encoding format ([amino] in this case)
|
||||
- split up the messages exchanged between tendermint and remote signers into requests and
|
||||
responses (see details below)
|
||||
- include an error type in responses
|
||||
|
||||
### Overview
|
||||
```
|
||||
+--------------+ +----------------+
|
||||
| | SignXRequest | |
|
||||
|Remote signer |<---------------------+ tendermint |
|
||||
| (e.g. KMS) | | |
|
||||
| +--------------------->| |
|
||||
+--------------+ SignedXReply +----------------+
|
||||
|
||||
|
||||
SignXRequest {
|
||||
x: X
|
||||
}
|
||||
|
||||
SignedXReply {
|
||||
x: X
|
||||
sig: Signature // []byte
|
||||
err: Error{
|
||||
code: int
|
||||
desc: string
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
TODO: Alternatively, the type `X` might directly include the signature. A lot of places expect a vote with a
|
||||
signature and do not necessarily deal with "Replies".
|
||||
Still exploring what would work best here.
|
||||
This would look like (exemplified using X = Vote):
|
||||
```
|
||||
Vote {
|
||||
// all fields besides signature
|
||||
}
|
||||
|
||||
SignedVote {
|
||||
Vote Vote
|
||||
Signature []byte
|
||||
}
|
||||
|
||||
SignVoteRequest {
|
||||
Vote Vote
|
||||
}
|
||||
|
||||
SignedVoteReply {
|
||||
Vote SignedVote
|
||||
Err Error
|
||||
}
|
||||
```
|
||||
|
||||
**Note:** There was a related discussion around including a fingerprint of, or, the whole public-key
|
||||
into each sign-request to tell the signer which corresponding private-key to
|
||||
use to sign the message. This is particularly relevant in the context of the KMS
|
||||
but is currently not considered in this ADR.
|
||||
|
||||
|
||||
[amino]: https://github.com/tendermint/go-amino/
|
||||
|
||||
### Vote
|
||||
|
||||
As explained in [issue#1622] `Vote` will be changed to contain the following fields
|
||||
(notation in protobuf-like syntax for easy readability):
|
||||
|
||||
```proto
|
||||
// vanilla protobuf / amino encoded
|
||||
message Vote {
|
||||
Version fixed32
|
||||
Height sfixed64
|
||||
Round sfixed32
|
||||
VoteType fixed32
|
||||
Timestamp Timestamp // << using protobuf definition
|
||||
BlockID BlockID // << as already defined
|
||||
ChainID string // at the end because length could vary a lot
|
||||
}
|
||||
|
||||
// this is an amino registered type; like currently privval.SignVoteMsg:
|
||||
// registered with "tendermint/socketpv/SignVoteRequest"
|
||||
message SignVoteRequest {
|
||||
Vote vote
|
||||
}
|
||||
|
||||
// amino registered type
|
||||
// registered with "tendermint/socketpv/SignedVoteReply"
|
||||
message SignedVoteReply {
|
||||
Vote Vote
|
||||
Signature Signature
|
||||
Err Error
|
||||
}
|
||||
|
||||
// we will use this type everywhere below
|
||||
message Error {
|
||||
Type uint // error code
|
||||
Description string // optional description
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
The `ChainID` gets moved into the vote message directly. Previously, it was injected
|
||||
using the [Signable] interface method `SignBytes(chainID string) []byte`. Also, the
|
||||
signature won't be included directly, only in the corresponding `SignedVoteReply` message.
|
||||
|
||||
[Signable]: https://github.com/tendermint/tendermint/blob/d419fffe18531317c28c29a292ad7d253f6cafdf/types/signable.go#L9-L11
|
||||
|
||||
### Proposal
|
||||
|
||||
```proto
|
||||
// vanilla protobuf / amino encoded
|
||||
message Proposal {
|
||||
Height sfixed64
|
||||
Round sfixed32
|
||||
Timestamp Timestamp // << using protobuf definition
|
||||
BlockPartsHeader PartSetHeader // as already defined
|
||||
POLRound sfixed32
|
||||
POLBlockID BlockID // << as already defined
|
||||
}
|
||||
|
||||
// amino registered with "tendermint/socketpv/SignProposalRequest"
|
||||
message SignProposalRequest {
|
||||
Proposal proposal
|
||||
}
|
||||
|
||||
// amino registered with "tendermint/socketpv/SignProposalReply"
|
||||
message SignProposalReply {
|
||||
Prop Proposal
|
||||
Sig Signature
|
||||
Err Error // as defined above
|
||||
}
|
||||
```
|
||||
|
||||
### Heartbeat
|
||||
|
||||
**TODO**: clarify if heartbeat also needs a fixed offset and update the fields accordingly:
|
||||
|
||||
```proto
|
||||
message Heartbeat {
|
||||
ValidatorAddress Address
|
||||
ValidatorIndex int
|
||||
Height int64
|
||||
Round int
|
||||
Sequence int
|
||||
}
|
||||
// amino registered with "tendermint/socketpv/SignHeartbeatRequest"
|
||||
message SignHeartbeatRequest {
|
||||
Hb Heartbeat
|
||||
}
|
||||
|
||||
// amino registered with "tendermint/socketpv/SignHeartbeatReply"
|
||||
message SignHeartbeatReply {
|
||||
Hb Heartbeat
|
||||
Sig Signature
|
||||
Err Error // as defined above
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
## PubKey
|
||||
|
||||
TBA - this needs further thoughts: e.g. what todo like in the case of the KMS which holds
|
||||
several keys? How does it know with which key to reply?
|
||||
|
||||
## SignBytes
|
||||
`SignBytes` will not require a `ChainID` parameter:
|
||||
|
||||
```golang
|
||||
type Signable interface {
|
||||
SignBytes() []byte
|
||||
}
|
||||
|
||||
```
|
||||
And the implementation for vote, heartbeat, proposal will look like:
|
||||
```golang
|
||||
// type T is one of vote, sign, proposal
|
||||
func (tp *T) SignBytes() []byte {
|
||||
bz, err := cdc.MarshalBinary(tp)
|
||||
if err != nil {
|
||||
panic(err)
|
||||
}
|
||||
return bz
|
||||
}
|
||||
```
|
||||
|
||||
## Status
|
||||
|
||||
Partially Accepted
|
||||
|
||||
## Consequences
|
||||
|
||||
|
||||
|
||||
### Positive
|
||||
|
||||
The most relevant positive effect is that the signing bytes can easily be parsed by a
|
||||
hardware module and a smart contract. Besides that:
|
||||
|
||||
- clearer separation between requests and responses
|
||||
- added error messages enable better error handling
|
||||
|
||||
|
||||
### Negative
|
||||
|
||||
- relatively huge change / refactoring touching quite some code
|
||||
- lot's of places assume a `Vote` with a signature included -> they will need to
|
||||
- need to modify some interfaces
|
||||
|
||||
### Neutral
|
||||
|
||||
not even the swiss are neutral
|
||||
@@ -1,150 +0,0 @@
|
||||
# ADR 025 Commit
|
||||
|
||||
## Context
|
||||
|
||||
Currently the `Commit` structure contains a lot of potentially redundant or unnecessary data.
|
||||
It contains a list of precommits from every validator, where the precommit
|
||||
includes the whole `Vote` structure. Thus each of the commit height, round,
|
||||
type, and blockID are repeated for every validator, and could be deduplicated,
|
||||
leading to very significant savings in block size.
|
||||
|
||||
```
|
||||
type Commit struct {
|
||||
BlockID BlockID `json:"block_id"`
|
||||
Precommits []*Vote `json:"precommits"`
|
||||
}
|
||||
|
||||
type Vote struct {
|
||||
ValidatorAddress Address `json:"validator_address"`
|
||||
ValidatorIndex int `json:"validator_index"`
|
||||
Height int64 `json:"height"`
|
||||
Round int `json:"round"`
|
||||
Timestamp time.Time `json:"timestamp"`
|
||||
Type byte `json:"type"`
|
||||
BlockID BlockID `json:"block_id"`
|
||||
Signature []byte `json:"signature"`
|
||||
}
|
||||
```
|
||||
|
||||
The original tracking issue for this is [#1648](https://github.com/tendermint/tendermint/issues/1648).
|
||||
We have discussed replacing the `Vote` type in `Commit` with a new `CommitSig`
|
||||
type, which includes at minimum the vote signature. The `Vote` type will
|
||||
continue to be used in the consensus reactor and elsewhere.
|
||||
|
||||
A primary question is what should be included in the `CommitSig` beyond the
|
||||
signature. One current constraint is that we must include a timestamp, since
|
||||
this is how we calculuate BFT time, though we may be able to change this [in the
|
||||
future](https://github.com/tendermint/tendermint/issues/2840).
|
||||
|
||||
Other concerns here include:
|
||||
|
||||
- Validator Address [#3596](https://github.com/tendermint/tendermint/issues/3596) -
|
||||
Should the CommitSig include the validator address? It is very convenient to
|
||||
do so, but likely not necessary. This was also discussed in [#2226](https://github.com/tendermint/tendermint/issues/2226).
|
||||
- Absent Votes [#3591](https://github.com/tendermint/tendermint/issues/3591) -
|
||||
How to represent absent votes? Currently they are just present as `nil` in the
|
||||
Precommits list, which is actually problematic for serialization
|
||||
- Other BlockIDs [#3485](https://github.com/tendermint/tendermint/issues/3485) -
|
||||
How to represent votes for nil and for other block IDs? We currently allow
|
||||
votes for nil and votes for alternative block ids, but just ignore them
|
||||
|
||||
|
||||
## Decision
|
||||
|
||||
Deduplicate the fields and introduce `CommitSig`:
|
||||
|
||||
```
|
||||
type Commit struct {
|
||||
Height int64
|
||||
Round int
|
||||
BlockID BlockID `json:"block_id"`
|
||||
Precommits []CommitSig `json:"precommits"`
|
||||
}
|
||||
|
||||
type CommitSig struct {
|
||||
BlockID BlockIDFlag
|
||||
ValidatorAddress Address
|
||||
Timestamp time.Time
|
||||
Signature []byte
|
||||
}
|
||||
|
||||
|
||||
// indicate which BlockID the signature is for
|
||||
type BlockIDFlag int
|
||||
|
||||
const (
|
||||
BlockIDFlagAbsent BlockIDFlag = iota // vote is not included in the Commit.Precommits
|
||||
BlockIDFlagCommit // voted for the Commit.BlockID
|
||||
BlockIDFlagNil // voted for nil
|
||||
)
|
||||
|
||||
```
|
||||
|
||||
Re the concerns outlined in the context:
|
||||
|
||||
**Timestamp**: Leave the timestamp for now. Removing it and switching to
|
||||
proposer based time will take more analysis and work, and will be left for a
|
||||
future breaking change. In the meantime, the concerns with the current approach to
|
||||
BFT time [can be
|
||||
mitigated](https://github.com/tendermint/tendermint/issues/2840#issuecomment-529122431).
|
||||
|
||||
**ValidatorAddress**: we include it in the `CommitSig` for now. While this
|
||||
does increase the block size unecessarily (20-bytes per validator), it has some ergonomic and debugging advantages:
|
||||
|
||||
- `Commit` contains everything necessary to reconstruct `[]Vote`, and doesn't depend on additional access to a `ValidatorSet`
|
||||
- Lite clients can check if they know the validators in a commit without
|
||||
re-downloading the validator set
|
||||
- Easy to see directly in a commit which validators signed what without having
|
||||
to fetch the validator set
|
||||
|
||||
If and when we change the `CommitSig` again, for instance to remove the timestamp,
|
||||
we can reconsider whether the ValidatorAddress should be removed.
|
||||
|
||||
**Absent Votes**: we include absent votes explicitly with no Signature or
|
||||
Timestamp but with the ValidatorAddress. This should resolve the serialization
|
||||
issues and make it easy to see which validator's votes failed to be included.
|
||||
|
||||
**Other BlockIDs**: We use a single byte to indicate which blockID a `CommitSig`
|
||||
is for. The only options are:
|
||||
- `Absent` - no vote received from the this validator, so no signature
|
||||
- `Nil` - validator voted Nil - meaning they did not see a polka in time
|
||||
- `Commit` - validator voted for this block
|
||||
|
||||
Note this means we don't allow votes for any other blockIDs. If a signature is
|
||||
included in a commit, it is either for nil or the correct blockID. According to
|
||||
the Tendermint protocol and assumptions, there is no way for a correct validator to
|
||||
precommit for a conflicting blockID in the same round an actual commit was
|
||||
created. This was the consensus from
|
||||
[#3485](https://github.com/tendermint/tendermint/issues/3485)
|
||||
|
||||
We may want to consider supporting other blockIDs later, as a way to capture
|
||||
evidence that might be helpful. We should clarify if/when/how doing so would
|
||||
actually help first. To implement it, we could change the `Commit.BlockID`
|
||||
field to a slice, where the first entry is the correct block ID and the other
|
||||
entries are other BlockIDs that validators precommited before. The BlockIDFlag
|
||||
enum can be extended to represent these additional block IDs on a per block
|
||||
basis.
|
||||
|
||||
## Status
|
||||
|
||||
Implemented
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
Removing the Type/Height/Round/Index and the BlockID saves roughly 80 bytes per precommit.
|
||||
It varies because some integers are varint. The BlockID contains two 32-byte hashes an integer,
|
||||
and the Height is 8-bytes.
|
||||
|
||||
For a chain with 100 validators, that's up to 8kB in savings per block!
|
||||
|
||||
|
||||
### Negative
|
||||
|
||||
- Large breaking change to the block and commit structure
|
||||
- Requires differentiating in code between the Vote and CommitSig objects, which may add some complexity (votes need to be reconstructed to be verified and gossiped)
|
||||
|
||||
### Neutral
|
||||
|
||||
- Commit.Precommits no longer contains nil values
|
||||
@@ -1,49 +0,0 @@
|
||||
# ADR 026: General Merkle Proof
|
||||
|
||||
## Context
|
||||
|
||||
We are using raw `[]byte` for merkle proofs in `abci.ResponseQuery`. It makes hard to handle multilayer merkle proofs and general cases. Here, new interface `ProofOperator` is defined. The users can defines their own Merkle proof format and layer them easily.
|
||||
|
||||
Goals:
|
||||
- Layer Merkle proofs without decoding/reencoding
|
||||
- Provide general way to chain proofs
|
||||
- Make the proof format extensible, allowing thirdparty proof types
|
||||
|
||||
## Decision
|
||||
|
||||
### ProofOperator
|
||||
|
||||
`type ProofOperator` is an interface for Merkle proofs. The definition is:
|
||||
|
||||
```go
|
||||
type ProofOperator interface {
|
||||
Run([][]byte) ([][]byte, error)
|
||||
GetKey() []byte
|
||||
ProofOp() ProofOp
|
||||
}
|
||||
```
|
||||
|
||||
Since a proof can treat various data type, `Run()` takes `[][]byte` as the argument, not `[]byte`. For example, a range proof's `Run()` can take multiple key-values as its argument. It will then return the root of the tree for the further process, calculated with the input value.
|
||||
|
||||
`ProofOperator` does not have to be a Merkle proof - it can be a function that transforms the argument for intermediate process e.g. prepending the length to the `[]byte`.
|
||||
|
||||
### ProofOp
|
||||
|
||||
`type ProofOp` is a protobuf message which is a triple of `Type string`, `Key []byte`, and `Data []byte`. `ProofOperator` and `ProofOp`are interconvertible, using `ProofOperator.ProofOp()` and `OpDecoder()`, where `OpDecoder` is a function that each proof type can register for their own encoding scheme. For example, we can add an byte for encoding scheme before the serialized proof, supporting JSON decoding.
|
||||
|
||||
## Status
|
||||
|
||||
Implemented
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- Layering becomes easier (no encoding/decoding at each step)
|
||||
- Thirdparty proof format is available
|
||||
|
||||
### Negative
|
||||
|
||||
- Larger size for abci.ResponseQuery
|
||||
- Unintuitive proof chaining(it is not clear what `Run()` is doing)
|
||||
- Additional codes for registering `OpDecoder`s
|
||||
@@ -1,127 +0,0 @@
|
||||
# ADR 029: Check block txs before prevote
|
||||
|
||||
## Changelog
|
||||
|
||||
04-10-2018: Update with link to issue
|
||||
[#2384](https://github.com/tendermint/tendermint/issues/2384) and reason for rejection
|
||||
19-09-2018: Initial Draft
|
||||
|
||||
## Context
|
||||
|
||||
We currently check a tx's validity through 2 ways.
|
||||
|
||||
1. Through checkTx in mempool connection.
|
||||
2. Through deliverTx in consensus connection.
|
||||
|
||||
The 1st is called when external tx comes in, so the node should be a proposer this time. The 2nd is called when external block comes in and reach the commit phase, the node doesn't need to be the proposer of the block, however it should check the txs in that block.
|
||||
|
||||
In the 2nd situation, if there are many invalid txs in the block, it would be too late for all nodes to discover that most txs in the block are invalid, and we'd better not record invalid txs in the blockchain too.
|
||||
|
||||
## Proposed solution
|
||||
|
||||
Therefore, we should find a way to check the txs' validity before send out a prevote. Currently we have cs.isProposalComplete() to judge whether a block is complete. We can have
|
||||
|
||||
```
|
||||
func (blockExec *BlockExecutor) CheckBlock(block *types.Block) error {
|
||||
// check txs of block.
|
||||
for _, tx := range block.Txs {
|
||||
reqRes := blockExec.proxyApp.CheckTxAsync(tx)
|
||||
reqRes.Wait()
|
||||
if reqRes.Response == nil || reqRes.Response.GetCheckTx() == nil || reqRes.Response.GetCheckTx().Code != abci.CodeTypeOK {
|
||||
return errors.Errorf("tx %v check failed. response: %v", tx, reqRes.Response)
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
```
|
||||
|
||||
such a method in BlockExecutor to check all txs' validity in that block.
|
||||
|
||||
However, this method should not be implemented like that, because checkTx will share the same state used in mempool in the app. So we should define a new interface method checkBlock in Application to indicate it to use the same state as deliverTx.
|
||||
|
||||
```
|
||||
type Application interface {
|
||||
// Info/Query Connection
|
||||
Info(RequestInfo) ResponseInfo // Return application info
|
||||
Query(RequestQuery) ResponseQuery // Query for state
|
||||
|
||||
// Mempool Connection
|
||||
CheckTx(tx []byte) ResponseCheckTx // Validate a tx for the mempool
|
||||
|
||||
// Consensus Connection
|
||||
InitChain(RequestInitChain) ResponseInitChain // Initialize blockchain with validators and other info from TendermintCore
|
||||
CheckBlock(RequestCheckBlock) ResponseCheckBlock
|
||||
BeginBlock(RequestBeginBlock) ResponseBeginBlock // Signals the beginning of a block
|
||||
DeliverTx(tx []byte) ResponseDeliverTx // Deliver a tx for full processing
|
||||
EndBlock(RequestEndBlock) ResponseEndBlock // Signals the end of a block, returns changes to the validator set
|
||||
Commit() ResponseCommit // Commit the state and return the application Merkle root hash
|
||||
}
|
||||
```
|
||||
|
||||
All app should implement that method. For example, counter:
|
||||
|
||||
```
|
||||
func (app *CounterApplication) CheckBlock(block types.Request_CheckBlock) types.ResponseCheckBlock {
|
||||
if app.serial {
|
||||
app.originalTxCount = app.txCount //backup the txCount state
|
||||
for _, tx := range block.CheckBlock.Block.Txs {
|
||||
if len(tx) > 8 {
|
||||
return types.ResponseCheckBlock{
|
||||
Code: code.CodeTypeEncodingError,
|
||||
Log: fmt.Sprintf("Max tx size is 8 bytes, got %d", len(tx))}
|
||||
}
|
||||
tx8 := make([]byte, 8)
|
||||
copy(tx8[len(tx8)-len(tx):], tx)
|
||||
txValue := binary.BigEndian.Uint64(tx8)
|
||||
if txValue < uint64(app.txCount) {
|
||||
return types.ResponseCheckBlock{
|
||||
Code: code.CodeTypeBadNonce,
|
||||
Log: fmt.Sprintf("Invalid nonce. Expected >= %v, got %v", app.txCount, txValue)}
|
||||
}
|
||||
app.txCount++
|
||||
}
|
||||
}
|
||||
return types.ResponseCheckBlock{Code: code.CodeTypeOK}
|
||||
}
|
||||
```
|
||||
|
||||
In BeginBlock, the app should restore the state to the orignal state before checking the block:
|
||||
|
||||
```
|
||||
func (app *CounterApplication) DeliverTx(tx []byte) types.ResponseDeliverTx {
|
||||
if app.serial {
|
||||
app.txCount = app.originalTxCount //restore the txCount state
|
||||
}
|
||||
app.txCount++
|
||||
return types.ResponseDeliverTx{Code: code.CodeTypeOK}
|
||||
}
|
||||
```
|
||||
|
||||
The txCount is like the nonce in ethermint, it should be restored when entering the deliverTx phase. While some operation like checking the tx signature needs not to be done again. So the deliverTx can focus on how a tx can be applied, ignoring the checking of the tx, because all the checking has already been done in the checkBlock phase before.
|
||||
|
||||
An optional optimization is alter the deliverTx to deliverBlock. For the block has already been checked by checkBlock, so all the txs in it are valid. So the app can cache the block, and in the deliverBlock phase, it just needs to apply the block in the cache. This optimization can save network current in deliverTx.
|
||||
|
||||
|
||||
|
||||
## Status
|
||||
|
||||
Rejected
|
||||
|
||||
## Decision
|
||||
|
||||
Performance impact is considered too great. See [#2384](https://github.com/tendermint/tendermint/issues/2384)
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- more robust to defend the adversary to propose a block full of invalid txs.
|
||||
|
||||
### Negative
|
||||
|
||||
- add a new interface method. app logic needs to adjust to appeal to it.
|
||||
- sending all the tx data over the ABCI twice
|
||||
- potentially redundant validations (eg. signature checks in both CheckBlock and
|
||||
DeliverTx)
|
||||
|
||||
### Neutral
|
||||
@@ -1,458 +0,0 @@
|
||||
# ADR 030: Consensus Refactor
|
||||
|
||||
## Context
|
||||
|
||||
One of the biggest challenges this project faces is to proof that the
|
||||
implementations of the specifications are correct, much like we strive to
|
||||
formaly verify our alogrithms and protocols we should work towards high
|
||||
confidence about the correctness of our program code. One of those is the core
|
||||
of Tendermint - Consensus - which currently resides in the `consensus` package.
|
||||
Over time there has been high friction making changes to the package due to the
|
||||
algorithm being scattered in a side-effectful container (the current
|
||||
`ConsensusState`). In order to test the algorithm a large object-graph needs to
|
||||
be set up and even than the non-deterministic parts of the container makes will
|
||||
prevent high certainty. Where ideally we have a 1-to-1 representation of the
|
||||
[spec](https://github.com/tendermint/spec), ready and easy to test for domain
|
||||
experts.
|
||||
|
||||
Addresses:
|
||||
|
||||
- [#1495](https://github.com/tendermint/tendermint/issues/1495)
|
||||
- [#1692](https://github.com/tendermint/tendermint/issues/1692)
|
||||
|
||||
## Decision
|
||||
|
||||
To remedy these issues we plan a gradual, non-invasive refactoring of the
|
||||
`consensus` package. Starting of by isolating the consensus alogrithm into
|
||||
a pure function and a finite state machine to address the most pressuring issue
|
||||
of lack of confidence. Doing so while leaving the rest of the package in tact
|
||||
and have follow-up optional changes to improve the sepration of concerns.
|
||||
|
||||
### Implementation changes
|
||||
|
||||
The core of Consensus can be modelled as a function with clear defined inputs:
|
||||
|
||||
* `State` - data container for current round, height, etc.
|
||||
* `Event`- significant events in the network
|
||||
|
||||
producing clear outputs;
|
||||
|
||||
* `State` - updated input
|
||||
* `Message` - signal what actions to perform
|
||||
|
||||
```go
|
||||
type Event int
|
||||
|
||||
const (
|
||||
EventUnknown Event = iota
|
||||
EventProposal
|
||||
Majority23PrevotesBlock
|
||||
Majority23PrecommitBlock
|
||||
Majority23PrevotesAny
|
||||
Majority23PrecommitAny
|
||||
TimeoutNewRound
|
||||
TimeoutPropose
|
||||
TimeoutPrevotes
|
||||
TimeoutPrecommit
|
||||
)
|
||||
|
||||
type Message int
|
||||
|
||||
const (
|
||||
MeesageUnknown Message = iota
|
||||
MessageProposal
|
||||
MessageVotes
|
||||
MessageDecision
|
||||
)
|
||||
|
||||
type State struct {
|
||||
height uint64
|
||||
round uint64
|
||||
step uint64
|
||||
lockedValue interface{} // TODO: Define proper type.
|
||||
lockedRound interface{} // TODO: Define proper type.
|
||||
validValue interface{} // TODO: Define proper type.
|
||||
validRound interface{} // TODO: Define proper type.
|
||||
// From the original notes: valid(v)
|
||||
valid interface{} // TODO: Define proper type.
|
||||
// From the original notes: proposer(h, r)
|
||||
proposer interface{} // TODO: Define proper type.
|
||||
}
|
||||
|
||||
func Consensus(Event, State) (State, Message) {
|
||||
// Consolidate implementation.
|
||||
}
|
||||
```
|
||||
|
||||
Tracking of relevant information to feed `Event` into the function and act on
|
||||
the output is left to the `ConsensusExecutor` (formerly `ConsensusState`).
|
||||
|
||||
Benefits for testing surfacing nicely as testing for a sequence of events
|
||||
against algorithm could be as simple as the following example:
|
||||
|
||||
``` go
|
||||
func TestConsensusXXX(t *testing.T) {
|
||||
type expected struct {
|
||||
message Message
|
||||
state State
|
||||
}
|
||||
|
||||
// Setup order of events, initial state and expectation.
|
||||
var (
|
||||
events = []struct {
|
||||
event Event
|
||||
want expected
|
||||
}{
|
||||
// ...
|
||||
}
|
||||
state = State{
|
||||
// ...
|
||||
}
|
||||
)
|
||||
|
||||
for _, e := range events {
|
||||
sate, msg = Consensus(e.event, state)
|
||||
|
||||
// Test message expectation.
|
||||
if msg != e.want.message {
|
||||
t.Fatalf("have %v, want %v", msg, e.want.message)
|
||||
}
|
||||
|
||||
// Test state expectation.
|
||||
if !reflect.DeepEqual(state, e.want.state) {
|
||||
t.Fatalf("have %v, want %v", state, e.want.state)
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
## Consensus Executor
|
||||
|
||||
## Consensus Core
|
||||
|
||||
```go
|
||||
type Event interface{}
|
||||
|
||||
type EventNewHeight struct {
|
||||
Height int64
|
||||
ValidatorId int
|
||||
}
|
||||
|
||||
type EventNewRound HeightAndRound
|
||||
|
||||
type EventProposal struct {
|
||||
Height int64
|
||||
Round int
|
||||
Timestamp Time
|
||||
BlockID BlockID
|
||||
POLRound int
|
||||
Sender int
|
||||
}
|
||||
|
||||
type Majority23PrevotesBlock struct {
|
||||
Height int64
|
||||
Round int
|
||||
BlockID BlockID
|
||||
}
|
||||
|
||||
type Majority23PrecommitBlock struct {
|
||||
Height int64
|
||||
Round int
|
||||
BlockID BlockID
|
||||
}
|
||||
|
||||
type HeightAndRound struct {
|
||||
Height int64
|
||||
Round int
|
||||
}
|
||||
|
||||
type Majority23PrevotesAny HeightAndRound
|
||||
type Majority23PrecommitAny HeightAndRound
|
||||
type TimeoutPropose HeightAndRound
|
||||
type TimeoutPrevotes HeightAndRound
|
||||
type TimeoutPrecommit HeightAndRound
|
||||
|
||||
|
||||
type Message interface{}
|
||||
|
||||
type MessageProposal struct {
|
||||
Height int64
|
||||
Round int
|
||||
BlockID BlockID
|
||||
POLRound int
|
||||
}
|
||||
|
||||
type VoteType int
|
||||
|
||||
const (
|
||||
VoteTypeUnknown VoteType = iota
|
||||
Prevote
|
||||
Precommit
|
||||
)
|
||||
|
||||
|
||||
type MessageVote struct {
|
||||
Height int64
|
||||
Round int
|
||||
BlockID BlockID
|
||||
Type VoteType
|
||||
}
|
||||
|
||||
|
||||
type MessageDecision struct {
|
||||
Height int64
|
||||
Round int
|
||||
BlockID BlockID
|
||||
}
|
||||
|
||||
type TriggerTimeout struct {
|
||||
Height int64
|
||||
Round int
|
||||
Duration Duration
|
||||
}
|
||||
|
||||
|
||||
type RoundStep int
|
||||
|
||||
const (
|
||||
RoundStepUnknown RoundStep = iota
|
||||
RoundStepPropose
|
||||
RoundStepPrevote
|
||||
RoundStepPrecommit
|
||||
RoundStepCommit
|
||||
)
|
||||
|
||||
type State struct {
|
||||
Height int64
|
||||
Round int
|
||||
Step RoundStep
|
||||
LockedValue BlockID
|
||||
LockedRound int
|
||||
ValidValue BlockID
|
||||
ValidRound int
|
||||
ValidatorId int
|
||||
ValidatorSetSize int
|
||||
}
|
||||
|
||||
func proposer(height int64, round int) int {}
|
||||
func getValue() BlockID {}
|
||||
|
||||
func Consensus(event Event, state State) (State, Message, TriggerTimeout) {
|
||||
msg = nil
|
||||
timeout = nil
|
||||
switch event := event.(type) {
|
||||
case EventNewHeight:
|
||||
if event.Height > state.Height {
|
||||
state.Height = event.Height
|
||||
state.Round = -1
|
||||
state.Step = RoundStepPropose
|
||||
state.LockedValue = nil
|
||||
state.LockedRound = -1
|
||||
state.ValidValue = nil
|
||||
state.ValidRound = -1
|
||||
state.ValidatorId = event.ValidatorId
|
||||
}
|
||||
return state, msg, timeout
|
||||
|
||||
case EventNewRound:
|
||||
if event.Height == state.Height and event.Round > state.Round {
|
||||
state.Round = eventRound
|
||||
state.Step = RoundStepPropose
|
||||
if proposer(state.Height, state.Round) == state.ValidatorId {
|
||||
proposal = state.ValidValue
|
||||
if proposal == nil {
|
||||
proposal = getValue()
|
||||
}
|
||||
msg = MessageProposal { state.Height, state.Round, proposal, state.ValidRound }
|
||||
}
|
||||
timeout = TriggerTimeout { state.Height, state.Round, timeoutPropose(state.Round) }
|
||||
}
|
||||
return state, msg, timeout
|
||||
|
||||
case EventProposal:
|
||||
if event.Height == state.Height and event.Round == state.Round and
|
||||
event.Sender == proposal(state.Height, state.Round) and state.Step == RoundStepPropose {
|
||||
if event.POLRound >= state.LockedRound or event.BlockID == state.BlockID or state.LockedRound == -1 {
|
||||
msg = MessageVote { state.Height, state.Round, event.BlockID, Prevote }
|
||||
}
|
||||
state.Step = RoundStepPrevote
|
||||
}
|
||||
return state, msg, timeout
|
||||
|
||||
case TimeoutPropose:
|
||||
if event.Height == state.Height and event.Round == state.Round and state.Step == RoundStepPropose {
|
||||
msg = MessageVote { state.Height, state.Round, nil, Prevote }
|
||||
state.Step = RoundStepPrevote
|
||||
}
|
||||
return state, msg, timeout
|
||||
|
||||
case Majority23PrevotesBlock:
|
||||
if event.Height == state.Height and event.Round == state.Round and state.Step >= RoundStepPrevote and event.Round > state.ValidRound {
|
||||
state.ValidRound = event.Round
|
||||
state.ValidValue = event.BlockID
|
||||
if state.Step == RoundStepPrevote {
|
||||
state.LockedRound = event.Round
|
||||
state.LockedValue = event.BlockID
|
||||
msg = MessageVote { state.Height, state.Round, event.BlockID, Precommit }
|
||||
state.Step = RoundStepPrecommit
|
||||
}
|
||||
}
|
||||
return state, msg, timeout
|
||||
|
||||
case Majority23PrevotesAny:
|
||||
if event.Height == state.Height and event.Round == state.Round and state.Step == RoundStepPrevote {
|
||||
timeout = TriggerTimeout { state.Height, state.Round, timeoutPrevote(state.Round) }
|
||||
}
|
||||
return state, msg, timeout
|
||||
|
||||
case TimeoutPrevote:
|
||||
if event.Height == state.Height and event.Round == state.Round and state.Step == RoundStepPrevote {
|
||||
msg = MessageVote { state.Height, state.Round, nil, Precommit }
|
||||
state.Step = RoundStepPrecommit
|
||||
}
|
||||
return state, msg, timeout
|
||||
|
||||
case Majority23PrecommitBlock:
|
||||
if event.Height == state.Height {
|
||||
state.Step = RoundStepCommit
|
||||
state.LockedValue = event.BlockID
|
||||
}
|
||||
return state, msg, timeout
|
||||
|
||||
case Majority23PrecommitAny:
|
||||
if event.Height == state.Height and event.Round == state.Round {
|
||||
timeout = TriggerTimeout { state.Height, state.Round, timeoutPrecommit(state.Round) }
|
||||
}
|
||||
return state, msg, timeout
|
||||
|
||||
case TimeoutPrecommit:
|
||||
if event.Height == state.Height and event.Round == state.Round {
|
||||
state.Round = state.Round + 1
|
||||
}
|
||||
return state, msg, timeout
|
||||
}
|
||||
}
|
||||
|
||||
func ConsensusExecutor() {
|
||||
proposal = nil
|
||||
votes = HeightVoteSet { Height: 1 }
|
||||
state = State {
|
||||
Height: 1
|
||||
Round: 0
|
||||
Step: RoundStepPropose
|
||||
LockedValue: nil
|
||||
LockedRound: -1
|
||||
ValidValue: nil
|
||||
ValidRound: -1
|
||||
}
|
||||
|
||||
event = EventNewHeight {1, id}
|
||||
state, msg, timeout = Consensus(event, state)
|
||||
|
||||
event = EventNewRound {state.Height, 0}
|
||||
state, msg, timeout = Consensus(event, state)
|
||||
|
||||
if msg != nil {
|
||||
send msg
|
||||
}
|
||||
|
||||
if timeout != nil {
|
||||
trigger timeout
|
||||
}
|
||||
|
||||
for {
|
||||
select {
|
||||
case message := <- msgCh:
|
||||
switch msg := message.(type) {
|
||||
case MessageProposal:
|
||||
|
||||
case MessageVote:
|
||||
if msg.Height == state.Height {
|
||||
newVote = votes.AddVote(msg)
|
||||
if newVote {
|
||||
switch msg.Type {
|
||||
case Prevote:
|
||||
prevotes = votes.Prevotes(msg.Round)
|
||||
if prevotes.WeakCertificate() and msg.Round > state.Round {
|
||||
event = EventNewRound { msg.Height, msg.Round }
|
||||
state, msg, timeout = Consensus(event, state)
|
||||
state = handleStateChange(state, msg, timeout)
|
||||
}
|
||||
|
||||
if blockID, ok = prevotes.TwoThirdsMajority(); ok and blockID != nil {
|
||||
if msg.Round == state.Round and hasBlock(blockID) {
|
||||
event = Majority23PrevotesBlock { msg.Height, msg.Round, blockID }
|
||||
state, msg, timeout = Consensus(event, state)
|
||||
state = handleStateChange(state, msg, timeout)
|
||||
}
|
||||
if proposal != nil and proposal.POLRound == msg.Round and hasBlock(blockID) {
|
||||
event = EventProposal {
|
||||
Height: state.Height
|
||||
Round: state.Round
|
||||
BlockID: blockID
|
||||
POLRound: proposal.POLRound
|
||||
Sender: message.Sender
|
||||
}
|
||||
state, msg, timeout = Consensus(event, state)
|
||||
state = handleStateChange(state, msg, timeout)
|
||||
}
|
||||
}
|
||||
|
||||
if prevotes.HasTwoThirdsAny() and msg.Round == state.Round {
|
||||
event = Majority23PrevotesAny { msg.Height, msg.Round, blockID }
|
||||
state, msg, timeout = Consensus(event, state)
|
||||
state = handleStateChange(state, msg, timeout)
|
||||
}
|
||||
|
||||
case Precommit:
|
||||
|
||||
}
|
||||
}
|
||||
}
|
||||
case timeout := <- timeoutCh:
|
||||
|
||||
case block := <- blockCh:
|
||||
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func handleStateChange(state, msg, timeout) State {
|
||||
if state.Step == Commit {
|
||||
state = ExecuteBlock(state.LockedValue)
|
||||
}
|
||||
if msg != nil {
|
||||
send msg
|
||||
}
|
||||
if timeout != nil {
|
||||
trigger timeout
|
||||
}
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
### Implementation roadmap
|
||||
|
||||
* implement proposed implementation
|
||||
* replace currently scattered calls in `ConsensusState` with calls to the new
|
||||
`Consensus` function
|
||||
* rename `ConsensusState` to `ConsensusExecutor` to avoid confusion
|
||||
* propose design for improved separation and clear information flow between
|
||||
`ConsensusExecutor` and `ConsensusReactor`
|
||||
|
||||
## Status
|
||||
|
||||
Draft.
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- isolated implementation of the algorithm
|
||||
- improved testability - simpler to proof correctness
|
||||
- clearer separation of concerns - easier to reason
|
||||
|
||||
### Negative
|
||||
|
||||
### Neutral
|
||||
@@ -1,247 +0,0 @@
|
||||
# ADR 033: pubsub 2.0
|
||||
|
||||
Author: Anton Kaliaev (@melekes)
|
||||
|
||||
## Changelog
|
||||
|
||||
02-10-2018: Initial draft
|
||||
|
||||
16-01-2019: Second version based on our conversation with Jae
|
||||
|
||||
17-01-2019: Third version explaining how new design solves current issues
|
||||
|
||||
25-01-2019: Fourth version to treat buffered and unbuffered channels differently
|
||||
|
||||
## Context
|
||||
|
||||
Since the initial version of the pubsub, there's been a number of issues
|
||||
raised: [#951], [#1879], [#1880]. Some of them are high-level issues questioning the
|
||||
core design choices made. Others are minor and mostly about the interface of
|
||||
`Subscribe()` / `Publish()` functions.
|
||||
|
||||
### Sync vs Async
|
||||
|
||||
Now, when publishing a message to subscribers, we can do it in a goroutine:
|
||||
|
||||
_using channels for data transmission_
|
||||
```go
|
||||
for each subscriber {
|
||||
out := subscriber.outc
|
||||
go func() {
|
||||
out <- msg
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
_by invoking callback functions_
|
||||
```go
|
||||
for each subscriber {
|
||||
go subscriber.callbackFn()
|
||||
}
|
||||
```
|
||||
|
||||
This gives us greater performance and allows us to avoid "slow client problem"
|
||||
(when other subscribers have to wait for a slow subscriber). A pool of
|
||||
goroutines can be used to avoid uncontrolled memory growth.
|
||||
|
||||
In certain cases, this is what you want. But in our case, because we need
|
||||
strict ordering of events (if event A was published before B, the guaranteed
|
||||
delivery order will be A -> B), we can't publish msg in a new goroutine every time.
|
||||
|
||||
We can also have a goroutine per subscriber, although we'd need to be careful
|
||||
with the number of subscribers. It's more difficult to implement as well +
|
||||
unclear if we'll benefit from it (cause we'd be forced to create N additional
|
||||
channels to distribute msg to these goroutines).
|
||||
|
||||
### Non-blocking send
|
||||
|
||||
There is also a question whenever we should have a non-blocking send.
|
||||
Currently, sends are blocking, so publishing to one client can block on
|
||||
publishing to another. This means a slow or unresponsive client can halt the
|
||||
system. Instead, we can use a non-blocking send:
|
||||
|
||||
```go
|
||||
for each subscriber {
|
||||
out := subscriber.outc
|
||||
select {
|
||||
case out <- msg:
|
||||
default:
|
||||
log("subscriber %v buffer is full, skipping...")
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
This fixes the "slow client problem", but there is no way for a slow client to
|
||||
know if it had missed a message. We could return a second channel and close it
|
||||
to indicate subscription termination. On the other hand, if we're going to
|
||||
stick with blocking send, **devs must always ensure subscriber's handling code
|
||||
does not block**, which is a hard task to put on their shoulders.
|
||||
|
||||
The interim option is to run goroutines pool for a single message, wait for all
|
||||
goroutines to finish. This will solve "slow client problem", but we'd still
|
||||
have to wait `max(goroutine_X_time)` before we can publish the next message.
|
||||
|
||||
### Channels vs Callbacks
|
||||
|
||||
Yet another question is whether we should use channels for message transmission or
|
||||
call subscriber-defined callback functions. Callback functions give subscribers
|
||||
more flexibility - you can use mutexes in there, channels, spawn goroutines,
|
||||
anything you really want. But they also carry local scope, which can result in
|
||||
memory leaks and/or memory usage increase.
|
||||
|
||||
Go channels are de-facto standard for carrying data between goroutines.
|
||||
|
||||
### Why `Subscribe()` accepts an `out` channel?
|
||||
|
||||
Because in our tests, we create buffered channels (cap: 1). Alternatively, we
|
||||
can make capacity an argument and return a channel.
|
||||
|
||||
## Decision
|
||||
|
||||
### MsgAndTags
|
||||
|
||||
Use a `MsgAndTags` struct on the subscription channel to indicate what tags the
|
||||
msg matched.
|
||||
|
||||
```go
|
||||
type MsgAndTags struct {
|
||||
Msg interface{}
|
||||
Tags TagMap
|
||||
}
|
||||
```
|
||||
|
||||
### Subscription Struct
|
||||
|
||||
|
||||
Change `Subscribe()` function to return a `Subscription` struct:
|
||||
|
||||
```go
|
||||
type Subscription struct {
|
||||
// private fields
|
||||
}
|
||||
|
||||
func (s *Subscription) Out() <-chan MsgAndTags
|
||||
func (s *Subscription) Canceled() <-chan struct{}
|
||||
func (s *Subscription) Err() error
|
||||
```
|
||||
|
||||
`Out()` returns a channel onto which messages and tags are published.
|
||||
`Unsubscribe`/`UnsubscribeAll` does not close the channel to avoid clients from
|
||||
receiving a nil message.
|
||||
|
||||
`Canceled()` returns a channel that's closed when the subscription is terminated
|
||||
and supposed to be used in a select statement.
|
||||
|
||||
If the channel returned by `Canceled()` is not closed yet, `Err()` returns nil.
|
||||
If the channel is closed, `Err()` returns a non-nil error explaining why:
|
||||
`ErrUnsubscribed` if the subscriber choose to unsubscribe,
|
||||
`ErrOutOfCapacity` if the subscriber is not pulling messages fast enough and the channel returned by `Out()` became full.
|
||||
After `Err()` returns a non-nil error, successive calls to `Err() return the same error.
|
||||
|
||||
```go
|
||||
subscription, err := pubsub.Subscribe(...)
|
||||
if err != nil {
|
||||
// ...
|
||||
}
|
||||
for {
|
||||
select {
|
||||
case msgAndTags <- subscription.Out():
|
||||
// ...
|
||||
case <-subscription.Canceled():
|
||||
return subscription.Err()
|
||||
}
|
||||
```
|
||||
|
||||
### Capacity and Subscriptions
|
||||
|
||||
Make the `Out()` channel buffered (with capacity 1) by default. In most cases, we want to
|
||||
terminate the slow subscriber. Only in rare cases, we want to block the pubsub
|
||||
(e.g. when debugging consensus). This should lower the chances of the pubsub
|
||||
being frozen.
|
||||
|
||||
```go
|
||||
// outCap can be used to set capacity of Out channel
|
||||
// (1 by default, must be greater than 0).
|
||||
Subscribe(ctx context.Context, clientID string, query Query, outCap... int) (Subscription, error) {
|
||||
```
|
||||
|
||||
Use a different function for an unbuffered channel:
|
||||
|
||||
```go
|
||||
// Subscription uses an unbuffered channel. Publishing will block.
|
||||
SubscribeUnbuffered(ctx context.Context, clientID string, query Query) (Subscription, error) {
|
||||
```
|
||||
|
||||
SubscribeUnbuffered should not be exposed to users.
|
||||
|
||||
### Blocking/Nonblocking
|
||||
|
||||
The publisher should treat these kinds of channels separately.
|
||||
It should block on unbuffered channels (for use with internal consensus events
|
||||
in the consensus tests) and not block on the buffered ones. If a client is too
|
||||
slow to keep up with it's messages, it's subscription is terminated:
|
||||
|
||||
for each subscription {
|
||||
out := subscription.outChan
|
||||
if cap(out) == 0 {
|
||||
// block on unbuffered channel
|
||||
out <- msg
|
||||
} else {
|
||||
// don't block on buffered channels
|
||||
select {
|
||||
case out <- msg:
|
||||
default:
|
||||
// set the error, notify on the cancel chan
|
||||
subscription.err = fmt.Errorf("client is too slow for msg)
|
||||
close(subscription.cancelChan)
|
||||
|
||||
// ... unsubscribe and close out
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
### How this new design solves the current issues?
|
||||
|
||||
[#951] ([#1880]):
|
||||
|
||||
Because of non-blocking send, situation where we'll deadlock is not possible
|
||||
anymore. If the client stops reading messages, it will be removed.
|
||||
|
||||
[#1879]:
|
||||
|
||||
MsgAndTags is used now instead of a plain message.
|
||||
|
||||
### Future problems and their possible solutions
|
||||
|
||||
[#2826]
|
||||
|
||||
One question I am still pondering about: how to prevent pubsub from slowing
|
||||
down consensus. We can increase the pubsub queue size (which is 0 now). Also,
|
||||
it's probably a good idea to limit the total number of subscribers.
|
||||
|
||||
This can be made automatically. Say we set queue size to 1000 and, when it's >=
|
||||
80% full, refuse new subscriptions.
|
||||
|
||||
## Status
|
||||
|
||||
Implemented
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- more idiomatic interface
|
||||
- subscribers know what tags msg was published with
|
||||
- subscribers aware of the reason their subscription was canceled
|
||||
|
||||
### Negative
|
||||
|
||||
- (since v1) no concurrency when it comes to publishing messages
|
||||
|
||||
### Neutral
|
||||
|
||||
|
||||
[#951]: https://github.com/tendermint/tendermint/issues/951
|
||||
[#1879]: https://github.com/tendermint/tendermint/issues/1879
|
||||
[#1880]: https://github.com/tendermint/tendermint/issues/1880
|
||||
[#2826]: https://github.com/tendermint/tendermint/issues/2826
|
||||
@@ -1,72 +0,0 @@
|
||||
# ADR 034: PrivValidator file structure
|
||||
|
||||
## Changelog
|
||||
|
||||
03-11-2018: Initial Draft
|
||||
|
||||
## Context
|
||||
|
||||
For now, the PrivValidator file `priv_validator.json` contains mutable and immutable parts.
|
||||
Even in an insecure mode which does not encrypt private key on disk, it is reasonable to separate
|
||||
the mutable part and immutable part.
|
||||
|
||||
References:
|
||||
[#1181](https://github.com/tendermint/tendermint/issues/1181)
|
||||
[#2657](https://github.com/tendermint/tendermint/issues/2657)
|
||||
[#2313](https://github.com/tendermint/tendermint/issues/2313)
|
||||
|
||||
## Proposed Solution
|
||||
|
||||
We can split mutable and immutable parts with two structs:
|
||||
```go
|
||||
// FilePVKey stores the immutable part of PrivValidator
|
||||
type FilePVKey struct {
|
||||
Address types.Address `json:"address"`
|
||||
PubKey crypto.PubKey `json:"pub_key"`
|
||||
PrivKey crypto.PrivKey `json:"priv_key"`
|
||||
|
||||
filePath string
|
||||
}
|
||||
|
||||
// FilePVState stores the mutable part of PrivValidator
|
||||
type FilePVLastSignState struct {
|
||||
Height int64 `json:"height"`
|
||||
Round int `json:"round"`
|
||||
Step int8 `json:"step"`
|
||||
Signature []byte `json:"signature,omitempty"`
|
||||
SignBytes cmn.HexBytes `json:"signbytes,omitempty"`
|
||||
|
||||
filePath string
|
||||
mtx sync.Mutex
|
||||
}
|
||||
```
|
||||
|
||||
Then we can combine `FilePVKey` with `FilePVLastSignState` and will get the original `FilePV`.
|
||||
|
||||
```go
|
||||
type FilePV struct {
|
||||
Key FilePVKey
|
||||
LastSignState FilePVLastSignState
|
||||
}
|
||||
```
|
||||
|
||||
As discussed, `FilePV` should be located in `config`, and `FilePVLastSignState` should be stored in `data`. The
|
||||
store path of each file should be specified in `config.yml`.
|
||||
|
||||
What we need to do next is changing the methods of `FilePV`.
|
||||
|
||||
## Status
|
||||
|
||||
Implemented
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- separate the mutable and immutable of PrivValidator
|
||||
|
||||
### Negative
|
||||
|
||||
- need to add more config for file path
|
||||
|
||||
### Neutral
|
||||
@@ -1,40 +0,0 @@
|
||||
# ADR 035: Documentation
|
||||
|
||||
Author: @zramsay (Zach Ramsay)
|
||||
|
||||
## Changelog
|
||||
|
||||
### November 2nd 2018
|
||||
|
||||
- initial write-up
|
||||
|
||||
## Context
|
||||
|
||||
The Tendermint documentation has undergone several changes until settling on the current model. Originally, the documentation was hosted on the website and had to be updated asynchronously from the code. Along with the other repositories requiring documentation, the whole stack moved to using Read The Docs to automatically generate, publish, and host the documentation. This, however, was insufficient; the RTD site had advertisement, it wasn't easily accessible to devs, didn't collect metrics, was another set of external links, etc.
|
||||
|
||||
## Decision
|
||||
|
||||
For two reasons, the decision was made to use VuePress:
|
||||
|
||||
1) ability to get metrics (implemented on both Tendermint and SDK)
|
||||
2) host the documentation on the website as a `/docs` endpoint.
|
||||
|
||||
This is done while maintaining synchrony between the docs and code, i.e., the website is built whenever the docs are updated.
|
||||
|
||||
## Status
|
||||
|
||||
The two points above have been implemented; the `config.js` has a Google Analytics identifier and the documentation workflow has been up and running largely without problems for several months. Details about the documentation build & workflow can be found [here](../DOCS_README.md)
|
||||
|
||||
## Consequences
|
||||
|
||||
Because of the organizational seperation between Tendermint & Cosmos, there is a challenge of "what goes where" for certain aspects of documentation.
|
||||
|
||||
### Positive
|
||||
|
||||
This architecture is largely positive relative to prior docs arrangements.
|
||||
|
||||
### Negative
|
||||
|
||||
A significant portion of the docs automation / build process is in private repos with limited access/visibility to devs. However, these tasks are handled by the SRE team.
|
||||
|
||||
### Neutral
|
||||
@@ -1,38 +0,0 @@
|
||||
# ADR 036: Empty Blocks via ABCI
|
||||
|
||||
## Changelog
|
||||
|
||||
- {date}: {changelog}
|
||||
|
||||
## Context
|
||||
|
||||
> This section contains all the context one needs to understand the current state, and why there is a problem. It should be as succinct as possible and introduce the high level idea behind the solution.
|
||||
|
||||
## Decision
|
||||
|
||||
> This section explains all of the details of the proposed solution, including implementation details.
|
||||
> It should also describe affects / corollary items that may need to be changed as a part of this.
|
||||
> If the proposed change will be large, please also indicate a way to do the change to maximize ease of review.
|
||||
> (e.g. the optimal split of things to do between separate PR's)
|
||||
|
||||
## Status
|
||||
|
||||
> A decision may be "proposed" if it hasn't been agreed upon yet, or "accepted" once it is agreed upon. If a later ADR changes or reverses a decision, it may be marked as "deprecated" or "superseded" with a reference to its replacement.
|
||||
|
||||
{Deprecated|Proposed|Accepted|Declined}
|
||||
|
||||
## Consequences
|
||||
|
||||
> This section describes the consequences, after applying the decision. All consequences should be summarized here, not just the "positive" ones.
|
||||
|
||||
### Positive
|
||||
|
||||
### Negative
|
||||
|
||||
### Neutral
|
||||
|
||||
## References
|
||||
|
||||
> Are there any relevant PR comments, issues that led up to this, or articles referenced for why we made the given design choice? If so link them here!
|
||||
|
||||
- {reference link}
|
||||
@@ -1,100 +0,0 @@
|
||||
# ADR 037: Deliver Block
|
||||
|
||||
Author: Daniil Lashin (@danil-lashin)
|
||||
|
||||
## Changelog
|
||||
|
||||
13-03-2019: Initial draft
|
||||
|
||||
## Context
|
||||
|
||||
Initial conversation: https://github.com/tendermint/tendermint/issues/2901
|
||||
|
||||
Some applications can handle transactions in parallel, or at least some
|
||||
part of tx processing can be parallelized. Now it is not possible for developer
|
||||
to execute txs in parallel because Tendermint delivers them consequentially.
|
||||
|
||||
## Decision
|
||||
|
||||
Now Tendermint have `BeginBlock`, `EndBlock`, `Commit`, `DeliverTx` steps
|
||||
while executing block. This doc proposes merging this steps into one `DeliverBlock`
|
||||
step. It will allow developers of applications to decide how they want to
|
||||
execute transactions (in parallel or consequentially). Also it will simplify and
|
||||
speed up communications between application and Tendermint.
|
||||
|
||||
As @jaekwon [mentioned](https://github.com/tendermint/tendermint/issues/2901#issuecomment-477746128)
|
||||
in discussion not all application will benefit from this solution. In some cases,
|
||||
when application handles transaction consequentially, it way slow down the blockchain,
|
||||
because it need to wait until full block is transmitted to application to start
|
||||
processing it. Also, in the case of complete change of ABCI, we need to force all the apps
|
||||
to change their implementation completely. That's why I propose to introduce one more ABCI
|
||||
type.
|
||||
|
||||
# Implementation Changes
|
||||
|
||||
In addition to default application interface which now have this structure
|
||||
|
||||
```go
|
||||
type Application interface {
|
||||
// Info and Mempool methods...
|
||||
|
||||
// Consensus Connection
|
||||
InitChain(RequestInitChain) ResponseInitChain // Initialize blockchain with validators and other info from TendermintCore
|
||||
BeginBlock(RequestBeginBlock) ResponseBeginBlock // Signals the beginning of a block
|
||||
DeliverTx(tx []byte) ResponseDeliverTx // Deliver a tx for full processing
|
||||
EndBlock(RequestEndBlock) ResponseEndBlock // Signals the end of a block, returns changes to the validator set
|
||||
Commit() ResponseCommit // Commit the state and return the application Merkle root hash
|
||||
}
|
||||
```
|
||||
|
||||
this doc proposes to add one more:
|
||||
|
||||
```go
|
||||
type Application interface {
|
||||
// Info and Mempool methods...
|
||||
|
||||
// Consensus Connection
|
||||
InitChain(RequestInitChain) ResponseInitChain // Initialize blockchain with validators and other info from TendermintCore
|
||||
DeliverBlock(RequestDeliverBlock) ResponseDeliverBlock // Deliver full block
|
||||
Commit() ResponseCommit // Commit the state and return the application Merkle root hash
|
||||
}
|
||||
|
||||
type RequestDeliverBlock struct {
|
||||
Hash []byte
|
||||
Header Header
|
||||
Txs Txs
|
||||
LastCommitInfo LastCommitInfo
|
||||
ByzantineValidators []Evidence
|
||||
}
|
||||
|
||||
type ResponseDeliverBlock struct {
|
||||
ValidatorUpdates []ValidatorUpdate
|
||||
ConsensusParamUpdates *ConsensusParams
|
||||
Tags []kv.Pair
|
||||
TxResults []ResponseDeliverTx
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
Also, we will need to add new config param, which will specify what kind of ABCI application uses.
|
||||
For example, it can be `abci_type`. Then we will have 2 types:
|
||||
- `advanced` - current ABCI
|
||||
- `simple` - proposed implementation
|
||||
|
||||
## Status
|
||||
|
||||
In review
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- much simpler introduction and tutorials for new developers (instead of implementing 5 methods whey
|
||||
will need to implement only 3)
|
||||
- txs can be handled in parallel
|
||||
- simpler interface
|
||||
- faster communications between Tendermint and application
|
||||
|
||||
### Negative
|
||||
|
||||
- Tendermint should now support 2 kinds of ABCI
|
||||
@@ -1,38 +0,0 @@
|
||||
# ADR 038: Non-zero start height
|
||||
|
||||
## Changelog
|
||||
|
||||
- {date}: {changelog}
|
||||
|
||||
## Context
|
||||
|
||||
> This section contains all the context one needs to understand the current state, and why there is a problem. It should be as succinct as possible and introduce the high level idea behind the solution.
|
||||
|
||||
## Decision
|
||||
|
||||
> This section explains all of the details of the proposed solution, including implementation details.
|
||||
> It should also describe affects / corollary items that may need to be changed as a part of this.
|
||||
> If the proposed change will be large, please also indicate a way to do the change to maximize ease of review.
|
||||
> (e.g. the optimal split of things to do between separate PR's)
|
||||
|
||||
## Status
|
||||
|
||||
> A decision may be "proposed" if it hasn't been agreed upon yet, or "accepted" once it is agreed upon. If a later ADR changes or reverses a decision, it may be marked as "deprecated" or "superseded" with a reference to its replacement.
|
||||
|
||||
{Deprecated|Proposed|Accepted|Declined}
|
||||
|
||||
## Consequences
|
||||
|
||||
> This section describes the consequences, after applying the decision. All consequences should be summarized here, not just the "positive" ones.
|
||||
|
||||
### Positive
|
||||
|
||||
### Negative
|
||||
|
||||
### Neutral
|
||||
|
||||
## References
|
||||
|
||||
> Are there any relevant PR comments, issues that led up to this, or articles referenced for why we made the given design choice? If so link them here!
|
||||
|
||||
- {reference link}
|
||||
@@ -1,159 +0,0 @@
|
||||
# ADR 039: Peer Behaviour Interface
|
||||
|
||||
## Changelog
|
||||
* 07-03-2019: Initial draft
|
||||
* 14-03-2019: Updates from feedback
|
||||
|
||||
## Context
|
||||
|
||||
The responsibility for signaling and acting upon peer behaviour lacks a single
|
||||
owning component and is heavily coupled with the network stack[<sup>1</sup>](#references). Reactors
|
||||
maintain a reference to the `p2p.Switch` which they use to call
|
||||
`switch.StopPeerForError(...)` when a peer misbehaves and
|
||||
`switch.MarkAsGood(...)` when a peer contributes in some meaningful way.
|
||||
While the switch handles `StopPeerForError` internally, the `MarkAsGood`
|
||||
method delegates to another component, `p2p.AddrBook`. This scheme of delegation
|
||||
across Switch obscures the responsibility for handling peer behaviour
|
||||
and ties up the reactors in a larger dependency graph when testing.
|
||||
|
||||
## Decision
|
||||
|
||||
Introduce a `PeerBehaviour` interface and concrete implementations which
|
||||
provide methods for reactors to signal peer behaviour without direct
|
||||
coupling `p2p.Switch`. Introduce a ErrorBehaviourPeer to provide
|
||||
concrete reasons for stopping peers. Introduce GoodBehaviourPeer to provide
|
||||
concrete ways in which a peer contributes.
|
||||
|
||||
### Implementation Changes
|
||||
|
||||
PeerBehaviour then becomes an interface for signaling peer errors as well
|
||||
as for marking peers as `good`.
|
||||
|
||||
```go
|
||||
type PeerBehaviour interface {
|
||||
Behaved(peer Peer, reason GoodBehaviourPeer)
|
||||
Errored(peer Peer, reason ErrorBehaviourPeer)
|
||||
}
|
||||
```
|
||||
|
||||
Instead of signaling peers to stop with arbitrary reasons:
|
||||
`reason interface{}`
|
||||
|
||||
We introduce a concrete error type ErrorBehaviourPeer:
|
||||
```go
|
||||
type ErrorBehaviourPeer int
|
||||
|
||||
const (
|
||||
ErrorBehaviourUnknown = iota
|
||||
ErrorBehaviourBadMessage
|
||||
ErrorBehaviourMessageOutofOrder
|
||||
...
|
||||
)
|
||||
```
|
||||
|
||||
To provide additional information on the ways a peer contributed, we introduce
|
||||
the GoodBehaviourPeer type.
|
||||
|
||||
```go
|
||||
type GoodBehaviourPeer int
|
||||
|
||||
const (
|
||||
GoodBehaviourVote = iota
|
||||
GoodBehaviourBlockPart
|
||||
...
|
||||
)
|
||||
```
|
||||
|
||||
As a first iteration we provide a concrete implementation which wraps
|
||||
the switch:
|
||||
```go
|
||||
type SwitchedPeerBehaviour struct {
|
||||
sw *Switch
|
||||
}
|
||||
|
||||
func (spb *SwitchedPeerBehaviour) Errored(peer Peer, reason ErrorBehaviourPeer) {
|
||||
spb.sw.StopPeerForError(peer, reason)
|
||||
}
|
||||
|
||||
func (spb *SwitchedPeerBehaviour) Behaved(peer Peer, reason GoodBehaviourPeer) {
|
||||
spb.sw.MarkPeerAsGood(peer)
|
||||
}
|
||||
|
||||
func NewSwitchedPeerBehaviour(sw *Switch) *SwitchedPeerBehaviour {
|
||||
return &SwitchedPeerBehaviour{
|
||||
sw: sw,
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Reactors, which are often difficult to unit test[<sup>2</sup>](#references) could use an implementation which exposes the signals produced by the reactor in
|
||||
manufactured scenarios:
|
||||
|
||||
```go
|
||||
type ErrorBehaviours map[Peer][]ErrorBehaviourPeer
|
||||
type GoodBehaviours map[Peer][]GoodBehaviourPeer
|
||||
|
||||
type StorePeerBehaviour struct {
|
||||
eb ErrorBehaviours
|
||||
gb GoodBehaviours
|
||||
}
|
||||
|
||||
func NewStorePeerBehaviour() *StorePeerBehaviour{
|
||||
return &StorePeerBehaviour{
|
||||
eb: make(ErrorBehaviours),
|
||||
gb: make(GoodBehaviours),
|
||||
}
|
||||
}
|
||||
|
||||
func (spb StorePeerBehaviour) Errored(peer Peer, reason ErrorBehaviourPeer) {
|
||||
if _, ok := spb.eb[peer]; !ok {
|
||||
spb.eb[peer] = []ErrorBehaviours{reason}
|
||||
} else {
|
||||
spb.eb[peer] = append(spb.eb[peer], reason)
|
||||
}
|
||||
}
|
||||
|
||||
func (mpb *StorePeerBehaviour) GetErrored() ErrorBehaviours {
|
||||
return mpb.eb
|
||||
}
|
||||
|
||||
|
||||
func (spb StorePeerBehaviour) Behaved(peer Peer, reason GoodBehaviourPeer) {
|
||||
if _, ok := spb.gb[peer]; !ok {
|
||||
spb.gb[peer] = []GoodBehaviourPeer{reason}
|
||||
} else {
|
||||
spb.gb[peer] = append(spb.gb[peer], reason)
|
||||
}
|
||||
}
|
||||
|
||||
func (spb *StorePeerBehaviour) GetBehaved() GoodBehaviours {
|
||||
return spb.gb
|
||||
}
|
||||
```
|
||||
|
||||
## Status
|
||||
|
||||
Accepted
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
* De-couple signaling from acting upon peer behaviour.
|
||||
* Reduce the coupling of reactors and the Switch and the network
|
||||
stack
|
||||
* The responsibility of managing peer behaviour can be migrated to
|
||||
a single component instead of split between the switch and the
|
||||
address book.
|
||||
|
||||
### Negative
|
||||
|
||||
* The first iteration will simply wrap the Switch and introduce a
|
||||
level of indirection.
|
||||
|
||||
### Neutral
|
||||
|
||||
## References
|
||||
|
||||
1. Issue [#2067](https://github.com/tendermint/tendermint/issues/2067): P2P Refactor
|
||||
2. PR: [#3506](https://github.com/tendermint/tendermint/pull/3506): ADR 036: Blockchain Reactor Refactor
|
||||
@@ -1,534 +0,0 @@
|
||||
# ADR 040: Blockchain Reactor Refactor
|
||||
|
||||
## Changelog
|
||||
|
||||
19-03-2019: Initial draft
|
||||
|
||||
## Context
|
||||
|
||||
The Blockchain Reactor's high level responsibility is to enable peers who are far behind the current state of the
|
||||
blockchain to quickly catch up by downloading many blocks in parallel from its peers, verifying block correctness, and
|
||||
executing them against the ABCI application. We call the protocol executed by the Blockchain Reactor `fast-sync`.
|
||||
The current architecture diagram of the blockchain reactor can be found here:
|
||||
|
||||

|
||||
|
||||
The current architecture consists of dozens of routines and it is tightly depending on the `Switch`, making writing
|
||||
unit tests almost impossible. Current tests require setting up complex dependency graphs and dealing with concurrency.
|
||||
Note that having dozens of routines is in this case overkill as most of the time routines sits idle waiting for
|
||||
something to happen (message to arrive or timeout to expire). Due to dependency on the `Switch`, testing relatively
|
||||
complex network scenarios and failures (for example adding and removing peers) is very complex tasks and frequently lead
|
||||
to complex tests with not deterministic behavior ([#3400]). Impossibility to write proper tests makes confidence in
|
||||
the code low and this resulted in several issues (some are fixed in the meantime and some are still open):
|
||||
[#3400], [#2897], [#2896], [#2699], [#2888], [#2457], [#2622], [#2026].
|
||||
|
||||
## Decision
|
||||
|
||||
To remedy these issues we plan a major refactor of the blockchain reactor. The proposed architecture is largely inspired
|
||||
by ADR-30 and is presented on the following diagram:
|
||||

|
||||
|
||||
We suggest a concurrency architecture where the core algorithm (we call it `Controller`) is extracted into a finite
|
||||
state machine. The active routine of the reactor is called `Executor` and is responsible for receiving and sending
|
||||
messages from/to peers and triggering timeouts. What messages should be sent and timeouts triggered is determined mostly
|
||||
by the `Controller`. The exception is `Peer Heartbeat` mechanism which is `Executor` responsibility. The heartbeat
|
||||
mechanism is used to remove slow and unresponsive peers from the peer list. Writing of unit tests is simpler with
|
||||
this architecture as most of the critical logic is part of the `Controller` function. We expect that simpler concurrency
|
||||
architecture will not have significant negative effect on the performance of this reactor (to be confirmed by
|
||||
experimental evaluation).
|
||||
|
||||
|
||||
### Implementation changes
|
||||
|
||||
We assume the following system model for "fast sync" protocol:
|
||||
|
||||
* a node is connected to a random subset of all nodes that represents its peer set. Some nodes are correct and some
|
||||
might be faulty. We don't make assumptions about ratio of faulty nodes, i.e., it is possible that all nodes in some
|
||||
peer set are faulty.
|
||||
* we assume that communication between correct nodes is synchronous, i.e., if a correct node `p` sends a message `m` to
|
||||
a correct node `q` at time `t`, then `q` will receive message the latest at time `t+Delta` where `Delta` is a system
|
||||
parameter that is known by network participants. `Delta` is normally chosen to be an order of magnitude higher than
|
||||
the real communication delay (maximum) between correct nodes. Therefore if a correct node `p` sends a request message
|
||||
to a correct node `q` at time `t` and there is no the corresponding reply at time `t + 2*Delta`, then `p` can assume
|
||||
that `q` is faulty. Note that the network assumptions for the consensus reactor are different (we assume partially
|
||||
synchronous model there).
|
||||
|
||||
The requirements for the "fast sync" protocol are formally specified as follows:
|
||||
|
||||
- `Correctness`: If a correct node `p` is connected to a correct node `q` for a long enough period of time, then `p`
|
||||
- will eventually download all requested blocks from `q`.
|
||||
- `Termination`: If a set of peers of a correct node `p` is stable (no new nodes are added to the peer set of `p`) for
|
||||
- a long enough period of time, then protocol eventually terminates.
|
||||
- `Fairness`: A correct node `p` sends requests for blocks to all peers from its peer set.
|
||||
|
||||
As explained above, the `Executor` is responsible for sending and receiving messages that are part of the `fast-sync`
|
||||
protocol. The following messages are exchanged as part of `fast-sync` protocol:
|
||||
|
||||
``` go
|
||||
type Message int
|
||||
const (
|
||||
MessageUnknown Message = iota
|
||||
MessageStatusRequest
|
||||
MessageStatusResponse
|
||||
MessageBlockRequest
|
||||
MessageBlockResponse
|
||||
)
|
||||
```
|
||||
`MessageStatusRequest` is sent periodically to all peers as a request for a peer to provide its current height. It is
|
||||
part of the `Peer Heartbeat` mechanism and a failure to respond timely to this message results in a peer being removed
|
||||
from the peer set. Note that the `Peer Heartbeat` mechanism is used only while a peer is in `fast-sync` mode. We assume
|
||||
here existence of a mechanism that gives node a possibility to inform its peers that it is in the `fast-sync` mode.
|
||||
|
||||
``` go
|
||||
type MessageStatusRequest struct {
|
||||
SeqNum int64 // sequence number of the request
|
||||
}
|
||||
```
|
||||
`MessageStatusResponse` is sent as a response to `MessageStatusRequest` to inform requester about the peer current
|
||||
height.
|
||||
|
||||
``` go
|
||||
type MessageStatusResponse struct {
|
||||
SeqNum int64 // sequence number of the corresponding request
|
||||
Height int64 // current peer height
|
||||
}
|
||||
```
|
||||
|
||||
`MessageBlockRequest` is used to make a request for a block and the corresponding commit certificate at a given height.
|
||||
|
||||
``` go
|
||||
type MessageBlockRequest struct {
|
||||
Height int64
|
||||
}
|
||||
```
|
||||
|
||||
`MessageBlockResponse` is a response for the corresponding block request. In addition to providing the block and the
|
||||
corresponding commit certificate, it contains also a current peer height.
|
||||
|
||||
``` go
|
||||
type MessageBlockResponse struct {
|
||||
Height int64
|
||||
Block Block
|
||||
Commit Commit
|
||||
PeerHeight int64
|
||||
}
|
||||
```
|
||||
|
||||
In addition to sending and receiving messages, and `HeartBeat` mechanism, controller is also managing timeouts
|
||||
that are triggered upon `Controller` request. `Controller` is then informed once a timeout expires.
|
||||
|
||||
``` go
|
||||
type TimeoutTrigger int
|
||||
const (
|
||||
TimeoutUnknown TimeoutTrigger = iota
|
||||
TimeoutResponseTrigger
|
||||
TimeoutTerminationTrigger
|
||||
)
|
||||
```
|
||||
|
||||
The `Controller` can be modelled as a function with clearly defined inputs:
|
||||
|
||||
* `State` - current state of the node. Contains data about connected peers and its behavior, pending requests,
|
||||
* received blocks, etc.
|
||||
* `Event` - significant events in the network.
|
||||
|
||||
producing clear outputs:
|
||||
|
||||
* `State` - updated state of the node,
|
||||
* `MessageToSend` - signal what message to send and to which peer
|
||||
* `TimeoutTrigger` - signal that timeout should be triggered.
|
||||
|
||||
|
||||
We consider the following `Event` types:
|
||||
|
||||
``` go
|
||||
type Event int
|
||||
const (
|
||||
EventUnknown Event = iota
|
||||
EventStatusReport
|
||||
EventBlockRequest
|
||||
EventBlockResponse
|
||||
EventRemovePeer
|
||||
EventTimeoutResponse
|
||||
EventTimeoutTermination
|
||||
)
|
||||
```
|
||||
|
||||
`EventStatusResponse` event is generated once `MessageStatusResponse` is received by the `Executor`.
|
||||
|
||||
``` go
|
||||
type EventStatusReport struct {
|
||||
PeerID ID
|
||||
Height int64
|
||||
}
|
||||
```
|
||||
|
||||
`EventBlockRequest` event is generated once `MessageBlockRequest` is received by the `Executor`.
|
||||
|
||||
``` go
|
||||
type EventBlockRequest struct {
|
||||
Height int64
|
||||
PeerID p2p.ID
|
||||
}
|
||||
```
|
||||
`EventBlockResponse` event is generated upon reception of `MessageBlockResponse` message by the `Executor`.
|
||||
|
||||
``` go
|
||||
type EventBlockResponse struct {
|
||||
Height int64
|
||||
Block Block
|
||||
Commit Commit
|
||||
PeerID ID
|
||||
PeerHeight int64
|
||||
}
|
||||
```
|
||||
`EventRemovePeer` is generated by `Executor` to signal that the connection to a peer is closed due to peer misbehavior.
|
||||
|
||||
``` go
|
||||
type EventRemovePeer struct {
|
||||
PeerID ID
|
||||
}
|
||||
```
|
||||
`EventTimeoutResponse` is generated by `Executor` to signal that a timeout triggered by `TimeoutResponseTrigger` has
|
||||
expired.
|
||||
|
||||
``` go
|
||||
type EventTimeoutResponse struct {
|
||||
PeerID ID
|
||||
Height int64
|
||||
}
|
||||
```
|
||||
`EventTimeoutTermination` is generated by `Executor` to signal that a timeout triggered by `TimeoutTerminationTrigger`
|
||||
has expired.
|
||||
|
||||
``` go
|
||||
type EventTimeoutTermination struct {
|
||||
Height int64
|
||||
}
|
||||
```
|
||||
|
||||
`MessageToSend` is just a wrapper around `Message` type that contains id of the peer to which message should be sent.
|
||||
|
||||
``` go
|
||||
type MessageToSend struct {
|
||||
PeerID ID
|
||||
Message Message
|
||||
}
|
||||
```
|
||||
|
||||
The Controller state machine can be in two modes: `ModeFastSync` when
|
||||
a node is trying to catch up with the network by downloading committed blocks,
|
||||
and `ModeConsensus` in which it executes Tendermint consensus protocol. We
|
||||
consider that `fast sync` mode terminates once the Controller switch to
|
||||
`ModeConsensus`.
|
||||
|
||||
``` go
|
||||
type Mode int
|
||||
const (
|
||||
ModeUnknown Mode = iota
|
||||
ModeFastSync
|
||||
ModeConsensus
|
||||
)
|
||||
```
|
||||
`Controller` is managing the following state:
|
||||
|
||||
``` go
|
||||
type ControllerState struct {
|
||||
Height int64 // the first block that is not committed
|
||||
Mode Mode // mode of operation
|
||||
PeerMap map[ID]PeerStats // map of peer IDs to peer statistics
|
||||
MaxRequestPending int64 // maximum height of the pending requests
|
||||
FailedRequests []int64 // list of failed block requests
|
||||
PendingRequestsNum int // total number of pending requests
|
||||
Store []BlockInfo // contains list of downloaded blocks
|
||||
Executor BlockExecutor // store, verify and executes blocks
|
||||
}
|
||||
```
|
||||
|
||||
`PeerStats` data structure keeps for every peer its current height and a list of pending requests for blocks.
|
||||
|
||||
``` go
|
||||
type PeerStats struct {
|
||||
Height int64
|
||||
PendingRequest int64 // a request sent to this peer
|
||||
}
|
||||
```
|
||||
|
||||
`BlockInfo` data structure is used to store information (as part of block store) about downloaded blocks: from what peer
|
||||
a block and the corresponding commit certificate are received.
|
||||
``` go
|
||||
type BlockInfo struct {
|
||||
Block Block
|
||||
Commit Commit
|
||||
PeerID ID // a peer from which we received the corresponding Block and Commit
|
||||
}
|
||||
```
|
||||
|
||||
The `Controller` is initialized by providing an initial height (`startHeight`) from which it will start downloading
|
||||
blocks from peers and the current state of the `BlockExecutor`.
|
||||
|
||||
``` go
|
||||
func NewControllerState(startHeight int64, executor BlockExecutor) ControllerState {
|
||||
state = ControllerState {}
|
||||
state.Height = startHeight
|
||||
state.Mode = ModeFastSync
|
||||
state.MaxRequestPending = startHeight - 1
|
||||
state.PendingRequestsNum = 0
|
||||
state.Executor = executor
|
||||
initialize state.PeerMap, state.FailedRequests and state.Store to empty data structures
|
||||
return state
|
||||
}
|
||||
```
|
||||
|
||||
The core protocol logic is given with the following function:
|
||||
|
||||
``` go
|
||||
func handleEvent(state ControllerState, event Event) (ControllerState, Message, TimeoutTrigger, Error) {
|
||||
msg = nil
|
||||
timeout = nil
|
||||
error = nil
|
||||
|
||||
switch state.Mode {
|
||||
case ModeConsensus:
|
||||
switch event := event.(type) {
|
||||
case EventBlockRequest:
|
||||
msg = createBlockResponseMessage(state, event)
|
||||
return state, msg, timeout, error
|
||||
default:
|
||||
error = "Only respond to BlockRequests while in ModeConsensus!"
|
||||
return state, msg, timeout, error
|
||||
}
|
||||
|
||||
case ModeFastSync:
|
||||
switch event := event.(type) {
|
||||
case EventBlockRequest:
|
||||
msg = createBlockResponseMessage(state, event)
|
||||
return state, msg, timeout, error
|
||||
|
||||
case EventStatusResponse:
|
||||
return handleEventStatusResponse(event, state)
|
||||
|
||||
case EventRemovePeer:
|
||||
return handleEventRemovePeer(event, state)
|
||||
|
||||
case EventBlockResponse:
|
||||
return handleEventBlockResponse(event, state)
|
||||
|
||||
case EventResponseTimeout:
|
||||
return handleEventResponseTimeout(event, state)
|
||||
|
||||
case EventTerminationTimeout:
|
||||
// Termination timeout is triggered in case of empty peer set and in case there are no pending requests.
|
||||
// If this timeout expires and in the meantime no new peers are added or new pending requests are made
|
||||
// then `fast-sync` mode terminates by switching to `ModeConsensus`.
|
||||
// Note that termination timeout should be higher than the response timeout.
|
||||
if state.Height == event.Height && state.PendingRequestsNum == 0 { state.State = ConsensusMode }
|
||||
return state, msg, timeout, error
|
||||
|
||||
default:
|
||||
error = "Received unknown event type!"
|
||||
return state, msg, timeout, error
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
``` go
|
||||
func createBlockResponseMessage(state ControllerState, event BlockRequest) MessageToSend {
|
||||
msgToSend = nil
|
||||
if _, ok := state.PeerMap[event.PeerID]; !ok { peerStats = PeerStats{-1, -1} }
|
||||
if state.Executor.ContainsBlockWithHeight(event.Height) && event.Height > peerStats.Height {
|
||||
peerStats = event.Height
|
||||
msg = BlockResponseMessage{
|
||||
Height: event.Height,
|
||||
Block: state.Executor.getBlock(eventHeight),
|
||||
Commit: state.Executor.getCommit(eventHeight),
|
||||
PeerID: event.PeerID,
|
||||
CurrentHeight: state.Height - 1,
|
||||
}
|
||||
msgToSend = MessageToSend { event.PeerID, msg }
|
||||
}
|
||||
state.PeerMap[event.PeerID] = peerStats
|
||||
return msgToSend
|
||||
}
|
||||
```
|
||||
|
||||
``` go
|
||||
func handleEventStatusResponse(event EventStatusResponse, state ControllerState) (ControllerState, MessageToSend, TimeoutTrigger, Error) {
|
||||
if _, ok := state.PeerMap[event.PeerID]; !ok {
|
||||
peerStats = PeerStats{ -1, -1 }
|
||||
} else {
|
||||
peerStats = state.PeerMap[event.PeerID]
|
||||
}
|
||||
|
||||
if event.Height > peerStats.Height { peerStats.Height = event.Height }
|
||||
// if there are no pending requests for this peer, try to send him a request for block
|
||||
if peerStats.PendingRequest == -1 {
|
||||
msg = createBlockRequestMessages(state, event.PeerID, peerStats.Height)
|
||||
// msg is nil if no request for block can be made to a peer at this point in time
|
||||
if msg != nil {
|
||||
peerStats.PendingRequests = msg.Height
|
||||
state.PendingRequestsNum++
|
||||
// when a request for a block is sent to a peer, a response timeout is triggered. If no corresponding block is sent by the peer
|
||||
// during response timeout period, then the peer is considered faulty and is removed from the peer set.
|
||||
timeout = ResponseTimeoutTrigger{ msg.PeerID, msg.Height, PeerTimeout }
|
||||
} else if state.PendingRequestsNum == 0 {
|
||||
// if there are no pending requests and no new request can be placed to the peer, termination timeout is triggered.
|
||||
// If termination timeout expires and we are still at the same height and there are no pending requests, the "fast-sync"
|
||||
// mode is finished and we switch to `ModeConsensus`.
|
||||
timeout = TerminationTimeoutTrigger{ state.Height, TerminationTimeout }
|
||||
}
|
||||
}
|
||||
state.PeerMap[event.PeerID] = peerStats
|
||||
return state, msg, timeout, error
|
||||
}
|
||||
```
|
||||
|
||||
``` go
|
||||
func handleEventRemovePeer(event EventRemovePeer, state ControllerState) (ControllerState, MessageToSend, TimeoutTrigger, Error) {
|
||||
if _, ok := state.PeerMap[event.PeerID]; ok {
|
||||
pendingRequest = state.PeerMap[event.PeerID].PendingRequest
|
||||
// if a peer is removed from the peer set, its pending request is declared failed and added to the `FailedRequests` list
|
||||
// so it can be retried.
|
||||
if pendingRequest != -1 {
|
||||
add(state.FailedRequests, pendingRequest)
|
||||
}
|
||||
state.PendingRequestsNum--
|
||||
delete(state.PeerMap, event.PeerID)
|
||||
// if the peer set is empty after removal of this peer then termination timeout is triggered.
|
||||
if state.PeerMap.isEmpty() {
|
||||
timeout = TerminationTimeoutTrigger{ state.Height, TerminationTimeout }
|
||||
}
|
||||
} else { error = "Removing unknown peer!" }
|
||||
return state, msg, timeout, error
|
||||
```
|
||||
|
||||
``` go
|
||||
func handleEventBlockResponse(event EventBlockResponse, state ControllerState) (ControllerState, MessageToSend, TimeoutTrigger, Error)
|
||||
if state.PeerMap[event.PeerID] {
|
||||
peerStats = state.PeerMap[event.PeerID]
|
||||
// when expected block arrives from a peer, it is added to the store so it can be verified and if correct executed after.
|
||||
if peerStats.PendingRequest == event.Height {
|
||||
peerStats.PendingRequest = -1
|
||||
state.PendingRequestsNum--
|
||||
if event.PeerHeight > peerStats.Height { peerStats.Height = event.PeerHeight }
|
||||
state.Store[event.Height] = BlockInfo{ event.Block, event.Commit, event.PeerID }
|
||||
// blocks are verified sequentially so adding a block to the store does not mean that it will be immediately verified
|
||||
// as some of the previous blocks might be missing.
|
||||
state = verifyBlocks(state) // it can lead to event.PeerID being removed from peer list
|
||||
if _, ok := state.PeerMap[event.PeerID]; ok {
|
||||
// we try to identify new request for a block that can be asked to the peer
|
||||
msg = createBlockRequestMessage(state, event.PeerID, peerStats.Height)
|
||||
if msg != nil {
|
||||
peerStats.PendingRequests = msg.Height
|
||||
state.PendingRequestsNum++
|
||||
// if request for block is made, response timeout is triggered
|
||||
timeout = ResponseTimeoutTrigger{ msg.PeerID, msg.Height, PeerTimeout }
|
||||
} else if state.PeerMap.isEmpty() || state.PendingRequestsNum == 0 {
|
||||
// if the peer map is empty (the peer can be removed as block verification failed) or there are no pending requests
|
||||
// termination timeout is triggered.
|
||||
timeout = TerminationTimeoutTrigger{ state.Height, TerminationTimeout }
|
||||
}
|
||||
}
|
||||
} else { error = "Received Block from wrong peer!" }
|
||||
} else { error = "Received Block from unknown peer!" }
|
||||
|
||||
state.PeerMap[event.PeerID] = peerStats
|
||||
return state, msg, timeout, error
|
||||
}
|
||||
```
|
||||
|
||||
``` go
|
||||
func handleEventResponseTimeout(event, state) {
|
||||
if _, ok := state.PeerMap[event.PeerID]; ok {
|
||||
peerStats = state.PeerMap[event.PeerID]
|
||||
// if a response timeout expires and the peer hasn't delivered the block, the peer is removed from the peer list and
|
||||
// the request is added to the `FailedRequests` so the block can be downloaded from other peer
|
||||
if peerStats.PendingRequest == event.Height {
|
||||
add(state.FailedRequests, pendingRequest)
|
||||
delete(state.PeerMap, event.PeerID)
|
||||
state.PendingRequestsNum--
|
||||
// if peer set is empty, then termination timeout is triggered
|
||||
if state.PeerMap.isEmpty() {
|
||||
timeout = TimeoutTrigger{ state.Height, TerminationTimeout }
|
||||
}
|
||||
}
|
||||
}
|
||||
return state, msg, timeout, error
|
||||
}
|
||||
```
|
||||
|
||||
``` go
|
||||
func createBlockRequestMessage(state ControllerState, peerID ID, peerHeight int64) MessageToSend {
|
||||
msg = nil
|
||||
blockHeight = -1
|
||||
r = find request in state.FailedRequests such that r <= peerHeight // returns `nil` if there are no such request
|
||||
// if there is a height in failed requests that can be downloaded from the peer send request to it
|
||||
if r != nil {
|
||||
blockNumber = r
|
||||
delete(state.FailedRequests, r)
|
||||
} else if state.MaxRequestPending < peerHeight {
|
||||
// if height of the maximum pending request is smaller than peer height, then ask peer for next block
|
||||
state.MaxRequestPending++
|
||||
blockHeight = state.MaxRequestPending // increment state.MaxRequestPending and then return the new value
|
||||
}
|
||||
|
||||
if blockHeight > -1 { msg = MessageToSend { peerID, MessageBlockRequest { blockHeight } }
|
||||
return msg
|
||||
}
|
||||
```
|
||||
|
||||
``` go
|
||||
func verifyBlocks(state State) State {
|
||||
done = false
|
||||
for !done {
|
||||
block = state.Store[height]
|
||||
if block != nil {
|
||||
verified = verify block.Block using block.Commit // return `true` is verification succeed, 'false` otherwise
|
||||
|
||||
if verified {
|
||||
block.Execute() // executing block is costly operation so it might make sense executing asynchronously
|
||||
state.Height++
|
||||
} else {
|
||||
// if block verification failed, then it is added to `FailedRequests` and the peer is removed from the peer set
|
||||
add(state.FailedRequests, height)
|
||||
state.Store[height] = nil
|
||||
if _, ok := state.PeerMap[block.PeerID]; ok {
|
||||
pendingRequest = state.PeerMap[block.PeerID].PendingRequest
|
||||
// if there is a pending request sent to the peer that is just to be removed from the peer set, add it to `FailedRequests`
|
||||
if pendingRequest != -1 {
|
||||
add(state.FailedRequests, pendingRequest)
|
||||
state.PendingRequestsNum--
|
||||
}
|
||||
delete(state.PeerMap, event.PeerID)
|
||||
}
|
||||
done = true
|
||||
}
|
||||
} else { done = true }
|
||||
}
|
||||
return state
|
||||
}
|
||||
```
|
||||
|
||||
In the proposed architecture `Controller` is not active task, i.e., it is being called by the `Executor`. Depending on
|
||||
the return values returned by `Controller`,`Executor` will send a message to some peer (`msg` != nil), trigger a
|
||||
timeout (`timeout` != nil) or deal with errors (`error` != nil).
|
||||
In case a timeout is triggered, it will provide as an input to `Controller` the corresponding timeout event once
|
||||
timeout expires.
|
||||
|
||||
|
||||
## Status
|
||||
|
||||
Draft.
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- isolated implementation of the algorithm
|
||||
- improved testability - simpler to prove correctness
|
||||
- clearer separation of concerns - easier to reason
|
||||
|
||||
### Negative
|
||||
|
||||
### Neutral
|
||||
@@ -1,29 +0,0 @@
|
||||
# ADR 041: Application should be in charge of validator set
|
||||
|
||||
## Changelog
|
||||
|
||||
|
||||
## Context
|
||||
|
||||
Currently Tendermint is in charge of validator set and proposer selection. Application can only update the validator set changes at EndBlock time.
|
||||
To support Light Client, application should make sure at least 2/3 of validator are same at each round.
|
||||
|
||||
Application should have full control on validator set changes and proposer selection. In each round Application can provide the list of validators for next rounds in order with their power. The proposer is the first in the list, in case the proposer is offline, the next one can propose the proposal and so on.
|
||||
|
||||
## Decision
|
||||
|
||||
## Status
|
||||
|
||||
## Consequences
|
||||
|
||||
Tendermint is no more in charge of validator set and its changes. The Application should provide the correct information.
|
||||
However Tendermint can provide psedo-randomness algorithm to help application for selecting proposer in each round.
|
||||
|
||||
### Positive
|
||||
|
||||
### Negative
|
||||
|
||||
### Neutral
|
||||
|
||||
## References
|
||||
|
||||
@@ -1,235 +0,0 @@
|
||||
# ADR 042: State Sync Design
|
||||
|
||||
## Changelog
|
||||
|
||||
2019-06-27: Init by EB
|
||||
2019-07-04: Follow up by brapse
|
||||
|
||||
## Context
|
||||
StateSync is a feature which would allow a new node to receive a
|
||||
snapshot of the application state without downloading blocks or going
|
||||
through consensus. Once downloaded, the node could switch to FastSync
|
||||
and eventually participate in consensus. The goal of StateSync is to
|
||||
facilitate setting up a new node as quickly as possible.
|
||||
|
||||
## Considerations
|
||||
Because Tendermint doesn't know anything about the application state,
|
||||
StateSync will broker messages between nodes and through
|
||||
the ABCI to an opaque applicaton. The implementation will have multiple
|
||||
touch points on both the tendermint code base and ABCI application.
|
||||
|
||||
* A StateSync reactor to facilitate peer communication - Tendermint
|
||||
* A Set of ABCI messages to transmit application state to the reactor - Tendermint
|
||||
* A Set of MultiStore APIs for exposing snapshot data to the ABCI - ABCI application
|
||||
* A Storage format with validation and performance considerations - ABCI application
|
||||
|
||||
### Implementation Properties
|
||||
Beyond the approach, any implementation of StateSync can be evaluated
|
||||
across different criteria:
|
||||
|
||||
* Speed: Expected throughput of producing and consuming snapshots
|
||||
* Safety: Cost of pushing invalid snapshots to a node
|
||||
* Liveness: Cost of preventing a node from receiving/constructing a snapshot
|
||||
* Effort: How much effort does an implementation require
|
||||
|
||||
### Implementation Question
|
||||
* What is the format of a snapshot
|
||||
* Complete snapshot
|
||||
* Ordered IAVL key ranges
|
||||
* Compressed individually chunks which can be validated
|
||||
* How is data validated
|
||||
* Trust a peer with it's data blindly
|
||||
* Trust a majority of peers
|
||||
* Use light client validation to validate each chunk against consensus
|
||||
produced merkle tree root
|
||||
* What are the performance characteristics
|
||||
* Random vs sequential reads
|
||||
* How parallelizeable is the scheduling algorithm
|
||||
|
||||
### Proposals
|
||||
Broadly speaking there are two approaches to this problem which have had
|
||||
varying degrees of discussion and progress. These approach can be
|
||||
summarized as:
|
||||
|
||||
**Lazy:** Where snapshots are produced dynamically at request time. This
|
||||
solution would use the existing data structure.
|
||||
**Eager:** Where snapshots are produced periodically and served from disk at
|
||||
request time. This solution would create an auxiliary data structure
|
||||
optimized for batch read/writes.
|
||||
|
||||
Additionally the propsosals tend to vary on how they provide safety
|
||||
properties.
|
||||
|
||||
**LightClient** Where a client can aquire the merkle root from the block
|
||||
headers synchronized from a trusted validator set. Subsets of the application state,
|
||||
called chunks can therefore be validated on receipt to ensure each chunk
|
||||
is part of the merkle root.
|
||||
|
||||
**Majority of Peers** Where manifests of chunks along with checksums are
|
||||
downloaded and compared against versions provided by a majority of
|
||||
peers.
|
||||
|
||||
#### Lazy StateSync
|
||||
An initial specification was published by Alexis Sellier.
|
||||
In this design, the state has a given `size` of primitive elements (like
|
||||
keys or nodes), each element is assigned a number from 0 to `size-1`,
|
||||
and chunks consists of a range of such elements. Ackratos raised
|
||||
[some concerns](https://docs.google.com/document/d/1npGTAa1qxe8EQZ1wG0a0Sip9t5oX2vYZNUDwr_LVRR4/edit)
|
||||
about this design, somewhat specific to the IAVL tree, and mainly concerning
|
||||
performance of random reads and of iterating through the tree to determine element numbers
|
||||
(ie. elements aren't indexed by the element number).
|
||||
|
||||
An alternative design was suggested by Jae Kwon in
|
||||
[#3639](https://github.com/tendermint/tendermint/issues/3639) where chunking
|
||||
happens lazily and in a dynamic way: nodes request key ranges from their peers,
|
||||
and peers respond with some subset of the
|
||||
requested range and with notes on how to request the rest in parallel from other
|
||||
peers. Unlike chunk numbers, keys can be verified directly. And if some keys in the
|
||||
range are ommitted, proofs for the range will fail to verify.
|
||||
This way a node can start by requesting the entire tree from one peer,
|
||||
and that peer can respond with say the first few keys, and the ranges to request
|
||||
from other peers.
|
||||
|
||||
Additionally, per chunk validation tends to come more naturally to the
|
||||
Lazy approach since it tends to use the existing structure of the tree
|
||||
(ie. keys or nodes) rather than state-sync specific chunks. Such a
|
||||
design for tendermint was originally tracked in
|
||||
[#828](https://github.com/tendermint/tendermint/issues/828).
|
||||
|
||||
#### Eager StateSync
|
||||
Warp Sync as implemented in OpenEthereum to rapidly
|
||||
download both blocks and state snapshots from peers. Data is carved into ~4MB
|
||||
chunks and snappy compressed. Hashes of snappy compressed chunks are stored in a
|
||||
manifest file which co-ordinates the state-sync. Obtaining a correct manifest
|
||||
file seems to require an honest majority of peers. This means you may not find
|
||||
out the state is incorrect until you download the whole thing and compare it
|
||||
with a verified block header.
|
||||
|
||||
A similar solution was implemented by Binance in
|
||||
[#3594](https://github.com/tendermint/tendermint/pull/3594)
|
||||
based on their initial implementation in
|
||||
[PR #3243](https://github.com/tendermint/tendermint/pull/3243)
|
||||
and [some learnings](https://docs.google.com/document/d/1npGTAa1qxe8EQZ1wG0a0Sip9t5oX2vYZNUDwr_LVRR4/edit).
|
||||
Note this still requires the honest majority peer assumption.
|
||||
|
||||
As an eager protocol, warp-sync can efficiently compress larger, more
|
||||
predicatable chunks once per snapshot and service many new peers. By
|
||||
comparison lazy chunkers would have to compress each chunk at request
|
||||
time.
|
||||
|
||||
### Analysis of Lazy vs Eager
|
||||
Lazy vs Eager have more in common than they differ. They all require
|
||||
reactors on the tendermint side, a set of ABCI messages and a method for
|
||||
serializing/deserializing snapshots facilitated by a SnapshotFormat.
|
||||
|
||||
The biggest difference between Lazy and Eager proposals is in the
|
||||
read/write patterns necessitated by serving a snapshot chunk.
|
||||
Specifically, Lazy State Sync performs random reads to the underlying data
|
||||
structure while Eager can optimize for sequential reads.
|
||||
|
||||
This distinctin between approaches was demonstrated by Binance's
|
||||
[ackratos](https://github.com/ackratos) in their implementation of [Lazy
|
||||
State sync](https://github.com/tendermint/tendermint/pull/3243), The
|
||||
[analysis](https://docs.google.com/document/d/1npGTAa1qxe8EQZ1wG0a0Sip9t5oX2vYZNUDwr_LVRR4/)
|
||||
of the performance, and follow up implementation of [Warp
|
||||
Sync](http://github.com/tendermint/tendermint/pull/3594).
|
||||
|
||||
#### Compairing Security Models
|
||||
There are several different security models which have been
|
||||
discussed/proposed in the past but generally fall into two categories.
|
||||
|
||||
Light client validation: In which the node receiving data is expected to
|
||||
first perform a light client sync and have all the nessesary block
|
||||
headers. Within the trusted block header (trusted in terms of from a
|
||||
validator set subject to [weak
|
||||
subjectivity](https://github.com/tendermint/tendermint/pull/3795)) and
|
||||
can compare any subset of keys called a chunk against the merkle root.
|
||||
The advantage of light client validation is that the block headers are
|
||||
signed by validators which have something to lose for malicious
|
||||
behaviour. If a validator were to provide an invalid proof, they can be
|
||||
slashed.
|
||||
|
||||
Majority of peer validation: A manifest file containing a list of chunks
|
||||
along with checksums of each chunk is downloaded from a
|
||||
trusted source. That source can be a community resource similar to
|
||||
[sum.golang.org](https://sum.golang.org) or downloaded from the majority
|
||||
of peers. One disadantage of the majority of peer security model is the
|
||||
vuliberability to eclipse attacks in which a malicious users looks to
|
||||
saturate a target node's peer list and produce a manufactured picture of
|
||||
majority.
|
||||
|
||||
A third option would be to include snapshot related data in the
|
||||
block header. This could include the manifest with related checksums and be
|
||||
secured through consensus. One challenge of this approach is to
|
||||
ensure that creating snapshots does not put undo burden on block
|
||||
propsers by synchronizing snapshot creation and block creation. One
|
||||
approach to minimizing the burden is for snapshots for height
|
||||
`H` to be included in block `H+n` where `n` is some `n` block away,
|
||||
giving the block propser enough time to complete the snapshot
|
||||
asynchronousy.
|
||||
|
||||
## Proposal: Eager StateSync With Per Chunk Light Client Validation
|
||||
The conclusion after some concideration of the advantages/disadvances of
|
||||
eager/lazy and different security models is to produce a state sync
|
||||
which eagerly produces snapshots and uses light client validation. This
|
||||
approach has the performance advantages of pre-computing efficient
|
||||
snapshots which can streamed to new nodes on demand using sequential IO.
|
||||
Secondly, by using light client validation we cna validate each chunk on
|
||||
receipt and avoid the potential eclipse attack of majority of peer based
|
||||
security.
|
||||
|
||||
### Implementation
|
||||
Tendermint is responsible for downloading and verifying chunks of
|
||||
AppState from peers. ABCI Application is responsible for taking
|
||||
AppStateChunk objects from TM and constructing a valid state tree whose
|
||||
root corresponds with the AppHash of syncing block. In particular we
|
||||
will need implement:
|
||||
|
||||
* Build new StateSync reactor brokers message transmission between the peers
|
||||
and the ABCI application
|
||||
* A set of ABCI Messages
|
||||
* Design SnapshotFormat as an interface which can:
|
||||
* validate chunks
|
||||
* read/write chunks from file
|
||||
* read/write chunks to/from application state store
|
||||
* convert manifests into chunkRequest ABCI messages
|
||||
* Implement SnapshotFormat for cosmos-hub with concrete implementation for:
|
||||
* read/write chunks in a way which can be:
|
||||
* parallelized across peers
|
||||
* validated on receipt
|
||||
* read/write to/from IAVL+ tree
|
||||
|
||||

|
||||
|
||||
## Implementation Path
|
||||
* Create StateSync reactor based on [#3753](https://github.com/tendermint/tendermint/pull/3753)
|
||||
* Design SnapshotFormat with an eye towards cosmos-hub implementation
|
||||
* ABCI message to send/receive SnapshotFormat
|
||||
* IAVL+ changes to support SnapshotFormat
|
||||
* Deliver Warp sync (no chunk validation)
|
||||
* light client implementation for weak subjectivity
|
||||
* Deliver StateSync with chunk validation
|
||||
|
||||
## Status
|
||||
|
||||
Proposed
|
||||
|
||||
## Concequences
|
||||
|
||||
### Neutral
|
||||
|
||||
### Positive
|
||||
* Safe & performant state sync design substantiated with real world implementation experience
|
||||
* General interfaces allowing application specific innovation
|
||||
* Parallizable implementation trajectory with reasonable engineering effort
|
||||
|
||||
### Negative
|
||||
* Static Scheduling lacks opportunity for real time chunk availability optimizations
|
||||
|
||||
## References
|
||||
[sync: Sync current state without full replay for Applications](https://github.com/tendermint/tendermint/issues/828) - original issue
|
||||
[tendermint state sync proposal 2](https://docs.google.com/document/d/1npGTAa1qxe8EQZ1wG0a0Sip9t5oX2vYZNUDwr_LVRR4/edit) - ackratos proposal
|
||||
[proposal 2 implementation](https://github.com/tendermint/tendermint/pull/3243) - ackratos implementation
|
||||
[WIP General/Lazy State-Sync pseudo-spec](https://github.com/tendermint/tendermint/issues/3639) - Jae Proposal
|
||||
[Warp Sync Implementation](https://github.com/tendermint/tendermint/pull/3594) - ackratos
|
||||
[Chunk Proposal](https://github.com/tendermint/tendermint/pull/3799) - Bucky proposed
|
||||
@@ -1,404 +0,0 @@
|
||||
# ADR 043: Blockhchain Reactor Riri-Org
|
||||
|
||||
## Changelog
|
||||
|
||||
- 18-06-2019: Initial draft
|
||||
- 08-07-2019: Reviewed
|
||||
- 29-11-2019: Implemented
|
||||
- 14-02-2020: Updated with the implementation details
|
||||
|
||||
## Context
|
||||
|
||||
The blockchain reactor is responsible for two high level processes:sending/receiving blocks from peers and FastSync-ing blocks to catch upnode who is far behind. The goal of [ADR-40](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-040-blockchain-reactor-refactor.md) was to refactor these two processes by separating business logic currently wrapped up in go-channels into pure `handle*` functions. While the ADR specified what the final form of the reactor might look like it lacked guidance on intermediary steps to get there.
|
||||
The following diagram illustrates the state of the [blockchain-reorg](https://github.com/tendermint/tendermint/pull/3561) reactor which will be referred to as `v1`.
|
||||
|
||||

|
||||
|
||||
While `v1` of the blockchain reactor has shown significant improvements in terms of simplifying the concurrency model, the current PR has run into few roadblocks.
|
||||
|
||||
- The current PR large and difficult to review.
|
||||
- Block gossiping and fast sync processes are highly coupled to the shared `Pool` data structure.
|
||||
- Peer communication is spread over multiple components creating complex dependency graph which must be mocked out during testing.
|
||||
- Timeouts modeled as stateful tickers introduce non-determinism in tests
|
||||
|
||||
This ADR is meant to specify the missing components and control necessary to achieve [ADR-40](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-040-blockchain-reactor-refactor.md).
|
||||
|
||||
## Decision
|
||||
|
||||
Partition the responsibilities of the blockchain reactor into a set of components which communicate exclusively with events. Events will contain timestamps allowing each component to track time as internal state. The internal state will be mutated by a set of `handle*` which will produce event(s). The integration between components will happen in the reactor and reactor tests will then become integration tests between components. This design will be known as `v2`.
|
||||
|
||||

|
||||
|
||||
### Fast Sync Related Communication Channels
|
||||
|
||||
The diagram below shows the fast sync routines and the types of channels and queues used to communicate with each other.
|
||||
In addition the per reactor channels used by the sendRoutine to send messages over the Peer MConnection are shown.
|
||||
|
||||

|
||||
|
||||
### Reactor changes in detail
|
||||
|
||||
The reactor will include a demultiplexing routine which will send each message to each sub routine for independent processing. Each sub routine will then select the messages it's interested in and call the handle specific function specified in [ADR-40](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-040-blockchain-reactor-refactor.md). The demuxRoutine acts as "pacemaker" setting the time in which events are expected to be handled.
|
||||
|
||||
```go
|
||||
func demuxRoutine(msgs, scheduleMsgs, processorMsgs, ioMsgs) {
|
||||
timer := time.NewTicker(interval)
|
||||
for {
|
||||
select {
|
||||
case <-timer.C:
|
||||
now := evTimeCheck{time.Now()}
|
||||
schedulerMsgs <- now
|
||||
processorMsgs <- now
|
||||
ioMsgs <- now
|
||||
case msg:= <- msgs:
|
||||
msg.time = time.Now()
|
||||
// These channels should produce backpressure before
|
||||
// being full to avoid starving each other
|
||||
schedulerMsgs <- msg
|
||||
processorMsgs <- msg
|
||||
ioMesgs <- msg
|
||||
if msg == stop {
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func processRoutine(input chan Message, output chan Message) {
|
||||
processor := NewProcessor(..)
|
||||
for {
|
||||
msg := <- input
|
||||
switch msg := msg.(type) {
|
||||
case bcBlockRequestMessage:
|
||||
output <- processor.handleBlockRequest(msg))
|
||||
...
|
||||
case stop:
|
||||
processor.stop()
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
func scheduleRoutine(input chan Message, output chan Message) {
|
||||
schelduer = NewScheduler(...)
|
||||
for {
|
||||
msg := <-msgs
|
||||
switch msg := input.(type) {
|
||||
case bcBlockResponseMessage:
|
||||
output <- scheduler.handleBlockResponse(msg)
|
||||
...
|
||||
case stop:
|
||||
schedule.stop()
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Lifecycle management
|
||||
|
||||
A set of routines for individual processes allow processes to run in parallel with clear lifecycle management. `Start`, `Stop`, and `AddPeer` hooks currently present in the reactor will delegate to the sub-routines allowing them to manage internal state independent without further coupling to the reactor.
|
||||
|
||||
```go
|
||||
func (r *BlockChainReactor) Start() {
|
||||
r.msgs := make(chan Message, maxInFlight)
|
||||
schedulerMsgs := make(chan Message)
|
||||
processorMsgs := make(chan Message)
|
||||
ioMsgs := make(chan Message)
|
||||
|
||||
go processorRoutine(processorMsgs, r.msgs)
|
||||
go scheduleRoutine(schedulerMsgs, r.msgs)
|
||||
go ioRoutine(ioMsgs, r.msgs)
|
||||
...
|
||||
}
|
||||
|
||||
func (bcR *BlockchainReactor) Receive(...) {
|
||||
...
|
||||
r.msgs <- msg
|
||||
...
|
||||
}
|
||||
|
||||
func (r *BlockchainReactor) Stop() {
|
||||
...
|
||||
r.msgs <- stop
|
||||
...
|
||||
}
|
||||
|
||||
...
|
||||
func (r *BlockchainReactor) Stop() {
|
||||
...
|
||||
r.msgs <- stop
|
||||
...
|
||||
}
|
||||
...
|
||||
|
||||
func (r *BlockchainReactor) AddPeer(peer p2p.Peer) {
|
||||
...
|
||||
r.msgs <- bcAddPeerEv{peer.ID}
|
||||
...
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
## IO handling
|
||||
|
||||
An io handling routine within the reactor will isolate peer communication. Message going through the ioRoutine will usually be one way, using `p2p` APIs. In the case in which the `p2p` API such as `trySend` return errors, the ioRoutine can funnel those message back to the demuxRoutine for distribution to the other routines. For instance errors from the ioRoutine can be consumed by the scheduler to inform better peer selection implementations.
|
||||
|
||||
```go
|
||||
func (r *BlockchainReacor) ioRoutine(ioMesgs chan Message, outMsgs chan Message) {
|
||||
...
|
||||
for {
|
||||
msg := <-ioMsgs
|
||||
switch msg := msg.(type) {
|
||||
case scBlockRequestMessage:
|
||||
queued := r.sendBlockRequestToPeer(...)
|
||||
if queued {
|
||||
outMsgs <- ioSendQueued{...}
|
||||
}
|
||||
case scStatusRequestMessage
|
||||
r.sendStatusRequestToPeer(...)
|
||||
case bcPeerError
|
||||
r.Swtich.StopPeerForError(msg.src)
|
||||
...
|
||||
...
|
||||
case bcFinished
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
### Processor Internals
|
||||
|
||||
The processor is responsible for ordering, verifying and executing blocks. The Processor will maintain an internal cursor `height` refering to the last processed block. As a set of blocks arrive unordered, the Processor will check if it has `height+1` necessary to process the next block. The processor also maintains the map `blockPeers` of peers to height, to keep track of which peer provided the block at `height`. `blockPeers` can be used in`handleRemovePeer(...)` to reschedule all unprocessed blocks provided by a peer who has errored.
|
||||
|
||||
```go
|
||||
type Processor struct {
|
||||
height int64 // the height cursor
|
||||
state ...
|
||||
blocks [height]*Block // keep a set of blocks in memory until they are processed
|
||||
blockPeers [height]PeerID // keep track of which heights came from which peerID
|
||||
lastTouch timestamp
|
||||
}
|
||||
|
||||
func (proc *Processor) handleBlockResponse(peerID, block) {
|
||||
if block.height <= height || block[block.height] {
|
||||
} else if blocks[block.height] {
|
||||
return errDuplicateBlock{}
|
||||
} else {
|
||||
blocks[block.height] = block
|
||||
}
|
||||
|
||||
if blocks[height] && blocks[height+1] {
|
||||
... = state.Validators.VerifyCommit(...)
|
||||
... = store.SaveBlock(...)
|
||||
state, err = blockExec.ApplyBlock(...)
|
||||
...
|
||||
if err == nil {
|
||||
delete blocks[height]
|
||||
height++
|
||||
lastTouch = msg.time
|
||||
return pcBlockProcessed{height-1}
|
||||
} else {
|
||||
... // Delete all unprocessed block from the peer
|
||||
return pcBlockProcessError{peerID, height}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func (proc *Processor) handleRemovePeer(peerID) {
|
||||
events = []
|
||||
// Delete all unprocessed blocks from peerID
|
||||
for i = height; i < len(blocks); i++ {
|
||||
if blockPeers[i] == peerID {
|
||||
events = append(events, pcBlockReschedule{height})
|
||||
|
||||
delete block[height]
|
||||
}
|
||||
}
|
||||
return events
|
||||
}
|
||||
|
||||
func handleTimeCheckEv(time) {
|
||||
if time - lastTouch > timeout {
|
||||
// Timeout the processor
|
||||
...
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Schedule
|
||||
|
||||
The Schedule maintains the internal state used for scheduling blockRequestMessages based on some scheduling algorithm. The schedule needs to maintain state on:
|
||||
|
||||
- The state `blockState` of every block seem up to height of maxHeight
|
||||
- The set of peers and their peer state `peerState`
|
||||
- which peers have which blocks
|
||||
- which blocks have been requested from which peers
|
||||
|
||||
```go
|
||||
type blockState int
|
||||
|
||||
const (
|
||||
blockStateNew = iota
|
||||
blockStatePending,
|
||||
blockStateReceived,
|
||||
blockStateProcessed
|
||||
)
|
||||
|
||||
type schedule {
|
||||
// a list of blocks in which blockState
|
||||
blockStates map[height]blockState
|
||||
|
||||
// a map of which blocks are available from which peers
|
||||
blockPeers map[height]map[p2p.ID]scPeer
|
||||
|
||||
// a map of peerID to schedule specific peer struct `scPeer`
|
||||
peers map[p2p.ID]scPeer
|
||||
|
||||
// a map of heights to the peer we are waiting for a response from
|
||||
pending map[height]scPeer
|
||||
|
||||
targetPending int // the number of blocks we want in blockStatePending
|
||||
targetReceived int // the number of blocks we want in blockStateReceived
|
||||
|
||||
peerTimeout int
|
||||
peerMinSpeed int
|
||||
}
|
||||
|
||||
func (sc *schedule) numBlockInState(state blockState) uint32 {
|
||||
num := 0
|
||||
for i := sc.minHeight(); i <= sc.maxHeight(); i++ {
|
||||
if sc.blockState[i] == state {
|
||||
num++
|
||||
}
|
||||
}
|
||||
return num
|
||||
}
|
||||
|
||||
|
||||
func (sc *schedule) popSchedule(maxRequest int) []scBlockRequestMessage {
|
||||
// We only want to schedule requests such that we have less than sc.targetPending and sc.targetReceived
|
||||
// This ensures we don't saturate the network or flood the processor with unprocessed blocks
|
||||
todo := min(sc.targetPending - sc.numBlockInState(blockStatePending), sc.numBlockInState(blockStateReceived))
|
||||
events := []scBlockRequestMessage{}
|
||||
for i := sc.minHeight(); i < sc.maxMaxHeight(); i++ {
|
||||
if todo == 0 {
|
||||
break
|
||||
}
|
||||
if blockStates[i] == blockStateNew {
|
||||
peer = sc.selectPeer(blockPeers[i])
|
||||
sc.blockStates[i] = blockStatePending
|
||||
sc.pending[i] = peer
|
||||
events = append(events, scBlockRequestMessage{peerID: peer.peerID, height: i})
|
||||
todo--
|
||||
}
|
||||
}
|
||||
return events
|
||||
}
|
||||
...
|
||||
|
||||
type scPeer struct {
|
||||
peerID p2p.ID
|
||||
numOustandingRequest int
|
||||
lastTouched time.Time
|
||||
monitor flow.Monitor
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
# Scheduler
|
||||
|
||||
The scheduler is configured to maintain a target `n` of in flight
|
||||
messages and will use feedback from `_blockResponseMessage`,
|
||||
`_statusResponseMessage` and `_peerError` produce an optimal assignment
|
||||
of scBlockRequestMessage at each `timeCheckEv`.
|
||||
|
||||
```
|
||||
|
||||
func handleStatusResponse(peerID, height, time) {
|
||||
schedule.touchPeer(peerID, time)
|
||||
schedule.setPeerHeight(peerID, height)
|
||||
}
|
||||
|
||||
func handleBlockResponseMessage(peerID, height, block, time) {
|
||||
schedule.touchPeer(peerID, time)
|
||||
schedule.markReceived(peerID, height, size(block))
|
||||
}
|
||||
|
||||
func handleNoBlockResponseMessage(peerID, height, time) {
|
||||
schedule.touchPeer(peerID, time)
|
||||
// reschedule that block, punish peer...
|
||||
...
|
||||
}
|
||||
|
||||
func handlePeerError(peerID) {
|
||||
// Remove the peer, reschedule the requests
|
||||
...
|
||||
}
|
||||
|
||||
func handleTimeCheckEv(time) {
|
||||
// clean peer list
|
||||
|
||||
events = []
|
||||
for peerID := range schedule.peersNotTouchedSince(time) {
|
||||
pending = schedule.pendingFrom(peerID)
|
||||
schedule.setPeerState(peerID, timedout)
|
||||
schedule.resetBlocks(pending)
|
||||
events = append(events, peerTimeout{peerID})
|
||||
}
|
||||
|
||||
events = append(events, schedule.popSchedule())
|
||||
|
||||
return events
|
||||
}
|
||||
```
|
||||
|
||||
## Peer
|
||||
|
||||
The Peer Stores per peer state based on messages received by the scheduler.
|
||||
|
||||
```go
|
||||
type Peer struct {
|
||||
lastTouched timestamp
|
||||
lastDownloaded timestamp
|
||||
pending map[height]struct{}
|
||||
height height // max height for the peer
|
||||
state {
|
||||
pending, // we know the peer but not the height
|
||||
active, // we know the height
|
||||
timeout // the peer has timed out
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Status
|
||||
|
||||
Implemented
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- Test become deterministic
|
||||
- Simulation becomes a-termporal: no need wait for a wall-time timeout
|
||||
- Peer Selection can be independently tested/simulated
|
||||
- Develop a general approach to refactoring reactors
|
||||
|
||||
### Negative
|
||||
|
||||
### Neutral
|
||||
|
||||
### Implementation Path
|
||||
|
||||
- Implement the scheduler, test the scheduler, review the rescheduler
|
||||
- Implement the processor, test the processor, review the processor
|
||||
- Implement the demuxer, write integration test, review integration tests
|
||||
|
||||
## References
|
||||
|
||||
- [ADR-40](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-040-blockchain-reactor-refactor.md): The original blockchain reactor re-org proposal
|
||||
- [Blockchain re-org](https://github.com/tendermint/tendermint/pull/3561): The current blockchain reactor re-org implementation (v1)
|
||||
@@ -1,141 +0,0 @@
|
||||
# ADR 044: Lite Client with Weak Subjectivity
|
||||
|
||||
## Changelog
|
||||
* 13-07-2019: Initial draft
|
||||
* 14-08-2019: Address cwgoes comments
|
||||
|
||||
## Context
|
||||
|
||||
The concept of light clients was introduced in the Bitcoin white paper. It
|
||||
describes a watcher of distributed consensus process that only validates the
|
||||
consensus algorithm and not the state machine transactions within.
|
||||
|
||||
Tendermint light clients allow bandwidth & compute-constrained devices, such as smartphones, low-power embedded chips, or other blockchains to
|
||||
efficiently verify the consensus of a Tendermint blockchain. This forms the
|
||||
basis of safe and efficient state synchronization for new network nodes and
|
||||
inter-blockchain communication (where a light client of one Tendermint instance
|
||||
runs in another chain's state machine).
|
||||
|
||||
In a network that is expected to reliably punish validators for misbehavior
|
||||
by slashing bonded stake and where the validator set changes
|
||||
infrequently, clients can take advantage of this assumption to safely
|
||||
synchronize a lite client without downloading the intervening headers.
|
||||
|
||||
Light clients (and full nodes) operating in the Proof Of Stake context need a
|
||||
trusted block height from a trusted source that is no older than 1 unbonding
|
||||
window plus a configurable evidence submission synchrony bound. This is called “weak subjectivity”.
|
||||
|
||||
Weak subjectivity is required in Proof of Stake blockchains because it is
|
||||
costless for an attacker to buy up voting keys that are no longer bonded and
|
||||
fork the network at some point in its prior history. See Vitalik’s post at
|
||||
[Proof of Stake: How I Learned to Love Weak
|
||||
Subjectivity](https://blog.ethereum.org/2014/11/25/proof-stake-learned-love-weak-subjectivity/).
|
||||
|
||||
Currently, Tendermint provides a lite client implementation in the
|
||||
[light](https://github.com/tendermint/tendermint/tree/master/light) package. This
|
||||
lite client implements a bisection algorithm that tries to use a binary search
|
||||
to find the minimum number of block headers where the validator set voting
|
||||
power changes are less than < 1/3rd. This interface does not support weak
|
||||
subjectivity at this time. The Cosmos SDK also does not support counterfactual
|
||||
slashing, nor does the lite client have any capacity to report evidence making
|
||||
these systems *theoretically unsafe*.
|
||||
|
||||
NOTE: Tendermint provides a somewhat different (stronger) light client model
|
||||
than Bitcoin under eclipse, since the eclipsing node(s) can only fool the light
|
||||
client if they have two-thirds of the private keys from the last root-of-trust.
|
||||
|
||||
## Decision
|
||||
|
||||
### The Weak Subjectivity Interface
|
||||
|
||||
Add the weak subjectivity interface for when a new light client connects to the
|
||||
network or when a light client that has been offline for longer than the
|
||||
unbonding period connects to the network. Specifically, the node needs to
|
||||
initialize the following structure before syncing from user input:
|
||||
|
||||
```
|
||||
type TrustOptions struct {
|
||||
// Required: only trust commits up to this old.
|
||||
// Should be equal to the unbonding period minus some delta for evidence reporting.
|
||||
TrustPeriod time.Duration `json:"trust-period"`
|
||||
|
||||
// Option 1: TrustHeight and TrustHash can both be provided
|
||||
// to force the trusting of a particular height and hash.
|
||||
// If the latest trusted height/hash is more recent, then this option is
|
||||
// ignored.
|
||||
TrustHeight int64 `json:"trust-height"`
|
||||
TrustHash []byte `json:"trust-hash"`
|
||||
|
||||
// Option 2: Callback can be set to implement a confirmation
|
||||
// step if the trust store is uninitialized, or expired.
|
||||
Callback func(height int64, hash []byte) error
|
||||
}
|
||||
```
|
||||
|
||||
The expectation is the user will get this information from a trusted source
|
||||
like a validator, a friend, or a secure website. A more user friendly
|
||||
solution with trust tradeoffs is that we establish an https based protocol with
|
||||
a default end point that populates this information. Also an on-chain registry
|
||||
of roots-of-trust (e.g. on the Cosmos Hub) seems likely in the future.
|
||||
|
||||
### Linear Verification
|
||||
|
||||
The linear verification algorithm requires downloading all headers
|
||||
between the `TrustHeight` and the `LatestHeight`. The lite client downloads the
|
||||
full header for the provided `TrustHeight` and then proceeds to download `N+1`
|
||||
headers and applies the [Tendermint validation
|
||||
rules](https://github.com/tendermint/tendermint/tree/master/spec/light-client/verification/README.md)
|
||||
to each block.
|
||||
|
||||
### Bisecting Verification
|
||||
|
||||
Bisecting Verification is a more bandwidth and compute intensive mechanism that
|
||||
in the most optimistic case requires a light client to only download two block
|
||||
headers to come into synchronization.
|
||||
|
||||
The bisection algorithm proceeds in the following fashion. The client downloads
|
||||
and verifies the full block header for `TrustHeight` and then fetches
|
||||
`LatestHeight` blocker header. The client then verifies the `LatestHeight`
|
||||
header. Finally the client attempts to verify the `LatestHeight` header with
|
||||
voting powers taken from `NextValidatorSet` in the `TrustHeight` header. This
|
||||
verification will succeed if the validators from `TrustHeight` still have > 2/3
|
||||
+1 of voting power in the `LatestHeight`. If this succeeds, the client is fully
|
||||
synchronized. If this fails, then following Bisection Algorithm should be
|
||||
executed.
|
||||
|
||||
The Client tries to download the block at the mid-point block between
|
||||
`LatestHeight` and `TrustHeight` and attempts that same algorithm as above
|
||||
using `MidPointHeight` instead of `LatestHeight` and a different threshold -
|
||||
1/3 +1 of voting power for *non-adjacent headers*. In the case the of failure,
|
||||
recursively perform the `MidPoint` verification until success then start over
|
||||
with an updated `NextValidatorSet` and `TrustHeight`.
|
||||
|
||||
If the client encounters a forged header, it should submit the header along
|
||||
with some other intermediate headers as the evidence of misbehavior to other
|
||||
full nodes. After that, it can retry the bisection using another full node. An
|
||||
optimal client will cache trusted headers from the previous run to minimize
|
||||
network usage.
|
||||
|
||||
---
|
||||
|
||||
Check out the formal specification
|
||||
[here](https://github.com/tendermint/tendermint/tree/master/spec/light-client).
|
||||
|
||||
## Status
|
||||
|
||||
Implemented
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
* light client which is safe to use (it can go offline, but not for too long)
|
||||
|
||||
### Negative
|
||||
|
||||
* complexity of bisection
|
||||
|
||||
### Neutral
|
||||
|
||||
* social consensus can be prone to errors (for cases where a new light client
|
||||
joins a network or it has been offline for too long)
|
||||
@@ -1,140 +0,0 @@
|
||||
# ADR 45 - ABCI Evidence Handling
|
||||
|
||||
## Changelog
|
||||
* 21-09-2019: Initial draft
|
||||
|
||||
## Context
|
||||
|
||||
Evidence is a distinct component in a Tendermint block and has it's own reactor
|
||||
for high priority gossipping. Currently, Tendermint supports only a single form of evidence, an explicit
|
||||
equivocation, where a validator signs conflicting blocks at the same
|
||||
height/round. It is detected in real-time in the consensus reactor, and gossiped
|
||||
through the evidence reactor. Evidence can also be submitted through the RPC.
|
||||
|
||||
Currently, Tendermint does not gracefully handle a fork on the main chain.
|
||||
If a fork is detected, the node panics. At this point manual intervention and
|
||||
social consensus are required to reconfigure. We'd like to do something more
|
||||
graceful here, but that's for another day.
|
||||
|
||||
It's possible to fool lite clients without there being a fork on the
|
||||
main chain - so called Fork-Lite. See the
|
||||
[fork accountability](https://github.com/tendermint/tendermint/blob/master/spec/light-client/accountability/README.md)
|
||||
document for more details. For a sequential lite client, this can happen via
|
||||
equivocation or amnesia attacks. For a skipping lite client this can also happen
|
||||
via lunatic validator attacks. There must be some way for applications to punish
|
||||
all forms of misbehaviour.
|
||||
|
||||
The essential question is whether Tendermint should manage the evidence
|
||||
verification, or whether it should treat evidence more like a transaction (ie.
|
||||
arbitrary bytes) and let the application handle it (including all the signature
|
||||
checking).
|
||||
|
||||
Currently, evidence verification is handled by Tendermint. Once committed,
|
||||
[evidence is passed over
|
||||
ABCI](https://github.com/tendermint/tendermint/blob/master/proto/tendermint/abci/types.proto#L354)
|
||||
in BeginBlock in a reduced form that includes only
|
||||
the type of evidence, its height and timestamp, the validator it's from, and the
|
||||
total voting power of the validator set at the height. The app trusts Tendermint
|
||||
to perform the evidence verification, as the ABCI evidence does not contain the
|
||||
signatures and additional data for the app to verify itself.
|
||||
|
||||
Arguments in favor of leaving evidence handling in Tendermint:
|
||||
|
||||
1) Attacks on full nodes must be detectable by full nodes in real time, ie. within the consensus reactor.
|
||||
So at the very least, any evidence involved in something that could fool a full
|
||||
node must be handled natively by Tendermint as there would otherwise be no way
|
||||
for the ABCI app to detect it (ie. we don't send all votes we receive during
|
||||
consensus to the app ... ).
|
||||
|
||||
2) Amensia attacks can not be easily detected - they require an interactive
|
||||
protocol among all the validators to submit justification for their past
|
||||
votes. Our best notion of [how to do this
|
||||
currently](https://github.com/tendermint/tendermint/blob/c67154232ca8be8f5c21dff65d154127adc4f7bb/docs/spec/consensus/fork-detection.md)
|
||||
is via a centralized
|
||||
monitor service that is trusted for liveness to aggregate data from
|
||||
current and past validators, but which produces a proof of misbehaviour (ie.
|
||||
via amnesia) that can be verified by anyone, including the blockchain.
|
||||
Validators must submit all the votes they saw for the relevant consensus
|
||||
height to justify their precommits. This is quite specific to the Tendermint
|
||||
protocol and may change if the protocol is upgraded. Hence it would be awkward
|
||||
to co-ordinate this from the app.
|
||||
|
||||
3) Evidence gossipping is similar to tx gossipping, but it should be higher
|
||||
priority. Since the mempool does not support any notion of priority yet,
|
||||
evidence is gossipped through a distinct Evidence reactor. If we just treated
|
||||
evidence like any other transaction, leaving it entirely to the application,
|
||||
Tendermint would have no way to know how to prioritize it, unless/until we
|
||||
significantly upgrade the mempool. Thus we would need to continue to treat evidence
|
||||
distinctly and update the ABCI to either support sending Evidence through
|
||||
CheckTx/DeliverTx, or to introduce new CheckEvidence/DeliverEvidence methods.
|
||||
In either case we'd need to make more changes to ABCI then if Tendermint
|
||||
handled things and we just added support for another evidence type that could be included
|
||||
in BeginBlock.
|
||||
|
||||
4) All ABCI application frameworks will benefit from most of the heavy lifting
|
||||
being handled by Tendermint, rather than each of them needing to re-implement
|
||||
all the evidence verification logic in each language.
|
||||
|
||||
Arguments in favor of moving evidence handling to the application:
|
||||
|
||||
5) Skipping lite clients require us to track the set of all validators that were
|
||||
bonded over some period in case validators that are unbonding but still
|
||||
slashable sign invalid headers to fool lite clients. The Cosmos-SDK
|
||||
staking/slashing modules track this, as it's used for slashing.
|
||||
Tendermint does not currently track this, though it does keep track of the
|
||||
validator set at every height. This leans in favour of managing evidence in
|
||||
the app to avoid redundantly managing the historical validator set data in
|
||||
Tendermint
|
||||
|
||||
6) Applications supporting cross-chain validation will be required to process
|
||||
evidence from other chains. This data will come in the form of a transaction,
|
||||
but it means the app will be required to have all the functionality to process
|
||||
evidence, even if the evidence for its own chain is handled directly by
|
||||
Tendermint.
|
||||
|
||||
7) Evidence from lite clients may be large and constitute some form of DoS
|
||||
vector against full nodes. Putting it in transactions allows it to engage the application's fee
|
||||
mechanism to pay for cost of executions in the event the evidence is false.
|
||||
This means the evidence submitter must be able to afford the fees for the
|
||||
submission, but of course it should be refunded if the evidence is valid.
|
||||
That said, the burden is mostly on full nodes, which don't necessarily benefit
|
||||
from fees.
|
||||
|
||||
|
||||
## Decision
|
||||
|
||||
The above mostly seems to suggest that evidence detection belongs in Tendermint.
|
||||
(5) does not impose particularly large obligations on Tendermint and (6) just
|
||||
means the app can use Tendermint libraries. That said, (7) is potentially
|
||||
cause for some concern, though it could still attack full nodes that weren't associated with validators
|
||||
(ie. that don't benefit from fees). This could be handled out of band, for instance by
|
||||
full nodes offering the light client service via payment channels or via some
|
||||
other payment service. This can also be mitigated by banning client IPs if they
|
||||
send bad data. Note the burden is on the client to actually send us a lot of
|
||||
data in the first place.
|
||||
|
||||
A separate ADR will describe how Tendermint will handle these new forms of
|
||||
evidence, in terms of how it will engage the monitoring protocol described in
|
||||
the [fork
|
||||
detection](https://github.com/tendermint/tendermint/blob/c67154232ca8be8f5c21dff65d154127adc4f7bb/docs/spec/consensus/fork-detection.md) document,
|
||||
and how it will track past validators and manage DoS issues.
|
||||
|
||||
## Status
|
||||
|
||||
Proposed.
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- No real changes to ABCI
|
||||
- Tendermint handles evidence for all apps
|
||||
|
||||
### Neutral
|
||||
|
||||
- Need to be careful about denial of service on the Tendermint RPC
|
||||
|
||||
### Negative
|
||||
|
||||
- Tendermint duplicates data by tracking all pubkeys that were validators during
|
||||
the unbonding period
|
||||
@@ -1,169 +0,0 @@
|
||||
# ADR 046: Lite Client Implementation
|
||||
|
||||
## Changelog
|
||||
* 13-02-2020: Initial draft
|
||||
* 26-02-2020: Cross-checking the first header
|
||||
* 28-02-2020: Bisection algorithm details
|
||||
* 31-03-2020: Verify signature got changed
|
||||
|
||||
## Context
|
||||
|
||||
A `Client` struct represents a light client, connected to a single blockchain.
|
||||
|
||||
The user has an option to verify headers using `VerifyHeader` or
|
||||
`VerifyHeaderAtHeight` or `Update` methods. The latter method downloads the
|
||||
latest header from primary and compares it with the currently trusted one.
|
||||
|
||||
```go
|
||||
type Client interface {
|
||||
// verify new headers
|
||||
VerifyHeaderAtHeight(height int64, now time.Time) (*types.SignedHeader, error)
|
||||
VerifyHeader(newHeader *types.SignedHeader, newVals *types.ValidatorSet, now time.Time) error
|
||||
Update(now time.Time) (*types.SignedHeader, error)
|
||||
|
||||
// get trusted headers & validators
|
||||
TrustedHeader(height int64) (*types.SignedHeader, error)
|
||||
TrustedValidatorSet(height int64) (valSet *types.ValidatorSet, heightUsed int64, err error)
|
||||
LastTrustedHeight() (int64, error)
|
||||
FirstTrustedHeight() (int64, error)
|
||||
|
||||
// query configuration options
|
||||
ChainID() string
|
||||
Primary() provider.Provider
|
||||
Witnesses() []provider.Provider
|
||||
|
||||
Cleanup() error
|
||||
}
|
||||
```
|
||||
|
||||
A new light client can either be created from scratch (via `NewClient`) or
|
||||
using the trusted store (via `NewClientFromTrustedStore`). When there's some
|
||||
data in the trusted store and `NewClient` is called, the light client will a)
|
||||
check if stored header is more recent b) optionally ask the user whenever it
|
||||
should rollback (no confirmation required by default).
|
||||
|
||||
```go
|
||||
func NewClient(
|
||||
chainID string,
|
||||
trustOptions TrustOptions,
|
||||
primary provider.Provider,
|
||||
witnesses []provider.Provider,
|
||||
trustedStore store.Store,
|
||||
options ...Option) (*Client, error) {
|
||||
```
|
||||
|
||||
`witnesses` as argument (as opposite to `Option`) is an intentional choice,
|
||||
made to increase security by default. At least one witness is required,
|
||||
although, right now, the light client does not check that primary != witness.
|
||||
When cross-checking a new header with witnesses, minimum number of witnesses
|
||||
required to respond: 1. Note the very first header (`TrustOptions.Hash`) is
|
||||
also cross-checked with witnesses for additional security.
|
||||
|
||||
Due to bisection algorithm nature, some headers might be skipped. If the light
|
||||
client does not have a header for height `X` and `VerifyHeaderAtHeight(X)` or
|
||||
`VerifyHeader(H#X)` methods are called, these will perform either a) backwards
|
||||
verification from the latest header back to the header at height `X` or b)
|
||||
bisection verification from the first stored header to the header at height `X`.
|
||||
|
||||
`TrustedHeader`, `TrustedValidatorSet` only communicate with the trusted store.
|
||||
If some header is not there, an error will be returned indicating that
|
||||
verification is required.
|
||||
|
||||
```go
|
||||
type Provider interface {
|
||||
ChainID() string
|
||||
|
||||
SignedHeader(height int64) (*types.SignedHeader, error)
|
||||
ValidatorSet(height int64) (*types.ValidatorSet, error)
|
||||
}
|
||||
```
|
||||
|
||||
Provider is a full node usually, but can be another light client. The above
|
||||
interface is thin and can accommodate many implementations.
|
||||
|
||||
If provider (primary or witness) becomes unavailable for a prolonged period of
|
||||
time, it will be removed to ensure smooth operation.
|
||||
|
||||
Both `Client` and providers expose chain ID to track if there are on the same
|
||||
chain. Note, when chain upgrades or intentionally forks, chain ID changes.
|
||||
|
||||
The light client stores headers & validators in the trusted store:
|
||||
|
||||
```go
|
||||
type Store interface {
|
||||
SaveSignedHeaderAndValidatorSet(sh *types.SignedHeader, valSet *types.ValidatorSet) error
|
||||
DeleteSignedHeaderAndValidatorSet(height int64) error
|
||||
|
||||
SignedHeader(height int64) (*types.SignedHeader, error)
|
||||
ValidatorSet(height int64) (*types.ValidatorSet, error)
|
||||
|
||||
LastSignedHeaderHeight() (int64, error)
|
||||
FirstSignedHeaderHeight() (int64, error)
|
||||
|
||||
SignedHeaderAfter(height int64) (*types.SignedHeader, error)
|
||||
|
||||
Prune(size uint16) error
|
||||
|
||||
Size() uint16
|
||||
}
|
||||
```
|
||||
|
||||
At the moment, the only implementation is the `db` store (wrapper around the KV
|
||||
database, used in Tendermint). In the future, remote adapters are possible
|
||||
(e.g. `Postgresql`).
|
||||
|
||||
```go
|
||||
func Verify(
|
||||
chainID string,
|
||||
trustedHeader *types.SignedHeader, // height=X
|
||||
trustedVals *types.ValidatorSet, // height=X or height=X+1
|
||||
untrustedHeader *types.SignedHeader, // height=Y
|
||||
untrustedVals *types.ValidatorSet, // height=Y
|
||||
trustingPeriod time.Duration,
|
||||
now time.Time,
|
||||
maxClockDrift time.Duration,
|
||||
trustLevel tmmath.Fraction) error {
|
||||
```
|
||||
|
||||
`Verify` pure function is exposed for a header verification. It handles both
|
||||
cases of adjacent and non-adjacent headers. In the former case, it compares the
|
||||
hashes directly (2/3+ signed transition). Otherwise, it verifies 1/3+
|
||||
(`trustLevel`) of trusted validators are still present in new validators.
|
||||
|
||||
While `Verify` function is certainly handy, `VerifyAdjacent` and
|
||||
`VerifyNonAdjacent` should be used most often to avoid logic errors.
|
||||
|
||||
### Bisection algorithm details
|
||||
|
||||
Non-recursive bisection algorithm was implemented despite the spec containing
|
||||
the recursive version. There are two major reasons:
|
||||
|
||||
1) Constant memory consumption => no risk of getting OOM (Out-Of-Memory) exceptions;
|
||||
2) Faster finality (see Fig. 1).
|
||||
|
||||
_Fig. 1: Differences between recursive and non-recursive bisections_
|
||||
|
||||

|
||||
|
||||
Specification of the non-recursive bisection can be found
|
||||
[here](https://github.com/tendermint/spec/blob/zm_non-recursive-verification/spec/consensus/light-client/non-recursive-verification.md).
|
||||
|
||||
## Status
|
||||
|
||||
Implemented
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
* single `Client` struct, which is easy to use
|
||||
* flexible interfaces for header providers and trusted storage
|
||||
|
||||
### Negative
|
||||
|
||||
* `Verify` needs to be aligned with the current spec
|
||||
|
||||
### Neutral
|
||||
|
||||
* `Verify` function might be misused (called with non-adjacent headers in
|
||||
incorrectly implemented sequential verification)
|
||||
@@ -1,254 +0,0 @@
|
||||
# ADR 047: Handling evidence from light client
|
||||
|
||||
## Changelog
|
||||
* 18-02-2020: Initial draft
|
||||
* 24-02-2020: Second version
|
||||
* 13-04-2020: Add PotentialAmnesiaEvidence and a few remarks
|
||||
* 31-07-2020: Remove PhantomValidatorEvidence
|
||||
* 14-08-2020: Introduce light traces (listed now as an alternative approach)
|
||||
* 20-08-2020: Light client produces evidence when detected instead of passing to full node
|
||||
* 16-09-2020: Post-implementation revision
|
||||
* 15-03-2020: Ammends for the case of a forward lunatic attack
|
||||
|
||||
### Glossary of Terms
|
||||
|
||||
- a `LightBlock` is the unit of data that a light client receives, verifies and stores.
|
||||
It is composed of a validator set, commit and header all at the same height.
|
||||
- a **Trace** is seen as an array of light blocks across a range of heights that were
|
||||
created as a result of skipping verification.
|
||||
- a **Provider** is a full node that a light client is connected to and serves the light
|
||||
client signed headers and validator sets.
|
||||
- `VerifySkipping` (sometimes known as bisection or verify non-adjacent) is a method the
|
||||
light client uses to verify a target header from a trusted header. The process involves verifying
|
||||
intermediate headers in between the two by making sure that 1/3 of the validators that signed
|
||||
the trusted header also signed the untrusted one.
|
||||
- **Light Bifurcation Point**: If the light client was to run `VerifySkipping` with two providers
|
||||
(i.e. a primary and a witness), the bifurcation point is the height that the headers
|
||||
from each of these providers are different yet valid. This signals that one of the providers
|
||||
may be trying to fool the light client.
|
||||
|
||||
## Context
|
||||
|
||||
The bisection method of header verification used by the light client exposes
|
||||
itself to a potential attack if any block within the light clients trusted period has
|
||||
a malicious group of validators with power that exceeds the light clients trust level
|
||||
(default is 1/3). To improve light client (and overall network) security, the light
|
||||
client has a detector component that compares the verified header provided by the
|
||||
primary against witness headers. This ADR outlines the process of mitigating attacks
|
||||
on the light client by using witness nodes to cross reference with.
|
||||
|
||||
## Alternative Approaches
|
||||
|
||||
A previously discussed approach to handling evidence was to pass all the data that the
|
||||
light client had witnessed when it had observed diverging headers for the full node to
|
||||
process.This was known as a light trace and had the following structure:
|
||||
|
||||
```go
|
||||
type ConflictingHeadersTrace struct {
|
||||
Headers []*types.SignedHeader
|
||||
}
|
||||
```
|
||||
|
||||
This approach has the advantage of not requiring as much processing on the light
|
||||
client side in the event that an attack happens. Although, this is not a significant
|
||||
difference as the light client would in any case have to validate all the headers
|
||||
from both witness and primary. Using traces would consume a large amount of bandwidth
|
||||
and adds a DDOS vector to the full node.
|
||||
|
||||
|
||||
## Decision
|
||||
|
||||
The light client will be divided into two components: a `Verifier` (either sequential or
|
||||
skipping) and a `Detector` (see [Informal's Detector](https://github.com/informalsystems/tendermint-rs/blob/master/docs/spec/lightclient/detection/detection.md))
|
||||
. The detector will take the trace of headers from the primary and check it against all
|
||||
witnesses. For a witness with a diverging header, the detector will first verify the header
|
||||
by bisecting through all the heights defined by the trace that the primary provided. If valid,
|
||||
the light client will trawl through both traces and find the point of bifurcation where it
|
||||
can proceed to extract any evidence (as is discussed in detail later).
|
||||
|
||||
Upon successfully detecting the evidence, the light client will send it to both primary and
|
||||
witness before halting. It will not send evidence to other peers nor continue to verify the
|
||||
primary's header against any other header.
|
||||
|
||||
|
||||
## Detailed Design
|
||||
|
||||
The verification process of the light client will start from a trusted header and use a bisectional
|
||||
algorithm to verify up to a header at a given height. This becomes the verified header (does not
|
||||
mean that it is trusted yet). All headers that were verified in between are cached and known as
|
||||
intermediary headers and the entire array is sometimes referred to as a trace.
|
||||
|
||||
The light client's detector then takes all the headers and runs the detect function.
|
||||
|
||||
```golang
|
||||
func (c *Client) detectDivergence(primaryTrace []*types.LightBlock, now time.Time) error
|
||||
```
|
||||
|
||||
The function takes the last header it received, the target header and compares it against all the witnesses
|
||||
it has through the following function:
|
||||
|
||||
```golang
|
||||
func (c *Client) compareNewHeaderWithWitness(errc chan error, h *types.SignedHeader,
|
||||
witness provider.Provider, witnessIndex int)
|
||||
```
|
||||
|
||||
The err channel is used to send back all the outcomes so that they can be processed in parallel.
|
||||
Invalid headers result in dropping the witness, lack of response or not having the headers is ignored
|
||||
just as headers that have the same hash. Headers, however,
|
||||
of a different hash then trigger the detection process between the primary and that particular witness.
|
||||
|
||||
This begins with verification of the witness's header via skipping verification which is run in tande
|
||||
with locating the Light Bifurcation Point
|
||||
|
||||

|
||||
|
||||
This is done with:
|
||||
|
||||
```golang
|
||||
func (c *Client) examineConflictingHeaderAgainstTrace(
|
||||
trace []*types.LightBlock,
|
||||
targetBlock *types.LightBlock,
|
||||
source provider.Provider,
|
||||
now time.Time,
|
||||
) ([]*types.LightBlock, *types.LightBlock, error)
|
||||
```
|
||||
|
||||
which performs the following
|
||||
|
||||
1. Checking that the trusted header is the same. Currently, they should not theoretically be different
|
||||
because witnesses cannot be added and removed after the client is initialized. But we do this any way
|
||||
as a sanity check. If this fails we have to drop the witness.
|
||||
|
||||
2. Querying and verifying the witness's headers using bisection at the same heights of all the
|
||||
intermediary headers of the primary (In the above example this is A, B, C, D, F, H). If bisection fails
|
||||
or the witness stops responding then we can call the witness faulty and drop it.
|
||||
|
||||
3. We eventually reach a verified header by the witness which is not the same as the intermediary header
|
||||
(In the above example this is E). This is the point of bifurcation (This could also be the last header).
|
||||
|
||||
4. There is a unique case where the trace that is being examined against has blocks that have a greater
|
||||
height than the targetBlock. This can occur as part of a forward lunatic attack where the primary has
|
||||
provided a light block that has a height greater than the head of the chain (see Appendix B). In this
|
||||
case, the light client will verify the sources blocks up to the targetBlock and return the block in the
|
||||
trace that is directly after the targetBlock in height as the `ConflictingBlock`
|
||||
|
||||
This function then returns the trace of blocks from the witness node between the common header and the
|
||||
divergent header of the primary as it is likely, as seen in the example to the right, that multiple
|
||||
headers where required in order to verify the divergent one. This trace will
|
||||
be used later (as is also described later in this document).
|
||||
|
||||

|
||||
|
||||
Now, that an attack has been detected, the light client must form evidence to prove it. There are
|
||||
three types of attacks that either the primary or witness could have done to try fool the light client
|
||||
into verifying the wrong header: Lunatic, Equivocation and Amnesia. As the consequence is the same and
|
||||
the data required to prove it is also very similar, we bundle these attack styles together in a single
|
||||
evidence:
|
||||
|
||||
```golang
|
||||
type LightClientAttackEvidence struct {
|
||||
ConflictingBlock *LightBlock
|
||||
CommonHeight int64
|
||||
}
|
||||
```
|
||||
|
||||
The light client takes the stance of first suspecting the primary. Given the bifurcation point found
|
||||
above, it takes the two divergent headers and compares whether the one from the primary is valid with
|
||||
respect to the one from the witness. This is done by calling `isInvalidHeader()` which looks to see if
|
||||
any one of the deterministically derived header fields differ from one another. This could be one of
|
||||
`ValidatorsHash`, `NextValidatorsHash`, `ConsensusHash`, `AppHash`, and `LastResultsHash`.
|
||||
In this case we know it's a Lunatic attack and to help the witness verify it we send the height
|
||||
of the common header which is 1 in the example above or C in the example above that. If all these
|
||||
hashes are the same then we can infer that it is either Equivocation or Amnesia. In this case we send
|
||||
the height of the diverged headers because we know that the validator sets are the same, hence the
|
||||
malicious nodes are still bonded at that height. In the example above, this is height 10 and the
|
||||
example above that it is the height at E.
|
||||
|
||||
The light client now has the evidence and broadcasts it to the witness.
|
||||
|
||||
However, it could have been that the header the light client used from the witness against the primary
|
||||
was forged, so before halting the light client swaps the process and thus suspects the witness and
|
||||
uses the primary to create evidence. It calls `examineConflictingHeaderAgainstTrace` this time using
|
||||
the witness trace found earlier.
|
||||
If the primary was malicious it is likely that it will not respond but if it is innocent then the
|
||||
light client will produce the same evidence but this time the conflicting
|
||||
block will come from the witness node instead of the primary. The evidence is then formed and sent to
|
||||
the primary node.
|
||||
|
||||
This then ends the process and the verify function that was called at the start returns the error to
|
||||
the user.
|
||||
|
||||
For a detailed overview of how each of these three attacks can be conducted please refer to the
|
||||
[fork accountability spec](https://github.com/tendermint/tendermint/blob/master/spec/consensus/light-client/accountability.md).
|
||||
|
||||
## Full Node Verification
|
||||
|
||||
When a full node receives evidence from the light client it will need to verify
|
||||
it for itself before gossiping it to peers and trying to commit it on chain. This process is outlined
|
||||
in [ADR-059](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-059-evidence-composition-and-lifecycle.md).
|
||||
|
||||
## Status
|
||||
|
||||
Implemented
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
* Light client has increased security against Lunatic, Equivocation and Amnesia attacks.
|
||||
* Do not need intermediate data structures to encapsulate the malicious behavior
|
||||
* Generalized evidence makes the code simpler
|
||||
|
||||
### Negative
|
||||
|
||||
* Breaking change on the light client from versions 0.33.8 and below. Previous
|
||||
versions will still send `ConflictingHeadersEvidence` but it won't be recognized
|
||||
by the full node. Light clients will however still refuse the header and shut down.
|
||||
* Amnesia attacks although detected, will not be able to be punished as it is not
|
||||
clear from the current information which nodes behaved maliciously.
|
||||
* Evidence module must handle both individual and grouped evidence.
|
||||
|
||||
### Neutral
|
||||
|
||||
## References
|
||||
|
||||
* [Fork accountability spec](https://github.com/tendermint/tendermint/blob/master/spec/consensus/light-client/accountability.md)
|
||||
* [ADR 056: Light client amnesia attacks](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-056-light-client-amnesia-attacks.md)
|
||||
* [ADR-059: Evidence Composition and Lifecycle](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-059-evidence-composition-and-lifecycle.md)
|
||||
* [Informal's Light Client Detector](https://github.com/informalsystems/tendermint-rs/blob/master/docs/spec/lightclient/detection/detection.md)
|
||||
|
||||
|
||||
## Appendix A
|
||||
|
||||
PhantomValidatorEvidence was used to capture when a validator that was still staked
|
||||
(i.e. within the bonded period) but was not in the current validator set had voted for a block.
|
||||
|
||||
In later discussions it was argued that although possible to keep phantom validator
|
||||
evidence, any case a phantom validator that could have the capacity to be involved
|
||||
in fooling a light client would have to be aided by 1/3+ lunatic validators.
|
||||
|
||||
It would also be very unlikely that the new validators injected by the lunatic attack
|
||||
would be validators that currently still have something staked.
|
||||
|
||||
Not only this but there was a large degree of extra computation required in storing all
|
||||
the currently staked validators that could possibly fall into the group of being
|
||||
a phantom validator. Given this, it was removed.
|
||||
|
||||
## Appendix B
|
||||
|
||||
A unique flavor of lunatic attack is a forward lunatic attack. This is where a malicious
|
||||
node provides a header with a height greater than the height of the blockchain. Thus there
|
||||
are no witnesses capable of rebutting the malicious header. Such an attack will also
|
||||
require an accomplice, i.e. at least one other witness to also return the same forged header.
|
||||
Although such attacks can be any arbitrary height ahead, they must still remain within the
|
||||
clock drift of the light clients real time. Therefore, to detect such an attack, a light
|
||||
client will wait for a time
|
||||
|
||||
```
|
||||
2 * MAX_CLOCK_DRIFT + LAG
|
||||
```
|
||||
|
||||
for a witness to provide the latest block it has. Given the time constraints, if the witness
|
||||
is operating at the head of the blockchain, it will have a header with an earlier height but
|
||||
a later timestamp. This can be used to prove that the primary has submitted a lunatic header
|
||||
which violates monotonically increasing time.
|
||||
@@ -1,58 +0,0 @@
|
||||
# ADR 50: Improved Trusted Peering
|
||||
|
||||
## Changelog
|
||||
* 22-10-2019: Initial draft
|
||||
* 05-11-2019: Modify `maximum-dial-period` to `persistent-peers-max-dial-period`
|
||||
|
||||
## Context
|
||||
|
||||
When `max-num-inbound-peers` or `max-num-outbound-peers` of a node is reached, the node cannot spare more slots to any peer
|
||||
by inbound or outbound. Therefore, after a certain period of disconnection, any important peering can be lost indefinitely
|
||||
because all slots are consumed by other peers, and the node stops trying to dial the peer anymore.
|
||||
|
||||
This is happening because of two reasons, exponential backoff and absence of unconditional peering feature for trusted peers.
|
||||
|
||||
|
||||
## Decision
|
||||
|
||||
We would like to suggest solving this problem by introducing two parameters in `config.toml`, `unconditional-peer-ids` and
|
||||
`persistent-peers-max-dial-period`.
|
||||
|
||||
1) `unconditional-peer-ids`
|
||||
|
||||
A node operator inputs list of ids of peers which are allowed to be connected by both inbound or outbound regardless of
|
||||
`max-num-inbound-peers` or `max-num-outbound-peers` of user's node reached or not.
|
||||
|
||||
2) `persistent-peers-max-dial-period`
|
||||
|
||||
Terms between each dial to each persistent peer will not exceed `persistent-peers-max-dial-period` during exponential backoff.
|
||||
Therefore, `dial-period` = min(`persistent-peers-max-dial-period`, `exponential-backoff-dial-period`)
|
||||
|
||||
Alternative approach
|
||||
|
||||
Persistent-peers is only for outbound, therefore it is not enough to cover the full utility of `unconditional-peer-ids`.
|
||||
@creamers158(https://github.com/Creamers158) suggested putting id-only items into persistent-peers to be handled as
|
||||
`unconditional-peer-ids`, but it needs very complicated struct exception for different structure of items in persistent-peers.
|
||||
Therefore we decided to have `unconditional-peer-ids` to independently cover this use-case.
|
||||
|
||||
## Status
|
||||
|
||||
Proposed
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
A node operator can configure two new parameters in `config.toml` so that he/she can assure that tendermint will allow connections
|
||||
from/to peers in `unconditional-peer-ids`. Also he/she can assure that every persistent peer will be dialed at least once in every
|
||||
`persistent-peers-max-dial-period` term. It achieves more stable and persistent peering for trusted peers.
|
||||
|
||||
### Negative
|
||||
|
||||
The new feature introduces two new parameters in `config.toml` which needs explanation for node operators.
|
||||
|
||||
### Neutral
|
||||
|
||||
## References
|
||||
|
||||
* two p2p feature enhancement proposal(https://github.com/tendermint/tendermint/issues/4053)
|
||||
@@ -1,53 +0,0 @@
|
||||
# ADR 051: Double Signing Risk Reduction
|
||||
|
||||
## Changelog
|
||||
|
||||
* 27-11-2019: Initial draft
|
||||
* 13-01-2020: Separate into 2 ADR, This ADR will only cover Double signing Protection and ADR-052 handle Tendermint Mode
|
||||
* 22-01-2020: change the title from "Double signing Protection" to "Double Signing Risk Reduction"
|
||||
|
||||
## Context
|
||||
|
||||
To provide a risk reduction method for double signing incidents mistakenly executed by validators
|
||||
- Validators often mistakenly run duplicated validators to cause double-signing incident
|
||||
- This proposed feature is to reduce the risk of mistaken double-signing incident by checking recent N blocks before voting begins
|
||||
- When we think of such serious impact on double-signing incident, it is very reasonable to have multiple risk reduction algorithm built in node daemon
|
||||
|
||||
## Decision
|
||||
|
||||
We would like to suggest a double signing risk reduction method.
|
||||
|
||||
- Methodology : query recent consensus results to find out whether node's consensus key is used on consensus recently or not
|
||||
- When to check
|
||||
- When the state machine starts `ConsensusReactor` after fully synced
|
||||
- When the node is validator ( with privValidator )
|
||||
- When `cs.config.DoubleSignCheckHeight > 0`
|
||||
- How to check
|
||||
1. When a validator is transformed from syncing status to fully synced status, the state machine check recent N blocks (`latest_height - double_sign_check_height`) to find out whether there exists consensus votes using the validator's consensus key
|
||||
2. If there exists votes from the validator's consensus key, exit state machine program
|
||||
- Configuration
|
||||
- We would like to suggest by introducing `double_sign_check_height` parameter in `config.toml` and cli, how many blocks state machine looks back to check votes
|
||||
- <span v-pre>`double_sign_check_height = {{ .Consensus.DoubleSignCheckHeight }}`</span> in `config.toml`
|
||||
- `tendermint node --consensus.double_sign_check_height` in cli
|
||||
- State machine ignore checking procedure when `double_sign_check_height == 0`
|
||||
|
||||
## Status
|
||||
|
||||
Implemented
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- Validators can avoid double signing incident by mistakes. (eg. If another validator node is voting on consensus, starting new validator node with same consensus key will cause panic stop of the state machine because consensus votes with the consensus key are found in recent blocks)
|
||||
- We expect this method will prevent majority of double signing incident by mistakes.
|
||||
|
||||
### Negative
|
||||
|
||||
- When the risk reduction method is on, restarting a validator node will panic because the node itself voted on consensus with the same consensus key. So, validators should stop the state machine, wait for some blocks, and then restart the state machine to avoid panic stop.
|
||||
|
||||
### Neutral
|
||||
|
||||
## References
|
||||
|
||||
- Issue [#4059](https://github.com/tendermint/tendermint/issues/4059) : double-signing protection
|
||||
@@ -1,85 +0,0 @@
|
||||
# ADR 052: Tendermint Mode
|
||||
|
||||
## Changelog
|
||||
|
||||
* 27-11-2019: Initial draft from ADR-051
|
||||
* 13-01-2020: Separate ADR Tendermint Mode from ADR-051
|
||||
* 29-03-2021: Update info regarding defaults
|
||||
|
||||
## Context
|
||||
|
||||
- Full mode: full mode does not have the capability to become a validator.
|
||||
- Validator mode : this mode is exactly same as existing state machine behavior. sync without voting on consensus, and participate consensus when fully synced
|
||||
- Seed mode : lightweight seed node maintaining an address book, p2p like [TenderSeed](https://gitlab.com/polychainlabs/tenderseed)
|
||||
|
||||
## Decision
|
||||
|
||||
We would like to suggest a simple Tendermint mode abstraction. These modes will live under one binary, and when initializing a node the user will be able to specify which node they would like to create.
|
||||
|
||||
- Which reactor, component to include for each node
|
||||
- full
|
||||
- switch, transport
|
||||
- reactors
|
||||
- mempool
|
||||
- consensus
|
||||
- evidence
|
||||
- blockchain
|
||||
- p2p/pex
|
||||
- statesync
|
||||
- rpc (safe connections only)
|
||||
- *~~no privValidator(priv_validator_key.json, priv_validator_state.json)~~*
|
||||
- validator
|
||||
- switch, transport
|
||||
- reactors
|
||||
- mempool
|
||||
- consensus
|
||||
- evidence
|
||||
- blockchain
|
||||
- p2p/pex
|
||||
- statesync
|
||||
- rpc (safe connections only)
|
||||
- with privValidator(priv_validator_key.json, priv_validator_state.json)
|
||||
- seed
|
||||
- switch, transport
|
||||
- reactor
|
||||
- p2p/pex
|
||||
- Configuration, cli command
|
||||
- We would like to suggest by introducing `mode` parameter in `config.toml` and cli
|
||||
- <span v-pre>`mode = "{{ .BaseConfig.Mode }}"`</span> in `config.toml`
|
||||
- `tendermint start --mode validator` in cli
|
||||
- full | validator | seednode
|
||||
- There will be no default. Users will need to specify when they run `tendermint init`
|
||||
- RPC modification
|
||||
- `host:26657/status`
|
||||
- return empty `validator_info` when in full mode
|
||||
- no rpc server in seednode
|
||||
- Where to modify in codebase
|
||||
- Add switch for `config.Mode` on `node/node.go:DefaultNewNode`
|
||||
- If `config.Mode==validator`, call default `NewNode` (current logic)
|
||||
- If `config.Mode==full`, call `NewNode` with `nil` `privValidator` (do not load or generation)
|
||||
- Need to add exception routine for `nil` `privValidator` to related functions
|
||||
- If `config.Mode==seed`, call `NewSeedNode` (seed node version of `node/node.go:NewNode`)
|
||||
- Need to add exception routine for `nil` `reactor`, `component` to related functions
|
||||
|
||||
## Status
|
||||
|
||||
Implemented
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- Node operators can choose mode when they run state machine according to the purpose of the node.
|
||||
- Mode can prevent mistakes because users have to specify which mode they want to run via flag. (eg. If a user want to run a validator node, she/he should explicitly write down validator as mode)
|
||||
- Different mode needs different reactors, resulting in efficient resource usage.
|
||||
|
||||
### Negative
|
||||
|
||||
- Users need to study how each mode operate and which capability it has.
|
||||
|
||||
### Neutral
|
||||
|
||||
## References
|
||||
|
||||
- Issue [#2237](https://github.com/tendermint/tendermint/issues/2237) : Tendermint "mode"
|
||||
- [TenderSeed](https://gitlab.com/polychainlabs/tenderseed) : A lightweight Tendermint Seed Node.
|
||||
@@ -1,254 +0,0 @@
|
||||
# ADR 053: State Sync Prototype
|
||||
|
||||
State sync is now [merged](https://github.com/tendermint/tendermint/pull/4705). Up-to-date ABCI documentation is [available](https://github.com/tendermint/spec/pull/90), refer to it rather than this ADR for details.
|
||||
|
||||
This ADR outlines the plan for an initial state sync prototype, and is subject to change as we gain feedback and experience. It builds on discussions and findings in [ADR-042](./adr-042-state-sync.md), see that for background information.
|
||||
|
||||
## Changelog
|
||||
|
||||
* 2020-01-28: Initial draft (Erik Grinaker)
|
||||
|
||||
* 2020-02-18: Updates after initial prototype (Erik Grinaker)
|
||||
* ABCI: added missing `reason` fields.
|
||||
* ABCI: used 32-bit 1-based chunk indexes (was 64-bit 0-based).
|
||||
* ABCI: moved `RequestApplySnapshotChunk.chain_hash` to `RequestOfferSnapshot.app_hash`.
|
||||
* Gaia: snapshots must include node versions as well, both for inner and leaf nodes.
|
||||
* Added experimental prototype info.
|
||||
* Added open questions and implementation plan.
|
||||
|
||||
* 2020-03-29: Strengthened and simplified ABCI interface (Erik Grinaker)
|
||||
* ABCI: replaced `chunks` with `chunk_hashes` in `Snapshot`.
|
||||
* ABCI: removed `SnapshotChunk` message.
|
||||
* ABCI: renamed `GetSnapshotChunk` to `LoadSnapshotChunk`.
|
||||
* ABCI: chunks are now exchanged simply as `bytes`.
|
||||
* ABCI: chunks are now 0-indexed, for parity with `chunk_hashes` array.
|
||||
* Reduced maximum chunk size to 16 MB, and increased snapshot message size to 4 MB.
|
||||
|
||||
* 2020-04-29: Update with final released ABCI interface (Erik Grinaker)
|
||||
|
||||
## Context
|
||||
|
||||
State sync will allow a new node to receive a snapshot of the application state without downloading blocks or going through consensus. This bootstraps the node significantly faster than the current fast sync system, which replays all historical blocks.
|
||||
|
||||
Background discussions and justifications are detailed in [ADR-042](./adr-042-state-sync.md). Its recommendations can be summarized as:
|
||||
|
||||
* The application periodically takes full state snapshots (i.e. eager snapshots).
|
||||
|
||||
* The application splits snapshots into smaller chunks that can be individually verified against a chain app hash.
|
||||
|
||||
* Tendermint uses the light client to obtain a trusted chain app hash for verification.
|
||||
|
||||
* Tendermint discovers and downloads snapshot chunks in parallel from multiple peers, and passes them to the application via ABCI to be applied and verified against the chain app hash.
|
||||
|
||||
* Historical blocks are not backfilled, so state synced nodes will have a truncated block history.
|
||||
|
||||
## Tendermint Proposal
|
||||
|
||||
This describes the snapshot/restore process seen from Tendermint. The interface is kept as small and general as possible to give applications maximum flexibility.
|
||||
|
||||
### Snapshot Data Structure
|
||||
|
||||
A node can have multiple snapshots taken at various heights. Snapshots can be taken in different application-specified formats (e.g. MessagePack as format `1` and Protobuf as format `2`, or similarly for schema versioning). Each snapshot consists of multiple chunks containing the actual state data, for parallel downloads and reduced memory usage.
|
||||
|
||||
```proto
|
||||
message Snapshot {
|
||||
uint64 height = 1; // The height at which the snapshot was taken
|
||||
uint32 format = 2; // The application-specific snapshot format
|
||||
uint32 chunks = 3; // Number of chunks in the snapshot
|
||||
bytes hash = 4; // Arbitrary snapshot hash - should be equal only for identical snapshots
|
||||
bytes metadata = 5; // Arbitrary application metadata
|
||||
}
|
||||
```
|
||||
|
||||
Chunks are exchanged simply as `bytes`, and cannot be larger than 16 MB. `Snapshot` messages should be less than 4 MB.
|
||||
|
||||
### ABCI Interface
|
||||
|
||||
```proto
|
||||
// Lists available snapshots
|
||||
message RequestListSnapshots {}
|
||||
|
||||
message ResponseListSnapshots {
|
||||
repeated Snapshot snapshots = 1;
|
||||
}
|
||||
|
||||
// Offers a snapshot to the application
|
||||
message RequestOfferSnapshot {
|
||||
Snapshot snapshot = 1; // snapshot offered by peers
|
||||
bytes app_hash = 2; // light client-verified app hash for snapshot height
|
||||
}
|
||||
|
||||
message ResponseOfferSnapshot {
|
||||
Result result = 1;
|
||||
|
||||
enum Result {
|
||||
accept = 0; // Snapshot accepted, apply chunks
|
||||
abort = 1; // Abort all snapshot restoration
|
||||
reject = 2; // Reject this specific snapshot, and try a different one
|
||||
reject_format = 3; // Reject all snapshots of this format, and try a different one
|
||||
reject_sender = 4; // Reject all snapshots from the sender(s), and try a different one
|
||||
}
|
||||
}
|
||||
|
||||
// Loads a snapshot chunk
|
||||
message RequestLoadSnapshotChunk {
|
||||
uint64 height = 1;
|
||||
uint32 format = 2;
|
||||
uint32 chunk = 3; // Zero-indexed
|
||||
}
|
||||
|
||||
message ResponseLoadSnapshotChunk {
|
||||
bytes chunk = 1;
|
||||
}
|
||||
|
||||
// Applies a snapshot chunk
|
||||
message RequestApplySnapshotChunk {
|
||||
uint32 index = 1;
|
||||
bytes chunk = 2;
|
||||
string sender = 3;
|
||||
}
|
||||
|
||||
message ResponseApplySnapshotChunk {
|
||||
Result result = 1;
|
||||
repeated uint32 refetch_chunks = 2; // Chunks to refetch and reapply (regardless of result)
|
||||
repeated string reject_senders = 3; // Chunk senders to reject and ban (regardless of result)
|
||||
|
||||
enum Result {
|
||||
accept = 0; // Chunk successfully accepted
|
||||
abort = 1; // Abort all snapshot restoration
|
||||
retry = 2; // Retry chunk, combine with refetch and reject as appropriate
|
||||
retry_snapshot = 3; // Retry snapshot, combine with refetch and reject as appropriate
|
||||
reject_snapshot = 4; // Reject this snapshot, try a different one but keep sender rejections
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Taking Snapshots
|
||||
|
||||
Tendermint is not aware of the snapshotting process at all, it is entirely an application concern. The following guarantees must be provided:
|
||||
|
||||
* **Periodic:** snapshots must be taken periodically, not on-demand, for faster restores, lower load, and less DoS risk.
|
||||
|
||||
* **Deterministic:** snapshots must be deterministic, and identical across all nodes - typically by taking a snapshot at given height intervals.
|
||||
|
||||
* **Consistent:** snapshots must be consistent, i.e. not affected by concurrent writes - typically by using a data store that supports versioning and/or snapshot isolation.
|
||||
|
||||
* **Asynchronous:** snapshots must be asynchronous, i.e. not halt block processing and state transitions.
|
||||
|
||||
* **Chunked:** snapshots must be split into chunks of reasonable size (on the order of megabytes), and each chunk must be verifiable against the chain app hash.
|
||||
|
||||
* **Garbage collected:** snapshots must be garbage collected periodically.
|
||||
|
||||
### Restoring Snapshots
|
||||
|
||||
Nodes should have options for enabling state sync and/or fast sync, and be provided a trusted header hash for the light client.
|
||||
|
||||
When starting an empty node with state sync and fast sync enabled, snapshots are restored as follows:
|
||||
|
||||
1. The node checks that it is empty, i.e. that it has no state nor blocks.
|
||||
|
||||
2. The node contacts the given seeds to discover peers.
|
||||
|
||||
3. The node contacts a set of full nodes, and verifies the trusted block header using the given hash via the light client.
|
||||
|
||||
4. The node requests available snapshots via P2P from peers, via `RequestListSnapshots`. Peers will return the 10 most recent snapshots, one message per snapshot.
|
||||
|
||||
5. The node aggregates snapshots from multiple peers, ordered by height and format (in reverse). If there are mismatches between different snapshots, the one hosted by the largest amount of peers is chosen. The node iterates over all snapshots in reverse order by height and format until it finds one that satisfies all of the following conditions:
|
||||
|
||||
* The snapshot height's block is considered trustworthy by the light client (i.e. snapshot height is greater than trusted header and within unbonding period of the latest trustworthy block).
|
||||
|
||||
* The snapshot's height or format hasn't been explicitly rejected by an earlier `RequestOfferSnapshot`.
|
||||
|
||||
* The application accepts the `RequestOfferSnapshot` call.
|
||||
|
||||
6. The node downloads chunks in parallel from multiple peers, via `RequestLoadSnapshotChunk`. Chunk messages cannot exceed 16 MB.
|
||||
|
||||
7. The node passes chunks sequentially to the app via `RequestApplySnapshotChunk`.
|
||||
|
||||
8. Once all chunks have been applied, the node compares the app hash to the chain app hash, and if they do not match it either errors or discards the state and starts over.
|
||||
|
||||
9. The node switches to fast sync to catch up blocks that were committed while restoring the snapshot.
|
||||
|
||||
10. The node switches to normal consensus mode.
|
||||
|
||||
## Gaia Proposal
|
||||
|
||||
This describes the snapshot process seen from Gaia, using format version `1`. The serialization format is unspecified, but likely to be compressed Amino or Protobuf.
|
||||
|
||||
### Snapshot Metadata
|
||||
|
||||
In the initial version there is no snapshot metadata, so it is set to an empty byte buffer.
|
||||
|
||||
Once all chunks have been successfully built, snapshot metadata should be stored in a database and served via `RequestListSnapshots`.
|
||||
|
||||
### Snapshot Chunk Format
|
||||
|
||||
The Gaia data structure consists of a set of named IAVL trees. A root hash is constructed by taking the root hashes of each of the IAVL trees, then constructing a Merkle tree of the sorted name/hash map.
|
||||
|
||||
IAVL trees are versioned, but a snapshot only contains the version relevant for the snapshot height. All historical versions are ignored.
|
||||
|
||||
IAVL trees are insertion-order dependent, so key/value pairs must be set in an appropriate insertion order to produce the same tree branching structure. This insertion order can be found by doing a breadth-first scan of all nodes (including inner nodes) and collecting unique keys in order. However, the node hash also depends on the node's version, so snapshots must contain the inner nodes' version numbers as well.
|
||||
|
||||
For the initial prototype, each chunk consists of a complete dump of all node data for all nodes in an entire IAVL tree. Thus the number of chunks equals the number of persistent stores in Gaia. No incremental verification of chunks is done, only a final app hash comparison at the end of the snapshot restoration.
|
||||
|
||||
For a production version, it should be sufficient to store key/value/version for all nodes (leaf and inner) in insertion order, chunked in some appropriate way. If per-chunk verification is required, the chunk must also contain enough information to reconstruct the Merkle proofs all the way up to the root of the multistore, e.g. by storing a complete subtree's key/value/version data plus Merkle hashes of all other branches up to the multistore root. The exact approach will depend on tradeoffs between size, time, and verification. IAVL RangeProofs are not recommended, since these include redundant data such as proofs for intermediate and leaf nodes that can be derived from the above data.
|
||||
|
||||
Chunks should be built greedily by collecting node data up to some size limit (e.g. 10 MB) and serializing it. Chunk data is stored in the file system as `snapshots/<height>/<format>/<chunk>`, and a SHA-256 checksum is stored along with the snapshot metadata.
|
||||
|
||||
### Snapshot Scheduling
|
||||
|
||||
Snapshots should be taken at some configurable height interval, e.g. every 1000 blocks. All nodes should preferably have the same snapshot schedule, such that all nodes can serve chunks for a given snapshot.
|
||||
|
||||
Taking consistent snapshots of IAVL trees is greatly simplified by them being versioned: simply snapshot the version that corresponds to the snapshot height, while concurrent writes create new versions. IAVL pruning must not prune a version that is being snapshotted.
|
||||
|
||||
Snapshots must also be garbage collected after some configurable time, e.g. by keeping the latest `n` snapshots.
|
||||
|
||||
## Resolved Questions
|
||||
|
||||
* Is it OK for state-synced nodes to not have historical blocks nor historical IAVL versions?
|
||||
|
||||
> Yes, this is as intended. Maybe backfill blocks later.
|
||||
|
||||
* Do we need incremental chunk verification for first version?
|
||||
|
||||
> No, we'll start simple. Can add chunk verification via a new snapshot format without any breaking changes in Tendermint. For adversarial conditions, maybe consider support for whitelisting peers to download chunks from.
|
||||
|
||||
* Should the snapshot ABCI interface be a separate optional ABCI service, or mandatory?
|
||||
|
||||
> Mandatory, to keep things simple for now. It will therefore be a breaking change and push the release. For apps using the Cosmos SDK, we can provide a default implementation that does not serve snapshots and errors when trying to apply them.
|
||||
|
||||
* How can we make sure `ListSnapshots` data is valid? An adversary can provide fake/invalid snapshots to DoS peers.
|
||||
|
||||
> For now, just pick snapshots that are available on a large number of peers. Maybe support whitelisting. We may consider e.g. placing snapshot manifests on the blockchain later.
|
||||
|
||||
* Should we punish nodes that provide invalid snapshots? How?
|
||||
|
||||
> No, these are full nodes not validators, so we can't punish them. Just disconnect from them and ignore them.
|
||||
|
||||
* Should we call these snapshots? The SDK already uses the term "snapshot" for `PruningOptions.SnapshotEvery`, and state sync will introduce additional SDK options for snapshot scheduling and pruning that are not related to IAVL snapshotting or pruning.
|
||||
|
||||
> Yes. Hopefully these concepts are distinct enough that we can refer to state sync snapshots and IAVL snapshots without too much confusion.
|
||||
|
||||
* Should we store snapshot and chunk metadata in a database? Can we use the database for chunks?
|
||||
|
||||
> As a first approach, store metadata in a database and chunks in the filesystem.
|
||||
|
||||
* Should a snapshot at height H be taken before or after the block at H is processed? E.g. RPC `/commit` returns app_hash after _previous_ height, i.e. _before_ current height.
|
||||
|
||||
> After commit.
|
||||
|
||||
* Do we need to support all versions of blockchain reactor (i.e. fast sync)?
|
||||
|
||||
> We should remove the v1 reactor completely once v2 has stabilized.
|
||||
|
||||
* Should `ListSnapshots` be a streaming API instead of a request/response API?
|
||||
|
||||
> No, just use a max message size.
|
||||
|
||||
## Status
|
||||
|
||||
Implemented
|
||||
|
||||
## References
|
||||
|
||||
* [ADR-042](./adr-042-state-sync.md) and its references
|
||||
@@ -1,71 +0,0 @@
|
||||
# ADR 054: Crypto encoding (part 2)
|
||||
|
||||
## Changelog
|
||||
|
||||
2020-2-27: Created
|
||||
2020-4-16: Update
|
||||
|
||||
## Context
|
||||
|
||||
Amino has been a pain point of many users in the ecosystem. While Tendermint does not suffer greatly from the performance degradation introduced by amino, we are making an effort in moving the encoding format to a widely adopted format, [Protocol Buffers](https://developers.google.com/protocol-buffers). With this migration a new standard is needed for the encoding of keys. This will cause ecosystem wide breaking changes.
|
||||
|
||||
Currently amino encodes keys as `<PrefixBytes> <Length> <ByteArray>`.
|
||||
|
||||
## Decision
|
||||
|
||||
Previously Tendermint defined all the key types for use in Tendermint and the Cosmos-SDK. Going forward the Cosmos-SDK will define its own protobuf type for keys. This will allow Tendermint to only define the keys that are being used in the codebase (ed25519).
|
||||
There is the the opportunity to only define the usage of ed25519 (`bytes`) and not have it be a `oneof`, but this would mean that the `oneof` work is only being postponed to a later date. When using the `oneof` protobuf type we will have to manually switch over the possible key types and then pass them to the interface which is needed.
|
||||
|
||||
The approach that will be taken to minimize headaches for users is one where all encoding of keys will shift to protobuf and where amino encoding is relied on, there will be custom marshal and unmarshal functions.
|
||||
|
||||
Protobuf messages:
|
||||
|
||||
```proto
|
||||
message PubKey {
|
||||
oneof key {
|
||||
bytes ed25519 = 1;
|
||||
}
|
||||
|
||||
message PrivKey {
|
||||
oneof sum {
|
||||
bytes ed25519 = 1;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
> Note: The places where backwards compatibility is needed is still unclear.
|
||||
|
||||
All modules currently do not rely on amino encoded bytes and keys are not amino encoded for genesis, therefore a hardfork upgrade is what will be needed to adopt these changes.
|
||||
|
||||
This work will be broken out into a few PRs, this work will be merged into a proto-breakage branch, all PRs will be reviewed prior to being merged:
|
||||
|
||||
1. Encoding of keys to protobuf and protobuf messages
|
||||
2. Move Tendermint types to protobuf, mainly the ones that are being encoded.
|
||||
3. Go one by one through the reactors and transition amino encoded messages to protobuf.
|
||||
4. Test with cosmos-sdk and/or testnets repo.
|
||||
|
||||
## Status
|
||||
|
||||
Implemented
|
||||
|
||||
## Consequences
|
||||
|
||||
- Move keys to protobuf encoding, where backwards compatibility is needed, amino marshal and unmarshal functions will be used.
|
||||
|
||||
### Positive
|
||||
|
||||
- Protocol Buffer encoding will not change going forward.
|
||||
- Removing amino overhead from keys will help with the KSM.
|
||||
- Have a large ecosystem of supported languages.
|
||||
|
||||
### Negative
|
||||
|
||||
- Hardfork is required to integrate this into running chains.
|
||||
|
||||
### Neutral
|
||||
|
||||
## References
|
||||
|
||||
> Are there any relevant PR comments, issues that led up to this, or articles referenced for why we made the given design choice? If so link them here!
|
||||
|
||||
- {reference link}
|
||||
@@ -1,61 +0,0 @@
|
||||
# ADR 055: Protobuf Design
|
||||
|
||||
## Changelog
|
||||
|
||||
- 2020-4-15: Created (@marbar3778)
|
||||
- 2020-6-18: Updated (@marbar3778)
|
||||
|
||||
## Context
|
||||
|
||||
Currently we use [go-amino](https://github.com/tendermint/go-amino) throughout Tendermint. Amino is not being maintained anymore (April 15, 2020) by the Tendermint team and has been found to have issues:
|
||||
|
||||
- https://github.com/tendermint/go-amino/issues/286
|
||||
- https://github.com/tendermint/go-amino/issues/230
|
||||
- https://github.com/tendermint/go-amino/issues/121
|
||||
|
||||
These are a few of the known issues that users could run into.
|
||||
|
||||
Amino enables quick prototyping and development of features. While this is nice, amino does not provide the performance and developer convenience that is expected. For Tendermint to see wider adoption as a BFT protocol engine a transition to an adopted encoding format is needed. Below are some possible options that can be explored.
|
||||
|
||||
There are a few options to pick from:
|
||||
|
||||
- `Protobuf`: Protocol buffers are Google's language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler. It is supported in countless languages and has been proven in production for many years.
|
||||
|
||||
- `FlatBuffers`: FlatBuffers is an efficient cross platform serialization library. Flatbuffers are more efficient than Protobuf due to the fast that there is no parsing/unpacking to a second representation. FlatBuffers has been tested and used in production but is not widely adopted.
|
||||
|
||||
- `CapnProto`: Cap’n Proto is an insanely fast data interchange format and capability-based RPC system. Cap'n Proto does not have a encoding/decoding step. It has not seen wide adoption throughout the industry.
|
||||
|
||||
- @erikgrinaker - https://github.com/tendermint/tendermint/pull/4623#discussion_r401163501
|
||||
```
|
||||
Cap'n'Proto is awesome. It was written by one of the original Protobuf developers to fix some of its issues, and supports e.g. random access to process huge messages without loading them into memory and an (opt-in) canonical form which would be very useful when determinism is needed (e.g. in the state machine). That said, I suspect Protobuf is the better choice due to wider adoption, although it makes me kind of sad since Cap'n'Proto is technically better.
|
||||
```
|
||||
|
||||
## Decision
|
||||
|
||||
Transition Tendermint to Protobuf because of its performance and tooling. The Ecosystem behind Protobuf is vast and has outstanding [support for many languages](https://developers.google.com/protocol-buffers/docs/tutorials).
|
||||
|
||||
We will be making this possible by keeping the current types in there current form (handwritten) and creating a `/proto` directory in which all the `.proto` files will live. Where encoding is needed, on disk and over the wire, we will call util functions that will transition the types from handwritten go types to protobuf generated types. This is inline with the recommended file structure from [buf](https://buf.build). You can find more information on this file structure [here](https://buf.build/docs/lint-checkers#file_layout).
|
||||
|
||||
By going with this design we will enable future changes to types and allow for a more modular codebase.
|
||||
|
||||
## Status
|
||||
|
||||
Implemented
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- Allows for modular types in the future
|
||||
- Less refactoring
|
||||
- Allows the proto files to be pulled into the spec repo in the future.
|
||||
- Performance
|
||||
- Tooling & support in multiple languages
|
||||
|
||||
### Negative
|
||||
|
||||
- When a developer is updating a type they need to make sure to update the proto type as well
|
||||
|
||||
### Neutral
|
||||
|
||||
## References
|
||||
@@ -1,170 +0,0 @@
|
||||
# ADR 056: Light client amnesia attacks
|
||||
|
||||
## Changelog
|
||||
|
||||
- 02.04.20: Initial Draft
|
||||
- 06.04.20: Second Draft
|
||||
- 10.06.20: Post Implementation Revision
|
||||
- 19.08.20: Short Term Amnesia Alteration
|
||||
- 01.10.20: Status of Amnesia for 0.34
|
||||
|
||||
## Context
|
||||
|
||||
Whilst most created evidence of malicious behavior is self evident such that any individual can verify them independently there are types of evidence, known collectively as global evidence, that require further collaboration from the network in order to accumulate enough information to create evidence that is individually verifiable and can therefore be processed through consensus. [Fork Accountability](https://github.com/tendermint/tendermint/blob/master/spec/consensus/light-client/accountability.md) has been coined to describe the entire process of detection, proving and punishing of malicious behavior. This ADR addresses specifically what a light client amnesia attack is and how it can be proven and the current decision around handling light client amnesia attacks. For information on evidence handling by the light client, it is recommended to read [ADR 47](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-047-handling-evidence-from-light-client.md).
|
||||
|
||||
### Amnesia Attack
|
||||
|
||||
The schematic below explains a scenario where an amnesia attack can occur such that two sets of honest nodes, C1 and C2, commit different blocks.
|
||||
|
||||

|
||||
|
||||
1. C1 and F send PREVOTE messages for block A.
|
||||
2. C1 sends PRECOMMIT for round 1 for block A.
|
||||
3. A new round is started, C2 and F send PREVOTE messages for a different block B.
|
||||
4. C2 and F then send PRECOMMIT messages for block B.
|
||||
5. F later on creates PRECOMMITS for block A and combines it with those from C1 to form a block
|
||||
|
||||
|
||||
This forged block can then be used to fool light clients trying to verify it. It must be stressed that there are a few more hurdles or dimensions to the attack to consider.For a more detailed walkthrough refer to Appendix A.
|
||||
|
||||
## Decision
|
||||
|
||||
The decision surrounding amnesia attacks has both a short term and long term component. In the long term, a more sturdy protocol will need to be fleshed out and implemented. There is already draft documents outlining what such a protocol would look like and the resources it would require (see references). Prior revisions however outlined a protocol which had been implemented (See Appendix B). It was agreed that it still required greater consideration and review given it's importance. It was therefore discussed, with the limited time frame set before 0.34, whether the protocol should be completely removed or if there should remain some logic in handling the aforementioned scenarios.
|
||||
|
||||
The latter of the two options meant storing a record of all votes in any height with which there was more than one round. This information would then be accessible for applications if they wanted to perform some off-chain verification and punishment.
|
||||
|
||||
In summary, this seemed like too much to ask of the application to implement only on a temporary basis, whilst not having the domain specific knowledge and considering such a difficult and unlikely attack. Therefore the short term decision is to identify when the attack has occurred and implement the detector algorithm highlighted in [ADR 47](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-047-handling-evidence-from-light-client.md) but to not implement any accountability protocol that would identify malicious validators and allow applications to punish them. This will hopefully change in the long term with the focus on eventually reaching a concrete and secure protocol with identifying and dealing with these attacks.
|
||||
|
||||
## Implications
|
||||
|
||||
- Light clients will still be able to detect amnesia attacks so long as the assumption of having at least one correct witness holds
|
||||
- Light clients will gossip the attack to witnesses and halt thus failing to validate the incorrect block (and therefore not being fooled)
|
||||
- Validators will propose and commit evidence of the amnesia attack on chain
|
||||
- No evidence will be passed to the application indicting any malicious validators, thus meaning that no malicious validators will be punished for performing the attack
|
||||
- If a light clients bubble of providers are all faulty the light client will falsely validate amnesia attacks as well as any other 1/3+ light client attack.
|
||||
|
||||
## Status
|
||||
|
||||
Implemented
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
Light clients are still able to prevent falsely validating a block.
|
||||
|
||||
Already implemented.
|
||||
|
||||
### Negative
|
||||
|
||||
Light clients where all witnesses are faulty can be subject to an amnesia attack and verify a forged block that is not part of the chain.
|
||||
|
||||
### Neutral
|
||||
|
||||
|
||||
## References
|
||||
|
||||
- [Fork accountability algorithm](https://docs.google.com/document/d/11ZhMsCj3y7zIZz4udO9l25xqb0kl7gmWqNpGVRzOeyY/edit)
|
||||
- [Fork accountability spec](https://github.com/tendermint/tendermint/blob/master/spec/consensus/light-client/accountability.md)
|
||||
|
||||
## Appendix A: Detailed Walkthrough of Performing a Light Client Amnesia Attack
|
||||
|
||||
As the attacker, a prerequisite to this attack is first to observe or attempt to craft a block where a subset (less than ⅓) of correct validators sent precommit votes for a proposal in an earlier round and later received ⅔ prevotes for a different proposal thus changing their lock and correctly sending precommit votes (and later committing) for the proposal in the latter round. The second prerequisite is to have at least ⅓ validating power in that height (or enough voting power to have ⅔+ when combined with the precommits of the earlier round).
|
||||
|
||||
To go back to how one may craft such a block, we begin with one of the validators in this cabal being the proposer. They propose a block with all the txs that they want to fool a light client with. The proposer then only relays this to the members of their cabal and a controlled subset of correct validators (less than ⅓). We will call ourselves f for faulty and c1 for this correct subset.
|
||||
|
||||
Attackers need to rely on the assistance of some form of a network partition or on the nature of the sporadic voting to conjure their desired environment. The attackers need at least ⅓ of the validating power of the remaining correct validators, we shall denote this as c2, to not see ⅔ prevotes and thus not be locked on a block when it comes to the next round. If we have less than ⅓ remaining validators that don’t see this first proposal, then we will not have enough voting power to reach ⅔+ prevotes (the sum of f and c2) in the following round and thus change the lock of c1 such that we correctly commit the block in the latter round yet have enough precommits in the earlier round to fool the light client. Remember this is our desired scenario: to save all these precommit votes for a different (in this case earlier) proposed block.
|
||||
|
||||
To try to break this down even further let’s go back to the first round. F sends c1 a proposal (and not c2), c1 in turn sends their prevotes to all whom they are connected to. This means that some will be received by c2. F then sends their prevotes just to c1. Now not all validators in c1 may be connected to each other, so perhaps some validators in c1 might not receive ⅔ (from their own cohort and from f) and thus not precommit. In other situations we may see a validator in c2 connected to all validators in c1. Therefore they too will receive ⅔ prevotes and thus precommit. We can conclude therefore that although targeting this c1 subset of validators, those that actually precommit may be somewhat different. The key is for the attackers to observe the n amount of precommits they need in round 1 where n is ⅔+ - f, whilst ensuring that n itself does not go over ⅓. If it does then less than ⅔ validators remain to be able to change the lock and commit the block in the later round.
|
||||
|
||||
An extra dimension to this puzzle is the timeouts. Whilst c1 is relaying votes to its peers and these validators count closer towards the ⅔ threshold needed to send their precommit votes at any moment the timeout could be reached and thus the nodes will precommit nil and ignore any late prevote messages.
|
||||
|
||||
This is all to say that such an attack is partly out of the attackers hands. All they can do is tweak the subset of validators that they first choose to gossip the proposal and modify the timings around when they send their prevotes until they reach the desired precondition: n precommits for an earlier proposal and ⅔ precommits for the later proposal. So this is up to the gods of non deterministic behavior to help them out with their plight. I’m not going to allocate the hours to calculate the probability but it could be in the magnitude of 1000’s of blocks trying to get this scenario before the precondition is met.
|
||||
|
||||
Obviously, the probability becomes substantially higher as the cabal’s voting power nears ⅔. This is because both n decreases and there is greater tolerance to send prevotes to a greater amount of validators without going overboard and reaching the ⅓ precommit threshold in the first round which would mean they would have to try again.
|
||||
|
||||
Once we’ve got our n, we can then forge the remaining signatures for that block (from the f) and bundle them all together and tada we have a forged signed header.
|
||||
|
||||
Now we’ve done that, it’s time to find some light clients to fool.
|
||||
|
||||
Also critical to this type of attack is that the light client that is connected to our nodes must request a light block at that specific height with which we forged this signed header but this shouldn’t be hard to do. To bring this back to a real context, say our faulty cabal, f, bought some groceries using atoms and then wanted to prove that they did, the grocery owner whips out their phone, runs the light client and f tells them the height they committed the transaction.
|
||||
|
||||
An important note here is that because the validator sets are the same between the canonical and the forged block, this attack also works on light clients that verify sequentially. In fact, they are especially vulnerable because they currently don’t run the detector function afterwards.
|
||||
|
||||
However, if our grocery owner verifies using the skipping algorithm, they will then run the detector and therefore they will compare with other witness nodes. Ideally for our attackers, if f has a lot of nodes exposing their rpc endpoints, then there is a chance that all the witnesses the light client has are faulty and thus we have a successful attack and the grocery owner has been fooled into handing f a few apples and carrots.
|
||||
|
||||
However, there is a greater chance, especially if the light client is connected to quite a few other nodes that a divergence will be detected. The light client will figure out there was an amnesia attack and send the evidence to the witness to commit on chain. The grocery owner will see that verification failed and won't hand over the apples or carrots but also f won't be punished for their villainous behavior. This means that they can go over to the hairdressers and see if they can pull off the same stunt again.
|
||||
|
||||
So this brings to the fore the current defenses that are in place. As long as there has not been a cabal of validators with greater than 1/3 power (or the trust level), the light clients verification algorithm will prevent any attempts to deceive it. Greater than this threshold and we rely on the detector as a second layer of defense to pick up on any attack. It's security is chiefly tied with the assumption that at least one of the witnesses is correct. If this fails then as illustrated above, the light client can be suceptible to amnesia (as well as equivocation and lunatic) attacks.
|
||||
|
||||
The outstanding problem, if we indeed consider it big enough to be one, therefore lies in the incentivisation mechanism which is how f and other malicious validators are punished. This is decided by the application but it's up to Tendermint to identify them. With other forms of attacks the evidence lies in the precommits. But because an amnesia attack uses precommits from another round, which is information that is discarded by the consensus engine once the block is committed, it is difficult to understand which validators were in fact faulty.
|
||||
|
||||
If we cast our minds back to what I previously wrote, part of an amnesia attack depends on getting n precommits from an earlier round. These are then bundled with the malicious validators' own signatures. This means that the light client nor full nodes are capable of distinguishing which of the signatures were correctly created as part of Tendermint consensus and which were forged later on.
|
||||
|
||||
## Appendix B: Prior Amnesia Evidence Accountability Implementation
|
||||
|
||||
As the distinction between these two attacks (amnesia and back to the past) can only be distinguished by confirming with all validators (to see if it is a full fork or a light fork), for the purpose of simplicity, these attacks will be treated as the same.
|
||||
|
||||
Currently, the evidence reactor is used to simply broadcast and store evidence. The idea of creating a new reactor for the specific task of verifying these attacks was briefly discussed, but it is decided that the current evidence reactor will be extended.
|
||||
|
||||
The process begins with a light client receiving conflicting headers (in the future this could also be a full node during fast sync or state sync), which it sends to a full node to analyze. As part of [evidence handling](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-047-handling-evidence-from-light-client.md), this is extracted into potential amnesia evidence when the validator voted in more than one round for a different block.
|
||||
|
||||
```golang
|
||||
type PotentialAmnesiaEvidence struct {
|
||||
VoteA *types.Vote
|
||||
VoteB *types.Vote
|
||||
|
||||
Heightstamp int64
|
||||
}
|
||||
```
|
||||
|
||||
*NOTE: There had been an earlier notion towards batching evidence against the entire set of validators all together but this has given way to individual processing predominantly to maintain consistency with the other forms of evidence. A more extensive breakdown can be found [here](https://github.com/tendermint/tendermint/issues/4729)*
|
||||
|
||||
The evidence will contain the precommit votes for a validator that voted for both rounds. If the validator voted in more than two rounds, then they will have multiple `PotentialAmnesiaEvidence` against them hence it is possible that there is multiple evidence for a validator in a single height but not for a single round. The votes should be all valid and the height and time that the infringement was made should be within:
|
||||
|
||||
`MaxEvidenceAge - ProofTrialPeriod`
|
||||
|
||||
This trial period will be discussed later.
|
||||
|
||||
Returning to the event of an amnesia attack, if we were to examine the behavior of the honest nodes, C1 and C2, in the schematic, C2 will not PRECOMMIT an earlier round, but it is likely, if a node in C1 were to receive +2/3 PREVOTE's or PRECOMMIT's for a higher round, that it would remove the lock and PREVOTE and PRECOMMIT for the later round. Therefore, unfortunately it is not a case of simply punishing all nodes that have double voted in the `PotentialAmnesiaEvidence`.
|
||||
|
||||
Instead we use the Proof of Lock Change (PoLC) referred to in the [consensus spec](https://github.com/tendermint/tendermint/blob/master/spec/consensus/consensus.md#terms). When an honest node votes again for a different block in a later round
|
||||
(which will only occur in very rare cases), it will generate the PoLC and store it in the evidence reactor for a time equal to the `MaxEvidenceAge`
|
||||
|
||||
```golang
|
||||
type ProofOfLockChange struct {
|
||||
Votes []*types.Vote
|
||||
PubKey crypto.PubKey
|
||||
}
|
||||
```
|
||||
|
||||
This can be either evidence of +2/3 PREVOTES or PRECOMMITS (either warrants the honest node the right to vote) and is valid, among other checks, so long as the PRECOMMIT vote of the node in V2 came after all the votes in the `ProofOfLockChange` i.e. it received +2/3 votes for a block and then voted for that block thereafter (F is unable to prove this).
|
||||
|
||||
In the event that an honest node receives `PotentialAmnesiaEvidence` it will first `ValidateBasic()` and `Verify()` it and then will check if it is among the suspected nodes in the evidence. If so, it will retrieve the `ProofOfLockChange` and combine it with `PotentialAmensiaEvidence` to form `AmensiaEvidence`. All honest nodes that are part of the indicted group will have a time, measured in blocks, equal to `ProofTrialPeriod`, the aforementioned evidence paramter, to gossip their `AmnesiaEvidence` with their `ProofOfLockChange`
|
||||
|
||||
```golang
|
||||
type AmnesiaEvidence struct {
|
||||
*types.PotentialAmnesiaEvidence
|
||||
Polc *types.ProofOfLockChange
|
||||
}
|
||||
```
|
||||
|
||||
If the node is not required to submit any proof than it will simply broadcast the `PotentialAmnesiaEvidence`, stamp the height that it received the evidence and begin to wait out the trial period. It will ignore other `PotentialAmnesiaEvidence` gossiped at the same height and round.
|
||||
|
||||
If a node receives `AmnesiaEvidence` that contains a valid `ProofOfClockChange` it will add it to the evidence store and replace any PotentialAmnesiaEvidence of the same height and round. At this stage, an amnesia evidence with polc, it is ready to be submitted to the chin. If a node receives `AmnesiaEvidence` with an empty polc it will ignore it as each honest node will conduct their own trial period to be sure that time was given for any other honest nodes to respond.
|
||||
|
||||
There can only be one `AmnesiaEvidence` and one `PotentialAmneisaEvidence` stored for each attack (i.e. for each height).
|
||||
|
||||
When, `state.LastBlockHeight > PotentialAmnesiaEvidence.timestamp + ProofTrialPeriod`, nodes will upgrade the corresponding `PotentialAmnesiaEvidence` and attach an empty `ProofOfLockChange`. Then honest validators of the current validator set can begin proposing the block that contains the `AmnesiaEvidence`.
|
||||
|
||||
*NOTE: Even before the evidence is proposed and committed, the off-chain process of gossiping valid evidence could be
|
||||
enough for honest nodes to recognize the fork and halt.*
|
||||
|
||||
Other validators will vote `nil` if:
|
||||
|
||||
- The Amnesia Evidence is not valid
|
||||
- The Amensia Evidence is not within their own trial period i.e. too soon.
|
||||
- They don't have the Amnesia Evidence and it is has an empty polc (each validator needs to run their own trial period of the evidence)
|
||||
- Is of an AmnesiaEvidence that has already been committed to the chain.
|
||||
|
||||
Finally it is important to stress that the protocol of having a trial period addresses attacks where a validator voted again for a different block at a later round and time. In the event, however, that the validator voted for an earlier round after voting for a later round i.e. `VoteA.Timestamp < VoteB.Timestamp && VoteA.Round > VoteB.Round` then this action is inexcusable and can be punished immediately without the need of a trial period. In this case, PotentialAmnesiaEvidence will be instantly upgraded to AmnesiaEvidence.
|
||||
@@ -1,90 +0,0 @@
|
||||
# ADR 057: RPC
|
||||
|
||||
## Changelog
|
||||
|
||||
- 19-05-2020: created
|
||||
|
||||
## Context
|
||||
|
||||
Currently the RPC layer of Tendermint is using a variant of the JSON-RPC protocol. This ADR is meant to serve as a pro/con list for possible alternatives and JSON-RPC.
|
||||
|
||||
There are currently two options being discussed: gRPC & JSON-RPC.
|
||||
|
||||
### JSON-RPC
|
||||
|
||||
JSON-RPC is a JSON-based RPC protocol. Tendermint has implemented its own variant of JSON-RPC which is not compatible with the [JSON-RPC 2.0 specification](https://www.jsonrpc.org/specification).
|
||||
|
||||
**Pros:**
|
||||
|
||||
- Easy to use & implement (by default)
|
||||
- Well-known and well-understood by users and integrators
|
||||
- Integrates reasonably well with web infrastructure (proxies, API gateways, service meshes, caches, etc)
|
||||
- human readable encoding (by default)
|
||||
|
||||
**Cons:**
|
||||
|
||||
- No schema support
|
||||
- RPC clients must be hand-written
|
||||
- Streaming not built into protocol
|
||||
- Underspecified types (e.g. numbers and timestamps)
|
||||
- Tendermint has its own implementation (not standards compliant, maintenance overhead)
|
||||
- High maintenance cost associated to this
|
||||
- Stdlib `jsonrpc` package only supports JSON-RPC 1.0, no dominant package for JSON-RPC 2.0
|
||||
- Tooling around documentation/specification (e.g. Swagger) could be better
|
||||
- JSON data is larger (offset by HTTP compression)
|
||||
- Serializing is slow ([~100% marshal, ~400% unmarshal](https://github.com/alecthomas/go_serialization_benchmarks)); insignificant in absolute terms
|
||||
- Specification was last updated in 2013 and is way behind Swagger/OpenAPI
|
||||
|
||||
### gRPC + gRPC-gateway (REST + Swagger)
|
||||
|
||||
gRPC is a high performant RPC framework. It has been battle tested by a large number of users and is heavily relied on and maintained by countless large corporations.
|
||||
|
||||
**Pros:**
|
||||
|
||||
- Efficient data retrieval for users, lite clients and other protocols
|
||||
- Easily implemented in supported languages (Go, Dart, JS, TS, rust, Elixir, Haskell, ...)
|
||||
- Defined schema with richer type system (Protocol Buffers)
|
||||
- Can use common schemas and types across all protocols and data stores (RPC, ABCI, blocks, etc)
|
||||
- Established conventions for forwards- and backwards-compatibility
|
||||
- Bi-directional streaming
|
||||
- Servers and clients are be autogenerated in many languages (e.g. Tendermint-rs)
|
||||
- Auto-generated swagger documentation for REST API
|
||||
- Backwards and forwards compatibility guarantees enforced at the protocol level.
|
||||
- Can be used with different codecs (JSON, CBOR, ...)
|
||||
|
||||
**Cons:**
|
||||
|
||||
- Complex system involving cross-language schemas, code generation, and custom protocols
|
||||
- Type system does not always map cleanly to native language type system; integration woes
|
||||
- Many common types require Protobuf plugins (e.g. timestamps and duration)
|
||||
- Generated code may be non-idiomatic and hard to use
|
||||
- Migration will be disruptive and laborious
|
||||
|
||||
## Decision
|
||||
|
||||
> This section explains all of the details of the proposed solution, including implementation details.
|
||||
> It should also describe affects / corollary items that may need to be changed as a part of this.
|
||||
> If the proposed change will be large, please also indicate a way to do the change to maximize ease of review.
|
||||
> (e.g. the optimal split of things to do between separate PR's)
|
||||
|
||||
## Status
|
||||
|
||||
> A decision may be "proposed" if it hasn't been agreed upon yet, or "accepted" once it is agreed upon. If a later ADR changes or reverses a decision, it may be marked as "deprecated" or "superseded" with a reference to its replacement.
|
||||
|
||||
{Deprecated|Proposed|Accepted}
|
||||
|
||||
## Consequences
|
||||
|
||||
> This section describes the consequences, after applying the decision. All consequences should be summarized here, not just the "positive" ones.
|
||||
|
||||
### Positive
|
||||
|
||||
### Negative
|
||||
|
||||
### Neutral
|
||||
|
||||
## References
|
||||
|
||||
> Are there any relevant PR comments, issues that led up to this, or articles referenced for why we made the given design choice? If so link them here!
|
||||
|
||||
- {reference link}
|
||||
@@ -1,122 +0,0 @@
|
||||
# ADR 058: Event hashing
|
||||
|
||||
## Changelog
|
||||
|
||||
- 2020-07-17: initial version
|
||||
- 2020-07-27: fixes after Ismail and Ethan's comments
|
||||
- 2020-07-27: declined
|
||||
|
||||
## Context
|
||||
|
||||
Before [PR#4845](https://github.com/tendermint/tendermint/pull/4845),
|
||||
`Header#LastResultsHash` was a root of the Merkle tree built from `DeliverTx`
|
||||
results. Only `Code`, `Data` fields were included because `Info` and `Log`
|
||||
fields are non-deterministic.
|
||||
|
||||
At some point, we've added events to `ResponseBeginBlock`, `ResponseEndBlock`,
|
||||
and `ResponseDeliverTx` to give applications a way to attach some additional
|
||||
information to blocks / transactions.
|
||||
|
||||
Many applications seem to have started using them since.
|
||||
|
||||
However, before [PR#4845](https://github.com/tendermint/tendermint/pull/4845)
|
||||
there was no way to prove that certain events were a part of the result
|
||||
(_unless the application developer includes them into the state tree_).
|
||||
|
||||
Hence, [PR#4845](https://github.com/tendermint/tendermint/pull/4845) was
|
||||
opened. In it, `GasWanted` along with `GasUsed` are included when hashing
|
||||
`DeliverTx` results. Also, events from `BeginBlock`, `EndBlock` and `DeliverTx`
|
||||
results are hashed into the `LastResultsHash` as follows:
|
||||
|
||||
- Since we do not expect `BeginBlock` and `EndBlock` to contain many events,
|
||||
these will be Protobuf encoded and included in the Merkle tree as leaves.
|
||||
- `LastResultsHash` therefore is the root hash of a Merkle tree w/ 3 leafs:
|
||||
proto-encoded `ResponseBeginBlock#Events`, root hash of a Merkle tree build
|
||||
from `ResponseDeliverTx` responses (Log, Info and Codespace fields are
|
||||
ignored), and proto-encoded `ResponseEndBlock#Events`.
|
||||
- Order of events is unchanged - same as received from the ABCI application.
|
||||
|
||||
[Spec PR](https://github.com/tendermint/spec/pull/97/files)
|
||||
|
||||
While it's certainly good to be able to prove something, introducing new events
|
||||
or removing such becomes difficult because it breaks the `LastResultsHash`. It
|
||||
means that every time you add, remove or update an event, you'll need a
|
||||
hard-fork. And that is undoubtedly bad for applications, which are evolving and
|
||||
don't have a stable events set.
|
||||
|
||||
## Decision
|
||||
|
||||
As a middle ground approach, the proposal is to add the
|
||||
`Block#LastResultsEvents` consensus parameter that is a list of all events that
|
||||
are to be hashed in the header.
|
||||
|
||||
```
|
||||
@ proto/tendermint/abci/types.proto:295 @ message BlockParams {
|
||||
int64 max_bytes = 1;
|
||||
// Note: must be greater or equal to -1
|
||||
int64 max_gas = 2;
|
||||
// List of events, which will be hashed into the LastResultsHash
|
||||
repeated string last_results_events = 3;
|
||||
}
|
||||
```
|
||||
|
||||
Initially the list is empty. The ABCI application can change it via `InitChain`
|
||||
or `EndBlock`.
|
||||
|
||||
Example:
|
||||
|
||||
```go
|
||||
func (app *MyApp) DeliverTx(req types.RequestDeliverTx) types.ResponseDeliverTx {
|
||||
//...
|
||||
events := []abci.Event{
|
||||
{
|
||||
Type: "transfer",
|
||||
Attributes: []abci.EventAttribute{
|
||||
{Key: []byte("sender"), Value: []byte("Bob"), Index: true},
|
||||
},
|
||||
},
|
||||
}
|
||||
return types.ResponseDeliverTx{Code: code.CodeTypeOK, Events: events}
|
||||
}
|
||||
```
|
||||
|
||||
For "transfer" event to be hashed, the `LastResultsEvents` must contain a
|
||||
string "transfer".
|
||||
|
||||
## Status
|
||||
|
||||
Declined
|
||||
|
||||
**Until there's more stability/motivation/use-cases/demand, the decision is to
|
||||
push this entirely application side and just have apps which want events to be
|
||||
provable to insert them into their application-side merkle trees. Of course
|
||||
this puts more pressure on their application state and makes event proving
|
||||
application specific, but it might help built up a better sense of use-cases
|
||||
and how this ought to ultimately be done by Tendermint.**
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
1. networks can perform parameter change proposals to update this list as new events are added
|
||||
2. allows networks to avoid having to do hard-forks
|
||||
3. events can still be added at-will to the application w/o breaking anything
|
||||
|
||||
### Negative
|
||||
|
||||
1. yet another consensus parameter
|
||||
2. more things to track in the tendermint state
|
||||
|
||||
## References
|
||||
|
||||
- [ADR 021](./adr-021-abci-events.md)
|
||||
- [Indexing transactions](../app-dev/indexing-transactions.md)
|
||||
|
||||
## Appendix A. Alternative proposals
|
||||
|
||||
The other proposal was to add `Hash bool` flag to the `Event`, similarly to
|
||||
`Index bool` EventAttribute's field. When `true`, Tendermint would hash it into
|
||||
the `LastResultsEvents`. The downside is that the logic is implicit and depends
|
||||
largely on the node's operator, who decides what application code to run. The
|
||||
above proposal makes it (the logic) explicit and easy to upgrade via
|
||||
governance.
|
||||
@@ -1,306 +0,0 @@
|
||||
# ADR 059: Evidence Composition and Lifecycle
|
||||
|
||||
## Changelog
|
||||
|
||||
- 04/09/2020: Initial Draft (Unabridged)
|
||||
- 07/09/2020: First Version
|
||||
- 13/03/2021: Ammendment to accomodate forward lunatic attack
|
||||
- 29/06/2021: Add information about ABCI specific fields
|
||||
|
||||
## Scope
|
||||
|
||||
This document is designed to collate together and surface some predicaments involving evidence in Tendermint: both its composition and lifecycle. It then aims to find a solution to these. The scope does not extend to the verification nor detection of certain types of evidence but concerns itself mainly with the general form of evidence and how it moves from inception to application.
|
||||
|
||||
## Background
|
||||
|
||||
For a long time `DuplicateVoteEvidence`, formed in the consensus reactor, was the only evidence Tendermint had. It was produced whenever two votes from the same validator in the same round
|
||||
was observed and thus it was designed that each evidence was for a single validator. It was predicted that there may come more forms of evidence and thus `DuplicateVoteEvidence` was used as the model for the `Evidence` interface and also for the form of the evidence data sent to the application. It is important to note that Tendermint concerns itself just with the detection and reporting of evidence and it is the responsibility of the application to exercise punishment.
|
||||
|
||||
```go
|
||||
type Evidence interface { //existing
|
||||
Height() int64 // height of the offense
|
||||
Time() time.Time // time of the offense
|
||||
Address() []byte // address of the offending validator
|
||||
Bytes() []byte // bytes which comprise the evidence
|
||||
Hash() []byte // hash of the evidence
|
||||
Verify(chainID string, pubKey crypto.PubKey) error // verify the evidence
|
||||
Equal(Evidence) bool // check equality of evidence
|
||||
|
||||
ValidateBasic() error
|
||||
String() string
|
||||
}
|
||||
```
|
||||
|
||||
```go
|
||||
type DuplicateVoteEvidence struct {
|
||||
VoteA *Vote
|
||||
VoteB *Vote
|
||||
|
||||
timestamp time.Time // taken from the block time
|
||||
}
|
||||
```
|
||||
|
||||
Tendermint has now introduced a new type of evidence to protect light clients from being attacked. This `LightClientAttackEvidence` (see [here](https://github.com/informalsystems/tendermint-rs/blob/31ca3e64ce90786c1734caf186e30595832297a4/docs/spec/lightclient/attacks/evidence-handling.md) for more information) is vastly different to `DuplicateVoteEvidence` in that it is physically a much different size containing a complete signed header and validator set. It is formed within the light client, not the consensus reactor and requires a lot more information from state to verify (`VerifyLightClientAttack(commonHeader, trustedHeader *SignedHeader, commonVals *ValidatorSet)` vs `VerifyDuplicateVote(chainID string, pubKey PubKey)`). Finally it batches validators together (a single piece of evidence that implicates multiple malicious validators at a height) as opposed to having individual evidence (each piece of evidence is per validator per height). This evidence stretches the existing mould that was used to accommodate new types of evidence and has thus caused us to reconsider how evidence should be formatted and processed.
|
||||
|
||||
```go
|
||||
type LightClientAttackEvidence struct { // proposed struct in spec
|
||||
ConflictingBlock *LightBlock
|
||||
CommonHeight int64
|
||||
Type AttackType // enum: {Lunatic|Equivocation|Amnesia}
|
||||
|
||||
timestamp time.Time // taken from the block time at the common height
|
||||
}
|
||||
```
|
||||
*Note: These three attack types have been proven by the research team to be exhaustive*
|
||||
|
||||
## Possible Approaches for Evidence Composition
|
||||
|
||||
### Individual framework
|
||||
|
||||
Evidence remains on a per validator basis. This causes the least disruption to the current processes but requires that we break `LightClientAttackEvidence` into several pieces of evidence for each malicious validator. This not only has performance consequences in that there are n times as many database operations and that the gossiping of evidence will require more bandwidth then necessary (by requiring a header for each piece) but it potentially impacts our ability to validate it. In batch form, the full node can run the same process the light client did to see that 1/3 validating power was present in both the common block and the conflicting block whereas this becomes more difficult to verify individually without opening the possibility that malicious validators forge evidence against innocent . Not only that, but `LightClientAttackEvidence` also deals with amnesia attacks which unfortunately have the characteristic where we know the set of validators involved but not the subset that were actually malicious (more to be said about this later). And finally splitting the evidence into individual pieces makes it difficult to understand the severity of the attack (i.e. the total voting power involved in the attack)
|
||||
|
||||
#### An example of a possible implementation path
|
||||
|
||||
We would ignore amnesia evidence (as individually it's hard to make) and revert to the initial split we had before where `DuplicateVoteEvidence` is also used for light client equivocation attacks and thus we only need `LunaticEvidence`. We would also most likely need to remove `Verify` from the interface as this isn't really something that can be used.
|
||||
|
||||
``` go
|
||||
type LunaticEvidence struct { // individual lunatic attack
|
||||
header *Header
|
||||
commonHeight int64
|
||||
vote *Vote
|
||||
|
||||
timestamp time.Time // once again taken from the block time at the height of the common header
|
||||
}
|
||||
```
|
||||
|
||||
### Batch Framework
|
||||
|
||||
The last approach of this category would be to consider batch only evidence. This works fine with `LightClientAttackEvidence` but would require alterations to `DuplicateVoteEvidence` which would most likely mean that the consensus would send conflicting votes to a buffer in the evidence module which would then wrap all the votes together per height before gossiping them to other nodes and trying to commit it on chain. At a glance this may improve IO and verification speed and perhaps more importantly grouping validators gives the application and Tendermint a better overview of the severity of the attack.
|
||||
|
||||
However individual evidence has the advantage that it is easy to check if a node already has that evidence meaning we just need to check hashes to know that we've already verified this evidence before. Batching evidence would imply that each node may have a different combination of duplicate votes which may complicate things.
|
||||
|
||||
#### An example of a possible implementation path
|
||||
|
||||
`LightClientAttackEvidence` won't change but the evidence interface will need to look like the proposed one above and `DuplicateVoteEvidence` will need to change to encompass multiple double votes. A problem with batch evidence is that it needs to be unique to avoid people from submitting different permutations.
|
||||
|
||||
## Decision
|
||||
|
||||
The decision is to adopt a hybrid design.
|
||||
|
||||
We allow individual and batch evidence to coexist together, meaning that verification is done depending on the evidence type and that the bulk of the work is done in the evidence pool itself (including forming the evidence to be sent to the application).
|
||||
|
||||
|
||||
## Detailed Design
|
||||
|
||||
Evidence has the following simple interface:
|
||||
|
||||
```go
|
||||
type Evidence interface { //proposed
|
||||
Height() int64 // height of the offense
|
||||
Bytes() []byte // bytes which comprise the evidence
|
||||
Hash() []byte // hash of the evidence
|
||||
ValidateBasic() error
|
||||
String() string
|
||||
}
|
||||
```
|
||||
|
||||
The changing of the interface is backwards compatible as these methods are all present in the previous version of the interface. However, networks will need to upgrade to be able to process the new evidence as verification has changed.
|
||||
|
||||
We have two concrete types of evidence that fulfil this interface
|
||||
|
||||
```go
|
||||
type LightClientAttackEvidence struct {
|
||||
ConflictingBlock *LightBlock
|
||||
CommonHeight int64 // the last height at which the primary provider and witness provider had the same header
|
||||
|
||||
// abci specific information
|
||||
ByzantineValidators []*Validator // validators in the validator set that misbehaved in creating the conflicting block
|
||||
TotalVotingPower int64 // total voting power of the validator set at the common height
|
||||
Timestamp time.Time // timestamp of the block at the common height
|
||||
}
|
||||
```
|
||||
where the `Hash()` is the hash of the header and commonHeight.
|
||||
|
||||
Note: It was also discussed whether to include the commit hash which captures the validators that signed the header. However this would open the opportunity for someone to propose multiple permutations of the same evidence (through different commit signatures) hence it was omitted. Consequentially, when it comes to verifying evidence in a block, for `LightClientAttackEvidence` we can't just check the hashes because someone could have the same hash as us but a different commit where less than 1/3 validators voted which would be an invalid version of the evidence. (see `fastCheck` for more details)
|
||||
|
||||
```go
|
||||
type DuplicateVoteEvidence {
|
||||
VoteA *Vote
|
||||
VoteB *Vote
|
||||
|
||||
// abci specific information
|
||||
TotalVotingPower int64
|
||||
ValidatorPower int64
|
||||
Timestamp time.Time
|
||||
}
|
||||
```
|
||||
where the `Hash()` is the hash of the two votes
|
||||
|
||||
For both of these types of evidence, `Bytes()` represents the proto-encoded byte array format of the evidence and `ValidateBasic` is
|
||||
an initial consistency check to make sure the evidence has a valid structure.
|
||||
|
||||
### The Evidence Pool
|
||||
|
||||
`LightClientAttackEvidence` is generated in the light client and `DuplicateVoteEvidence` in consensus. Both are sent to the evidence pool through `AddEvidence(ev Evidence) error`. The evidence pool's primary purpose is to verify evidence. It also gossips evidence to other peers' evidence pool and serves it to consensus so it can be committed on chain and the relevant information can be sent to the application in order to exercise punishment. When evidence is added, the pool first runs `Has(ev Evidence)` to check if it has already received it (by comparing hashes) and then `Verify(ev Evidence) error`. Once verified the evidence pool stores it it's pending database. There are two databases: one for pending evidence that is not yet committed and another of the committed evidence (to avoid committing evidence twice)
|
||||
|
||||
#### Verification
|
||||
|
||||
`Verify()` does the following:
|
||||
|
||||
- Use the hash to see if we already have this evidence in our committed database.
|
||||
|
||||
- Use the height to check if the evidence hasn't expired.
|
||||
|
||||
- If it has expired then use the height to find the block header and check if the time has also expired in which case we drop the evidence
|
||||
|
||||
- Then proceed with switch statement for each of the two evidence:
|
||||
|
||||
For `DuplicateVote`:
|
||||
|
||||
- Check that height, round, type and validator address are the same
|
||||
|
||||
- Check that the Block ID is different
|
||||
|
||||
- Check the look up table for addresses to make sure there already isn't evidence against this validator
|
||||
|
||||
- Fetch the validator set and confirm that the address is in the set at the height of the attack
|
||||
|
||||
- Check that the chain ID and signature is valid.
|
||||
|
||||
For `LightClientAttack`
|
||||
|
||||
- Fetch the common signed header and val set from the common height and use skipping verification to verify the conflicting header
|
||||
|
||||
- Fetch the trusted signed header at the same height as the conflicting header and compare with the conflicting header to work out which type of attack it is and in doing so return the malicious validators. NOTE: If the node doesn't have the signed header at the height of the conflicting header, it instead fetches the latest header it has and checks to see if it can prove the evidence based on a violation of header time. This is known as forward lunatic attack.
|
||||
|
||||
- If equivocation, return the validators that signed for the commits of both the trusted and signed header
|
||||
|
||||
- If lunatic, return the validators from the common val set that signed in the conflicting block
|
||||
|
||||
- If amnesia, return no validators (since we can't know which validators are malicious). This also means that we don't currently send amnesia evidence to the application, although we will introduce more robust amnesia evidence handling in future Tendermint Core releases
|
||||
|
||||
- Check that the hashes of the conflicting header and the trusted header are different
|
||||
|
||||
- In the case of a forward lunatic attack, where the trusted header height is less than the conflicting header height, the node checks that the time of the trusted header is later than the time of conflicting header. This proves that the conflicting header breaks monotonically increasing time. If the node doesn't have a trusted header with a later time then it is unable to validate the evidence for now.
|
||||
|
||||
- Lastly, for each validator, check the look up table to make sure there already isn't evidence against this validator
|
||||
|
||||
After verification we persist the evidence with the key `height/hash` to the pending evidence database in the evidence pool.
|
||||
|
||||
#### ABCI Evidence
|
||||
|
||||
Both evidence structures contain data (such as timestamp) that are necessary to be passed to the application but do not strictly constitute evidence of misbehaviour. As such, these fields are verified last. If any of these fields are invalid to a node i.e. they don't correspond with their state, nodes will reconstruct a new evidence struct from the existing fields and repopulate the abci specific fields with their own state data.
|
||||
|
||||
#### Broadcasting and receiving evidence
|
||||
|
||||
The evidence pool also runs a reactor that broadcasts the newly validated
|
||||
evidence to all connected peers.
|
||||
|
||||
Receiving evidence from other evidence reactors works in the same manner as receiving evidence from the consensus reactor or a light client.
|
||||
|
||||
|
||||
#### Proposing evidence on the block
|
||||
|
||||
When it comes to prevoting and precomitting a proposal that contains evidence, the full node will once again
|
||||
call upon the evidence pool to verify the evidence using `CheckEvidence(ev []Evidence)`:
|
||||
|
||||
This performs the following actions:
|
||||
|
||||
1. Loops through all the evidence to check that nothing has been duplicated
|
||||
|
||||
2. For each evidence, run `fastCheck(ev evidence)` which works similar to `Has` but instead for `LightClientAttackEvidence` if it has the
|
||||
same hash it then goes on to check that the validators it has are all signers in the commit of the conflicting header. If it doesn't pass fast check (because it hasn't seen the evidence before) then it will have to verify the evidence.
|
||||
|
||||
3. runs `Verify(ev Evidence)` - Note: this also saves the evidence to the db as mentioned before.
|
||||
|
||||
|
||||
#### Updating application and pool
|
||||
|
||||
The final part of the lifecycle is when the block is committed and the `BlockExecutor` then updates state. As part of this process, the `BlockExecutor` gets the evidence pool to create a simplified format for the evidence to be sent to the application. This happens in `ApplyBlock` where the executor calls `Update(Block, State) []abci.Evidence`.
|
||||
|
||||
```go
|
||||
abciResponses.BeginBlock.ByzantineValidators = evpool.Update(block, state)
|
||||
```
|
||||
|
||||
Here is the format of the evidence that the application will receive. As seen above, this is stored as an array within `BeginBlock`.
|
||||
The changes to the application are minimal (it is still formed one for each malicious validator) with the exception of using an enum instead of a string for the evidence type.
|
||||
|
||||
```go
|
||||
type Evidence struct {
|
||||
// either LightClientAttackEvidence or DuplicateVoteEvidence as an enum (abci.EvidenceType)
|
||||
Type EvidenceType `protobuf:"varint,1,opt,name=type,proto3,enum=tendermint.abci.EvidenceType" json:"type,omitempty"`
|
||||
// The offending validator
|
||||
Validator Validator `protobuf:"bytes,2,opt,name=validator,proto3" json:"validator"`
|
||||
// The height when the offense occurred
|
||||
Height int64 `protobuf:"varint,3,opt,name=height,proto3" json:"height,omitempty"`
|
||||
// The corresponding time where the offense occurred
|
||||
Time time.Time `protobuf:"bytes,4,opt,name=time,proto3,stdtime" json:"time"`
|
||||
// Total voting power of the validator set in case the ABCI application does
|
||||
// not store historical validators.
|
||||
// https://github.com/tendermint/tendermint/issues/4581
|
||||
TotalVotingPower int64 `protobuf:"varint,5,opt,name=total_voting_power,json=totalVotingPower,proto3" json:"total_voting_power,omitempty"`
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
This `Update()` function does the following:
|
||||
|
||||
- Increments state which keeps track of both the current time and height used for measuring expiry
|
||||
|
||||
- Marks evidence as committed and saves to db. This prevents validators from proposing committed evidence in the future
|
||||
Note: the db just saves the height and the hash. There is no need to save the entire committed evidence
|
||||
|
||||
- Forms ABCI evidence as such: (note for `DuplicateVoteEvidence` the validators array size is 1)
|
||||
```go
|
||||
for _, val := range evInfo.Validators {
|
||||
abciEv = append(abciEv, &abci.Evidence{
|
||||
Type: evType, // either DuplicateVote or LightClientAttack
|
||||
Validator: val, // the offending validator (which includes the address, pubkey and power)
|
||||
Height: evInfo.ev.Height(), // the height when the offense happened
|
||||
Time: evInfo.time, // the time when the offense happened
|
||||
TotalVotingPower: evInfo.totalVotingPower // the total voting power of the validator set
|
||||
})
|
||||
}
|
||||
```
|
||||
|
||||
- Removes expired evidence from both pending and committed databases
|
||||
|
||||
The ABCI evidence is then sent via the `BlockExecutor` to the application.
|
||||
|
||||
#### Summary
|
||||
|
||||
To summarize, we can see the lifecycle of evidence as such:
|
||||
|
||||

|
||||
|
||||
Evidence is first detected and created in the light client and consensus reactor. It is verified and stored as `EvidenceInfo` and gossiped to the evidence pools in other nodes. The consensus reactor later communicates with the evidence pool to either retrieve evidence to be put into a block, or verify the evidence the consensus reactor has retrieved in a block. Lastly when a block is added to the chain, the block executor sends the committed evidence back to the evidence pool so a pointer to the evidence can be stored in the evidence pool and it can update it's height and time. Finally, it turns the committed evidence into ABCI evidence and through the block executor passes the evidence to the application so the application can handle it.
|
||||
|
||||
## Status
|
||||
|
||||
Implemented
|
||||
|
||||
## Consequences
|
||||
|
||||
<!-- > This section describes the consequences, after applying the decision. All consequences should be summarized here, not just the "positive" ones. -->
|
||||
|
||||
### Positive
|
||||
|
||||
- Evidence is better contained to the evidence pool / module
|
||||
- LightClientAttack is kept together (easier for verification and bandwidth)
|
||||
- Variations on commit sigs in LightClientAttack doesn't lead to multiple permutations and multiple evidence
|
||||
- Address to evidence map prevents DOS attacks, where a single validator could DOS the network by flooding it with evidence submissions
|
||||
|
||||
### Negative
|
||||
|
||||
- Changes the `Evidence` interface and thus is a block breaking change
|
||||
- Changes the ABCI `Evidence` and is thus a ABCI breaking change
|
||||
- Unable to query evidence for address / time without evidence pool
|
||||
|
||||
### Neutral
|
||||
|
||||
|
||||
## References
|
||||
|
||||
<!-- > Are there any relevant PR comments, issues that led up to this, or articles referenced for why we made the given design choice? If so link them here! -->
|
||||
|
||||
- [LightClientAttackEvidence](https://github.com/informalsystems/tendermint-rs/blob/31ca3e64ce90786c1734caf186e30595832297a4/docs/spec/lightclient/attacks/evidence-handling.md)
|
||||
@@ -1,193 +0,0 @@
|
||||
# ADR 060: Go API Stability
|
||||
|
||||
## Changelog
|
||||
|
||||
- 2020-09-08: Initial version. (@erikgrinaker)
|
||||
|
||||
- 2020-09-09: Tweak accepted changes, add initial public API packages, add consequences. (@erikgrinaker)
|
||||
|
||||
- 2020-09-17: Clarify initial public API. (@erikgrinaker)
|
||||
|
||||
## Context
|
||||
|
||||
With the release of Tendermint 1.0 we will adopt [semantic versioning](https://semver.org). One major implication is a guarantee that we will not make backwards-incompatible changes until Tendermint 2.0 (except in pre-release versions). In order to provide this guarantee for our Go API, we must clearly define which of our APIs are public, and what changes are considered backwards-compatible.
|
||||
|
||||
Currently, we list packages that we consider public in our [README](https://github.com/tendermint/tendermint#versioning), but since we are still at version 0.x we do not provide any backwards compatiblity guarantees at all.
|
||||
|
||||
### Glossary
|
||||
|
||||
* **External project:** a different Git/VCS repository or code base.
|
||||
|
||||
* **External package:** a different Go package, can be a child or sibling package in the same project.
|
||||
|
||||
* **Internal code:** code not intended for use in external projects.
|
||||
|
||||
* **Internal directory:** code under `internal/` which cannot be imported in external projects.
|
||||
|
||||
* **Exported:** a Go identifier starting with an uppercase letter, which can therefore be accessed by an external package.
|
||||
|
||||
* **Private:** a Go identifier starting with a lowercase letter, which therefore cannot be accessed by an external package unless via an exported field, variable, or function/method return value.
|
||||
|
||||
* **Public API:** any Go identifier that can be imported or accessed by an external project, except test code in `_test.go` files.
|
||||
|
||||
* **Private API:** any Go identifier that is not accessible via a public API, including all code in the internal directory.
|
||||
|
||||
## Alternative Approaches
|
||||
|
||||
- Split all public APIs out to separate Go modules in separate Git repositories, and consider all Tendermint code internal and not subject to API backwards compatibility at all. This was rejected, since it has been attempted by the Tendermint project earlier, resulting in too much dependency management overhead.
|
||||
|
||||
- Simply document which APIs are public and which are private. This is the current approach, but users should not be expected to self-enforce this, the documentation is not always up-to-date, and external projects will often end up depending on internal code anyway.
|
||||
|
||||
## Decision
|
||||
|
||||
From Tendermint 1.0, all internal code (except private APIs) will be placed in a root-level [`internal` directory](https://golang.org/cmd/go/#hdr-Internal_Directories), which the Go compiler will block for use by external projects. All exported items outside of the `internal` directory are considered a public API and subject to backwards compatibility guarantees, except files ending in `_test.go`.
|
||||
|
||||
The `crypto` package may be split out to a separate module in a separate repo. This is the main general-purpose package used by external projects, and is the only Tendermint dependency in e.g. IAVL which can cause some problems for projects depending on both IAVL and Tendermint. This will be decided after further discussion.
|
||||
|
||||
The `tm-db` package will remain a separate module in a separate repo. The `crypto` package may possibly be split out, pending further discussion, as this is the main general-purpose package used by other projects.
|
||||
|
||||
## Detailed Design
|
||||
|
||||
### Public API
|
||||
|
||||
When preparing our public API for 1.0, we should keep these principles in mind:
|
||||
|
||||
- Limit the number of public APIs that we start out with - we can always add new APIs later, but we can't change or remove APIs once they're made public.
|
||||
|
||||
- Before an API is made public, do a thorough review of the API to make sure it covers any future needs, can accomodate expected changes, and follows good API design practices.
|
||||
|
||||
The following is the minimum set of public APIs that will be included in 1.0, in some form:
|
||||
|
||||
- `abci`
|
||||
- packages used for constructing nodes `config`, `libs/log`, and `version`
|
||||
- Client APIs, i.e. `rpc/client`, `light`, and `privval`.
|
||||
- `crypto` (possibly as a separate repo)
|
||||
|
||||
We may offer additional APIs as well, following further discussions internally and with other stakeholders. However, public APIs for providing custom components (e.g. reactors and mempools) are not planned for 1.0, but may be added in a later 1.x version if this is something we want to offer.
|
||||
|
||||
For comparison, the following are the number of Tendermint imports in the Cosmos SDK (excluding tests), which should be mostly satisfied by the planned APIs.
|
||||
|
||||
```
|
||||
1 github.com/tendermint/tendermint/abci/server
|
||||
73 github.com/tendermint/tendermint/abci/types
|
||||
2 github.com/tendermint/tendermint/cmd/tendermint/commands
|
||||
7 github.com/tendermint/tendermint/config
|
||||
68 github.com/tendermint/tendermint/crypto
|
||||
1 github.com/tendermint/tendermint/crypto/armor
|
||||
10 github.com/tendermint/tendermint/crypto/ed25519
|
||||
2 github.com/tendermint/tendermint/crypto/encoding
|
||||
3 github.com/tendermint/tendermint/crypto/merkle
|
||||
3 github.com/tendermint/tendermint/crypto/sr25519
|
||||
8 github.com/tendermint/tendermint/crypto/tmhash
|
||||
1 github.com/tendermint/tendermint/crypto/xsalsa20symmetric
|
||||
11 github.com/tendermint/tendermint/libs/bytes
|
||||
2 github.com/tendermint/tendermint/libs/bytes.HexBytes
|
||||
15 github.com/tendermint/tendermint/libs/cli
|
||||
2 github.com/tendermint/tendermint/libs/cli/flags
|
||||
2 github.com/tendermint/tendermint/libs/json
|
||||
30 github.com/tendermint/tendermint/libs/log
|
||||
1 github.com/tendermint/tendermint/libs/math
|
||||
11 github.com/tendermint/tendermint/libs/os
|
||||
4 github.com/tendermint/tendermint/libs/rand
|
||||
1 github.com/tendermint/tendermint/libs/strings
|
||||
5 github.com/tendermint/tendermint/light
|
||||
1 github.com/tendermint/tendermint/internal/mempool
|
||||
3 github.com/tendermint/tendermint/node
|
||||
5 github.com/tendermint/tendermint/internal/p2p
|
||||
4 github.com/tendermint/tendermint/privval
|
||||
10 github.com/tendermint/tendermint/proto/tendermint/crypto
|
||||
1 github.com/tendermint/tendermint/proto/tendermint/libs/bits
|
||||
24 github.com/tendermint/tendermint/proto/tendermint/types
|
||||
3 github.com/tendermint/tendermint/proto/tendermint/version
|
||||
2 github.com/tendermint/tendermint/proxy
|
||||
3 github.com/tendermint/tendermint/rpc/client
|
||||
1 github.com/tendermint/tendermint/rpc/client/http
|
||||
2 github.com/tendermint/tendermint/rpc/client/local
|
||||
3 github.com/tendermint/tendermint/rpc/core/types
|
||||
1 github.com/tendermint/tendermint/rpc/jsonrpc/server
|
||||
33 github.com/tendermint/tendermint/types
|
||||
2 github.com/tendermint/tendermint/types/time
|
||||
1 github.com/tendermint/tendermint/version
|
||||
```
|
||||
|
||||
### Backwards-Compatible Changes
|
||||
|
||||
In Go, [almost all API changes are backwards-incompatible](https://blog.golang.org/module-compatibility) and thus exported items in public APIs generally cannot be changed until Tendermint 2.0. The only backwards-compatible changes we can make to public APIs are:
|
||||
|
||||
- Adding a package.
|
||||
|
||||
- Adding a new identifier to the package scope (e.g. const, var, func, struct, interface, etc.).
|
||||
|
||||
- Adding a new method to a struct.
|
||||
|
||||
- Adding a new field to a struct, if the zero-value preserves any old behavior.
|
||||
|
||||
- Changing the order of fields in a struct.
|
||||
|
||||
- Adding a variadic parameter to a named function or struct method, if the function type itself is not assignable in any public APIs (e.g. a callback).
|
||||
|
||||
- Adding a new method to an interface, or a variadic parameter to an interface method, _if the interface already has a private method_ (which prevents external packages from implementing it).
|
||||
|
||||
- Widening a numeric type as long as it is a named type (e.g. `type Number int32` can change to `int64`, but not `int8` or `uint32`).
|
||||
|
||||
Note that public APIs can expose private types (e.g. via an exported variable, field, or function/method return value), in which case the exported fields and methods on these private types are also part of the public API and covered by its backwards compatiblity guarantees. In general, private types should never be accessible via public APIs unless wrapped in an exported interface.
|
||||
|
||||
Also note that if we accept, return, export, or embed types from a dependency, we assume the backwards compatibility responsibility for that dependency, and must make sure any dependency upgrades comply with the above constraints.
|
||||
|
||||
We should run CI linters for minor version branches to enforce this, e.g. [apidiff](https://go.googlesource.com/exp/+/refs/heads/master/apidiff/README.md), [breakcheck](https://github.com/gbbr/breakcheck), and [apicombat](https://github.com/bradleyfalzon/apicompat).
|
||||
|
||||
#### Accepted Breakage
|
||||
|
||||
The above changes can still break programs in a few ways - these are _not_ considered backwards-incompatible changes, and users are advised to avoid this usage:
|
||||
|
||||
- If a program uses unkeyed struct literals (e.g. `Foo{"bar", "baz"}`) and we add fields or change the field order, the program will no longer compile or may have logic errors.
|
||||
|
||||
- If a program embeds two structs in a struct, and we add a new field or method to an embedded Tendermint struct which also exists in the other embedded struct, the program will no longer compile.
|
||||
|
||||
- If a program compares two structs (e.g. with `==`), and we add a new field of an incomparable type (slice, map, func, or struct that contains these) to a Tendermint struct which is compared, the program will no longer compile.
|
||||
|
||||
- If a program assigns a Tendermint function to an identifier, and we add a variadic parameter to the function signature, the program will no longer compile.
|
||||
|
||||
### Strategies for API Evolution
|
||||
|
||||
The API guarantees above can be fairly constraining, but are unavoidable given the Go language design. The following tricks can be employed where appropriate to allow us to make changes to the API:
|
||||
|
||||
- We can add a new function or method with a different name that takes additional parameters, and have the old function call the new one.
|
||||
|
||||
- Functions and methods can take an options struct instead of separate parameters, to allow adding new options - this is particularly suitable for functions that take many parameters and are expected to be extended, and especially for interfaces where we cannot add new methods with different parameters at all.
|
||||
|
||||
- Interfaces can include a private method, e.g. `interface { private() }`, to make them unimplementable by external packages and thus allow us to add new methods to the interface without breaking other programs. Of course, this can't be used for interfaces that should be implementable externally.
|
||||
|
||||
- We can use [interface upgrades](https://avtok.com/2014/11/05/interface-upgrades.html) to allow implementers of an existing interface to also implement a new interface, as long as the old interface can still be used - e.g. the new interface `BetterReader` may have a method `ReadBetter()`, and a function that takes a `Reader` interface as an input can check if the implementer also implements `BetterReader` and in that case call `ReadBetter()` instead of `Read()`.
|
||||
|
||||
## Status
|
||||
|
||||
Accepted
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- Users can safely upgrade with less fear of applications breaking, and know whether an upgrade only includes bug fixes or also functional enhancements
|
||||
|
||||
- External developers have a predictable and well-defined API to build on that will be supported for some time
|
||||
|
||||
- Less synchronization between teams, since there is a clearer contract and timeline for changes and they happen less frequently
|
||||
|
||||
- More documentation will remain accurate, since it's not chasing a moving target
|
||||
|
||||
- Less time will be spent on code churn and more time spent on functional improvements, both for the community and for our teams
|
||||
|
||||
### Negative
|
||||
|
||||
- Many improvements, changes, and bug fixes will have to be postponed until the next major version, possibly for a year or more
|
||||
|
||||
- The pace of development will slow down, since we must work within the existing API constraints, and spend more time planning public APIs
|
||||
|
||||
- External developers may lose access to some currently exported APIs and functionality
|
||||
|
||||
## References
|
||||
|
||||
- [#4451: Place internal APIs under internal package](https://github.com/tendermint/tendermint/issues/4451)
|
||||
|
||||
- [On Pluggability](https://docs.google.com/document/d/1G08LnwSyb6BAuCVSMF3EKn47CGdhZ5wPZYJQr4-bw58/edit?ts=5f609f11)
|
||||
@@ -1,109 +0,0 @@
|
||||
# ADR 061: P2P Refactor Scope
|
||||
|
||||
## Changelog
|
||||
|
||||
- 2020-10-30: Initial version (@erikgrinaker)
|
||||
|
||||
## Context
|
||||
|
||||
The `p2p` package responsible for peer-to-peer networking is rather old and has a number of weaknesses, including tight coupling, leaky abstractions, lack of tests, DoS vulnerabilites, poor performance, custom protocols, and incorrect behavior. A refactor has been discussed for several years ([#2067](https://github.com/tendermint/tendermint/issues/2067)).
|
||||
|
||||
Informal Systems are also building a Rust implementation of Tendermint, [Tendermint-rs](https://github.com/informalsystems/tendermint-rs), and plan to implement P2P networking support over the next year. As part of this work, they have requested adopting e.g. [QUIC](https://datatracker.ietf.org/doc/draft-ietf-quic-transport/) as a transport protocol instead of implementing the custom application-level `MConnection` stream multiplexing protocol that Tendermint currently uses.
|
||||
|
||||
This ADR summarizes recent discussion with stakeholders on the scope of a P2P refactor. Specific designs and implementations will be submitted as separate ADRs.
|
||||
|
||||
## Alternative Approaches
|
||||
|
||||
There have been recurring proposals to adopt [LibP2P](https://libp2p.io) instead of maintaining our own P2P networking stack (see [#3696](https://github.com/tendermint/tendermint/issues/3696)). While this appears to be a good idea in principle, it would be a highly breaking protocol change, there are indications that we might have to fork and modify LibP2P, and there are concerns about the abstractions used.
|
||||
|
||||
In discussions with Informal Systems we decided to begin with incremental improvements to the current P2P stack, add support for pluggable transports, and then gradually start experimenting with LibP2P as a transport layer. If this proves successful, we can consider adopting it for higher-level components at a later time.
|
||||
|
||||
## Decision
|
||||
|
||||
The P2P stack will be refactored and improved iteratively, in several phases:
|
||||
|
||||
* **Phase 1:** code and API refactoring, maintaining protocol compatibility as far as possible.
|
||||
|
||||
* **Phase 2:** additional transports and incremental protocol improvements.
|
||||
|
||||
* **Phase 3:** disruptive protocol changes.
|
||||
|
||||
The scope of phases 2 and 3 is still uncertain, and will be revisited once the preceding phases have been completed as we'll have a better sense of requirements and challenges.
|
||||
|
||||
## Detailed Design
|
||||
|
||||
Separate ADRs will be submitted for specific designs and changes in each phase, following research and prototyping. Below are objectives in order of priority.
|
||||
|
||||
### Phase 1: Code and API Refactoring
|
||||
|
||||
This phase will focus on improving the internal abstractions and implementations in the `p2p` package. As far as possible, it should not change the P2P protocol in a backwards-incompatible way.
|
||||
|
||||
* Cleaner, decoupled abstractions for e.g. `Reactor`, `Switch`, and `Peer`. [#2067](https://github.com/tendermint/tendermint/issues/2067) [#5287](https://github.com/tendermint/tendermint/issues/5287) [#3833](https://github.com/tendermint/tendermint/issues/3833)
|
||||
* Reactors should receive messages in separate goroutines or via buffered channels. [#2888](https://github.com/tendermint/tendermint/issues/2888)
|
||||
* Improved peer lifecycle management. [#3679](https://github.com/tendermint/tendermint/issues/3679) [#3719](https://github.com/tendermint/tendermint/issues/3719) [#3653](https://github.com/tendermint/tendermint/issues/3653) [#3540](https://github.com/tendermint/tendermint/issues/3540) [#3183](https://github.com/tendermint/tendermint/issues/3183) [#3081](https://github.com/tendermint/tendermint/issues/3081) [#1356](https://github.com/tendermint/tendermint/issues/1356)
|
||||
* Peer prioritization. [#2860](https://github.com/tendermint/tendermint/issues/2860) [#2041](https://github.com/tendermint/tendermint/issues/2041)
|
||||
* Pluggable transports, with `MConnection` as one implementation. [#5587](https://github.com/tendermint/tendermint/issues/5587) [#2430](https://github.com/tendermint/tendermint/issues/2430) [#805](https://github.com/tendermint/tendermint/issues/805)
|
||||
* Improved peer address handling.
|
||||
* Address book refactor. [#4848](https://github.com/tendermint/tendermint/issues/4848) [#2661](https://github.com/tendermint/tendermint/issues/2661)
|
||||
* Transport-agnostic peer addressing. [#5587](https://github.com/tendermint/tendermint/issues/5587) [#3782](https://github.com/tendermint/tendermint/issues/3782) [#3692](https://github.com/tendermint/tendermint/issues/3692)
|
||||
* Improved detection and advertisement of own address. [#5588](https://github.com/tendermint/tendermint/issues/5588) [#4260](https://github.com/tendermint/tendermint/issues/4260) [#3716](https://github.com/tendermint/tendermint/issues/3716) [#1727](https://github.com/tendermint/tendermint/issues/1727)
|
||||
* Support multiple IPs per peer. [#1521](https://github.com/tendermint/tendermint/issues/1521) [#2317](https://github.com/tendermint/tendermint/issues/2317)
|
||||
|
||||
The refactor should attempt to address the following secondary objectives: testability, observability, performance, security, quality-of-service, backpressure, and DoS resilience. Much of this will be revisited as explicit objectives in phase 2.
|
||||
|
||||
Ideally, the refactor should happen incrementally, with regular merges to `master` every few weeks. This will take more time overall, and cause frequent breaking changes to internal Go APIs, but it reduces the branch drift and gets the code tested sooner and more broadly.
|
||||
|
||||
### Phase 2: Additional Transports and Protocol Improvements
|
||||
|
||||
This phase will focus on protocol improvements and other breaking changes. The following are considered proposals that will need to be evaluated separately once the refactor is done. Additional proposals are likely to be added during phase 1.
|
||||
|
||||
* QUIC transport. [#198](https://github.com/tendermint/spec/issues/198)
|
||||
* Noise protocol for secret connection handshake. [#5589](https://github.com/tendermint/tendermint/issues/5589) [#3340](https://github.com/tendermint/tendermint/issues/3340)
|
||||
* Peer ID in connection handshake. [#5590](https://github.com/tendermint/tendermint/issues/5590)
|
||||
* Peer and service discovery (e.g. RPC nodes, state sync snapshots). [#5481](https://github.com/tendermint/tendermint/issues/5481) [#4583](https://github.com/tendermint/tendermint/issues/4583)
|
||||
* Rate-limiting, backpressure, and QoS scheduling. [#4753](https://github.com/tendermint/tendermint/issues/4753) [#2338](https://github.com/tendermint/tendermint/issues/2338)
|
||||
* Compression. [#2375](https://github.com/tendermint/tendermint/issues/2375)
|
||||
* Improved metrics and tracing. [#3849](https://github.com/tendermint/tendermint/issues/3849) [#2600](https://github.com/tendermint/tendermint/issues/2600)
|
||||
* Simplified P2P configuration options.
|
||||
|
||||
### Phase 3: Disruptive Protocol Changes
|
||||
|
||||
This phase covers speculative, wide-reaching proposals that are poorly defined and highly uncertain. They will be evaluated once the previous phases are done.
|
||||
|
||||
* Adopt LibP2P. [#3696](https://github.com/tendermint/tendermint/issues/3696)
|
||||
* Allow cross-reactor communication, possibly without channels.
|
||||
* Dynamic channel advertisment, as reactors are enabled/disabled. [#4394](https://github.com/tendermint/tendermint/issues/4394) [#1148](https://github.com/tendermint/tendermint/issues/1148)
|
||||
* Pubsub-style networking topology and pattern.
|
||||
* Support multiple chain IDs in the same network.
|
||||
|
||||
## Status
|
||||
|
||||
Accepted
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
* Cleaner, simpler architecture that's easier to reason about and test, and thus hopefully less buggy.
|
||||
|
||||
* Improved performance and robustness.
|
||||
|
||||
* Reduced maintenance burden and increased interoperability by the possible adoption of standardized protocols such as QUIC and Noise.
|
||||
|
||||
* Improved usability, with better observability, simpler configuration, and more automation (e.g. peer/service/address discovery, rate-limiting, and backpressure).
|
||||
|
||||
### Negative
|
||||
|
||||
* Maintaining our own P2P networking stack is resource-intensive.
|
||||
|
||||
* Abstracting away the underlying transport may prevent usage of advanced transport features.
|
||||
|
||||
* Breaking changes to APIs and protocols are disruptive to users.
|
||||
|
||||
## References
|
||||
|
||||
See issue links above.
|
||||
|
||||
- [#2067: P2P Refactor](https://github.com/tendermint/tendermint/issues/2067)
|
||||
|
||||
- [P2P refactor brainstorm document](https://docs.google.com/document/d/1FUTADZyLnwA9z7ndayuhAdAFRKujhh_y73D0ZFdKiOQ/edit?pli=1#)
|
||||
@@ -1,615 +0,0 @@
|
||||
# ADR 062: P2P Architecture and Abstractions
|
||||
|
||||
## Changelog
|
||||
|
||||
- 2020-11-09: Initial version (@erikgrinaker)
|
||||
|
||||
- 2020-11-13: Remove stream IDs, move peer errors onto channel, note on moving PEX into core (@erikgrinaker)
|
||||
|
||||
- 2020-11-16: Notes on recommended reactor implementation patterns, approve ADR (@erikgrinaker)
|
||||
|
||||
- 2021-02-04: Update with new P2P core and Transport API changes (@erikgrinaker).
|
||||
|
||||
## Context
|
||||
|
||||
In [ADR 061](adr-061-p2p-refactor-scope.md) we decided to refactor the peer-to-peer (P2P) networking stack. The first phase is to redesign and refactor the internal P2P architecture, while retaining protocol compatibility as far as possible.
|
||||
|
||||
## Alternative Approaches
|
||||
|
||||
Several variations of the proposed design were considered, including e.g. calling interface methods instead of passing messages (like the current architecture), merging channels with streams, exposing the internal peer data structure to reactors, being message format-agnostic via arbitrary codecs, and so on. This design was chosen because it has very loose coupling, is simpler to reason about and more convenient to use, avoids race conditions and lock contention for internal data structures, gives reactors better control of message ordering and processing semantics, and allows for QoS scheduling and backpressure in a very natural way.
|
||||
|
||||
[multiaddr](https://github.com/multiformats/multiaddr) was considered as a transport-agnostic peer address format over regular URLs, but it does not appear to have very widespread adoption, and advanced features like protocol encapsulation and tunneling do not appear to be immediately useful to us.
|
||||
|
||||
There were also proposals to use LibP2P instead of maintaining our own P2P stack, which were rejected (for now) in [ADR 061](adr-061-p2p-refactor-scope.md).
|
||||
|
||||
The initial version of this ADR had a byte-oriented multi-stream transport API, but this had to be abandoned/postponed to maintain backwards-compatibility with the existing MConnection protocol which is message-oriented. See the rejected RFC in [tendermint/spec#227](https://github.com/tendermint/spec/pull/227) for details.
|
||||
|
||||
## Decision
|
||||
|
||||
The P2P stack will be redesigned as a message-oriented architecture, primarily relying on Go channels for communication and scheduling. It will use a message-oriented transport to binary messages with individual peers, bidirectional peer-addressable channels to send and receive Protobuf messages, a router to route messages between reactors and peers, and a peer manager to manage peer lifecycle information. Message passing is asynchronous with at-most-once delivery.
|
||||
|
||||
## Detailed Design
|
||||
|
||||
This ADR is primarily concerned with the architecture and interfaces of the P2P stack, not implementation details. The interfaces described here should therefore be considered a rough architecture outline, not a complete and final design.
|
||||
|
||||
Primary design objectives have been:
|
||||
|
||||
* Loose coupling between components, for a simpler, more robust, and test-friendly architecture.
|
||||
* Pluggable transports (not necessarily networked).
|
||||
* Better scheduling of messages, with improved prioritization, backpressure, and performance.
|
||||
* Centralized peer lifecycle and connection management.
|
||||
* Better peer address detection, advertisement, and exchange.
|
||||
* Wire-level backwards compatibility with current P2P network protocols, except where it proves too obstructive.
|
||||
|
||||
The main abstractions in the new stack are:
|
||||
|
||||
* `Transport`: An arbitrary mechanism to exchange binary messages with a peer across a `Connection`.
|
||||
* `Channel`: A bidirectional channel to asynchronously exchange Protobuf messages with peers using node ID addressing.
|
||||
* `Router`: Maintains transport connections to relevant peers and routes channel messages.
|
||||
* `PeerManager`: Manages peer lifecycle information, e.g. deciding which peers to dial and when, using a `peerStore` for storage.
|
||||
* Reactor: A design pattern loosely defined as "something which listens on a channel and reacts to messages".
|
||||
|
||||
These abstractions are illustrated in the following diagram (representing the internals of node A) and described in detail below.
|
||||
|
||||

|
||||
|
||||
### Transports
|
||||
|
||||
Transports are arbitrary mechanisms for exchanging binary messages with a peer. For example, a gRPC transport would connect to a peer over TCP/IP and send data using the gRPC protocol, while an in-memory transport might communicate with a peer running in another goroutine using internal Go channels. Note that transports don't have a notion of a "peer" or "node" as such - instead, they establish connections between arbitrary endpoint addresses (e.g. IP address and port number), to decouple them from the rest of the P2P stack.
|
||||
|
||||
Transports must satisfy the following requirements:
|
||||
|
||||
* Be connection-oriented, and support both listening for inbound connections and making outbound connections using endpoint addresses.
|
||||
|
||||
* Support sending binary messages with distinct channel IDs (although channels and channel IDs are a higher-level application protocol concept explained in the Router section, they are threaded through the transport layer as well for backwards compatibilty with the existing MConnection protocol).
|
||||
|
||||
* Exchange the MConnection `NodeInfo` and public key via a node handshake, and possibly encrypt or sign the traffic as appropriate.
|
||||
|
||||
The initial transport is a port of the current MConnection protocol currently used by Tendermint, and should be backwards-compatible at the wire level. An in-memory transport for testing has also been implemented. There are plans to explore a QUIC transport that may replace the MConnection protocol.
|
||||
|
||||
The `Transport` interface is as follows:
|
||||
|
||||
```go
|
||||
// Transport is a connection-oriented mechanism for exchanging data with a peer.
|
||||
type Transport interface {
|
||||
// Protocols returns the protocols supported by the transport. The Router
|
||||
// uses this to pick a transport for an Endpoint.
|
||||
Protocols() []Protocol
|
||||
|
||||
// Endpoints returns the local endpoints the transport is listening on, if any.
|
||||
// How to listen is transport-dependent, e.g. MConnTransport uses Listen() while
|
||||
// MemoryTransport starts listening via MemoryNetwork.CreateTransport().
|
||||
Endpoints() []Endpoint
|
||||
|
||||
// Accept waits for the next inbound connection on a listening endpoint, blocking
|
||||
// until either a connection is available or the transport is closed. On closure,
|
||||
// io.EOF is returned and further Accept calls are futile.
|
||||
Accept() (Connection, error)
|
||||
|
||||
// Dial creates an outbound connection to an endpoint.
|
||||
Dial(context.Context, Endpoint) (Connection, error)
|
||||
|
||||
// Close stops accepting new connections, but does not close active connections.
|
||||
Close() error
|
||||
}
|
||||
```
|
||||
|
||||
How the transport configures listening is transport-dependent, and not covered by the interface. This typically happens during transport construction, where a single instance of the transport is created and set to listen on an appropriate network interface before being passed to the router.
|
||||
|
||||
#### Endpoints
|
||||
|
||||
`Endpoint` represents a transport endpoint (e.g. an IP address and port). A connection always has two endpoints: one at the local node and one at the remote peer. Outbound connections to remote endpoints are made via `Dial()`, and inbound connections to listening endpoints are returned via `Accept()`.
|
||||
|
||||
The `Endpoint` struct is:
|
||||
|
||||
```go
|
||||
// Endpoint represents a transport connection endpoint, either local or remote.
|
||||
//
|
||||
// Endpoints are not necessarily networked (see e.g. MemoryTransport) but all
|
||||
// networked endpoints must use IP as the underlying transport protocol to allow
|
||||
// e.g. IP address filtering. Either IP or Path (or both) must be set.
|
||||
type Endpoint struct {
|
||||
// Protocol specifies the transport protocol.
|
||||
Protocol Protocol
|
||||
|
||||
// IP is an IP address (v4 or v6) to connect to. If set, this defines the
|
||||
// endpoint as a networked endpoint.
|
||||
IP net.IP
|
||||
|
||||
// Port is a network port (either TCP or UDP). If 0, a default port may be
|
||||
// used depending on the protocol.
|
||||
Port uint16
|
||||
|
||||
// Path is an optional transport-specific path or identifier.
|
||||
Path string
|
||||
}
|
||||
|
||||
// Protocol identifies a transport protocol.
|
||||
type Protocol string
|
||||
```
|
||||
|
||||
Endpoints are arbitrary transport-specific addresses, but if they are networked they must use IP addresses and thus rely on IP as a fundamental packet routing protocol. This enables policies for address discovery, advertisement, and exchange - for example, a private `192.168.0.0/24` IP address should only be advertised to peers on that IP network, while the public address `8.8.8.8` may be advertised to all peers. Similarly, any port numbers if given must represent TCP and/or UDP port numbers, in order to use [UPnP](https://en.wikipedia.org/wiki/Universal_Plug_and_Play) to autoconfigure e.g. NAT gateways.
|
||||
|
||||
Non-networked endpoints (without an IP address) are considered local, and will only be advertised to other peers connecting via the same protocol. For example, the in-memory transport used for testing uses `Endpoint{Protocol: "memory", Path: "foo"}` as an address for the node "foo", and this should only be advertised to other nodes using `Protocol: "memory"`.
|
||||
|
||||
#### Connections
|
||||
|
||||
A connection represents an established transport connection between two endpoints (i.e. two nodes), which can be used to exchange binary messages with logical channel IDs (corresponding to the higher-level channel IDs used in the router). Connections are set up either via `Transport.Dial()` (outbound) or `Transport.Accept()` (inbound).
|
||||
|
||||
Once a connection is esablished, `Transport.Handshake()` must be called to perform a node handshake, exchanging node info and public keys to verify node identities. Node handshakes should not really be part of the transport layer (it's an application protocol concern), this exists for backwards-compatibility with the existing MConnection protocol which conflates the two. `NodeInfo` is part of the existing MConnection protocol, but does not appear to be documented in the specification -- refer to the Go codebase for details.
|
||||
|
||||
The `Connection` interface is shown below. It omits certain additions that are currently implemented for backwards compatibility with the legacy P2P stack and are planned to be removed before the final release.
|
||||
|
||||
```go
|
||||
// Connection represents an established connection between two endpoints.
|
||||
type Connection interface {
|
||||
// Handshake executes a node handshake with the remote peer. It must be
|
||||
// called once the connection is established, and returns the remote peer's
|
||||
// node info and public key. The caller is responsible for validation.
|
||||
Handshake(context.Context, NodeInfo, crypto.PrivKey) (NodeInfo, crypto.PubKey, error)
|
||||
|
||||
// ReceiveMessage returns the next message received on the connection,
|
||||
// blocking until one is available. Returns io.EOF if closed.
|
||||
ReceiveMessage() (ChannelID, []byte, error)
|
||||
|
||||
// SendMessage sends a message on the connection. Returns io.EOF if closed.
|
||||
SendMessage(ChannelID, []byte) error
|
||||
|
||||
// LocalEndpoint returns the local endpoint for the connection.
|
||||
LocalEndpoint() Endpoint
|
||||
|
||||
// RemoteEndpoint returns the remote endpoint for the connection.
|
||||
RemoteEndpoint() Endpoint
|
||||
|
||||
// Close closes the connection.
|
||||
Close() error
|
||||
}
|
||||
```
|
||||
|
||||
This ADR initially proposed a byte-oriented multi-stream connection API that follows more typical networking API conventions (using e.g. `io.Reader` and `io.Writer` interfaces which easily compose with other libraries). This would also allow moving the responsibility for message framing, node handshakes, and traffic scheduling to the common router instead of reimplementing this across transports, and would allow making better use of multi-stream protocols such as QUIC. However, this would require minor breaking changes to the MConnection protocol which were rejected, see [tendermint/spec#227](https://github.com/tendermint/spec/pull/227) for details. This should be revisited when starting work on a QUIC transport.
|
||||
|
||||
### Peer Management
|
||||
|
||||
Peers are other Tendermint nodes. Each peer is identified by a unique `NodeID` (tied to the node's private key).
|
||||
|
||||
#### Peer Addresses
|
||||
|
||||
Nodes have one or more `NodeAddress` addresses expressed as URLs that they can be reached at. Examples of node addresses might be e.g.:
|
||||
|
||||
* `mconn://nodeid@host.domain.com:25567/path`
|
||||
* `memory:nodeid`
|
||||
|
||||
Addresses are resolved into one or more transport endpoints, e.g. by resolving DNS hostnames into IP addresses. Peers should always be expressed as address URLs rather than endpoints (which are a lower-level transport construct).
|
||||
|
||||
```go
|
||||
// NodeID is a hex-encoded crypto.Address. It must be lowercased
|
||||
// (for uniqueness) and of length 40.
|
||||
type NodeID string
|
||||
|
||||
// NodeAddress is a node address URL. It differs from a transport Endpoint in
|
||||
// that it contains the node's ID, and that the address hostname may be resolved
|
||||
// into multiple IP addresses (and thus multiple endpoints).
|
||||
//
|
||||
// If the URL is opaque, i.e. of the form "scheme:opaque", then the opaque part
|
||||
// is expected to contain a node ID.
|
||||
type NodeAddress struct {
|
||||
NodeID NodeID
|
||||
Protocol Protocol
|
||||
Hostname string
|
||||
Port uint16
|
||||
Path string
|
||||
}
|
||||
|
||||
// ParseNodeAddress parses a node address URL into a NodeAddress, normalizing
|
||||
// and validating it.
|
||||
func ParseNodeAddress(urlString string) (NodeAddress, error)
|
||||
|
||||
// Resolve resolves a NodeAddress into a set of Endpoints, e.g. by expanding
|
||||
// out a DNS hostname to IP addresses.
|
||||
func (a NodeAddress) Resolve(ctx context.Context) ([]Endpoint, error)
|
||||
```
|
||||
|
||||
#### Peer Manager
|
||||
|
||||
The P2P stack needs to track a lot of internal state about peers, such as their addresses, connection state, priorities, availability, failures, retries, and so on. This responsibility has been separated out to a `PeerManager`, which track this state for the `Router` (but does not maintain the actual transport connections themselves, which is the router's responsibility).
|
||||
|
||||
The `PeerManager` is a synchronous state machine, where all state transitions are serialized (implemented as synchronous method calls holding an exclusive mutex lock). Most peer state is intentionally kept internal, stored in a `peerStore` database that persists it as appropriate, and the external interfaces pass the minimum amount of information necessary in order to avoid shared state between router goroutines. This design significantly simplifies the model, making it much easier to reason about and test than if it was baked into the asynchronous ball of concurrency that the P2P networking core must necessarily be. As peer lifecycle events are expected to be relatively infrequent, this should not significantly impact performance either.
|
||||
|
||||
The `Router` uses the `PeerManager` to request which peers to dial and evict, and reports in with peer lifecycle events such as connections, disconnections, and failures as they occur. The manager can reject these events (e.g. reject an inbound connection) by returning errors. This happens as follows:
|
||||
|
||||
* Outbound connections, via `Transport.Dial`:
|
||||
* `DialNext()`: returns a peer address to dial, or blocks until one is available.
|
||||
* `DialFailed()`: reports a peer dial failure.
|
||||
* `Dialed()`: reports a peer dial success.
|
||||
* `Ready()`: reports the peer as routed and ready.
|
||||
* `Disconnected()`: reports a peer disconnection.
|
||||
|
||||
* Inbound connections, via `Transport.Accept`:
|
||||
* `Accepted()`: reports an inbound peer connection.
|
||||
* `Ready()`: reports the peer as routed and ready.
|
||||
* `Disconnected()`: reports a peer disconnection.
|
||||
|
||||
* Evictions, via `Connection.Close`:
|
||||
* `EvictNext()`: returns a peer to disconnect, or blocks until one is available.
|
||||
* `Disconnected()`: reports a peer disconnection.
|
||||
|
||||
These calls have the following interface:
|
||||
|
||||
```go
|
||||
// DialNext returns a peer address to dial, blocking until one is available.
|
||||
func (m *PeerManager) DialNext(ctx context.Context) (NodeAddress, error)
|
||||
|
||||
// DialFailed reports a dial failure for the given address.
|
||||
func (m *PeerManager) DialFailed(address NodeAddress) error
|
||||
|
||||
// Dialed reports a successful outbound connection to the given address.
|
||||
func (m *PeerManager) Dialed(address NodeAddress) error
|
||||
|
||||
// Accepted reports a successful inbound connection from the given node.
|
||||
func (m *PeerManager) Accepted(peerID NodeID) error
|
||||
|
||||
// Ready reports the peer as fully routed and ready for use.
|
||||
func (m *PeerManager) Ready(peerID NodeID) error
|
||||
|
||||
// EvictNext returns a peer ID to disconnect, blocking until one is available.
|
||||
func (m *PeerManager) EvictNext(ctx context.Context) (NodeID, error)
|
||||
|
||||
// Disconnected reports a peer disconnection.
|
||||
func (m *PeerManager) Disconnected(peerID NodeID) error
|
||||
```
|
||||
|
||||
Internally, the `PeerManager` uses a numeric peer score to prioritize peers, e.g. when deciding which peers to dial next. The scoring policy has not yet been implemented, but should take into account e.g. node configuration such a `persistent_peers`, uptime and connection failures, performance, and so on. The manager will also attempt to automatically upgrade to better-scored peers by evicting lower-scored peers when a better one becomes available (e.g. when a persistent peer comes back online after an outage).
|
||||
|
||||
The `PeerManager` should also have an API for reporting peer behavior from reactors that affects its score (e.g. signing a block increases the score, double-voting decreases it or even bans the peer), but this has not yet been designed and implemented.
|
||||
|
||||
Additionally, the `PeerManager` provides `PeerUpdates` subscriptions that will receive `PeerUpdate` events whenever significant peer state changes happen. Reactors can use these e.g. to know when peers are connected or disconnected, and take appropriate action. This is currently fairly minimal:
|
||||
|
||||
```go
|
||||
// Subscribe subscribes to peer updates. The caller must consume the peer updates
|
||||
// in a timely fashion and close the subscription when done, to avoid stalling the
|
||||
// PeerManager as delivery is semi-synchronous, guaranteed, and ordered.
|
||||
func (m *PeerManager) Subscribe() *PeerUpdates
|
||||
|
||||
// PeerUpdate is a peer update event sent via PeerUpdates.
|
||||
type PeerUpdate struct {
|
||||
NodeID NodeID
|
||||
Status PeerStatus
|
||||
}
|
||||
|
||||
// PeerStatus is a peer status.
|
||||
type PeerStatus string
|
||||
|
||||
const (
|
||||
PeerStatusUp PeerStatus = "up" // Connected and ready.
|
||||
PeerStatusDown PeerStatus = "down" // Disconnected.
|
||||
)
|
||||
|
||||
// PeerUpdates is a real-time peer update subscription.
|
||||
type PeerUpdates struct { ... }
|
||||
|
||||
// Updates returns a channel for consuming peer updates.
|
||||
func (pu *PeerUpdates) Updates() <-chan PeerUpdate
|
||||
|
||||
// Close closes the peer updates subscription.
|
||||
func (pu *PeerUpdates) Close()
|
||||
```
|
||||
|
||||
The `PeerManager` will also be responsible for providing peer information to the PEX reactor that can be gossipped to other nodes. This requires an improved system for peer address detection and advertisement, that e.g. reliably detects peer and self addresses and only gossips private network addresses to other peers on the same network, but this system has not yet been fully designed and implemented.
|
||||
|
||||
### Channels
|
||||
|
||||
While low-level data exchange happens via the `Transport`, the high-level API is based on a bidirectional `Channel` that can send and receive Protobuf messages addressed by `NodeID`. A channel is identified by an arbitrary `ChannelID` identifier, and can exchange Protobuf messages of one specific type (since the type to unmarshal into must be predefined). Message delivery is asynchronous and at-most-once.
|
||||
|
||||
The channel can also be used to report peer errors, e.g. when receiving an invalid or malignant message. This may cause the peer to be disconnected or banned depending on `PeerManager` policy, but should probably be replaced by a broader peer behavior API that can also report good behavior.
|
||||
|
||||
A `Channel` has this interface:
|
||||
|
||||
```go
|
||||
// ChannelID is an arbitrary channel ID.
|
||||
type ChannelID uint16
|
||||
|
||||
// Channel is a bidirectional channel to exchange Protobuf messages with peers.
|
||||
type Channel struct {
|
||||
ID ChannelID // Channel ID.
|
||||
In <-chan Envelope // Inbound messages (peers to reactors).
|
||||
Out chan<- Envelope // outbound messages (reactors to peers)
|
||||
Error chan<- PeerError // Peer error reporting.
|
||||
messageType proto.Message // Channel's message type, for e.g. unmarshaling.
|
||||
}
|
||||
|
||||
// Close closes the channel, also closing Out and Error.
|
||||
func (c *Channel) Close() error
|
||||
|
||||
// Envelope specifies the message receiver and sender.
|
||||
type Envelope struct {
|
||||
From NodeID // Sender (empty if outbound).
|
||||
To NodeID // Receiver (empty if inbound).
|
||||
Broadcast bool // Send to all connected peers, ignoring To.
|
||||
Message proto.Message // Message payload.
|
||||
}
|
||||
|
||||
// PeerError is a peer error reported via the Error channel.
|
||||
type PeerError struct {
|
||||
NodeID NodeID
|
||||
Err error
|
||||
}
|
||||
```
|
||||
|
||||
A channel can reach any connected peer, and will automatically (un)marshal the Protobuf messages. Message scheduling and queueing is a `Router` implementation concern, and can use any number of algorithms such as FIFO, round-robin, priority queues, etc. Since message delivery is not guaranteed, both inbound and outbound messages may be dropped, buffered, reordered, or blocked as appropriate.
|
||||
|
||||
Since a channel can only exchange messages of a single type, it is often useful to use a wrapper message type with e.g. a Protobuf `oneof` field that specifies a set of inner message types that it can contain. The channel can automatically perform this (un)wrapping if the outer message type implements the `Wrapper` interface (see [Reactor Example](#reactor-example) for an example):
|
||||
|
||||
```go
|
||||
// Wrapper is a Protobuf message that can contain a variety of inner messages.
|
||||
// If a Channel's message type implements Wrapper, the channel will
|
||||
// automatically (un)wrap passed messages using the container type, such that
|
||||
// the channel can transparently support multiple message types.
|
||||
type Wrapper interface {
|
||||
proto.Message
|
||||
|
||||
// Wrap will take a message and wrap it in this one.
|
||||
Wrap(proto.Message) error
|
||||
|
||||
// Unwrap will unwrap the inner message contained in this message.
|
||||
Unwrap() (proto.Message, error)
|
||||
}
|
||||
```
|
||||
|
||||
### Routers
|
||||
|
||||
The router exeutes P2P networking for a node, taking instructions from and reporting events to the `PeerManager`, maintaining transport connections to peers, and routing messages between channels and peers.
|
||||
|
||||
Practically all concurrency in the P2P stack has been moved into the router and reactors, while as many other responsibilities as possible have been moved into separate components such as the `Transport` and `PeerManager` that can remain largely synchronous. Limiting concurrency to a single core component makes it much easier to reason about since there is only a single concurrency structure, while the remaining components can be serial, simple, and easily testable.
|
||||
|
||||
The `Router` has a very minimal API, since it is mostly driven by `PeerManager` and `Transport` events:
|
||||
|
||||
```go
|
||||
// Router maintains peer transport connections and routes messages between
|
||||
// peers and channels.
|
||||
type Router struct {
|
||||
// Some details have been omitted below.
|
||||
|
||||
logger log.Logger
|
||||
options RouterOptions
|
||||
nodeInfo NodeInfo
|
||||
privKey crypto.PrivKey
|
||||
peerManager *PeerManager
|
||||
transports []Transport
|
||||
|
||||
peerMtx sync.RWMutex
|
||||
peerQueues map[NodeID]queue
|
||||
|
||||
channelMtx sync.RWMutex
|
||||
channelQueues map[ChannelID]queue
|
||||
}
|
||||
|
||||
// OpenChannel opens a new channel for the given message type. The caller must
|
||||
// close the channel when done, before stopping the Router. messageType is the
|
||||
// type of message passed through the channel.
|
||||
func (r *Router) OpenChannel(id ChannelID, messageType proto.Message) (*Channel, error)
|
||||
|
||||
// Start starts the router, connecting to peers and routing messages.
|
||||
func (r *Router) Start() error
|
||||
|
||||
// Stop stops the router, disconnecting from all peers and stopping message routing.
|
||||
func (r *Router) Stop() error
|
||||
```
|
||||
|
||||
All Go channel sends in the `Router` and reactors are blocking (the router also selects on signal channels for closure and shutdown). The responsibility for message scheduling, prioritization, backpressure, and load shedding is centralized in a core `queue` interface that is used at contention points (i.e. from all peers to a single channel, and from all channels to a single peer):
|
||||
|
||||
```go
|
||||
// queue does QoS scheduling for Envelopes, enqueueing and dequeueing according
|
||||
// to some policy. Queues are used at contention points, i.e.:
|
||||
// - Receiving inbound messages to a single channel from all peers.
|
||||
// - Sending outbound messages to a single peer from all channels.
|
||||
type queue interface {
|
||||
// enqueue returns a channel for submitting envelopes.
|
||||
enqueue() chan<- Envelope
|
||||
|
||||
// dequeue returns a channel ordered according to some queueing policy.
|
||||
dequeue() <-chan Envelope
|
||||
|
||||
// close closes the queue. After this call enqueue() will block, so the
|
||||
// caller must select on closed() as well to avoid blocking forever. The
|
||||
// enqueue() and dequeue() channels will not be closed.
|
||||
close()
|
||||
|
||||
// closed returns a channel that's closed when the scheduler is closed.
|
||||
closed() <-chan struct{}
|
||||
}
|
||||
```
|
||||
|
||||
The current implementation is `fifoQueue`, which is a simple unbuffered lossless queue that passes messages in the order they were received and blocks until the message is delivered (i.e. it is a Go channel). The router will need a more sophisticated queueing policy, but this has not yet been implemented.
|
||||
|
||||
The internal `Router` goroutine structure and design is described in the `Router` GoDoc, which is included below for reference:
|
||||
|
||||
```go
|
||||
// On startup, three main goroutines are spawned to maintain peer connections:
|
||||
//
|
||||
// dialPeers(): in a loop, calls PeerManager.DialNext() to get the next peer
|
||||
// address to dial and spawns a goroutine that dials the peer, handshakes
|
||||
// with it, and begins to route messages if successful.
|
||||
//
|
||||
// acceptPeers(): in a loop, waits for an inbound connection via
|
||||
// Transport.Accept() and spawns a goroutine that handshakes with it and
|
||||
// begins to route messages if successful.
|
||||
//
|
||||
// evictPeers(): in a loop, calls PeerManager.EvictNext() to get the next
|
||||
// peer to evict, and disconnects it by closing its message queue.
|
||||
//
|
||||
// When a peer is connected, an outbound peer message queue is registered in
|
||||
// peerQueues, and routePeer() is called to spawn off two additional goroutines:
|
||||
//
|
||||
// sendPeer(): waits for an outbound message from the peerQueues queue,
|
||||
// marshals it, and passes it to the peer transport which delivers it.
|
||||
//
|
||||
// receivePeer(): waits for an inbound message from the peer transport,
|
||||
// unmarshals it, and passes it to the appropriate inbound channel queue
|
||||
// in channelQueues.
|
||||
//
|
||||
// When a reactor opens a channel via OpenChannel, an inbound channel message
|
||||
// queue is registered in channelQueues, and a channel goroutine is spawned:
|
||||
//
|
||||
// routeChannel(): waits for an outbound message from the channel, looks
|
||||
// up the recipient peer's outbound message queue in peerQueues, and submits
|
||||
// the message to it.
|
||||
//
|
||||
// All channel sends in the router are blocking. It is the responsibility of the
|
||||
// queue interface in peerQueues and channelQueues to prioritize and drop
|
||||
// messages as appropriate during contention to prevent stalls and ensure good
|
||||
// quality of service.
|
||||
```
|
||||
|
||||
### Reactor Example
|
||||
|
||||
While reactors are a first-class concept in the current P2P stack (i.e. there is an explicit `p2p.Reactor` interface), they will simply be a design pattern in the new stack, loosely defined as "something which listens on a channel and reacts to messages".
|
||||
|
||||
Since reactors have very few formal constraints, they can be implemented in a variety of ways. There is currently no recommended pattern for implementing reactors, to avoid overspecification and scope creep in this ADR. However, prototyping and developing a reactor pattern should be done early during implementation, to make sure reactors built using the `Channel` interface can satisfy the needs for convenience, deterministic tests, and reliability.
|
||||
|
||||
Below is a trivial example of a simple echo reactor implemented as a function. The reactor will exchange the following Protobuf messages:
|
||||
|
||||
```protobuf
|
||||
message EchoMessage {
|
||||
oneof inner {
|
||||
PingMessage ping = 1;
|
||||
PongMessage pong = 2;
|
||||
}
|
||||
}
|
||||
|
||||
message PingMessage {
|
||||
string content = 1;
|
||||
}
|
||||
|
||||
message PongMessage {
|
||||
string content = 1;
|
||||
}
|
||||
```
|
||||
|
||||
Implementing the `Wrapper` interface for `EchoMessage` allows transparently passing `PingMessage` and `PongMessage` through the channel, where it will automatically be (un)wrapped in an `EchoMessage`:
|
||||
|
||||
```go
|
||||
func (m *EchoMessage) Wrap(inner proto.Message) error {
|
||||
switch inner := inner.(type) {
|
||||
case *PingMessage:
|
||||
m.Inner = &EchoMessage_PingMessage{Ping: inner}
|
||||
case *PongMessage:
|
||||
m.Inner = &EchoMessage_PongMessage{Pong: inner}
|
||||
default:
|
||||
return fmt.Errorf("unknown message %T", inner)
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func (m *EchoMessage) Unwrap() (proto.Message, error) {
|
||||
switch inner := m.Inner.(type) {
|
||||
case *EchoMessage_PingMessage:
|
||||
return inner.Ping, nil
|
||||
case *EchoMessage_PongMessage:
|
||||
return inner.Pong, nil
|
||||
default:
|
||||
return nil, fmt.Errorf("unknown message %T", inner)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The reactor itself would be implemented e.g. like this:
|
||||
|
||||
```go
|
||||
// RunEchoReactor wires up an echo reactor to a router and runs it.
|
||||
func RunEchoReactor(router *p2p.Router, peerManager *p2p.PeerManager) error {
|
||||
channel, err := router.OpenChannel(1, &EchoMessage{})
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
defer channel.Close()
|
||||
peerUpdates := peerManager.Subscribe()
|
||||
defer peerUpdates.Close()
|
||||
|
||||
return EchoReactor(context.Background(), channel, peerUpdates)
|
||||
}
|
||||
|
||||
// EchoReactor provides an echo service, pinging all known peers until the given
|
||||
// context is canceled.
|
||||
func EchoReactor(ctx context.Context, channel *p2p.Channel, peerUpdates *p2p.PeerUpdates) error {
|
||||
ticker := time.NewTicker(5 * time.Second)
|
||||
defer ticker.Stop()
|
||||
|
||||
for {
|
||||
select {
|
||||
// Send ping message to all known peers every 5 seconds.
|
||||
case <-ticker.C:
|
||||
channel.Out <- Envelope{
|
||||
Broadcast: true,
|
||||
Message: &PingMessage{Content: "👋"},
|
||||
}
|
||||
|
||||
// When we receive a message from a peer, either respond to ping, output
|
||||
// pong, or report peer error on unknown message type.
|
||||
case envelope := <-channel.In:
|
||||
switch msg := envelope.Message.(type) {
|
||||
case *PingMessage:
|
||||
channel.Out <- Envelope{
|
||||
To: envelope.From,
|
||||
Message: &PongMessage{Content: msg.Content},
|
||||
}
|
||||
|
||||
case *PongMessage:
|
||||
fmt.Printf("%q replied with %q\n", envelope.From, msg.Content)
|
||||
|
||||
default:
|
||||
channel.Error <- PeerError{
|
||||
PeerID: envelope.From,
|
||||
Err: fmt.Errorf("unexpected message %T", msg),
|
||||
}
|
||||
}
|
||||
|
||||
// Output info about any peer status changes.
|
||||
case peerUpdate := <-peerUpdates:
|
||||
fmt.Printf("Peer %q changed status to %q", peerUpdate.PeerID, peerUpdate.Status)
|
||||
|
||||
// Exit when context is canceled.
|
||||
case <-ctx.Done():
|
||||
return nil
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Status
|
||||
|
||||
Partially implemented ([#5670](https://github.com/tendermint/tendermint/issues/5670))
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
* Reduced coupling and simplified interfaces should lead to better understandability, increased reliability, and more testing.
|
||||
|
||||
* Using message passing via Go channels gives better control of backpressure and quality-of-service scheduling.
|
||||
|
||||
* Peer lifecycle and connection management is centralized in a single entity, making it easier to reason about.
|
||||
|
||||
* Detection, advertisement, and exchange of node addresses will be improved.
|
||||
|
||||
* Additional transports (e.g. QUIC) can be implemented and used in parallel with the existing MConn protocol.
|
||||
|
||||
* The P2P protocol will not be broken in the initial version, if possible.
|
||||
|
||||
### Negative
|
||||
|
||||
* Fully implementing the new design as indended is likely to require breaking changes to the P2P protocol at some point, although the initial implementation shouldn't.
|
||||
|
||||
* Gradually migrating the existing stack and maintaining backwards-compatibility will be more labor-intensive than simply replacing the entire stack.
|
||||
|
||||
* A complete overhaul of P2P internals is likely to cause temporary performance regressions and bugs as the implementation matures.
|
||||
|
||||
* Hiding peer management information inside the `PeerManager` may prevent certain functionality or require additional deliberate interfaces for information exchange, as a tradeoff to simplify the design, reduce coupling, and avoid race conditions and lock contention.
|
||||
|
||||
### Neutral
|
||||
|
||||
* Implementation details around e.g. peer management, message scheduling, and peer and endpoint advertisement are not yet determined.
|
||||
|
||||
## References
|
||||
|
||||
* [ADR 061: P2P Refactor Scope](adr-061-p2p-refactor-scope.md)
|
||||
* [#5670 p2p: internal refactor and architecture redesign](https://github.com/tendermint/tendermint/issues/5670)
|
||||
@@ -1,109 +0,0 @@
|
||||
# ADR 063: Privval gRPC
|
||||
|
||||
## Changelog
|
||||
|
||||
- 23/11/2020: Initial Version (@marbar3778)
|
||||
|
||||
## Context
|
||||
|
||||
Validators use remote signers to help secure their keys. This system is Tendermint's recommended way to secure validators, but the path to integration with Tendermint's private validator client is plagued with custom protocols.
|
||||
|
||||
Tendermint uses its own custom secure connection protocol (`SecretConnection`) and a raw tcp/unix socket connection protocol. The secure connection protocol until recently was exposed to man in the middle attacks and can take longer to integrate if not using Golang. The raw tcp connection protocol is less custom, but has been causing minute issues with users.
|
||||
|
||||
Migrating Tendermint's private validator client to a widely adopted protocol, gRPC, will ease the current maintenance and integration burden experienced with the current protocol.
|
||||
|
||||
## Decision
|
||||
|
||||
After discussing with multiple stake holders, [gRPC](https://grpc.io/) was decided on to replace the current private validator protocol. gRPC is a widely adopted protocol in the micro-service and cloud infrastructure world. gRPC uses [protocol-buffers](https://developers.google.com/protocol-buffers) to describe its services, providing a language agnostic implementation. Tendermint uses protobuf for on disk and over the wire encoding already making the integration with gRPC simpler.
|
||||
|
||||
## Alternative Approaches
|
||||
|
||||
- JSON-RPC: We did not consider JSON-RPC because Tendermint uses protobuf extensively making gRPC a natural choice.
|
||||
|
||||
## Detailed Design
|
||||
|
||||
With the recent integration of [Protobuf](https://developers.google.com/protocol-buffers) into Tendermint the needed changes to migrate from the current private validator protocol to gRPC is not large.
|
||||
|
||||
The [service definition](https://grpc.io/docs/what-is-grpc/core-concepts/#service-definition) for gRPC will be defined as:
|
||||
|
||||
```proto
|
||||
service PrivValidatorAPI {
|
||||
rpc GetPubKey(tendermint.proto.privval.PubKeyRequest) returns (tendermint.proto.privval.PubKeyResponse);
|
||||
rpc SignVote(tendermint.proto.privval.SignVoteRequest) returns (tendermint.proto.privval.SignedVoteResponse);
|
||||
rpc SignProposal(tendermint.proto.privval.SignProposalRequest) returns (tendermint.proto.privval.SignedProposalResponse);
|
||||
|
||||
message PubKeyRequest {
|
||||
string chain_id = 1;
|
||||
}
|
||||
|
||||
// PubKeyResponse is a response message containing the public key.
|
||||
message PubKeyResponse {
|
||||
tendermint.crypto.PublicKey pub_key = 1 [(gogoproto.nullable) = false];
|
||||
}
|
||||
|
||||
// SignVoteRequest is a request to sign a vote
|
||||
message SignVoteRequest {
|
||||
tendermint.types.Vote vote = 1;
|
||||
string chain_id = 2;
|
||||
}
|
||||
|
||||
// SignedVoteResponse is a response containing a signed vote or an error
|
||||
message SignedVoteResponse {
|
||||
tendermint.types.Vote vote = 1 [(gogoproto.nullable) = false];
|
||||
}
|
||||
|
||||
// SignProposalRequest is a request to sign a proposal
|
||||
message SignProposalRequest {
|
||||
tendermint.types.Proposal proposal = 1;
|
||||
string chain_id = 2;
|
||||
}
|
||||
|
||||
// SignedProposalResponse is response containing a signed proposal or an error
|
||||
message SignedProposalResponse {
|
||||
tendermint.types.Proposal proposal = 1 [(gogoproto.nullable) = false];
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
> Note: Remote Singer errors are removed in favor of [grpc status error codes](https://grpc.io/docs/guides/error/).
|
||||
|
||||
In previous versions of the remote signer, Tendermint acted as the server and the remote signer as the client. In this process the client established a long lived connection providing a way for the server to make requests to the client. In the new version it has been simplified. Tendermint is the client and the remote signer is the server. This follows client and server architecture and simplifies the previous protocol.
|
||||
|
||||
#### Keep Alive
|
||||
|
||||
If you have worked on the private validator system you will see that we are removing the `PingRequest` and `PingResponse` messages. These messages were used to create functionality which kept the connection alive. With gRPC there is a [keep alive feature](https://github.com/grpc/grpc/blob/master/doc/keepalive.md) that will be added along side the integration to provide the same functionality.
|
||||
|
||||
#### Metrics
|
||||
|
||||
Remote signers are crucial to operating secure and consistently up Validators. In the past there were no metrics to tell the operator if something is wrong other than the node not signing. Integrating metrics into the client and provided server will be done with [prometheus](https://github.com/grpc-ecosystem/go-grpc-prometheus). This will be integrated into node's prometheus export for node operators.
|
||||
|
||||
#### Security
|
||||
|
||||
[TLS](https://en.wikipedia.org/wiki/Transport_Layer_Security) is widely adopted with the use of gRPC. There are various forms of TLS (one-way & two-way). One way is the client identifying who the server is, while two way is both parties identifying the other. For Tendermint's use case having both parties identifying each other provides adds an extra layer of security. This requires users to generate both client and server certificates for a TLS connection.
|
||||
|
||||
An insecure option will be provided for users who do not wish to secure the connection.
|
||||
|
||||
#### Upgrade Path
|
||||
|
||||
This is a largely breaking change for validator operators. The optimal upgrade path would be to release gRPC in a minor release, allow key management systems to migrate to the new protocol. In the next major release the current system (raw tcp/unix) is removed. This allows users to migrate to the new system and not have to coordinate upgrading the key management system alongside a network upgrade.
|
||||
|
||||
The upgrade of [tmkms](https://github.com/iqlusioninc/tmkms) will be coordinated with Iqlusion. They will be able to make the necessary upgrades to allow users to migrate to gRPC from the current protocol.
|
||||
|
||||
## Status
|
||||
|
||||
|
||||
Implemented
|
||||
|
||||
### Positive
|
||||
|
||||
- Use an adopted standard for secure communication. (TLS)
|
||||
- Use an adopted communication protocol. (gRPC)
|
||||
- Requests are multiplexed onto the tcp connection. (http/2)
|
||||
- Language agnostic service definition.
|
||||
|
||||
### Negative
|
||||
|
||||
- Users will need to generate certificates to use TLS. (Added step)
|
||||
- Users will need to find a supported gRPC supported key management system
|
||||
|
||||
### Neutral
|
||||
@@ -1,90 +0,0 @@
|
||||
# ADR 064: Batch Verification
|
||||
|
||||
## Changelog
|
||||
|
||||
- January 28, 2021: Created (@marbar3778)
|
||||
|
||||
## Context
|
||||
|
||||
Tendermint uses public private key cryptography for validator signing. When a block is proposed and voted on validators sign a message representing acceptance of a block, rejection is signaled via a nil vote. These signatures are also used to verify previous blocks are correct if a node is syncing. Currently, Tendermint requires each signature to be verified individually, this leads to a slow down of block times.
|
||||
|
||||
Batch Verification is the process of taking many messages, keys, and signatures adding them together and verifying them all at once. The public key can be the same in which case it would mean a single user is signing many messages. In our case each public key is unique, each validator has their own and contribute a unique message. The algorithm can vary from curve to curve but the performance benefit, over single verifying messages, public keys and signatures is shared.
|
||||
|
||||
## Alternative Approaches
|
||||
|
||||
- Signature aggregation
|
||||
- Signature aggregation is an alternative to batch verification. Signature aggregation leads to fast verification and smaller block sizes. At the time of writing this ADR there is on going work to enable signature aggregation in Tendermint. The reason why we have opted to not introduce it at this time is because every validator signs a unique message.
|
||||
Signing a unique message prevents aggregation before verification. For example if we were to implement signature aggregation with BLS, there could be a potential slow down of 10x-100x in verification speeds.
|
||||
|
||||
## Decision
|
||||
|
||||
Adopt Batch Verification.
|
||||
|
||||
## Detailed Design
|
||||
|
||||
A new interface will be introduced. This interface will have three methods `NewBatchVerifier`, `Add` and `VerifyBatch`.
|
||||
|
||||
```go
|
||||
type BatchVerifier interface {
|
||||
Add(key crypto.Pubkey, signature, message []byte) error // Add appends an entry into the BatchVerifier.
|
||||
Verify() bool // Verify verifies all the entries in the BatchVerifier. If the verification fails it is unknown which entry failed and each entry will need to be verified individually.
|
||||
}
|
||||
```
|
||||
|
||||
- `NewBatchVerifier` creates a new verifier. This verifier will be populated with entries to be verified.
|
||||
- `Add` adds an entry to the Verifier. Add accepts a public key and two slice of bytes (signature and message).
|
||||
- `Verify` verifies all the entires. At the end of Verify if the underlying API does not reset the Verifier to its initial state (empty), it should be done here. This prevents accidentally reusing the verifier with entries from a previous verification.
|
||||
|
||||
Above there is mention of an entry. An entry can be constructed in many ways depending on the needs of the underlying curve. A simple approach would be:
|
||||
|
||||
```go
|
||||
type entry struct {
|
||||
pubKey crypto.Pubkey
|
||||
signature []byte
|
||||
message []byte
|
||||
}
|
||||
```
|
||||
|
||||
The main reason this approach is being taken is to prevent simple mistakes. Some APIs allow the user to create three slices and pass them to the `VerifyBatch` function but this relies on the user to safely generate all the slices (see example below). We would like to minimize the possibility of making a mistake.
|
||||
|
||||
```go
|
||||
func Verify(keys []crypto.Pubkey, signatures, messages[][]byte) bool
|
||||
```
|
||||
|
||||
This change will not affect any users in anyway other than faster verification times.
|
||||
|
||||
This new api will be used for verification in both consensus and block syncing. Within the current Verify functions there will be a check to see if the key types supports the BatchVerification API. If it does it will execute batch verification, if not single signature verification will be used.
|
||||
|
||||
#### Consensus
|
||||
|
||||
The process within consensus will be to wait for 2/3+ of the votes to be received, once they are received `Verify()` will be called to batch verify all the messages. The messages that come in after 2/3+ has been verified will be individually verified.
|
||||
|
||||
#### Block Sync & Light Client
|
||||
|
||||
The process for block sync & light client verification will be to verify only 2/3+ in a batch style. Since these processes are not participating in consensus there is no need to wait for more messages.
|
||||
|
||||
If batch verifications fails for any reason, it will not be known which entry caused the failure. Verification will need to revert to single signature verification.
|
||||
|
||||
Starting out, only ed25519 will support batch verification.
|
||||
|
||||
## Status
|
||||
|
||||
Implemented
|
||||
|
||||
### Positive
|
||||
|
||||
- Faster verification times, if the curve supports it
|
||||
|
||||
### Negative
|
||||
|
||||
- No way to see which key failed verification
|
||||
- A failure means reverting back to single signature verification.
|
||||
|
||||
### Neutral
|
||||
|
||||
## References
|
||||
|
||||
[Ed25519 Library](https://github.com/hdevalence/ed25519consensus)
|
||||
[Ed25519 spec](https://ed25519.cr.yp.to/)
|
||||
[Signature Aggregation for votes](https://github.com/tendermint/tendermint/issues/1319)
|
||||
[Proposer-based timestamps](https://github.com/tendermint/tendermint/issues/2840)
|
||||
@@ -1,425 +0,0 @@
|
||||
# ADR 065: Custom Event Indexing
|
||||
|
||||
- [ADR 065: Custom Event Indexing](#adr-065-custom-event-indexing)
|
||||
- [Changelog](#changelog)
|
||||
- [Status](#status)
|
||||
- [Context](#context)
|
||||
- [Alternative Approaches](#alternative-approaches)
|
||||
- [Decision](#decision)
|
||||
- [Detailed Design](#detailed-design)
|
||||
- [EventSink](#eventsink)
|
||||
- [Supported Sinks](#supported-sinks)
|
||||
- [`KVEventSink`](#kveventsink)
|
||||
- [`PSQLEventSink`](#psqleventsink)
|
||||
- [Configuration](#configuration)
|
||||
- [Future Improvements](#future-improvements)
|
||||
- [Consequences](#consequences)
|
||||
- [Positive](#positive)
|
||||
- [Negative](#negative)
|
||||
- [Neutral](#neutral)
|
||||
- [References](#references)
|
||||
|
||||
## Changelog
|
||||
|
||||
- April 1, 2021: Initial Draft (@alexanderbez)
|
||||
- April 28, 2021: Specify search capabilities are only supported through the KV indexer (@marbar3778)
|
||||
- May 19, 2021: Update the SQL schema and the eventsink interface (@jayt106)
|
||||
- Aug 30, 2021: Update the SQL schema and the psql implementation (@creachadair)
|
||||
- Oct 5, 2021: Clarify goals and implementation changes (@creachadair)
|
||||
|
||||
## Status
|
||||
|
||||
Accepted
|
||||
|
||||
## Context
|
||||
|
||||
Currently, Tendermint Core supports block and transaction event indexing through
|
||||
the `tx_index.indexer` configuration. Events are captured in transactions and
|
||||
are indexed via a `TxIndexer` type. Events are captured in blocks, specifically
|
||||
from `BeginBlock` and `EndBlock` application responses, and are indexed via a
|
||||
`BlockIndexer` type. Both of these types are managed by a single `IndexerService`
|
||||
which is responsible for consuming events and sending those events off to be
|
||||
indexed by the respective type.
|
||||
|
||||
In addition to indexing, Tendermint Core also supports the ability to query for
|
||||
both indexed transaction and block events via Tendermint's RPC layer. The ability
|
||||
to query for these indexed events facilitates a great multitude of upstream client
|
||||
and application capabilities, e.g. block explorers, IBC relayers, and auxiliary
|
||||
data availability and indexing services.
|
||||
|
||||
Currently, Tendermint only supports indexing via a `kv` indexer, which is supported
|
||||
by an underlying embedded key/value store database. The `kv` indexer implements
|
||||
its own indexing and query mechanisms. While the former is somewhat trivial,
|
||||
providing a rich and flexible query layer is not as trivial and has caused many
|
||||
issues and UX concerns for upstream clients and applications.
|
||||
|
||||
The fragile nature of the proprietary `kv` query engine and the potential
|
||||
performance and scaling issues that arise when a large number of consumers are
|
||||
introduced, motivate the need for a more robust and flexible indexing and query
|
||||
solution.
|
||||
|
||||
## Alternative Approaches
|
||||
|
||||
With regards to alternative approaches to a more robust solution, the only serious
|
||||
contender that was considered was to transition to using [SQLite](https://www.sqlite.org/index.html).
|
||||
|
||||
While the approach would work, it locks us into a specific query language and
|
||||
storage layer, so in some ways it's only a bit better than our current approach.
|
||||
In addition, the implementation would require the introduction of CGO into the
|
||||
Tendermint Core stack, whereas right now CGO is only introduced depending on
|
||||
the database used.
|
||||
|
||||
## Decision
|
||||
|
||||
We will adopt a similar approach to that of the Cosmos SDK's `KVStore` state
|
||||
listening described in [ADR-038](https://github.com/cosmos/cosmos-sdk/blob/master/docs/architecture/adr-038-state-listening.md).
|
||||
|
||||
We will implement the following changes:
|
||||
|
||||
- Introduce a new interface, `EventSink`, that all data sinks must implement.
|
||||
- Augment the existing `tx_index.indexer` configuration to now accept a series
|
||||
of one or more indexer types, i.e., sinks.
|
||||
- Combine the current `TxIndexer` and `BlockIndexer` into a single `KVEventSink`
|
||||
that implements the `EventSink` interface.
|
||||
- Introduce an additional `EventSink` implementation that is backed by
|
||||
[PostgreSQL](https://www.postgresql.org/).
|
||||
- Implement the necessary schemas to support both block and transaction event indexing.
|
||||
- Update `IndexerService` to use a series of `EventSinks`.
|
||||
|
||||
In addition:
|
||||
|
||||
- The Postgres indexer implementation will _not_ implement the proprietary `kv`
|
||||
query language. Users wishing to write queries against the Postgres indexer
|
||||
will connect to the underlying DBMS directly and use SQL queries based on the
|
||||
indexing schema.
|
||||
|
||||
Future custom indexer implementations will not be required to support the
|
||||
proprietary query language either.
|
||||
|
||||
- For now, the existing `kv` indexer will be left in place with its current
|
||||
query support, but will be marked as deprecated in a subsequent release, and
|
||||
the documentation will be updated to encourage users who need to query the
|
||||
event index to migrate to the Postgres indexer.
|
||||
|
||||
- In the future we may remove the `kv` indexer entirely, or replace it with a
|
||||
different implementation; that decision is deferred as future work.
|
||||
|
||||
- In the future, we may remove the index query endpoints from the RPC service
|
||||
entirely; that decision is deferred as future work, but recommended.
|
||||
|
||||
|
||||
## Detailed Design
|
||||
|
||||
### EventSink
|
||||
|
||||
We introduce the `EventSink` interface type that all supported sinks must implement.
|
||||
The interface is defined as follows:
|
||||
|
||||
```go
|
||||
type EventSink interface {
|
||||
IndexBlockEvents(types.EventDataNewBlockHeader) error
|
||||
IndexTxEvents([]*abci.TxResult) error
|
||||
|
||||
SearchBlockEvents(context.Context, *query.Query) ([]int64, error)
|
||||
SearchTxEvents(context.Context, *query.Query) ([]*abci.TxResult, error)
|
||||
|
||||
GetTxByHash([]byte) (*abci.TxResult, error)
|
||||
HasBlock(int64) (bool, error)
|
||||
|
||||
Type() EventSinkType
|
||||
Stop() error
|
||||
}
|
||||
```
|
||||
|
||||
The `IndexerService` will accept a list of one or more `EventSink` types. During
|
||||
the `OnStart` method it will call the appropriate APIs on each `EventSink` to
|
||||
index both block and transaction events.
|
||||
|
||||
### Supported Sinks
|
||||
|
||||
We will initially support two `EventSink` types out of the box.
|
||||
|
||||
#### `KVEventSink`
|
||||
|
||||
This type of `EventSink` is a combination of the `TxIndexer` and `BlockIndexer`
|
||||
indexers, both of which are backed by a single embedded key/value database.
|
||||
|
||||
A bulk of the existing business logic will remain the same, but the existing APIs
|
||||
mapped to the new `EventSink` API. Both types will be removed in favor of a single
|
||||
`KVEventSink` type.
|
||||
|
||||
The `KVEventSink` will be the only `EventSink` enabled by default, so from a UX
|
||||
perspective, operators should not notice a difference apart from a configuration
|
||||
change.
|
||||
|
||||
We omit `EventSink` implementation details as it should be fairly straightforward
|
||||
to map the existing business logic to the new APIs.
|
||||
|
||||
#### `PSQLEventSink`
|
||||
|
||||
This type of `EventSink` indexes block and transaction events into a [PostgreSQL](https://www.postgresql.org/).
|
||||
database. We define and automatically migrate the following schema when the
|
||||
`IndexerService` starts.
|
||||
|
||||
The postgres eventsink will not support `tx_search`, `block_search`, `GetTxByHash` and `HasBlock`.
|
||||
|
||||
```sql
|
||||
-- Table Definition ----------------------------------------------
|
||||
|
||||
-- The blocks table records metadata about each block.
|
||||
-- The block record does not include its events or transactions (see tx_results).
|
||||
CREATE TABLE blocks (
|
||||
rowid BIGSERIAL PRIMARY KEY,
|
||||
|
||||
height BIGINT NOT NULL,
|
||||
chain_id VARCHAR NOT NULL,
|
||||
|
||||
-- When this block header was logged into the sink, in UTC.
|
||||
created_at TIMESTAMPTZ NOT NULL,
|
||||
|
||||
UNIQUE (height, chain_id)
|
||||
);
|
||||
|
||||
-- Index blocks by height and chain, since we need to resolve block IDs when
|
||||
-- indexing transaction records and transaction events.
|
||||
CREATE INDEX idx_blocks_height_chain ON blocks(height, chain_id);
|
||||
|
||||
-- The tx_results table records metadata about transaction results. Note that
|
||||
-- the events from a transaction are stored separately.
|
||||
CREATE TABLE tx_results (
|
||||
rowid BIGSERIAL PRIMARY KEY,
|
||||
|
||||
-- The block to which this transaction belongs.
|
||||
block_id BIGINT NOT NULL REFERENCES blocks(rowid),
|
||||
-- The sequential index of the transaction within the block.
|
||||
index INTEGER NOT NULL,
|
||||
-- When this result record was logged into the sink, in UTC.
|
||||
created_at TIMESTAMPTZ NOT NULL,
|
||||
-- The hex-encoded hash of the transaction.
|
||||
tx_hash VARCHAR NOT NULL,
|
||||
-- The protobuf wire encoding of the TxResult message.
|
||||
tx_result BYTEA NOT NULL,
|
||||
|
||||
UNIQUE (block_id, index)
|
||||
);
|
||||
|
||||
-- The events table records events. All events (both block and transaction) are
|
||||
-- associated with a block ID; transaction events also have a transaction ID.
|
||||
CREATE TABLE events (
|
||||
rowid BIGSERIAL PRIMARY KEY,
|
||||
|
||||
-- The block and transaction this event belongs to.
|
||||
-- If tx_id is NULL, this is a block event.
|
||||
block_id BIGINT NOT NULL REFERENCES blocks(rowid),
|
||||
tx_id BIGINT NULL REFERENCES tx_results(rowid),
|
||||
|
||||
-- The application-defined type label for the event.
|
||||
type VARCHAR NOT NULL
|
||||
);
|
||||
|
||||
-- The attributes table records event attributes.
|
||||
CREATE TABLE attributes (
|
||||
event_id BIGINT NOT NULL REFERENCES events(rowid),
|
||||
key VARCHAR NOT NULL, -- bare key
|
||||
composite_key VARCHAR NOT NULL, -- composed type.key
|
||||
value VARCHAR NULL,
|
||||
|
||||
UNIQUE (event_id, key)
|
||||
);
|
||||
|
||||
-- A joined view of events and their attributes. Events that do not have any
|
||||
-- attributes are represented as a single row with empty key and value fields.
|
||||
CREATE VIEW event_attributes AS
|
||||
SELECT block_id, tx_id, type, key, composite_key, value
|
||||
FROM events LEFT JOIN attributes ON (events.rowid = attributes.event_id);
|
||||
|
||||
-- A joined view of all block events (those having tx_id NULL).
|
||||
CREATE VIEW block_events AS
|
||||
SELECT blocks.rowid as block_id, height, chain_id, type, key, composite_key, value
|
||||
FROM blocks JOIN event_attributes ON (blocks.rowid = event_attributes.block_id)
|
||||
WHERE event_attributes.tx_id IS NULL;
|
||||
|
||||
-- A joined view of all transaction events.
|
||||
CREATE VIEW tx_events AS
|
||||
SELECT height, index, chain_id, type, key, composite_key, value, tx_results.created_at
|
||||
FROM blocks JOIN tx_results ON (blocks.rowid = tx_results.block_id)
|
||||
JOIN event_attributes ON (tx_results.rowid = event_attributes.tx_id)
|
||||
WHERE event_attributes.tx_id IS NOT NULL;
|
||||
```
|
||||
|
||||
The `PSQLEventSink` will implement the `EventSink` interface as follows
|
||||
(some details omitted for brevity):
|
||||
|
||||
```go
|
||||
func NewEventSink(connStr, chainID string) (*EventSink, error) {
|
||||
db, err := sql.Open(driverName, connStr)
|
||||
// ...
|
||||
|
||||
return &EventSink{
|
||||
store: db,
|
||||
chainID: chainID,
|
||||
}, nil
|
||||
}
|
||||
|
||||
func (es *EventSink) IndexBlockEvents(h types.EventDataNewBlockHeader) error {
|
||||
ts := time.Now().UTC()
|
||||
|
||||
return runInTransaction(es.store, func(tx *sql.Tx) error {
|
||||
// Add the block to the blocks table and report back its row ID for use
|
||||
// in indexing the events for the block.
|
||||
blockID, err := queryWithID(tx, `
|
||||
INSERT INTO blocks (height, chain_id, created_at)
|
||||
VALUES ($1, $2, $3)
|
||||
ON CONFLICT DO NOTHING
|
||||
RETURNING rowid;
|
||||
`, h.Header.Height, es.chainID, ts)
|
||||
// ...
|
||||
|
||||
// Insert the special block meta-event for height.
|
||||
if err := insertEvents(tx, blockID, 0, []abci.Event{
|
||||
makeIndexedEvent(types.BlockHeightKey, fmt.Sprint(h.Header.Height)),
|
||||
}); err != nil {
|
||||
return fmt.Errorf("block meta-events: %w", err)
|
||||
}
|
||||
// Insert all the block events. Order is important here,
|
||||
if err := insertEvents(tx, blockID, 0, h.ResultBeginBlock.Events); err != nil {
|
||||
return fmt.Errorf("begin-block events: %w", err)
|
||||
}
|
||||
if err := insertEvents(tx, blockID, 0, h.ResultEndBlock.Events); err != nil {
|
||||
return fmt.Errorf("end-block events: %w", err)
|
||||
}
|
||||
return nil
|
||||
})
|
||||
}
|
||||
|
||||
func (es *EventSink) IndexTxEvents(txrs []*abci.TxResult) error {
|
||||
ts := time.Now().UTC()
|
||||
|
||||
for _, txr := range txrs {
|
||||
// Encode the result message in protobuf wire format for indexing.
|
||||
resultData, err := proto.Marshal(txr)
|
||||
// ...
|
||||
|
||||
// Index the hash of the underlying transaction as a hex string.
|
||||
txHash := fmt.Sprintf("%X", types.Tx(txr.Tx).Hash())
|
||||
|
||||
if err := runInTransaction(es.store, func(tx *sql.Tx) error {
|
||||
// Find the block associated with this transaction.
|
||||
blockID, err := queryWithID(tx, `
|
||||
SELECT rowid FROM blocks WHERE height = $1 AND chain_id = $2;
|
||||
`, txr.Height, es.chainID)
|
||||
// ...
|
||||
|
||||
// Insert a record for this tx_result and capture its ID for indexing events.
|
||||
txID, err := queryWithID(tx, `
|
||||
INSERT INTO tx_results (block_id, index, created_at, tx_hash, tx_result)
|
||||
VALUES ($1, $2, $3, $4, $5)
|
||||
ON CONFLICT DO NOTHING
|
||||
RETURNING rowid;
|
||||
`, blockID, txr.Index, ts, txHash, resultData)
|
||||
// ...
|
||||
|
||||
// Insert the special transaction meta-events for hash and height.
|
||||
if err := insertEvents(tx, blockID, txID, []abci.Event{
|
||||
makeIndexedEvent(types.TxHashKey, txHash),
|
||||
makeIndexedEvent(types.TxHeightKey, fmt.Sprint(txr.Height)),
|
||||
}); err != nil {
|
||||
return fmt.Errorf("indexing transaction meta-events: %w", err)
|
||||
}
|
||||
// Index any events packaged with the transaction.
|
||||
if err := insertEvents(tx, blockID, txID, txr.Result.Events); err != nil {
|
||||
return fmt.Errorf("indexing transaction events: %w", err)
|
||||
}
|
||||
return nil
|
||||
|
||||
}); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// SearchBlockEvents is not implemented by this sink, and reports an error for all queries.
|
||||
func (es *EventSink) SearchBlockEvents(ctx context.Context, q *query.Query) ([]int64, error)
|
||||
|
||||
// SearchTxEvents is not implemented by this sink, and reports an error for all queries.
|
||||
func (es *EventSink) SearchTxEvents(ctx context.Context, q *query.Query) ([]*abci.TxResult, error)
|
||||
|
||||
// GetTxByHash is not implemented by this sink, and reports an error for all queries.
|
||||
func (es *EventSink) GetTxByHash(hash []byte) (*abci.TxResult, error)
|
||||
|
||||
// HasBlock is not implemented by this sink, and reports an error for all queries.
|
||||
func (es *EventSink) HasBlock(h int64) (bool, error)
|
||||
```
|
||||
|
||||
### Configuration
|
||||
|
||||
The current `tx_index.indexer` configuration would be changed to accept a list
|
||||
of supported `EventSink` types instead of a single value.
|
||||
|
||||
Example:
|
||||
|
||||
```toml
|
||||
[tx_index]
|
||||
|
||||
indexer = [
|
||||
"kv",
|
||||
"psql"
|
||||
]
|
||||
```
|
||||
|
||||
If the `indexer` list contains the `null` indexer, then no indexers will be used
|
||||
regardless of what other values may exist.
|
||||
|
||||
Additional configuration parameters might be required depending on what event
|
||||
sinks are supplied to `tx_index.indexer`. The `psql` will require an additional
|
||||
connection configuration.
|
||||
|
||||
```toml
|
||||
[tx_index]
|
||||
|
||||
indexer = [
|
||||
"kv",
|
||||
"psql"
|
||||
]
|
||||
|
||||
pqsql_conn = "postgresql://<user>:<password>@<host>:<port>/<db>?<opts>"
|
||||
```
|
||||
|
||||
Any invalid or misconfigured `tx_index` configuration should yield an error as
|
||||
early as possible.
|
||||
|
||||
## Future Improvements
|
||||
|
||||
Although not technically required to maintain feature parity with the current
|
||||
existing Tendermint indexer, it would be beneficial for operators to have a method
|
||||
of performing a "re-index". Specifically, Tendermint operators could invoke an
|
||||
RPC method that allows the Tendermint node to perform a re-indexing of all block
|
||||
and transaction events between two given heights, H<sub>1</sub> and H<sub>2</sub>,
|
||||
so long as the block store contains the blocks and transaction results for all
|
||||
the heights specified in a given range.
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- A more robust and flexible indexing and query engine for indexing and search
|
||||
block and transaction events.
|
||||
- The ability to not have to support a custom indexing and query engine beyond
|
||||
the legacy `kv` type.
|
||||
- The ability to offload/proxy indexing and querying to the underling sink.
|
||||
- Scalability and reliability that essentially comes "for free" from the underlying
|
||||
sink, if it supports it.
|
||||
|
||||
### Negative
|
||||
|
||||
- The need to support multiple and potentially a growing set of custom `EventSink`
|
||||
types.
|
||||
|
||||
### Neutral
|
||||
|
||||
## References
|
||||
|
||||
- [Cosmos SDK ADR-038](https://github.com/cosmos/cosmos-sdk/blob/master/docs/architecture/adr-038-state-listening.md)
|
||||
- [PostgreSQL](https://www.postgresql.org/)
|
||||
- [SQLite](https://www.sqlite.org/index.html)
|
||||
@@ -1,140 +0,0 @@
|
||||
# ADR 66: End-to-End Testing
|
||||
|
||||
## Changelog
|
||||
|
||||
- 2020-09-07: Initial draft (@erikgrinaker)
|
||||
- 2020-09-08: Minor improvements (@erikgrinaker)
|
||||
- 2021-04-12: Renamed from RFC 001 (@tessr)
|
||||
|
||||
## Authors
|
||||
|
||||
- Erik Grinaker (@erikgrinaker)
|
||||
|
||||
## Context
|
||||
|
||||
The current set of end-to-end tests under `test/` are very limited, mostly focusing on P2P testing in a standard configuration. They do not test various configurations (e.g. fast sync reactor versions, state sync, block pruning, genesis vs InitChain setup), nor do they test various network topologies (e.g. sentry node architecture). This leads to poor test coverage, which has caused several serious bugs to go unnoticed.
|
||||
|
||||
We need an end-to-end test suite that can run a large number of combinations of configuration options, genesis settings, network topologies, ABCI interactions, and failure scenarios and check that the network is still functional. This ADR outlines the basic requirements and design for such a system.
|
||||
|
||||
This ADR will not cover comprehensive chaos testing, only a few simple scenarios (e.g. abrupt process termination and network partitioning). Chaos testing of the core consensus algorithm should be implemented e.g. via Jepsen tests or a similar framework, or alternatively be added to these end-to-end tests at a later time. Similarly, malicious or adversarial behavior is out of scope for the first implementation, but may be added later.
|
||||
|
||||
## Proposal
|
||||
|
||||
### Functional Coverage
|
||||
|
||||
The following lists the functionality we would like to test:
|
||||
|
||||
#### Environments
|
||||
|
||||
- **Topology:** single node, 4 nodes (seeds and persistent), sentry architecture, NAT (UPnP)
|
||||
- **Networking:** IPv4, IPv6
|
||||
- **ABCI connection:** UNIX socket, TCP, gRPC
|
||||
- **PrivVal:** file, UNIX socket, TCP
|
||||
|
||||
#### Node/App Configurations
|
||||
|
||||
- **Database:** goleveldb, cleveldb, boltdb, rocksdb, badgerdb
|
||||
- **Fast sync:** disabled, v0, v2
|
||||
- **State sync:** disabled, enabled
|
||||
- **Block pruning:** none, keep 20, keep 1, keep random
|
||||
- **Role:** validator, full node
|
||||
- **App persistence:** enabled, disabled
|
||||
- **Node modes:** validator, full, light, seed
|
||||
|
||||
#### Geneses
|
||||
|
||||
- **Validators:** none (InitChain), given
|
||||
- **Initial height:** 1, 1000
|
||||
- **App state:** none, given
|
||||
|
||||
#### Behaviors
|
||||
|
||||
- **Recovery:** stop/start, power cycling, validator outage, network partition, total network loss
|
||||
- **Validators:** add, remove, change power
|
||||
- **Evidence:** injection of DuplicateVoteEvidence and LightClientAttackEvidence
|
||||
|
||||
### Functional Combinations
|
||||
|
||||
Running separate tests for all combinations of the above functionality is not feasible, as there are millions of them. However, the functionality can be grouped into three broad classes:
|
||||
|
||||
- **Global:** affects the entire network, needing a separate testnet for each combination (e.g. topology, network protocol, genesis settings)
|
||||
|
||||
- **Local:** affects a single node, and can be varied per node in a testnet (e.g. ABCI/privval connections, database backend, block pruning)
|
||||
|
||||
- **Temporal:** can be run after each other in the same testnet (e.g. recovery and validator changes)
|
||||
|
||||
Thus, we can run separate testnets for all combinations of global options (on the order of 100). In each testnet, we run nodes with randomly generated node configurations optimized for broad coverage (i.e. if one node is using GoLevelDB, then no other node should use it if possible). And in each testnet, we sequentially and randomly pick nodes to stop/start, power cycle, add/remove, disconnect, and so on.
|
||||
|
||||
All of the settings should be specified in a testnet configuration (or alternatively the seed that generated it) such that it can be retrieved from CI and debugged locally.
|
||||
|
||||
A custom ABCI application will have to be built that can exhibit the necessary behavior (e.g. make validator changes, prune blocks, enable/disable persistence, and so on).
|
||||
|
||||
### Test Stages
|
||||
|
||||
Given a test configuration, the test runner has the following stages:
|
||||
|
||||
- **Setup:** configures the Docker containers and networks, but does not start them.
|
||||
|
||||
- **Initialization:** starts the Docker containers, performs fast sync/state sync. Accomodates for different start heights.
|
||||
|
||||
- **Perturbation:** adds/removes validators, restarts nodes, perturbs networking, etc - liveness and readiness checked between each operation.
|
||||
|
||||
- **Testing:** runs RPC tests independently against all network nodes, making sure data matches expectations and invariants hold.
|
||||
|
||||
### Tests
|
||||
|
||||
The general approach will be to put the network through a sequence of operations (see stages above), check basic liveness and readiness after each operation, and then once the network stabilizes run an RPC test suite against each node in the network.
|
||||
|
||||
The test suite will do black-box testing against a single node's RPC service. We will be testing the behavior of the network as a whole, e.g. that a fast synced node correctly catches up to the chain head and serves basic block data via RPC. Thus the tests will not send e.g. P2P messages or examine the node database, as these are considered internal implementation details - if the network behaves correctly, presumably the internal components function correctly. Comprehensive component testing (e.g. each and every RPC method parameter) should be done via unit/integration tests.
|
||||
|
||||
The tests must take into account the node configuration (e.g. some nodes may be pruned, others may not be validators), and should somehow be provided access to expected data (i.e. complete block headers for the entire chain).
|
||||
|
||||
The test suite should use the Tendermint RPC client and the Tendermint light client, to exercise the client code as well.
|
||||
|
||||
### Implementation Considerations
|
||||
|
||||
The testnets should run in Docker Compose, both locally and in CI. This makes it easier to reproduce test failures locally. Supporting multiple test-runners (e.g. on VMs or Kubernetes) is out of scope. The same image should be used for all tests, with configuration passed via a mounted volume.
|
||||
|
||||
There does not appear to be any off-the-shelf solutions that would do this for us, so we will have to roll our own on top of Docker Compose. This gives us more flexibility, but is estimated to be a few weeks of work.
|
||||
|
||||
Testnets should be configured via a YAML file. These are used as inputs for the test runner, which e.g. generates Docker Compose configurations from them. An additional layer on top should generate these testnet configurations from a YAML file that specifies all the option combinations to test.
|
||||
|
||||
Comprehensive testnets should run against master nightly. However, a small subset of representative testnets should run for each pull request, e.g. a four-node IPv4 network with state sync and fast sync.
|
||||
|
||||
Tests should be written using the standard Go test framework (and e.g. Testify), with a helper function to fetch info from the test configuration. The test runner will run the tests separately for each network node, and the test must vary its expectations based on the node's configuration.
|
||||
|
||||
It should be possible to launch a specific testnet and run individual test cases from the IDE or local terminal against a it.
|
||||
|
||||
If possible, the existing `testnet` command should be extended to set up the network topologies needed by the end-to-end tests.
|
||||
|
||||
## Status
|
||||
|
||||
Implemented
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- Comprehensive end-to-end test coverage of basic Tendermint functionality, exercising common code paths in the same way that users would
|
||||
|
||||
- Test environments can easily be reproduced locally and debugged via standard tooling
|
||||
|
||||
### Negative
|
||||
|
||||
- Limited coverage of consensus correctness testing (e.g. Jepsen)
|
||||
|
||||
- No coverage of malicious or adversarial behavior
|
||||
|
||||
- Have to roll our own test framework, which takes engineering resources
|
||||
|
||||
- Possibly slower CI times, depending on which tests are run
|
||||
|
||||
- Operational costs and overhead, e.g. infrastructure costs and system maintenance
|
||||
|
||||
### Neutral
|
||||
|
||||
- No support for alternative infrastructure platforms, e.g. Kubernetes or VMs
|
||||
|
||||
## References
|
||||
|
||||
- [#5291: new end-to-end test suite](https://github.com/tendermint/tendermint/issues/5291)
|
||||
@@ -1,303 +0,0 @@
|
||||
# ADR 067: Mempool Refactor
|
||||
|
||||
- [ADR 067: Mempool Refactor](#adr-067-mempool-refactor)
|
||||
- [Changelog](#changelog)
|
||||
- [Status](#status)
|
||||
- [Context](#context)
|
||||
- [Current Design](#current-design)
|
||||
- [Alternative Approaches](#alternative-approaches)
|
||||
- [Prior Art](#prior-art)
|
||||
- [Ethereum](#ethereum)
|
||||
- [Diem](#diem)
|
||||
- [Decision](#decision)
|
||||
- [Detailed Design](#detailed-design)
|
||||
- [CheckTx](#checktx)
|
||||
- [Mempool](#mempool)
|
||||
- [Eviction](#eviction)
|
||||
- [Gossiping](#gossiping)
|
||||
- [Performance](#performance)
|
||||
- [Future Improvements](#future-improvements)
|
||||
- [Consequences](#consequences)
|
||||
- [Positive](#positive)
|
||||
- [Negative](#negative)
|
||||
- [Neutral](#neutral)
|
||||
- [References](#references)
|
||||
|
||||
## Changelog
|
||||
|
||||
- April 19, 2021: Initial Draft (@alexanderbez)
|
||||
|
||||
## Status
|
||||
|
||||
Accepted
|
||||
|
||||
## Context
|
||||
|
||||
Tendermint Core has a reactor and data structure, mempool, that facilitates the
|
||||
ephemeral storage of uncommitted transactions. Honest nodes participating in a
|
||||
Tendermint network gossip these uncommitted transactions to each other if they
|
||||
pass the application's `CheckTx`. In addition, block proposers select from the
|
||||
mempool a subset of uncommitted transactions to include in the next block.
|
||||
|
||||
Currently, the mempool in Tendermint Core is designed as a FIFO queue. In other
|
||||
words, transactions are included in blocks as they are received by a node. There
|
||||
currently is no explicit and prioritized ordering of these uncommitted transactions.
|
||||
This presents a few technical and UX challenges for operators and applications.
|
||||
|
||||
Namely, validators are not able to prioritize transactions by their fees or any
|
||||
incentive aligned mechanism. In addition, the lack of prioritization also leads
|
||||
to cascading effects in terms of DoS and various attack vectors on networks,
|
||||
e.g. [cosmos/cosmos-sdk#8224](https://github.com/cosmos/cosmos-sdk/discussions/8224).
|
||||
|
||||
Thus, Tendermint Core needs the ability for an application and its users to
|
||||
prioritize transactions in a flexible and performant manner. Specifically, we're
|
||||
aiming to either improve, maintain or add the following properties in the
|
||||
Tendermint mempool:
|
||||
|
||||
- Allow application-determined transaction priority.
|
||||
- Allow efficient concurrent reads and writes.
|
||||
- Allow block proposers to reap transactions efficiently by priority.
|
||||
- Maintain a fixed mempool capacity by transaction size and evict lower priority
|
||||
transactions to make room for higher priority transactions.
|
||||
- Allow transactions to be gossiped by priority efficiently.
|
||||
- Allow operators to specify a maximum TTL for transactions in the mempool before
|
||||
they're automatically evicted if not selected for a block proposal in time.
|
||||
- Ensure the design allows for future extensions, such as replace-by-priority and
|
||||
allowing multiple pending transactions per sender, to be incorporated easily.
|
||||
|
||||
Note, not all of these properties will be addressed by the proposed changes in
|
||||
this ADR. However, this proposal will ensure that any unaddressed properties
|
||||
can be addressed in an easy and extensible manner in the future.
|
||||
|
||||
### Current Design
|
||||
|
||||

|
||||
|
||||
At the core of the `v0` mempool reactor is a concurrent linked-list. This is the
|
||||
primary data structure that contains `Tx` objects that have passed `CheckTx`.
|
||||
When a node receives a transaction from another peer, it executes `CheckTx`, which
|
||||
obtains a read-lock on the `*CListMempool`. If the transaction passes `CheckTx`
|
||||
locally on the node, it is added to the `*CList` by obtaining a write-lock. It
|
||||
is also added to the `cache` and `txsMap`, both of which obtain their own respective
|
||||
write-locks and map a reference from the transaction hash to the `Tx` itself.
|
||||
|
||||
Transactions are continuously gossiped to peers whenever a new transaction is added
|
||||
to a local node's `*CList`, where the node at the front of the `*CList` is selected.
|
||||
Another transaction will not be gossiped until the `*CList` notifies the reader
|
||||
that there are more transactions to gossip.
|
||||
|
||||
When a proposer attempts to propose a block, they will execute `ReapMaxBytesMaxGas`
|
||||
on the reactor's `*CListMempool`. This call obtains a read-lock on the `*CListMempool`
|
||||
and selects as many transactions as possible starting from the front of the `*CList`
|
||||
moving to the back of the list.
|
||||
|
||||
When a block is finally committed, a caller invokes `Update` on the reactor's
|
||||
`*CListMempool` with all the selected transactions. Note, the caller must also
|
||||
explicitly obtain a write-lock on the reactor's `*CListMempool`. This call
|
||||
will remove all the supplied transactions from the `txsMap` and the `*CList`, both
|
||||
of which obtain their own respective write-locks. In addition, the transaction
|
||||
may also be removed from the `cache` which obtains it's own write-lock.
|
||||
|
||||
## Alternative Approaches
|
||||
|
||||
When considering which approach to take for a priority-based flexible and
|
||||
performant mempool, there are two core candidates. The first candidate is less
|
||||
invasive in the required set of protocol and implementation changes, which
|
||||
simply extends the existing `CheckTx` ABCI method. The second candidate essentially
|
||||
involves the introduction of new ABCI method(s) and would require a higher degree
|
||||
of complexity in protocol and implementation changes, some of which may either
|
||||
overlap or conflict with the upcoming introduction of [ABCI++](https://github.com/tendermint/tendermint/blob/master/docs/rfc/rfc-013-abci%2B%2B.md).
|
||||
|
||||
For more information on the various approaches and proposals, please see the
|
||||
[mempool discussion](https://github.com/tendermint/tendermint/discussions/6295).
|
||||
|
||||
## Prior Art
|
||||
|
||||
### Ethereum
|
||||
|
||||
The Ethereum mempool, specifically [Geth](https://github.com/ethereum/go-ethereum),
|
||||
contains a mempool, `*TxPool`, that contains various mappings indexed by account,
|
||||
such as a `pending` which contains all processable transactions for accounts
|
||||
prioritized by nonce. It also contains a `queue` which is the exact same mapping
|
||||
except it contains not currently processable transactions. The mempool also
|
||||
contains a `priced` index of type `*txPricedList` that is a priority queue based
|
||||
on transaction price.
|
||||
|
||||
### Diem
|
||||
|
||||
The [Diem mempool](https://github.com/diem/diem/blob/master/mempool/README.md#implementation-details)
|
||||
contains a similar approach to the one we propose. Specifically, the Diem mempool
|
||||
contains a mapping from `Account:[]Tx`. On top of this primary mapping from account
|
||||
to a list of transactions, are various indexes used to perform certain actions.
|
||||
|
||||
The main index, `PriorityIndex`. is an ordered queue of transactions that are
|
||||
“consensus-ready” (i.e., they have a sequence number which is sequential to the
|
||||
current sequence number for the account). This queue is ordered by gas price so
|
||||
that if a client is willing to pay more (than other clients) per unit of
|
||||
execution, then they can enter consensus earlier.
|
||||
|
||||
## Decision
|
||||
|
||||
To incorporate a priority-based flexible and performant mempool in Tendermint Core,
|
||||
we will introduce new fields, `priority` and `sender`, into the `ResponseCheckTx`
|
||||
type.
|
||||
|
||||
We will introduce a new versioned mempool reactor, `v1` and assume an implicit
|
||||
version of the current mempool reactor as `v0`. In the new `v1` mempool reactor,
|
||||
we largely keep the functionality the same as `v0` except we augment the underlying
|
||||
data structures. Specifically, we keep a mapping of senders to transaction objects.
|
||||
On top of this mapping, we index transactions to provide the ability to efficiently
|
||||
gossip and reap transactions by priority.
|
||||
|
||||
## Detailed Design
|
||||
|
||||
### CheckTx
|
||||
|
||||
We introduce the following new fields into the `ResponseCheckTx` type:
|
||||
|
||||
```diff
|
||||
message ResponseCheckTx {
|
||||
uint32 code = 1;
|
||||
bytes data = 2;
|
||||
string log = 3; // nondeterministic
|
||||
string info = 4; // nondeterministic
|
||||
int64 gas_wanted = 5 [json_name = "gas_wanted"];
|
||||
int64 gas_used = 6 [json_name = "gas_used"];
|
||||
repeated Event events = 7 [(gogoproto.nullable) = false, (gogoproto.jsontag) = "events,omitempty"];
|
||||
string codespace = 8;
|
||||
+ int64 priority = 9;
|
||||
+ string sender = 10;
|
||||
}
|
||||
```
|
||||
|
||||
It is entirely up the application in determining how these fields are populated
|
||||
and with what values, e.g. the `sender` could be the signer and fee payer
|
||||
of the transaction, the `priority` could be the cumulative sum of the fee(s).
|
||||
|
||||
Only `sender` is required, while `priority` can be omitted which would result in
|
||||
using the default value of zero.
|
||||
|
||||
### Mempool
|
||||
|
||||
The existing concurrent-safe linked-list will be replaced by a thread-safe map
|
||||
of `<sender:*Tx>`, i.e a mapping from `sender` to a single `*Tx` object, where
|
||||
each `*Tx` is the next valid and processable transaction from the given `sender`.
|
||||
|
||||
On top of this mapping, we index all transactions by priority using a thread-safe
|
||||
priority queue, i.e. a [max heap](https://en.wikipedia.org/wiki/Min-max_heap).
|
||||
When a proposer is ready to select transactions for the next block proposal,
|
||||
transactions are selected from this priority index by highest priority order.
|
||||
When a transaction is selected and reaped, it is removed from this index and
|
||||
from the `<sender:*Tx>` mapping.
|
||||
|
||||
We define `Tx` as the following data structure:
|
||||
|
||||
```go
|
||||
type Tx struct {
|
||||
// Tx represents the raw binary transaction data.
|
||||
Tx []byte
|
||||
|
||||
// Priority defines the transaction's priority as specified by the application
|
||||
// in the ResponseCheckTx response.
|
||||
Priority int64
|
||||
|
||||
// Sender defines the transaction's sender as specified by the application in
|
||||
// the ResponseCheckTx response.
|
||||
Sender string
|
||||
|
||||
// Index defines the current index in the priority queue index. Note, if
|
||||
// multiple Tx indexes are needed, this field will be removed and each Tx
|
||||
// index will have its own wrapped Tx type.
|
||||
Index int
|
||||
}
|
||||
```
|
||||
|
||||
### Eviction
|
||||
|
||||
Upon successfully executing `CheckTx` for a new `Tx` and the mempool is currently
|
||||
full, we must check if there exists a `Tx` of lower priority that can be evicted
|
||||
to make room for the new `Tx` with higher priority and with sufficient size
|
||||
capacity left.
|
||||
|
||||
If such a `Tx` exists, we find it by obtaining a read lock and sorting the
|
||||
priority queue index. Once sorted, we find the first `Tx` with lower priority and
|
||||
size such that the new `Tx` would fit within the mempool's size limit. We then
|
||||
remove this `Tx` from the priority queue index as well as the `<sender:*Tx>`
|
||||
mapping.
|
||||
|
||||
This will require additional `O(n)` space and `O(n*log(n))` runtime complexity. Note that the space complexity does not depend on the size of the tx.
|
||||
|
||||
### Gossiping
|
||||
|
||||
We keep the existing thread-safe linked list as an additional index. Using this
|
||||
index, we can efficiently gossip transactions in the same manner as they are
|
||||
gossiped now (FIFO).
|
||||
|
||||
Gossiping transactions will not require locking any other indexes.
|
||||
|
||||
### Performance
|
||||
|
||||
Performance should largely remain unaffected apart from the space overhead of
|
||||
keeping an additional priority queue index and the case where we need to evict
|
||||
transactions from the priority queue index. There should be no reads which
|
||||
block writes on any index
|
||||
|
||||
## Future Improvements
|
||||
|
||||
There are a few considerable ways in which the proposed design can be improved or
|
||||
expanded upon. Namely, transaction gossiping and for the ability to support
|
||||
multiple transactions from the same `sender`.
|
||||
|
||||
With regards to transaction gossiping, we need empirically validate whether we
|
||||
need to gossip by priority. In addition, the current method of gossiping may not
|
||||
be the most efficient. Specifically, broadcasting all the transactions a node
|
||||
has in it's mempool to it's peers. Rather, we should explore for the ability to
|
||||
gossip transactions on a request/response basis similar to Ethereum and other
|
||||
protocols. Not only does this reduce bandwidth and complexity, but also allows
|
||||
for us to explore gossiping by priority or other dimensions more efficiently.
|
||||
|
||||
Allowing for multiple transactions from the same `sender` is important and will
|
||||
most likely be a needed feature in the future development of the mempool, but for
|
||||
now it suffices to have the preliminary design agreed upon. Having the ability
|
||||
to support multiple transactions per `sender` will require careful thought with
|
||||
regards to the interplay of the corresponding ABCI application. Regardless, the
|
||||
proposed design should allow for adaptations to support this feature in a
|
||||
non-contentious and backwards compatible manner.
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- Transactions are allowed to be prioritized by the application.
|
||||
|
||||
### Negative
|
||||
|
||||
- Increased size of the `ResponseCheckTx` Protocol Buffer type.
|
||||
- Causal ordering is NOT maintained.
|
||||
- It is possible that certain transactions broadcasted in a particular order may
|
||||
pass `CheckTx` but not end up being committed in a block because they fail
|
||||
`CheckTx` later. e.g. Consider Tx<sub>1</sub> that sends funds from existing
|
||||
account Alice to a _new_ account Bob with priority P<sub>1</sub> and then later
|
||||
Bob's _new_ account sends funds back to Alice in Tx<sub>2</sub> with P<sub>2</sub>,
|
||||
such that P<sub>2</sub> > P<sub>1</sub>. If executed in this order, both
|
||||
transactions will pass `CheckTx`. However, when a proposer is ready to select
|
||||
transactions for the next block proposal, they will select Tx<sub>2</sub> before
|
||||
Tx<sub>1</sub> and thus Tx<sub>2</sub> will _fail_ because Tx<sub>1</sub> must
|
||||
be executed first. This is because there is a _causal ordering_,
|
||||
Tx<sub>1</sub> ➝ Tx<sub>2</sub>. These types of situations should be rare as
|
||||
most transactions are not causally ordered and can be circumvented by simply
|
||||
trying again at a later point in time or by ensuring the "child" priority is
|
||||
lower than the "parent" priority. In other words, if parents always have
|
||||
priories that are higher than their children, then the new mempool design will
|
||||
maintain causal ordering.
|
||||
|
||||
### Neutral
|
||||
|
||||
- A transaction that passed `CheckTx` and entered the mempool can later be evicted
|
||||
at a future point in time if a higher priority transaction entered while the
|
||||
mempool was full.
|
||||
|
||||
## References
|
||||
|
||||
- [ABCI++](https://github.com/tendermint/tendermint/blob/master/docs/rfc/rfc-013-abci%2B%2B.md)
|
||||
- [Mempool Discussion](https://github.com/tendermint/tendermint/discussions/6295)
|
||||
@@ -1,97 +0,0 @@
|
||||
# ADR 068: Reverse Sync
|
||||
|
||||
## Changelog
|
||||
|
||||
- 20 April 2021: Initial Draft (@cmwaters)
|
||||
|
||||
## Status
|
||||
|
||||
Accepted
|
||||
|
||||
## Context
|
||||
|
||||
The advent of state sync and block pruning gave rise to the opportunity for full nodes to participate in consensus without needing complete block history. This also introduced a problem with respect to evidence handling. Nodes that didn't have all the blocks within the evidence age were incapable of validating evidence, thus halting if that evidence was committed on chain.
|
||||
|
||||
[ADR 068](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-068-reverse-sync.md) was published in response to this problem and modified the spec to add a minimum block history invariant. This predominantly sought to extend state sync so that it was capable of fetching and storing the `Header`, `Commit` and `ValidatorSet` (essentially a `LightBlock`) of the last `n` heights, where `n` was calculated based from the evidence age.
|
||||
|
||||
This ADR sets out to describe the design of this state sync extension as well as modifications to the light client provider and the merging of tm store.
|
||||
|
||||
## Decision
|
||||
|
||||
The state sync reactor will be extended by introducing 2 new P2P messages (and a new channel).
|
||||
|
||||
```protobuf
|
||||
message LightBlockRequest {
|
||||
uint64 height = 1;
|
||||
}
|
||||
|
||||
message LightBlockResponse {
|
||||
tendermint.types.LightBlock light_block = 1;
|
||||
}
|
||||
```
|
||||
|
||||
This will be used by the "reverse sync" protocol that will fetch, verify and store prior light blocks such that the node can safely participate in consensus.
|
||||
|
||||
Furthermore this allows for a new light client provider which offers the ability for the `StateProvider` to use the underlying P2P stack instead of RPC.
|
||||
|
||||
## Detailed Design
|
||||
|
||||
This section will focus first on the reverse sync (here we call it `backfill`) mechanism as a standalone protocol and then look to decribe how it integrates within the state sync reactor and how we define the new p2p light client provider.
|
||||
|
||||
```go
|
||||
// Backfill fetches, verifies, and stores necessary history
|
||||
// to participate in consensus and validate evidence.
|
||||
func (r *Reactor) backfill(state State) error {}
|
||||
```
|
||||
|
||||
`State` is used to work out how far to go back, namely we need all light blocks that have:
|
||||
- a height: `h >= state.LastBlockHeight - state.ConsensusParams.Evidence.MaxAgeNumBlocks`
|
||||
- a time: `t >= state.LastBlockTime - state.ConsensusParams.Evidence.MaxAgeDuration`
|
||||
|
||||
Reverse Sync relies on two components: A `Dispatcher` and a `BlockQueue`. The `Dispatcher` is a pattern taken from a similar [PR](https://github.com/tendermint/tendermint/pull/4508). It is wired to the `LightBlockChannel` and allows for concurrent light block requests by shifting through a linked list of peers. This abstraction has the nice quality that it can also be used as an array of light providers for a P2P based light client.
|
||||
|
||||
The `BlockQueue` is a data structure that allows for multiple workers to fetch light blocks, serializing them for the main thread which picks them off the end of the queue, verifies the hashes and persists them.
|
||||
|
||||
### Integration with State Sync
|
||||
|
||||
Reverse sync is a blocking process that runs directly after syncing state and before transitioning into either fast sync or consensus.
|
||||
|
||||
Prior, the state sync service was not connected to any db, instead it passed the state and commit back to the node. For reverse sync, state sync will be given access to both the `StateStore` and `BlockStore` to be able to write `Header`'s, `Commit`'s and `ValidatorSet`'s and read them so as to serve other state syncing peers.
|
||||
|
||||
This also means adding new methods to these respective stores in order to persist them
|
||||
|
||||
### P2P Light Client Provider
|
||||
|
||||
As mentioned previously, the `Dispatcher` is capable of handling requests to multiple peers. We can therefore simply peel off a `blockProvider` instance which is assigned to each peer. By giving it the chain ID, the `blockProvider` is capable of doing a basic validation of the light block before returning it to the client.
|
||||
|
||||
It's important to note that because state sync doesn't have access to the evidence channel it is incapable of allowing the light client to report evidence thus `ReportEvidence` is a no op. This is not too much of a concern for reverse sync but will need to be addressed for pure p2p light clients.
|
||||
|
||||
### Pruning
|
||||
|
||||
A final small note is with pruning. This ADR will introduce changes that will not allow an application to prune blocks that are within the evidence age.
|
||||
|
||||
## Future Work
|
||||
|
||||
This ADR tries to remain within the scope of extending state sync, however the changes made opens the door for several areas to be followed up:
|
||||
- Properly integrate p2p messaging in the light client package. This will require adding the evidence channel so the light client is capable of reporting evidence. We may also need to rethink the providers model (i.e. currently providers are only added on start up)
|
||||
- Merge and clean up the tendermint stores (state, block and evidence). This ADR adds new methods to both the state and block store for saving headers, commits and validator sets. This doesn't quite fit with the current struct (i.e. only `BlockMeta`s instead of `Header`s are saved). We should explore consolidating this for the sake of atomicity and the opportunity for batching. There are also other areas for changes such as the way we store block parts. See [here](https://github.com/tendermint/tendermint/issues/5383) and [here](https://github.com/tendermint/tendermint/issues/4630) for more context.
|
||||
- Explore opportunistic reverse sync. Technically we don't need to reverse sync if no evidence is observed. I've tried to design the protocol such that it could be possible to move it across to the evidence package if we see fit. Thus only when evidence is seen where we don't have the necessary data, do we perform a reverse sync. The problem with this is that imagine we are in consensus and some evidence pops up requiring us to first fetch and verify the last 10,000 blocks. There's no way a node could do this (sequentially) and vote before the round finishes. Also as we don't punish invalid evidence, a malicious node could easily spam the chain just to get a bunch of "stateless" nodes to perform a bunch of useless work.
|
||||
- Explore full reverse sync. Currently we only fetch light blocks. There might be benefits in the future to fetch and persist entire blocks especially if we give control to the application to do this.
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- All nodes should have sufficient history to validate all types of evidence
|
||||
- State syncing nodes can use the p2p layer for light client verification of state. This has better UX and could be faster but I haven't benchmarked.
|
||||
|
||||
### Negative
|
||||
|
||||
- Introduces more code = more maintenance
|
||||
|
||||
### Neutral
|
||||
|
||||
## References
|
||||
|
||||
- [Reverse Sync RFC](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-068-reverse-sync.md)
|
||||
- [Original Issue](https://github.com/tendermint/tendermint/issues/5617)
|
||||
@@ -1,268 +0,0 @@
|
||||
# ADR 069: Flexible Node Initialization
|
||||
|
||||
## Changlog
|
||||
|
||||
- 2021-06-09: Initial Draft (@tychoish)
|
||||
|
||||
- 2021-07-21: Major Revision (@tychoish)
|
||||
|
||||
## Status
|
||||
|
||||
Proposed.
|
||||
|
||||
## Context
|
||||
|
||||
In an effort to support [Go-API-Stability](./adr-060-go-api-stability.md),
|
||||
during the 0.35 development cycle, we have attempted to reduce the the API
|
||||
surface area by moving most of the interface of the `node` package into
|
||||
unexported functions, as well as moving the reactors to an `internal`
|
||||
package. Having this coincide with the 0.35 release made a lot of sense
|
||||
because these interfaces were _already_ changing as a result of the `p2p`
|
||||
[refactor](./adr-061-p2p-refactor-scope.md), so it made sense to think a bit
|
||||
more about how tendermint exposes this API.
|
||||
|
||||
While the interfaces of the P2P layer and most of the node package are already
|
||||
internalized, this precludes some operational patterns that are important to
|
||||
users who use tendermint as a library. Specifically, introspecting the
|
||||
tendermint node service and replacing components is not supported in the latest
|
||||
version of the code, and some of these use cases would require maintaining a
|
||||
vendor copy of the code. Adding these features requires rather extensive
|
||||
(internal/implementation) changes to the `node` and `rpc` packages, and this
|
||||
ADR describes a model for changing the way that tendermint nodes initialize, in
|
||||
service of providing this kind of functionality.
|
||||
|
||||
We consider node initialization, because the current implemention
|
||||
provides strong connections between all components, as well as between
|
||||
the components of the node and the RPC layer, and being able to think
|
||||
about the interactions of these components will help enable these
|
||||
features and help define the requirements of the node package.
|
||||
|
||||
## Alternative Approaches
|
||||
|
||||
These alternatives are presented to frame the design space and to
|
||||
contextualize the decision in terms of product requirements. These
|
||||
ideas are not inherently bad, and may even be possible or desireable
|
||||
in the (distant) future, and merely provide additional context for how
|
||||
we, in the moment came to our decision(s).
|
||||
|
||||
### Do Nothing
|
||||
|
||||
The current implementation is functional and sufficient for the vast
|
||||
majority of use cases (e.g., all users of the Cosmos-SDK as well as
|
||||
anyone who runs tendermint and the ABCI application in separate
|
||||
processes). In the current implementation, and even previous versions,
|
||||
modifying node initialization or injecting custom components required
|
||||
copying most of the `node` package, which required such users
|
||||
to maintain a vendored copy of tendermint.
|
||||
|
||||
While this is (likely) not tenable in the long term, as users do want
|
||||
more modularity, and the current service implementation is brittle and
|
||||
difficult to maintain, in the short term it may be possible to delay
|
||||
implementation somewhat. Eventually, however, we will need to make the
|
||||
`node` package easier to maintain and reason about.
|
||||
|
||||
### Generic Service Pluggability
|
||||
|
||||
One possible system design would export interfaces (in the Golang
|
||||
sense) for all components of the system, to permit runtime dependency
|
||||
injection of all components in the system, so that users can compose
|
||||
tendermint nodes of arbitrary user-supplied components.
|
||||
|
||||
Although this level of customization would provide benefits, it would be a huge
|
||||
undertaking (particularly with regards to API design work) that we do not have
|
||||
scope for at the moment. Eventually providing support for some kinds of
|
||||
pluggability may be useful, so the current solution does not explicitly
|
||||
foreclose the possibility of this alternative.
|
||||
|
||||
### Abstract Dependency Based Startup and Shutdown
|
||||
|
||||
The main proposal in this document makes tendermint node initialization simpler
|
||||
and more abstract, but the system lacks a number of
|
||||
features which daemon/service initialization could provide, such as a
|
||||
system allowing the authors of services to control initialization and shutdown order
|
||||
of components using dependency relationships.
|
||||
|
||||
Such a system could work by allowing services to declare
|
||||
initialization order dependencies to other reactors (by ID, perhaps)
|
||||
so that the node could decide the initialization based on the
|
||||
dependencies declared by services rather than requiring the node to
|
||||
encode this logic directly.
|
||||
|
||||
This level of configuration is probably more complicated than is needed. Given
|
||||
that the authors of components in the current implementation of tendermint
|
||||
already *do* need to know about other components, a dependency-based system
|
||||
would probably be overly-abstract at this stage.
|
||||
|
||||
## Decisions
|
||||
|
||||
- To the greatest extent possible, factor the code base so that
|
||||
packages are responsible for their own initialization, and minimize
|
||||
the amount of code in the `node` package itself.
|
||||
|
||||
- As a design goal, reduce direct coupling and dependencies between
|
||||
components in the implementation of `node`.
|
||||
|
||||
- Begin iterating on a more-flexible internal framework for
|
||||
initializing tendermint nodes to make the initatilization process
|
||||
less hard-coded by the implementation of the node objects.
|
||||
|
||||
- Reactors should not need to expose their interfaces *within* the
|
||||
implementation of the node type
|
||||
|
||||
- This refactoring should be entirely opaque to users.
|
||||
|
||||
- These node initialization changes should not require a
|
||||
reevaluation of the `service.Service` or a generic initialization
|
||||
orchestration framework.
|
||||
|
||||
- Do not proactively provide a system for injecting
|
||||
components/services within a tendtermint node, though make it
|
||||
possible to retrofit this kind of plugability in the future if
|
||||
needed.
|
||||
|
||||
- Prioritize implementation of p2p-based statesync reactor to obviate
|
||||
need for users to inject a custom state-sync provider.
|
||||
|
||||
## Detailed Design
|
||||
|
||||
The [current
|
||||
nodeImpl](https://github.com/tendermint/tendermint/blob/master/node/node.go#L47)
|
||||
includes direct references to the implementations of each of the
|
||||
reactors, which should be replaced by references to `service.Service`
|
||||
objects. This will require moving construction of the [rpc
|
||||
service](https://github.com/tendermint/tendermint/blob/master/node/node.go#L771)
|
||||
into the constructor of
|
||||
[makeNode](https://github.com/tendermint/tendermint/blob/master/node/node.go#L126). One
|
||||
possible implementation of this would be to eliminate the current
|
||||
`ConfigureRPC` method on the node package and instead [configure it
|
||||
here](https://github.com/tendermint/tendermint/pull/6798/files#diff-375d57e386f20eaa5f09f02bb9d28bfc48ac3dca18d0325f59492208219e5618R441).
|
||||
|
||||
To avoid adding complexity to the `node` package, we will add a
|
||||
composite service implementation to the `service` package
|
||||
that implements `service.Service` and is composed of a sequence of
|
||||
underlying `service.Service` objects and handles their
|
||||
startup/shutdown in the specified sequential order.
|
||||
|
||||
Consensus, blocksync (*née* fast sync), and statesync all depend on
|
||||
each other, and have significant initialization dependencies that are
|
||||
presently encoded in the `node` package. As part of this change, a
|
||||
new package/component (likely named `blocks` located at
|
||||
`internal/blocks`) will encapsulate the initialization of these block
|
||||
management areas of the code.
|
||||
|
||||
### Injectable Component Option
|
||||
|
||||
This section briefly describes a possible implementation for
|
||||
user-supplied services running within a node. This should not be
|
||||
implemented unless user-supplied components are a hard requirement for
|
||||
a user.
|
||||
|
||||
In order to allow components to be replaced, a new public function
|
||||
will be added to the public interface of `node` with a signature that
|
||||
resembles the following:
|
||||
|
||||
```go
|
||||
func NewWithServices(conf *config.Config,
|
||||
logger log.Logger,
|
||||
cf proxy.ClientCreator,
|
||||
gen *types.GenesisDoc,
|
||||
srvs []service.Service,
|
||||
) (service.Service, error) {
|
||||
```
|
||||
|
||||
The `service.Service` objects will be initialized in the order supplied, after
|
||||
all pre-configured/default services have started (and shut down in reverse
|
||||
order). The given services may implement additional interfaces, allowing them
|
||||
to replace specific default services. `NewWithServices` will validate input
|
||||
service lists with the following rules:
|
||||
|
||||
- None of the services may already be running.
|
||||
- The caller may not supply more than one replacement reactor for a given
|
||||
default service type.
|
||||
|
||||
If callers violate any of these rules, `NewWithServices` will return
|
||||
an error. To retract support for this kind of operation in the future,
|
||||
the function can be modified to *always* return an error.
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- The node package will become easier to maintain.
|
||||
|
||||
- It will become easier to add additional services within tendermint
|
||||
nodes.
|
||||
|
||||
- It will become possible to replace default components in the node
|
||||
package without vendoring the tendermint repo and modifying internal
|
||||
code.
|
||||
|
||||
- The current end-to-end (e2e) test suite will be able to prevent any
|
||||
regressions, and the new functionality can be thoroughly unit tested.
|
||||
|
||||
- The scope of this project is very narrow, which minimizes risk.
|
||||
|
||||
### Negative
|
||||
|
||||
- This increases our reliance on the `service.Service` interface which
|
||||
is probably not an interface that we want to fully commit to.
|
||||
|
||||
- This proposal implements a fairly minimal set of functionality and
|
||||
leaves open the possibility for many additional features which are
|
||||
not included in the scope of this proposal.
|
||||
|
||||
### Neutral
|
||||
|
||||
N/A
|
||||
|
||||
## Open Questions
|
||||
|
||||
- To what extent does this new initialization framework need to accommodate
|
||||
the legacy p2p stack? Would it be possible to delay a great deal of this
|
||||
work to the 0.36 cycle to avoid this complexity?
|
||||
|
||||
- Answer: _depends on timing_, and the requirement to ship pluggable reactors in 0.35.
|
||||
|
||||
- Where should additional public types be exported for the 0.35
|
||||
release?
|
||||
|
||||
Related to the general project of API stabilization we want to deprecate
|
||||
the `types` package, and move its contents into a new `pkg` hierarchy;
|
||||
however, the design of the `pkg` interface is currently underspecified.
|
||||
If `types` is going to remain for the 0.35 release, then we should consider
|
||||
the impact of using multiple organizing modalities for this code within a
|
||||
single release.
|
||||
|
||||
## Future Work
|
||||
|
||||
- Improve or simplify the `service.Service` interface. There are some
|
||||
pretty clear limitations with this interface as written (there's no
|
||||
way to timeout slow startup or shut down, the cycle between the
|
||||
`service.BaseService` and `service.Service` implementations is
|
||||
troubling, the default panic in `OnReset` seems troubling.)
|
||||
|
||||
- As part of the refactor of `service.Service` have all services/nodes
|
||||
respect the lifetime of a `context.Context` object, and avoid the
|
||||
current practice of creating `context.Context` objects in p2p and
|
||||
reactor code. This would be required for in-process multi-tenancy.
|
||||
|
||||
- Support explicit dependencies between components and allow for
|
||||
parallel startup, so that different reactors can startup at the same
|
||||
time, where possible.
|
||||
|
||||
## References
|
||||
|
||||
- [the component
|
||||
graph](https://peter.bourgon.org/go-for-industrial-programming/#the-component-graph)
|
||||
as a framing for internal service construction.
|
||||
|
||||
## Appendix
|
||||
|
||||
### Dependencies
|
||||
|
||||
There's a relationship between the blockchain and consensus reactor
|
||||
described by the following dependency graph makes replacing some of
|
||||
these components more difficult relative to other reactors or
|
||||
components.
|
||||
|
||||

|
||||
@@ -1,333 +0,0 @@
|
||||
# ADR 71: Proposer-Based Timestamps
|
||||
|
||||
## Changelog
|
||||
|
||||
- July 15 2021: Created by @williambanfield
|
||||
- Aug 4 2021: Draft completed by @williambanfield
|
||||
- Aug 5 2021: Draft updated to include data structure changes by @williambanfield
|
||||
- Aug 20 2021: Language edits completed by @williambanfield
|
||||
- Oct 25 2021: Update the ADR to match updated spec from @cason by @williambanfield
|
||||
- Nov 10 2021: Additional language updates by @williambanfield per feedback from @cason
|
||||
- Feb 2 2022: Synchronize logic for timely with latest version of the spec by @williambanfield
|
||||
|
||||
## Status
|
||||
|
||||
**Accepted**
|
||||
|
||||
## Context
|
||||
|
||||
Tendermint currently provides a monotonically increasing source of time known as [BFTTime](https://github.com/tendermint/tendermint/blob/master/spec/consensus/bft-time.md).
|
||||
This mechanism for producing a source of time is reasonably simple.
|
||||
Each correct validator adds a timestamp to each `Precommit` message it sends.
|
||||
The timestamp it sends is either the validator's current known Unix time or one millisecond greater than the previous block time, depending on which value is greater.
|
||||
When a block is produced, the proposer chooses the block timestamp as the weighted median of the times in all of the `Precommit` messages the proposer received.
|
||||
The weighting is proportional to the amount of voting power, or stake, a validator has on the network.
|
||||
This mechanism for producing timestamps is both deterministic and byzantine fault tolerant.
|
||||
|
||||
This current mechanism for producing timestamps has a few drawbacks.
|
||||
Validators do not have to agree at all on how close the selected block timestamp is to their own currently known Unix time.
|
||||
Additionally, any amount of voting power `>1/3` may directly control the block timestamp.
|
||||
As a result, it is quite possible that the timestamp is not particularly meaningful.
|
||||
|
||||
These drawbacks present issues in the Tendermint protocol.
|
||||
Timestamps are used by light clients to verify blocks.
|
||||
Light clients rely on correspondence between their own currently known Unix time and the block timestamp to verify blocks they see;
|
||||
However, their currently known Unix time may be greatly divergent from the block timestamp as a result of the limitations of `BFTTime`.
|
||||
|
||||
The proposer-based timestamps specification suggests an alternative approach for producing block timestamps that remedies these issues.
|
||||
Proposer-based timestamps alter the current mechanism for producing block timestamps in two main ways:
|
||||
|
||||
1. The block proposer is amended to offer up its currently known Unix time as the timestamp for the next block instead of the `BFTTime`.
|
||||
1. Correct validators only approve the proposed block timestamp if it is close enough to their own currently known Unix time.
|
||||
|
||||
The result of these changes is a more meaningful timestamp that cannot be controlled by `<= 2/3` of the validator voting power.
|
||||
This document outlines the necessary code changes in Tendermint to implement the corresponding [proposer-based timestamps specification](https://github.com/tendermint/tendermint/tree/master/spec/consensus/proposer-based-timestamp).
|
||||
|
||||
## Alternative Approaches
|
||||
|
||||
### Remove timestamps altogether
|
||||
|
||||
Computer clocks are bound to skew for a variety of reasons.
|
||||
Using timestamps in our protocol means either accepting the timestamps as not reliable or impacting the protocol’s liveness guarantees.
|
||||
This design requires impacting the protocol’s liveness in order to make the timestamps more reliable.
|
||||
An alternate approach is to remove timestamps altogether from the block protocol.
|
||||
`BFTTime` is deterministic but may be arbitrarily inaccurate.
|
||||
However, having a reliable source of time is quite useful for applications and protocols built on top of a blockchain.
|
||||
|
||||
We therefore decided not to remove the timestamp.
|
||||
Applications often wish for some transactions to occur on a certain day, on a regular period, or after some time following a different event.
|
||||
All of these require some meaningful representation of agreed upon time.
|
||||
The following protocols and application features require a reliable source of time:
|
||||
* Tendermint Light Clients [rely on correspondence between their known time](https://github.com/tendermint/tendermint/blob/master/spec/light-client/verification/README.md#definitions-1) and the block time for block verification.
|
||||
* Tendermint Evidence validity is determined [either in terms of heights or in terms of time](https://github.com/tendermint/tendermint/blob/8029cf7a0fcc89a5004e173ec065aa48ad5ba3c8/spec/consensus/evidence.md#verification).
|
||||
* Unbonding of staked assets in the Cosmos Hub [occurs after a period of 21 days](https://github.com/cosmos/governance/blob/ce75de4019b0129f6efcbb0e752cd2cc9e6136d3/params-change/Staking.md#unbondingtime).
|
||||
* IBC packets can use either a [timestamp or a height to timeout packet delivery](https://docs.cosmos.network/v0.44/ibc/overview.html#acknowledgements)
|
||||
|
||||
Finally, inflation distribution in the Cosmos Hub uses an approximation of time to calculate an annual percentage rate.
|
||||
This approximation of time is calculated using [block heights with an estimated number of blocks produced in a year](https://github.com/cosmos/governance/blob/master/params-change/Mint.md#blocksperyear).
|
||||
Proposer-based timestamps will allow this inflation calculation to use a more meaningful and accurate source of time.
|
||||
|
||||
|
||||
## Decision
|
||||
|
||||
Implement proposer-based timestamps and remove `BFTTime`.
|
||||
|
||||
## Detailed Design
|
||||
|
||||
### Overview
|
||||
|
||||
Implementing proposer-based timestamps will require a few changes to Tendermint’s code.
|
||||
These changes will be to the following components:
|
||||
* The `internal/consensus/` package.
|
||||
* The `state/` package.
|
||||
* The `Vote`, `CommitSig` and `Header` types.
|
||||
* The consensus parameters.
|
||||
|
||||
### Changes to `CommitSig`
|
||||
|
||||
The [CommitSig](https://github.com/tendermint/tendermint/blob/a419f4df76fe4aed668a6c74696deabb9fe73211/types/block.go#L604) struct currently contains a timestamp.
|
||||
This timestamp is the current Unix time known to the validator when it issued a `Precommit` for the block.
|
||||
This timestamp is no longer used and will be removed in this change.
|
||||
|
||||
`CommitSig` will be updated as follows:
|
||||
|
||||
```diff
|
||||
type CommitSig struct {
|
||||
BlockIDFlag BlockIDFlag `json:"block_id_flag"`
|
||||
ValidatorAddress Address `json:"validator_address"`
|
||||
-- Timestamp time.Time `json:"timestamp"`
|
||||
Signature []byte `json:"signature"`
|
||||
}
|
||||
```
|
||||
|
||||
### Changes to `Vote` messages
|
||||
|
||||
`Precommit` and `Prevote` messages use a common [Vote struct](https://github.com/tendermint/tendermint/blob/a419f4df76fe4aed668a6c74696deabb9fe73211/types/vote.go#L50).
|
||||
This struct currently contains a timestamp.
|
||||
This timestamp is set using the [voteTime](https://github.com/tendermint/tendermint/blob/e8013281281985e3ada7819f42502b09623d24a0/internal/consensus/state.go#L2241) function and therefore vote times correspond to the current Unix time known to the validator, provided this time is greater than the timestamp of the previous block.
|
||||
For precommits, this timestamp is used to construct the [CommitSig that is included in the block in the LastCommit](https://github.com/tendermint/tendermint/blob/e8013281281985e3ada7819f42502b09623d24a0/types/block.go#L754) field.
|
||||
For prevotes, this field is currently unused.
|
||||
Proposer-based timestamps will use the timestamp that the proposer sets into the block and will therefore no longer require that a timestamp be included in the vote messages.
|
||||
This timestamp is therefore no longer useful as part of consensus and may optionally be dropped from the message.
|
||||
|
||||
`Vote` will be updated as follows:
|
||||
|
||||
```diff
|
||||
type Vote struct {
|
||||
Type tmproto.SignedMsgType `json:"type"`
|
||||
Height int64 `json:"height"`
|
||||
Round int32 `json:"round"`
|
||||
BlockID BlockID `json:"block_id"` // zero if vote is nil.
|
||||
-- Timestamp time.Time `json:"timestamp"`
|
||||
ValidatorAddress Address `json:"validator_address"`
|
||||
ValidatorIndex int32 `json:"validator_index"`
|
||||
Signature []byte `json:"signature"`
|
||||
}
|
||||
```
|
||||
|
||||
### New consensus parameters
|
||||
|
||||
The proposer-based timestamp specification includes a pair of new parameters that must be the same among all validators.
|
||||
These parameters are `PRECISION`, and `MSGDELAY`.
|
||||
|
||||
The `PRECISION` and `MSGDELAY` parameters are used to determine if the proposed timestamp is acceptable.
|
||||
A validator will only Prevote a proposal if the proposal timestamp is considered `timely`.
|
||||
A proposal timestamp is considered `timely` if it is within `PRECISION` and `MSGDELAY` of the Unix time known to the validator.
|
||||
More specifically, a proposal timestamp is `timely` if `proposalTimestamp - PRECISION ≤ validatorLocalTime ≤ proposalTimestamp + PRECISION + MSGDELAY`.
|
||||
|
||||
Because the `PRECISION` and `MSGDELAY` parameters must be the same across all validators, they will be added to the [consensus parameters](https://github.com/tendermint/spec/blob/master/proto/tendermint/types/params.proto#L11) as [durations](https://developers.google.com/protocol-buffers/docs/reference/google.protobuf#google.protobuf.Duration).
|
||||
|
||||
The consensus parameters will be updated to include this `Synchrony` field as follows:
|
||||
|
||||
```diff
|
||||
type ConsensusParams struct {
|
||||
Block BlockParams `json:"block"`
|
||||
Evidence EvidenceParams `json:"evidence"`
|
||||
Validator ValidatorParams `json:"validator"`
|
||||
Version VersionParams `json:"version"`
|
||||
++ Synchrony SynchronyParams `json:"synchrony"`
|
||||
}
|
||||
```
|
||||
|
||||
```go
|
||||
type SynchronyParams struct {
|
||||
MessageDelay time.Duration `json:"message_delay"`
|
||||
Precision time.Duration `json:"precision"`
|
||||
}
|
||||
```
|
||||
|
||||
### Changes to the block proposal step
|
||||
|
||||
#### Proposer selects block timestamp
|
||||
|
||||
Tendermint currently uses the `BFTTime` algorithm to produce the block's `Header.Timestamp`.
|
||||
The [proposal logic](https://github.com/tendermint/tendermint/blob/68ca65f5d79905abd55ea999536b1a3685f9f19d/internal/state/state.go#L269) sets the weighted median of the times in the `LastCommit.CommitSigs` as the proposed block's `Header.Timestamp`.
|
||||
|
||||
In proposer-based timestamps, the proposer will still set a timestamp into the `Header.Timestamp`.
|
||||
The timestamp the proposer sets into the `Header` will change depending on if the block has previously received a [polka](https://github.com/tendermint/tendermint/blob/053651160f496bb44b107a434e3e6482530bb287/docs/introduction/what-is-tendermint.md#consensus-overview) or not.
|
||||
|
||||
#### Proposal of a block that has not previously received a polka
|
||||
|
||||
If a proposer is proposing a new block then it will set the Unix time currently known to the proposer into the `Header.Timestamp` field.
|
||||
The proposer will also set this same timestamp into the `Timestamp` field of the `Proposal` message that it issues.
|
||||
|
||||
#### Re-proposal of a block that has previously received a polka
|
||||
|
||||
If a proposer is re-proposing a block that has previously received a polka on the network, then the proposer does not update the `Header.Timestamp` of that block.
|
||||
Instead, the proposer simply re-proposes the exact same block.
|
||||
This way, the proposed block has the exact same block ID as the previously proposed block and the validators that have already received that block do not need to attempt to receive it again.
|
||||
|
||||
The proposer will set the re-proposed block's `Header.Timestamp` as the `Proposal` message's `Timestamp`.
|
||||
|
||||
#### Proposer waits
|
||||
|
||||
Block timestamps must be monotonically increasing.
|
||||
In `BFTTime`, if a validator’s clock was behind, the [validator added 1 millisecond to the previous block’s time and used that in its vote messages](https://github.com/tendermint/tendermint/blob/e8013281281985e3ada7819f42502b09623d24a0/internal/consensus/state.go#L2246).
|
||||
A goal of adding proposer-based timestamps is to enforce some degree of clock synchronization, so having a mechanism that completely ignores the Unix time of the validator time no longer works.
|
||||
Validator clocks will not be perfectly in sync.
|
||||
Therefore, the proposer’s current known Unix time may be less than the previous block's `Header.Time`.
|
||||
If the proposer’s current known Unix time is less than the previous block's `Header.Time`, the proposer will sleep until its known Unix time exceeds it.
|
||||
|
||||
This change will require amending the [defaultDecideProposal](https://github.com/tendermint/tendermint/blob/822893615564cb20b002dd5cf3b42b8d364cb7d9/internal/consensus/state.go#L1180) method.
|
||||
This method should now schedule a timeout that fires when the proposer’s time is greater than the previous block's `Header.Time`.
|
||||
When the timeout fires, the proposer will finally issue the `Proposal` message.
|
||||
|
||||
### Changes to proposal validation rules
|
||||
|
||||
The rules for validating a proposed block will be modified to implement proposer-based timestamps.
|
||||
We will change the validation logic to ensure that a proposal is `timely`.
|
||||
|
||||
Per the proposer-based timestamps spec, `timely` only needs to be checked if a block has not received a +2/3 majority of `Prevotes` in a round.
|
||||
If a block previously received a +2/3 majority of prevotes in a previous round, then +2/3 of the voting power considered the block's timestamp near enough to their own currently known Unix time in that round.
|
||||
|
||||
The validation logic will be updated to check `timely` for blocks that did not previously receive +2/3 prevotes in a round.
|
||||
Receiving +2/3 prevotes in a round is frequently referred to as a 'polka' and we will use this term for simplicity.
|
||||
|
||||
#### Current timestamp validation logic
|
||||
|
||||
To provide a better understanding of the changes needed to timestamp validation, we will first detail how timestamp validation works currently in Tendermint.
|
||||
|
||||
The [validBlock function](https://github.com/tendermint/tendermint/blob/c3ae6f5b58e07b29c62bfdc5715b6bf8ae5ee951/state/validation.go#L14) currently [validates the proposed block timestamp in three ways](https://github.com/tendermint/tendermint/blob/c3ae6f5b58e07b29c62bfdc5715b6bf8ae5ee951/state/validation.go#L118).
|
||||
First, the validation logic checks that this timestamp is greater than the previous block’s timestamp.
|
||||
|
||||
Second, it validates that the block timestamp is correctly calculated as the weighted median of the timestamps in the [block’s LastCommit](https://github.com/tendermint/tendermint/blob/e8013281281985e3ada7819f42502b09623d24a0/types/block.go#L48).
|
||||
|
||||
Finally, the validation logic authenticates the timestamps in the `LastCommit.CommitSig`.
|
||||
The cryptographic signature in each `CommitSig` is created by signing a hash of fields in the block with the voting validator’s private key.
|
||||
One of the items in this `signedBytes` hash is the timestamp in the `CommitSig`.
|
||||
To authenticate the `CommitSig` timestamp, the validator authenticating votes builds a hash of fields that includes the `CommitSig` timestamp and checks this hash against the signature.
|
||||
This takes place in the [VerifyCommit function](https://github.com/tendermint/tendermint/blob/e8013281281985e3ada7819f42502b09623d24a0/types/validation.go#L25).
|
||||
|
||||
#### Remove unused timestamp validation logic
|
||||
|
||||
`BFTTime` validation is no longer applicable and will be removed.
|
||||
This means that validators will no longer check that the block timestamp is a weighted median of `LastCommit` timestamps.
|
||||
Specifically, we will remove the call to [MedianTime in the validateBlock function](https://github.com/tendermint/tendermint/blob/4db71da68e82d5cb732b235eeb2fd69d62114b45/state/validation.go#L117).
|
||||
The `MedianTime` function can be completely removed.
|
||||
|
||||
Since `CommitSig`s will no longer contain a timestamp, the validator authenticating a commit will no longer include the `CommitSig` timestamp in the hash of fields it builds to check against the cryptographic signature.
|
||||
|
||||
#### Timestamp validation when a block has not received a polka
|
||||
|
||||
The [POLRound](https://github.com/tendermint/tendermint/blob/68ca65f5d79905abd55ea999536b1a3685f9f19d/types/proposal.go#L29) in the `Proposal` message indicates which round the block received a polka.
|
||||
A negative value in the `POLRound` field indicates that the block has not previously been proposed on the network.
|
||||
Therefore the validation logic will check for timely when `POLRound < 0`.
|
||||
|
||||
When a validator receives a `Proposal` message, the validator will check that the `Proposal.Timestamp` is at most `PRECISION` greater than the current Unix time known to the validator, and at maximum `PRECISION + MSGDELAY` less than the current Unix time known to the validator.
|
||||
If the timestamp is not within these bounds, the proposed block will not be considered `timely`.
|
||||
|
||||
Once a full block matching the `Proposal` message is received, the validator will also check that the timestamp in the `Header.Timestamp` of the block matches this `Proposal.Timestamp`.
|
||||
Using the `Proposal.Timestamp` to check `timely` allows for the `MSGDELAY` parameter to be more finely tuned since `Proposal` messages do not change sizes and are therefore faster to gossip than full blocks across the network.
|
||||
|
||||
A validator will also check that the proposed timestamp is greater than the timestamp of the block for the previous height.
|
||||
If the timestamp is not greater than the previous block's timestamp, the block will not be considered valid, which is the same as the current logic.
|
||||
|
||||
#### Timestamp validation when a block has received a polka
|
||||
|
||||
When a block is re-proposed that has already received a +2/3 majority of `Prevote`s on the network, the `Proposal` message for the re-proposed block is created with a `POLRound` that is `>= 0`.
|
||||
A validator will not check that the `Proposal` is `timely` if the propose message has a non-negative `POLRound`.
|
||||
If the `POLRound` is non-negative, each validator will simply ensure that it received the `Prevote` messages for the proposed block in the round indicated by `POLRound`.
|
||||
|
||||
If the validator does not receive `Prevote` messages for the proposed block before the proposal timeout, then it will prevote nil.
|
||||
Validators already check that +2/3 prevotes were seen in `POLRound`, so this does not represent a change to the prevote logic.
|
||||
|
||||
A validator will also check that the proposed timestamp is greater than the timestamp of the block for the previous height.
|
||||
If the timestamp is not greater than the previous block's timestamp, the block will not be considered valid, which is the same as the current logic.
|
||||
|
||||
Additionally, this validation logic can be updated to check that the `Proposal.Timestamp` matches the `Header.Timestamp` of the proposed block, but it is less relevant since checking that votes were received is sufficient to ensure the block timestamp is correct.
|
||||
|
||||
#### Relaxation of the 'Timely' check
|
||||
|
||||
The `Synchrony` parameters, `MessageDelay` and `Precision` provide a means to bound the timestamp of a proposed block.
|
||||
Selecting values that are too small presents a possible liveness issue for the network.
|
||||
If a Tendermint network selects a `MessageDelay` parameter that does not accurately reflect the time to broadcast a proposal message to all of the validators on the network, nodes will begin rejecting proposals from otherwise correct proposers because these proposals will appear to be too far in the past.
|
||||
|
||||
`MessageDelay` and `Precision` are planned to be configured as `ConsensusParams`.
|
||||
A very common way to update `ConsensusParams` is by executing a transaction included in a block that specifies new values for them.
|
||||
However, if the network is unable to produce blocks because of this liveness issue, no such transaction may be executed.
|
||||
To prevent this dangerous condition, we will add a relaxation mechanism to the `Timely` predicate.
|
||||
If consensus takes more than 10 rounds to produce a block for any reason, the `MessageDelay` will be doubled.
|
||||
This doubling will continue for each subsequent 10 rounds of consensus.
|
||||
This will enable chains that selected too small of a value for the `MessageDelay` parameter to eventually issue a transaction and readjust the parameters to more accurately reflect the broadcast time.
|
||||
|
||||
This liveness issue is not as problematic for chains with very small `Precision` values.
|
||||
Operators can more easily readjust local validator clocks to be more aligned.
|
||||
Additionally, chains that wish to increase a small `Precision` value can still take advantage of the `MessageDelay` relaxation, waiting for the `MessageDelay` value to grow significantly and issuing proposals with timestamps that are far in the past of their peers.
|
||||
|
||||
For more discussion of this, see [issue 371](https://github.com/tendermint/spec/issues/371).
|
||||
|
||||
### Changes to the prevote step
|
||||
|
||||
Currently, a validator will prevote a proposal in one of three cases:
|
||||
|
||||
* Case 1: Validator has no locked block and receives a valid proposal.
|
||||
* Case 2: Validator has a locked block and receives a valid proposal matching its locked block.
|
||||
* Case 3: Validator has a locked block, sees a valid proposal not matching its locked block but sees +2/3 prevotes for the proposal’s block, either in the current round or in a round greater than or equal to the round in which it locked its locked block.
|
||||
|
||||
The only change we will make to the prevote step is to what a validator considers a valid proposal as detailed above.
|
||||
|
||||
### Changes to the precommit step
|
||||
|
||||
The precommit step will not require much modification.
|
||||
Its proposal validation rules will change in the same ways that validation will change in the prevote step with the exception of the `timely` check: precommit validation will never check that the timestamp is `timely`.
|
||||
|
||||
### Remove voteTime Completely
|
||||
|
||||
[voteTime](https://github.com/tendermint/tendermint/blob/822893615564cb20b002dd5cf3b42b8d364cb7d9/internal/consensus/state.go#L2229) is a mechanism for calculating the next `BFTTime` given both the validator's current known Unix time and the previous block timestamp.
|
||||
If the previous block timestamp is greater than the validator's current known Unix time, then voteTime returns a value one millisecond greater than the previous block timestamp.
|
||||
This logic is used in multiple places and is no longer needed for proposer-based timestamps.
|
||||
It should therefore be removed completely.
|
||||
|
||||
## Future Improvements
|
||||
|
||||
* Implement BLS signature aggregation.
|
||||
By removing fields from the `Precommit` messages, we are able to aggregate signatures.
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
* `<2/3` of validators can no longer influence block timestamps.
|
||||
* Block timestamp will have stronger correspondence to real time.
|
||||
* Improves the reliability of light client block verification.
|
||||
* Enables BLS signature aggregation.
|
||||
* Enables evidence handling to use time instead of height for evidence validity.
|
||||
|
||||
### Neutral
|
||||
|
||||
* Alters Tendermint’s liveness properties.
|
||||
Liveness now requires that all correct validators have synchronized clocks within a bound.
|
||||
Liveness will now also require that validators’ clocks move forward, which was not required under `BFTTime`.
|
||||
|
||||
### Negative
|
||||
|
||||
* May increase the length of the propose step if there is a large skew between the previous proposer and the current proposer’s local Unix time.
|
||||
This skew will be bound by the `PRECISION` value, so it is unlikely to be too large.
|
||||
|
||||
* Current chains with block timestamps far in the future will either need to pause consensus until after the erroneous block timestamp or must maintain synchronized but very inaccurate clocks.
|
||||
|
||||
## References
|
||||
|
||||
* [PBTS Spec](https://github.com/tendermint/tendermint/tree/master/spec/consensus/proposer-based-timestamp)
|
||||
* [BFTTime spec](https://github.com/tendermint/spec/blob/master/spec/consensus/bft-time.md)
|
||||
* [Issue 371](https://github.com/tendermint/spec/issues/371)
|
||||
@@ -1,105 +0,0 @@
|
||||
# ADR 72: Restore Requests for Comments
|
||||
|
||||
## Changelog
|
||||
|
||||
- 20-Aug-2021: Initial draft (@creachadair)
|
||||
|
||||
## Status
|
||||
|
||||
Implemented
|
||||
|
||||
## Context
|
||||
|
||||
In the past, we kept a collection of Request for Comments (RFC) documents in `docs/rfc`.
|
||||
Prior to the creation of the ADR process, these documents were used to document
|
||||
design and implementation decisions about Tendermint Core. The RFC directory
|
||||
was removed in favor of ADRs, in commit 3761aa69 (PR
|
||||
[\#6345](https://github.com/tendermint/tendermint/pull/6345)).
|
||||
|
||||
For issues where an explicit design decision or implementation change is
|
||||
required, an ADR is generally preferable to an open-ended RFC: An ADR is
|
||||
relatively narrowly-focused, identifies a specific design or implementation
|
||||
question, and documents the consensus answer to that question.
|
||||
|
||||
Some discussions are more open-ended, however, or don't require a specific
|
||||
decision to be made (yet). Such conversations are still valuable to document,
|
||||
and several members of the Tendermint team have been doing so by writing gists
|
||||
or Google docs to share them around. That works well enough in the moment, but
|
||||
gists do not support any kind of collaborative editing, and both gists and docs
|
||||
are hard to discover after the fact. Google docs have much better collaborative
|
||||
editing, but are worse for discoverability, especially when contributors span
|
||||
different Google accounts.
|
||||
|
||||
Discoverability is important, because these kinds of open-ended discussions are
|
||||
useful to people who come later -- either as new team members or as outside
|
||||
contributors seeking to use and understand the thoughts behind our designs and
|
||||
the architectural decisions that arose from those discussion.
|
||||
|
||||
With these in mind, I propose that:
|
||||
|
||||
- We re-create a new, initially empty `docs/rfc` directory in the repository,
|
||||
and use it to capture these kinds of open-ended discussions in supplement to
|
||||
ADRs.
|
||||
|
||||
- Unlike in the previous RFC scheme, documents in this new directory will
|
||||
_not_ be used directly for decision-making. This is the key difference
|
||||
between an RFC and an ADR.
|
||||
|
||||
Instead, an RFC will exist to document background, articulate general
|
||||
principles, and serve as a historical record of discussion and motivation.
|
||||
|
||||
In this system, an RFC may _only_ result in a decision indirectly, via ADR
|
||||
documents created in response to the RFC.
|
||||
|
||||
**In short:** If a decision is required, write an ADR; otherwise if a
|
||||
sufficiently broad discussion is needed, write an RFC.
|
||||
|
||||
Just so that there is a consistent format, I also propose that:
|
||||
|
||||
- RFC files are named `rfc-XXX-title.{md,rst,txt}` and are written in plain
|
||||
text, Markdown, or ReStructured Text.
|
||||
|
||||
- Like an ADR, an RFC should include a high-level change log at the top of the
|
||||
document, and sections for:
|
||||
|
||||
* Abstract: A brief, high-level synopsis of the topic.
|
||||
* Background: Any background necessary to understand the topic.
|
||||
* Discussion: Detailed discussion of the issue being considered.
|
||||
|
||||
- Unlike an ADR, an RFC does _not_ include sections for Decisions, Detailed
|
||||
Design, or evaluation of proposed solutions. If an RFC leads to a proposal
|
||||
for an actual architectural change, that must be recorded in an ADR in the
|
||||
usual way, and may refer back to the RFC in its References section.
|
||||
|
||||
## Alternative Approaches
|
||||
|
||||
Leaving aside implementation details, the main alternative to this proposal is
|
||||
to leave things as they are now, with ADRs as the only log of record and other
|
||||
discussions being held informally in whatever medium is convenient at the time.
|
||||
|
||||
## Decision
|
||||
|
||||
(pending)
|
||||
|
||||
## Detailed Design
|
||||
|
||||
- Create a new `docs/rfc` directory in the `tendermint` repository. Note that
|
||||
this proposal intentionally does _not_ pull back the previous contents of
|
||||
that path from Git history, as those documents were appropriately merged into
|
||||
the ADR process.
|
||||
|
||||
- Create a `README.md` for RFCs that explains the rules and their relationship
|
||||
to ADRs.
|
||||
|
||||
- Create an `rfc-template.md` file for RFC files.
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- We will have a more discoverable place to record open-ended discussions that
|
||||
do not immediately result in a design change.
|
||||
|
||||
### Negative
|
||||
|
||||
- Potentially some people could be confused about the RFC/ADR distinction.
|
||||
@@ -1,235 +0,0 @@
|
||||
# ADR 073: Adopt LibP2P
|
||||
|
||||
## Changelog
|
||||
|
||||
- 2021-11-02: Initial Draft (@tychoish)
|
||||
|
||||
## Status
|
||||
|
||||
Proposed.
|
||||
|
||||
## Context
|
||||
|
||||
|
||||
As part of the 0.35 development cycle, the Tendermint team completed
|
||||
the first phase of the work described in ADRs 61 and 62, which included a
|
||||
large scale refactoring of the reactors and the p2p message
|
||||
routing. This replaced the switch and many of the other legacy
|
||||
components without breaking protocol or network-level
|
||||
interoperability and left the legacy connection/socket handling code.
|
||||
|
||||
Following the release, the team has reexamined the state of the code
|
||||
and the design, as well as Tendermint's requirements. The notes
|
||||
from that process are available in the [P2P Roadmap
|
||||
RFC][rfc].
|
||||
|
||||
This ADR supersedes the decisions made in ADRs 60 and 61, but
|
||||
builds on the completed portions of this work. Previously, the
|
||||
boundaries of peer management, message handling, and the higher level
|
||||
business logic (e.g., "the reactors") were intermingled, and core
|
||||
elements of the p2p system were responsible for the orchestration of
|
||||
higher-level business logic. Refactoring the legacy components
|
||||
made it more obvious that this entanglement of responsibilities
|
||||
had outsized influence on the entire implementation, making
|
||||
it difficult to iterate within the current abstractions.
|
||||
It would not be viable to maintain interoperability with legacy
|
||||
systems while also achieving many of our broader objectives.
|
||||
|
||||
LibP2P is a thoroughly-specified implementation of a peer-to-peer
|
||||
networking stack, designed specifically for systems such as
|
||||
ours. Adopting LibP2P as the basis of Tendermint will allow the
|
||||
Tendermint team to focus more of their time on other differentiating
|
||||
aspects of the system, and make it possible for the ecosystem as a
|
||||
whole to take advantage of tooling and efforts of the LibP2P
|
||||
platform.
|
||||
|
||||
## Alternative Approaches
|
||||
|
||||
As discussed in the [P2P Roadmap RFC][rfc], the primary alternative would be to
|
||||
continue development of Tendermint's home-grown peer-to-peer
|
||||
layer. While that would give the Tendermint team maximal control
|
||||
over the peer system, the current design is unexceptional on its
|
||||
own merits, and the prospective maintenance burden for this system
|
||||
exceeds our tolerances for the medium term.
|
||||
|
||||
Tendermint can and should differentiate itself not on the basis of
|
||||
its networking implementation or peer management tools, but providing
|
||||
a consistent operator experience, a battle-tested consensus algorithm,
|
||||
and an ergonomic user experience.
|
||||
|
||||
## Decision
|
||||
|
||||
Tendermint will adopt libp2p during the 0.37 development cycle,
|
||||
replacing the bespoke Tendermint P2P stack. This will remove the
|
||||
`Endpoint`, `Transport`, `Connection`, and `PeerManager` abstractions
|
||||
and leave the reactors, `p2p.Router` and `p2p.Channel`
|
||||
abstractions.
|
||||
|
||||
LibP2P may obviate the need for a dedicated peer exchange (PEX)
|
||||
reactor, which would also in turn obviate the need for a dedicated
|
||||
seed mode. If this is the case, then all of this functionality would
|
||||
be removed.
|
||||
|
||||
If it turns out (based on the advice of Protocol Labs) that it makes
|
||||
sense to maintain separate pubsub or gossipsub topics
|
||||
per-message-type, then the `Router` abstraction could also
|
||||
be entirely subsumed.
|
||||
|
||||
## Detailed Design
|
||||
|
||||
### Implementation Changes
|
||||
|
||||
The seams in the P2P implementation between the higher level
|
||||
constructs (reactors), the routing layer (`Router`) and the lower
|
||||
level connection and peer management code make this operation
|
||||
relatively straightforward to implement. A key
|
||||
goal in this design is to minimize the impact on the reactors
|
||||
(potentially entirely,) and completely remove the lower level
|
||||
components (e.g., `Transport`, `Connection` and `PeerManager`) using the
|
||||
separation afforded by the `Router` layer. The current state of the
|
||||
code makes these changes relatively surgical, and limited to a small
|
||||
number of methods:
|
||||
|
||||
- `p2p.Router.OpenChannel` will still return a `Channel` structure
|
||||
which will continue to serve as a pipe between the reactors and the
|
||||
`Router`. The implementation will no longer need the queue
|
||||
implementation, and will instead start goroutines that
|
||||
are responsible for routing the messages from the channel to libp2p
|
||||
fundamentals, replacing the current `p2p.Router.routeChannel`.
|
||||
|
||||
- The current `p2p.Router.dialPeers` and `p2p.Router.acceptPeers`,
|
||||
are responsible for establishing outbound and inbound connections,
|
||||
respectively. These methods will be removed, along with
|
||||
`p2p.Router.openConnection`, and the libp2p connection manager will
|
||||
be responsible for maintaining network connectivity.
|
||||
|
||||
- The `p2p.Channel` interface will change to replace Go
|
||||
channels with a more functional interface for sending messages.
|
||||
New methods on this object will take contexts to support safe
|
||||
cancellation, and return errors, and will block rather than
|
||||
running asynchronously. The `Out` channel through which
|
||||
reactors send messages to Peers, will be replaced by a `Send`
|
||||
method, and the Error channel will be replaced by an `Error`
|
||||
method.
|
||||
|
||||
- Reactors will be passed an interface that will allow them to
|
||||
access Peer information from libp2p. This will supplant the
|
||||
`p2p.PeerUpdates` subscription.
|
||||
|
||||
- Add some kind of heartbeat message at the application level
|
||||
(e.g. with a reactor,) potentially connected to libp2p's DHT to be
|
||||
used by reactors for service discovery, message targeting, or other
|
||||
features.
|
||||
|
||||
- Replace the existing/legacy handshake protocol with [Noise](http://www.noiseprotocol.org/noise.html).
|
||||
|
||||
This project will initially use the TCP-based transport protocols within
|
||||
libp2p. QUIC is also available as an option that we may implement later.
|
||||
We will not support mixed networks in the initial release, but will
|
||||
revisit that possibility later if there is a demonstrated need.
|
||||
|
||||
### Upgrade and Compatibility
|
||||
|
||||
Because the routers and all current P2P libraries are `internal`
|
||||
packages and not part of the public API, the only changes to the public
|
||||
API surface area of Tendermint will be different configuration
|
||||
file options, replacing the current P2P options with options relevant
|
||||
to libp2p.
|
||||
|
||||
However, it will not be possible to run a network with both networking
|
||||
stacks active at once, so the upgrade to the version of Tendermint
|
||||
will need to be coordinated between all nodes of the network. This is
|
||||
consistent with the expectations around upgrades for Tendermint moving
|
||||
forward, and will help manage both the complexity of the project and
|
||||
the implementation timeline.
|
||||
|
||||
## Open Questions
|
||||
|
||||
- What is the role of Protocol Labs in the implementation of libp2p in
|
||||
tendermint, both during the initial implementation and on an ongoing
|
||||
basis thereafter?
|
||||
|
||||
- Should all P2P traffic for a given node be pushed to a single topic,
|
||||
so that a topic maps to a specific ChainID, or should
|
||||
each reactor (or type of message) have its own topic? How many
|
||||
topics can a libp2p network support? Is there testing that validates
|
||||
the capabilities?
|
||||
|
||||
- Tendermint presently provides a very coarse QoS-like functionality
|
||||
using priorities based on message-type.
|
||||
This intuitively/theoretically ensures that evidence and consensus
|
||||
messages don't get starved by blocksync/statesync messages. It's
|
||||
unclear if we can or should attempt to replicate this with libp2p.
|
||||
|
||||
- What kind of QoS functionality does libp2p provide and what kind of
|
||||
metrics does libp2p provide about it's QoS functionality?
|
||||
|
||||
- Is it possible to store additional (and potentially arbitrary)
|
||||
information into the DHT as part of the heartbeats between nodes,
|
||||
such as the latest height, and then access that in the
|
||||
reactors. How frequently can the DHT be updated?
|
||||
|
||||
- Does it make sense to have reactors continue to consume inbound
|
||||
messages from a Channel (`In`) or is there another interface or
|
||||
pattern that we should consider?
|
||||
|
||||
- We should avoid exposing Go channels when possible, and likely
|
||||
some kind of alternate iterator likely makes sense for processing
|
||||
messages within the reactors.
|
||||
|
||||
- What are the security and protocol implications of tracking
|
||||
information from peer heartbeats and exposing that to reactors?
|
||||
|
||||
- How much (or how little) configuration can Tendermint provide for
|
||||
libp2p, particularly on the first release?
|
||||
|
||||
- In general, we should not support byo-functionality for libp2p
|
||||
components within Tendermint, and reduce the configuration surface
|
||||
area, as much as possible.
|
||||
|
||||
- What are the best ways to provide request/response semantics for
|
||||
reactors on top of libp2p? Will it be possible to add
|
||||
request/response semantics in a future release or is there
|
||||
anticipatory work that needs to be done as part of the initial
|
||||
release?
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- Reduce the maintenance burden for the Tendermint Core team by
|
||||
removing a large swath of legacy code that has proven to be
|
||||
difficult to modify safely.
|
||||
|
||||
- Remove the responsibility for maintaining and developing the entire
|
||||
peer management system (p2p) and stack.
|
||||
|
||||
- Provide users with a more stable peer and networking system,
|
||||
Tendermint can improve operator experience and network stability.
|
||||
|
||||
### Negative
|
||||
|
||||
- By deferring to library implementations for peer management and
|
||||
networking, Tendermint loses some flexibility for innovating at the
|
||||
peer and networking level. However, Tendermint should be innovating
|
||||
primarily at the consensus layer, and libp2p does not preclude
|
||||
optimization or development in the peer layer.
|
||||
|
||||
- Libp2p is a large dependency and Tendermint would become dependent
|
||||
upon Protocol Labs' release cycle and prioritization for bug
|
||||
fixes. If this proves onerous, it's possible to maintain a vendor
|
||||
fork of relevant components as needed.
|
||||
|
||||
### Neutral
|
||||
|
||||
- N/A
|
||||
|
||||
## References
|
||||
|
||||
- [ADR 61: P2P Refactor Scope][adr61]
|
||||
- [ADR 62: P2P Architecture][adr62]
|
||||
- [P2P Roadmap RFC][rfc]
|
||||
|
||||
[adr61]: ./adr-061-p2p-refactor-scope.md
|
||||
[adr62]: ./adr-062-p2p-architecture.md
|
||||
[rfc]: ../rfc/rfc-000-p2p-roadmap.rst
|
||||
@@ -1,203 +0,0 @@
|
||||
# ADR 74: Migrate Timeout Parameters to Consensus Parameters
|
||||
|
||||
## Changelog
|
||||
|
||||
- 03-Jan-2022: Initial draft (@williambanfield)
|
||||
- 13-Jan-2022: Updated to indicate work on upgrade path needed (@williambanfield)
|
||||
|
||||
## Status
|
||||
|
||||
Proposed
|
||||
|
||||
## Context
|
||||
|
||||
### Background
|
||||
|
||||
Tendermint's consensus timeout parameters are currently configured locally by each validator
|
||||
in the validator's [config.toml][config-toml].
|
||||
This means that the validators on a Tendermint network may have different timeouts
|
||||
from each other. There is no reason for validators on the same network to configure
|
||||
different timeout values. Proper functioning of the Tendermint consensus algorithm
|
||||
relies on these parameters being uniform across validators.
|
||||
|
||||
The configurable values are as follows:
|
||||
|
||||
* `TimeoutPropose`
|
||||
* How long the consensus algorithm waits for a proposal block before issuing a prevote.
|
||||
* If no prevote arrives by `TimeoutPropose`, then the consensus algorithm will issue a nil prevote.
|
||||
* `TimeoutProposeDelta`
|
||||
* How much the `TimeoutPropose` grows each round.
|
||||
* `TimeoutPrevote`
|
||||
* How long the consensus algorithm waits after receiving +2/3 prevotes with
|
||||
no quorum for a value before issuing a precommit for nil.
|
||||
(See the [arXiv paper][arxiv-paper], Algorithm 1, Line 34)
|
||||
* `TimeoutPrevoteDelta`
|
||||
* How much the `TimeoutPrevote` increases with each round.
|
||||
* `TimeoutPrecommit`
|
||||
* How long the consensus algorithm waits after receiving +2/3 precommits that
|
||||
do not have a quorum for a value before entering the next round.
|
||||
(See the [arXiv paper][arxiv-paper], Algorithm 1, Line 47)
|
||||
* `TimeoutPrecommitDelta`
|
||||
* How much the `TimeoutPrecommit` increases with each round.
|
||||
* `TimeoutCommit`
|
||||
* How long the consensus algorithm waits after committing a block but before starting the new height.
|
||||
* This gives a validator a chance to receive slow precommits.
|
||||
* `SkipTimeoutCommit`
|
||||
* Make progress as soon as the node has 100% of the precommits.
|
||||
|
||||
|
||||
### Overview of Change
|
||||
|
||||
We will consolidate the timeout parameters and migrate them from the node-local
|
||||
`config.toml` file into the network-global consensus parameters.
|
||||
|
||||
The 8 timeout parameters will be consolidated down to 6. These will be as follows:
|
||||
|
||||
* `TimeoutPropose`
|
||||
* Same as current `TimeoutPropose`.
|
||||
* `TimeoutProposeDelta`
|
||||
* Same as current `TimeoutProposeDelta`.
|
||||
* `TimeoutVote`
|
||||
* How long validators wait for votes in both the prevote
|
||||
and precommit phase of the consensus algorithm. This parameter subsumes
|
||||
the current `TimeoutPrevote` and `TimeoutPrecommit` parameters.
|
||||
* `TimeoutVoteDelta`
|
||||
* How much the `TimeoutVote` will grow each successive round.
|
||||
This parameter subsumes the current `TimeoutPrevoteDelta` and `TimeoutPrecommitDelta`
|
||||
parameters.
|
||||
* `TimeoutCommit`
|
||||
* Same as current `TimeoutCommit`.
|
||||
* `BypassCommitTimeout`
|
||||
* Same as current `SkipTimeoutCommit`, renamed for clarity.
|
||||
|
||||
A safe default will be provided by Tendermint for each of these parameters and
|
||||
networks will be able to update the parameters as they see fit. Local updates
|
||||
to these parameters will no longer be possible; instead, the application will control
|
||||
updating the parameters. Applications using the Cosmos SDK will be automatically be
|
||||
able to change the values of these consensus parameters [via a governance proposal][cosmos-sdk-consensus-params].
|
||||
|
||||
This change is low-risk. While parameters are locally configurable, many running chains
|
||||
do not change them from their default values. For example, initializing
|
||||
a node on Osmosis, Terra, and the Cosmos Hub using the their `init` command produces
|
||||
a `config.toml` with Tendermint's default values for these parameters.
|
||||
|
||||
### Why this parameter consolidation?
|
||||
|
||||
Reducing the number of parameters is good for UX. Fewer superfluous parameters makes
|
||||
running and operating a Tendermint network less confusing.
|
||||
|
||||
The Prevote and Precommit messages are both similar sizes, require similar amounts
|
||||
of processing so there is no strong need for them to be configured separately.
|
||||
|
||||
The `TimeoutPropose` parameter governs how long Tendermint will wait for the proposed
|
||||
block to be gossiped. Blocks are much larger than votes and therefore tend to be
|
||||
gossiped much more slowly. It therefore makes sense to keep `TimeoutPropose` and
|
||||
the `TimeoutProposeDelta` as parameters separate from the vote timeouts.
|
||||
|
||||
`TimeoutCommit` is used by chains to ensure that the network waits for the votes from
|
||||
slower validators before proceeding to the next height. Without this timeout, the votes
|
||||
from slower validators would consistently not be included in blocks and those validators
|
||||
would not be counted as 'up' from the chain's perspective. Being down damages a validator's
|
||||
reputation and causes potential stakers to think twice before delegating to that validator.
|
||||
|
||||
`TimeoutCommit` also prevents the network from producing the next height as soon as validators
|
||||
on the fastest hardware with a summed voting power of +2/3 of the network's total have
|
||||
completed execution of the block. Allowing the network to proceed as soon as the fastest
|
||||
+2/3 completed execution would have a cumulative effect over heights, eventually
|
||||
leaving slower validators unable to participate in consensus at all. `TimeoutCommit`
|
||||
therefore allows networks to have greater variability in hardware. Additional
|
||||
discussion of this can be found in [tendermint issue 5911][tendermint-issue-5911-comment]
|
||||
and [spec issue 359][spec-issue-359].
|
||||
|
||||
## Alternative Approaches
|
||||
|
||||
### Hardcode the parameters
|
||||
|
||||
Many Tendermint networks run on similar cloud-hosted infrastructure. Therefore,
|
||||
they have similar bandwidth and machine resources. The timings for propagating votes
|
||||
and blocks are likely to be reasonably similar across networks. As a result, the
|
||||
timeout parameters are good candidates for being hardcoded. Hardcoding the timeouts
|
||||
in Tendermint would mean entirely removing these parameters from any configuration
|
||||
that could be altered by either an application or a node operator. Instead,
|
||||
Tendermint would ship with a set of timeouts and all applications using Tendermint
|
||||
would use this exact same set of values.
|
||||
|
||||
While Tendermint nodes often run with similar bandwidth and on similar cloud-hosted
|
||||
machines, there are enough points of variability to make configuring
|
||||
consensus timeouts meaningful. Namely, Tendermint network topologies are likely to be
|
||||
very different from chain to chain. Additionally, applications may vary greatly in
|
||||
how long the `Commit` phase may take. Applications that perform more work during `Commit`
|
||||
require a longer `TimeoutCommit` to allow the application to complete its work
|
||||
and be prepared for the next height.
|
||||
|
||||
## Decision
|
||||
|
||||
The decision has been made to implement this work, with the caveat that the
|
||||
specific mechanism for introducing the new parameters to chains is still ongoing.
|
||||
|
||||
## Detailed Design
|
||||
|
||||
### New Consensus Parameters
|
||||
|
||||
A new `TimeoutParams` `message` will be added to the [params.proto file][consensus-params-proto].
|
||||
This message will have the following form:
|
||||
|
||||
```proto
|
||||
message TimeoutParams {
|
||||
google.protobuf.Duration propose = 1;
|
||||
google.protobuf.Duration propose_delta = 2;
|
||||
google.protobuf.Duration vote = 3;
|
||||
google.protobuf.Duration vote_delta = 4;
|
||||
google.protobuf.Duration commit = 5;
|
||||
bool bypass_commit_timeout = 6;
|
||||
}
|
||||
```
|
||||
|
||||
This new message will be added as a field into the [`ConsensusParams`
|
||||
message][consensus-params-proto]. The same default values that are [currently
|
||||
set for these parameters][current-timeout-defaults] in the local configuration
|
||||
file will be used as the defaults for these new consensus parameters in the
|
||||
[consensus parameter defaults][default-consensus-params].
|
||||
|
||||
The new consensus parameters will be subject to the same
|
||||
[validity rules][time-param-validation] as the current configuration values,
|
||||
namely, each value must be non-negative.
|
||||
|
||||
### Migration
|
||||
|
||||
The new `ConsensusParameters` will be added during an upcoming release. In this
|
||||
release, the old `config.toml` parameters will cease to control the timeouts and
|
||||
an error will be logged on nodes that continue to specify these values. The specific
|
||||
mechanism by which these parameters will added to a chain is being discussed in
|
||||
[RFC-009][rfc-009] and will be decided ahead of the next release.
|
||||
|
||||
The specific mechanism for adding these parameters depends on work related to
|
||||
[soft upgrades][soft-upgrades], which is still ongoing.
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
* Timeout parameters will be equal across all of the validators in a Tendermint network.
|
||||
* Remove superfluous timeout parameters.
|
||||
|
||||
### Negative
|
||||
|
||||
### Neutral
|
||||
|
||||
* Timeout parameters require consensus to change.
|
||||
|
||||
## References
|
||||
|
||||
[conseusus-params-proto]: https://github.com/tendermint/spec/blob/a00de7199f5558cdd6245bbbcd1d8405ccfb8129/proto/tendermint/types/params.proto#L11
|
||||
[hashed-params]: https://github.com/tendermint/tendermint/blob/7cdf560173dee6773b80d1c574a06489d4c394fe/types/params.go#L49
|
||||
[default-consensus-params]: https://github.com/tendermint/tendermint/blob/7cdf560173dee6773b80d1c574a06489d4c394fe/types/params.go#L79
|
||||
[current-timeout-defaults]: https://github.com/tendermint/tendermint/blob/7cdf560173dee6773b80d1c574a06489d4c394fe/config/config.go#L955
|
||||
[config-toml]: https://github.com/tendermint/tendermint/blob/5cc980698a3402afce76b26693ab54b8f67f038b/config/toml.go#L425-L440
|
||||
[cosmos-sdk-consensus-params]: https://github.com/cosmos/cosmos-sdk/issues/6197
|
||||
[time-param-validation]: https://github.com/tendermint/tendermint/blob/7cdf560173dee6773b80d1c574a06489d4c394fe/config/config.go#L1038
|
||||
[tendermint-issue-5911-comment]: https://github.com/tendermint/tendermint/issues/5911#issuecomment-973560381
|
||||
[spec-issue-359]: https://github.com/tendermint/spec/issues/359
|
||||
[arxiv-paper]: https://arxiv.org/pdf/1807.04938.pdf
|
||||
[soft-upgrades]: https://github.com/tendermint/spec/pull/222
|
||||
[rfc-009]: https://github.com/tendermint/tendermint/pull/7524
|
||||
@@ -1,684 +0,0 @@
|
||||
# ADR 075: RPC Event Subscription Interface
|
||||
|
||||
## Changelog
|
||||
|
||||
- 01-Mar-2022: Update long-polling interface (@creachadair).
|
||||
- 10-Feb-2022: Updates to reflect implementation.
|
||||
- 26-Jan-2022: Marked accepted.
|
||||
- 22-Jan-2022: Updated and expanded (@creachadair).
|
||||
- 20-Nov-2021: Initial draft (@creachadair).
|
||||
|
||||
---
|
||||
## Status
|
||||
|
||||
Accepted
|
||||
|
||||
---
|
||||
## Background & Context
|
||||
|
||||
For context, see [RFC 006: Event Subscription][rfc006].
|
||||
|
||||
The [Tendermint RPC service][rpc-service] permits clients to subscribe to the
|
||||
event stream generated by a consensus node. This allows clients to observe the
|
||||
state of the consensus network, including details of the consensus algorithm
|
||||
state machine, proposals, transaction delivery, and block completion. The
|
||||
application may also attach custom key-value attributes to events to expose
|
||||
application-specific details to clients.
|
||||
|
||||
The event subscription API in the RPC service currently comprises three methods:
|
||||
|
||||
1. `subscribe`: A request to subscribe to the events matching a specific
|
||||
[query expression][query-grammar]. Events can be filtered by their key-value
|
||||
attributes, including custom attributes provided by the application.
|
||||
|
||||
2. `unsubscribe`: A request to cancel an existing subscription based on its
|
||||
query expression.
|
||||
|
||||
3. `unsubscribe_all`: A request to cancel all existing subscriptions belonging
|
||||
to the client.
|
||||
|
||||
There are some important technical and UX issues with the current RPC event
|
||||
subscription API. The rest of this ADR outlines these problems in detail, and
|
||||
proposes a new API scheme intended to address them.
|
||||
|
||||
### Issue 1: Persistent connections
|
||||
|
||||
To subscribe to a node's event stream, a client needs a persistent connection
|
||||
to the node. Unlike the other methods of the service, for which each call is
|
||||
serviced by a short-lived HTTP round trip, subscription delivers a continuous
|
||||
stream of events to the client by hijacking the HTTP channel for a websocket.
|
||||
The stream (and hence the HTTP request) persists until either the subscription
|
||||
is explicitly cancelled, or the connection is closed.
|
||||
|
||||
There are several problems with this API:
|
||||
|
||||
1. **Expensive per-connection state**: The server must maintain a substantial
|
||||
amount of state per subscriber client:
|
||||
|
||||
- The current implementation uses a [WebSocket][ws] for each active
|
||||
subscriber. The connection must be maintained even if there are no
|
||||
matching events for a given client.
|
||||
|
||||
The server can drop idle connections to save resources, but doing so
|
||||
terminates all subscriptions on those connections and forces those clients
|
||||
to re-connect, adding additional resource churn for the server.
|
||||
|
||||
- In addition, the server maintains a separate buffer of undelivered events
|
||||
for each client. This is to reduce the dual risks that a client will miss
|
||||
events, and that a slow client could "push back" on the publisher,
|
||||
impeding the progress of consensus.
|
||||
|
||||
Because event traffic is quite bursty, queues can potentially take up a
|
||||
lot of memory. Moreover, each subscriber may have a different filter
|
||||
query, so the server winds up having to duplicate the same events among
|
||||
multiple subscriber queues. Not only does this add memory pressure, but it
|
||||
does so most at the worst possible time, i.e., when the server is already
|
||||
under load from high event traffic.
|
||||
|
||||
2. **Operational access control is difficult**: The server's websocket
|
||||
interface exposes _all_ the RPC service endpoints, not only the subscription
|
||||
methods. This includes methods that allow callers to inject arbitrary
|
||||
transactions (`broadcast_tx_*`) and evidence (`broadcast_evidence`) into the
|
||||
network, remove transactions (`remove_tx`), and request arbitrary amounts of
|
||||
chain state.
|
||||
|
||||
Filtering requests to the GET endpoint is straightforward: A reverse proxy
|
||||
like [nginx][nginx] can easily filter methods by URL path. Filtering POST
|
||||
requests takes a bit more work, but can be managed with a filter program
|
||||
that speaks [FastCGI][fcgi] and parses JSON-RPC request bodies.
|
||||
|
||||
Filtering the websocket interface requires a dedicated proxy implementation.
|
||||
Although nginx can [reverse-proxy websockets][rp-ws], it does not support
|
||||
filtering websocket traffic via FastCGI. The operator would need to either
|
||||
implement a custom [nginx extension module][ng-xm] or build and run a
|
||||
standalone proxy that implements websocket and filters each session. Apart
|
||||
from the work, this also makes the system even more resource intensive, as
|
||||
well as introducing yet another connection that could potentially time out
|
||||
or stall on full buffers.
|
||||
|
||||
Even for the simple case of restricting access to only event subscription,
|
||||
there is no easy solution currently: Once a caller has access to the
|
||||
websocket endpoint, it has complete access to the RPC service.
|
||||
|
||||
### Issue 2: Inconvenient client API
|
||||
|
||||
The subscription interface has some inconvenient features for the client as
|
||||
well as the server. These include:
|
||||
|
||||
1. **Non-standard protocol:** The RPC service is mostly [JSON-RPC 2.0][jsonrpc2],
|
||||
but the subscription interface diverges from the standard.
|
||||
|
||||
In a standard JSON-RPC 2.0 call, the client initiates a request to the
|
||||
server with a unique ID, and the server concludes the call by sending a
|
||||
reply for that ID. The `subscribe` implementation, however, sends multiple
|
||||
responses to the client's request:
|
||||
|
||||
- The client sends `subscribe` with some ID `x` and the desired query
|
||||
|
||||
- The server responds with ID `x` and an empty confirmation response.
|
||||
|
||||
- The server then (repeatedly) sends event result responses with ID `x`, one
|
||||
for each item with a matching event.
|
||||
|
||||
Standard JSON-RPC clients will reject the subsequent replies, as they
|
||||
announce a request ID (`x`) that is already complete. This means a caller
|
||||
has to implement Tendermint-specific handling for these responses.
|
||||
|
||||
Moreover, the result format is different between the initial confirmation
|
||||
and the subsequent responses. This means a caller has to implement special
|
||||
logic for decoding the first response versus the subsequent ones.
|
||||
|
||||
2. **No way to detect data loss:** The subscriber connection can be terminated
|
||||
for many reasons. Even ignoring ordinary network issues (e.g., packet loss):
|
||||
|
||||
- The server will drop messages and/or close the websocket if its write
|
||||
buffer fills, or if the queue of undelivered matching events is not
|
||||
drained fast enough. The client has no way to discover that messages were
|
||||
dropped even if the connection remains open.
|
||||
|
||||
- Either the client or the server may close the websocket if the websocket
|
||||
PING and PONG exchanges are not handled correctly, or frequently enough.
|
||||
Even if correctly implemented, this may fail if the system is under high
|
||||
load and cannot service those control messages in a timely manner.
|
||||
|
||||
When the connection is terminated, the server drops all the subscriptions
|
||||
for that client (as if it had called `unsubscribe_all`). Even if the client
|
||||
reconnects, any events that were published during the period between the
|
||||
disconnect and re-connect and re-subscription will be silently lost, and the
|
||||
client has no way to discover that it missed some relevant messages.
|
||||
|
||||
3. **No way to replay old events:** Even if a client knew it had missed some
|
||||
events (due to a disconnection, for example), the API provides no way for
|
||||
the client to "play back" events it may have missed.
|
||||
|
||||
4. **Large response sizes:** Some event data can be quite large, and there can
|
||||
be substantial duplication across items. The API allows the client to select
|
||||
_which_ events are reported, but has no way to control which parts of a
|
||||
matching event it wishes to receive.
|
||||
|
||||
This can be costly on the server (which has to marshal those data into
|
||||
JSON), the network, and the client (which has to unmarshal the result and
|
||||
then pick through for the components that are relevant to it).
|
||||
|
||||
Besides being inefficient, this also contributes to some of the persistent
|
||||
connection issues mentioned above, e.g., filling up the websocket write
|
||||
buffer and forcing the server to queue potentially several copies of a large
|
||||
value in memory.
|
||||
|
||||
5. **Client identity is tied to network address:** The Tendermint event API
|
||||
identifies each subscriber by a (Client ID, Query) pair. In the RPC service,
|
||||
the query is provided by the client, but the client ID is set to the TCP
|
||||
address of the client (typically "host:port" or "ip:port").
|
||||
|
||||
This means that even if the server did _not_ drop subscriptions immediately
|
||||
when the websocket connection is closed, a client may not be able to
|
||||
reattach to its existing subscription. Dialing a new connection is likely
|
||||
to result in a different port (and, depending on their own proxy setup,
|
||||
possibly a different public IP).
|
||||
|
||||
In isolation, this problem would be easy to work around with a new
|
||||
subscription parameter, but it would require several other changes to the
|
||||
handling of event subscriptions for that workaround to become useful.
|
||||
|
||||
---
|
||||
## Decision
|
||||
|
||||
To address the described problems, we will:
|
||||
|
||||
1. Introduce a new API for event subscription to the Tendermint RPC service.
|
||||
The proposed API is described in [Detailed Design](#detailed-design) below.
|
||||
|
||||
2. This new API will target the Tendermint v0.36 release, during which the
|
||||
current ("streaming") API will remain available as-is, but deprecated.
|
||||
|
||||
3. The streaming API will be entirely removed in release v0.37, which will
|
||||
require all users of event subscription to switch to the new API.
|
||||
|
||||
> **Point for discussion:** Given that ABCI++ and PBTS are the main priorities
|
||||
> for v0.36, it would be fine to slip the first phase of this work to v0.37.
|
||||
> Unless there is a time problem, however, the proposed design does not disrupt
|
||||
> the work on ABCI++ or PBTS, and will not increase the scope of breaking
|
||||
> changes. Therefore the plan is to begin in v0.36 and slip only if necessary.
|
||||
|
||||
---
|
||||
## Detailed Design
|
||||
|
||||
### Design Goals
|
||||
|
||||
Specific goals of this design include:
|
||||
|
||||
1. Remove the need for a persistent connection to each subscription client.
|
||||
Subscribers should use the same HTTP request flow for event subscription
|
||||
requests as for other RPC calls.
|
||||
|
||||
2. The server retains minimal state (possibly none) per-subscriber. In
|
||||
particular:
|
||||
|
||||
- The server does not buffer unconsumed writes nor queue undelivered events
|
||||
on a per-client basis.
|
||||
- A client that stalls or goes idle does not cost the server any resources.
|
||||
- Any event data that is buffered or stored is shared among _all_
|
||||
subscribers, and is not duplicated per client.
|
||||
|
||||
3. Slow clients have no impact (or minimal impact) on the rate of progress of
|
||||
the consensus algorithm, beyond the ambient overhead of servicing individual
|
||||
RPC requests.
|
||||
|
||||
4. Clients can tell when they have missed events matching their subscription,
|
||||
within some reasonable (configurable) window of time, and can "replay"
|
||||
events within that window to catch up.
|
||||
|
||||
5. Nice to have: It should be easy to use the event subscription API from
|
||||
existing standard tools and libraries, including command-line use for
|
||||
testing and experimentation.
|
||||
|
||||
### Definitions
|
||||
|
||||
- The **event stream** of a node is a single, time-ordered, heterogeneous
|
||||
stream of event items.
|
||||
|
||||
- Each **event item** comprises an **event datum** (for example, block header
|
||||
metadata for a new-block event), and zero or more optional **events**.
|
||||
|
||||
- An **event** means the [ABCI `Event` data type][abci-event], which comprises
|
||||
a string type and zero or more string key-value **event attributes**.
|
||||
|
||||
The use of the new terms "event item" and "event datum" is to avert confusion
|
||||
between the values that are published to the event bus (what we call here
|
||||
"event items") and the ABCI `Event` data type.
|
||||
|
||||
- The node assigns each event item a unique identifier string called a
|
||||
**cursor**. A cursor must be unique among all events published by a single
|
||||
node, but it is not required to be unique globally across nodes.
|
||||
|
||||
Cursors are time-ordered so that given event items A and B, if A was
|
||||
published before B, then cursor(A) < cursor(B) in lexicographic order.
|
||||
|
||||
A minimum viable cursor implementation is a tuple consisting of a timestamp
|
||||
and a sequence number (e.g., `16CCC798FB5F4670-0123`). However, it may also
|
||||
be useful to append basic type information to a cursor, to allow efficient
|
||||
filtering (e.g., `16CCC87E91869050-0091:BeginBlock`).
|
||||
|
||||
The initial implementation will use the minimum viable format.
|
||||
|
||||
### Discussion
|
||||
|
||||
The node maintains an **event log**, a shared ordered record of the events
|
||||
published to its event bus within an operator-configurable time window. The
|
||||
initial implementation will store the event log in-memory, and the operator
|
||||
will be given two per-node configuration settings. Note, these names are
|
||||
provisional:
|
||||
|
||||
- `[rpc] event-log-window-size`: A duration before the latest published event,
|
||||
during which the node will retain event items published. Setting this value
|
||||
to zero disables event subscription.
|
||||
|
||||
- `[rpc] event-log-max-items`: A maximum number of event items that the node
|
||||
will retain within the time window. If the number of items exceeds this
|
||||
value, the node discardes the oldest items in the window. Setting this value
|
||||
to zero means that no limit is imposed on the number of items.
|
||||
|
||||
The node will retain all events within the time window, provided they do not
|
||||
exceed the maximum number. These config parameters allow the operator to
|
||||
loosely regulate how much memory and storage the node allocates to the event
|
||||
log. The client can use the server reply to tell whether the events it wants
|
||||
are still available from the event log.
|
||||
|
||||
The event log is shared among all subscribers to the node.
|
||||
|
||||
> **Discussion point:** Should events persist across node restarts?
|
||||
>
|
||||
> The current event API does not persist events across restarts, so this new
|
||||
> design does not either. Note, however, that we may "spill" older event data
|
||||
> to disk as a way of controlling memory use. Such usage is ephemeral, however,
|
||||
> and does not need to be tracked as node data (e.g., it could be temp files).
|
||||
|
||||
### Query API
|
||||
|
||||
To retrieve event data, the client will call the (new) RPC method `events`.
|
||||
The parameters of this method will correspond to the following Go types:
|
||||
|
||||
```go
|
||||
type EventParams struct {
|
||||
// Optional filter spec. If nil or empty, all items are eligible.
|
||||
Filter *Filter `json:"filter"`
|
||||
|
||||
// The maximum number of eligible results to return.
|
||||
// If zero or negative, the server will report a default number.
|
||||
MaxResults int `json:"max_results"`
|
||||
|
||||
// Return only items after this cursor. If empty, the limit is just
|
||||
// before the the beginning of the event log.
|
||||
After string `json:"after"`
|
||||
|
||||
// Return only items before this cursor. If empty, the limit is just
|
||||
// after the head of the event log.
|
||||
Before string `json:"before"`
|
||||
|
||||
// Wait for up to this long for events to be available.
|
||||
WaitTime time.Duration `json:"wait_time"`
|
||||
}
|
||||
|
||||
type Filter struct {
|
||||
Query string `json:"query"`
|
||||
}
|
||||
```
|
||||
|
||||
> **Discussion point:** The initial implementation will not cache filter
|
||||
> queries for the client. If this turns out to be a performance issue in
|
||||
> production, the service can keep a small shared cache of compiled queries.
|
||||
> Given the improvements from #7319 et seq., this should not be necessary.
|
||||
|
||||
> **Discussion point:** For the initial implementation, the new API will use
|
||||
> the existing query language as-is. Future work may extend the Filter message
|
||||
> with a more structured and/or expressive query surface, but that is beyond
|
||||
> the scope of this design.
|
||||
|
||||
The semantics of the request are as follows: An item in the event log is
|
||||
**eligible** for a query if:
|
||||
|
||||
- It is newer than the `after` cursor (if set).
|
||||
- It is older than the `before` cursor (if set).
|
||||
- It matches the filter (if set).
|
||||
|
||||
Among the eligible items in the log, the server returns up to `max_results` of
|
||||
the newest items, in reverse order of cursor. If `max_results` is unset the
|
||||
server chooses a number to return, and will cap `max_results` at a sensible
|
||||
limit.
|
||||
|
||||
The `wait_time` parameter is used to effect polling. If `before` is empty and
|
||||
no items are available, the server will wait for up to `wait_time` for matching
|
||||
items to arrive at the head of the log. If `wait_time` is zero or negative, the
|
||||
server will wait for a default (positive) interval.
|
||||
|
||||
If `before` non-empty, `wait_time` is ignored: new results are only added to
|
||||
the head of the log, so there is no need to wait. This allows the client to
|
||||
poll for new data, and "page" backward through matching event items. This is
|
||||
discussed in more detail below.
|
||||
|
||||
The server will set a sensible cap on the maximum `wait_time`, overriding
|
||||
client-requested intervals longer than that.
|
||||
|
||||
A successful reply from the `events` request corresponds to the following Go
|
||||
types:
|
||||
|
||||
```go
|
||||
type EventReply struct {
|
||||
// The items matching the request parameters, from newest
|
||||
// to oldest, if any were available within the timeout.
|
||||
Items []*EventItem `json:"items"`
|
||||
|
||||
// This is true if there is at least one older matching item
|
||||
// available in the log that was not returned.
|
||||
More bool `json:"more"`
|
||||
|
||||
// The cursor of the oldest item in the log at the time of this reply,
|
||||
// or "" if the log is empty.
|
||||
Oldest string `json:"oldest"`
|
||||
|
||||
// The cursor of the newest item in the log at the time of this reply,
|
||||
// or "" if the log is empty.
|
||||
Newest string `json:"newest"`
|
||||
}
|
||||
|
||||
type EventItem struct {
|
||||
// The cursor of this item.
|
||||
Cursor string `json:"cursor"`
|
||||
|
||||
// The encoded event data for this item.
|
||||
// The type identifies the structure of the value.
|
||||
Data struct {
|
||||
Type string `json:"type"`
|
||||
Value json.RawMessage `json:"value"`
|
||||
} `json:"data"`
|
||||
}
|
||||
```
|
||||
|
||||
The `oldest` and `newest` fields of the reply report the cursors of the oldest
|
||||
and newest items (of any kind) recorded in the event log at the time of the
|
||||
reply, or are `""` if the log is empty.
|
||||
|
||||
The `data` field contains the type-specific event datum. The datum carries any
|
||||
ABCI events that may have been defined.
|
||||
|
||||
> **Discussion point**: Based on [issue #7273][i7273], I did not include a
|
||||
> separate field in the response for the ABCI events, since it duplicates data
|
||||
> already stored elsewhere in the event data.
|
||||
|
||||
The semantics of the reply are as follows:
|
||||
|
||||
- If `items` is non-empty:
|
||||
|
||||
- Items are ordered from newest to oldest.
|
||||
|
||||
- If `more` is true, there is at least one additional, older item in the
|
||||
event log that was not returned (in excess of `max_results`).
|
||||
|
||||
In this case the client can fetch the next page by setting `before` in a
|
||||
new request, to the cursor of the oldest item fetched (i.e., the last one
|
||||
in `items`).
|
||||
|
||||
- Otherwise (if `more` is false), all the matching results have been
|
||||
reported (pagination is complete).
|
||||
|
||||
- The first element of `items` identifies the newest item considered.
|
||||
Subsequent poll requests can set `after` to this cursor to skip items
|
||||
that were already retrieved.
|
||||
|
||||
- If `items` is empty:
|
||||
|
||||
- If the `before` was set in the request, there are no further eligible
|
||||
items for this query in the log (pagination is complete).
|
||||
|
||||
This is just a safety case; the client can detect this without issuing
|
||||
another call by consulting the `more` field of the previous reply.
|
||||
|
||||
- If the `before` was empty in the request, no eligible items were
|
||||
available before the `wait_time` expired. The client may poll again to
|
||||
wait for more event items.
|
||||
|
||||
A client can store cursor values to detect data loss and to recover from
|
||||
crashes and connectivity issues:
|
||||
|
||||
- After a crash, the client requests events after the newest cursor it has
|
||||
seen. If the reply indicates that cursor is no longer in range, the client
|
||||
may (conservatively) conclude some event data may have been lost.
|
||||
|
||||
- On the other hand, if it _is_ in range, the client can then page back through
|
||||
the results that it missed, and then resume polling. As long as its recovery
|
||||
cursor does not age out before it finishes, the client can be sure it has all
|
||||
the relevant results.
|
||||
|
||||
### Other Notes
|
||||
|
||||
- The new API supports two general "modes" of operation:
|
||||
|
||||
1. In ordinary operation, clients will **long-poll** the head of the event
|
||||
log for new events matching their criteria (by setting a `wait_time` and
|
||||
no `before`).
|
||||
|
||||
2. If there are more events than the client requested, or if the client needs
|
||||
to to read older events to recover from a stall or crash, clients will
|
||||
**page** backward through the event log (by setting `before` and `after`).
|
||||
|
||||
- While the new API requires explicit polling by the client, it makes better
|
||||
use of the node's existing HTTP infrastructure (e.g., connection pools).
|
||||
Moreover, the direct implementation is easier to use from standard tools and
|
||||
client libraries for HTTP and JSON-RPC.
|
||||
|
||||
Explicit polling does shift the burden of timeliness to the client. That is
|
||||
arguably preferable, however, given that the RPC service is ancillary to the
|
||||
node's primary goal, viz., consensus. The details of polling can be easily
|
||||
hidden from client applications with simple libraries.
|
||||
|
||||
- The format of a cursor is considered opaque to the client. Clients must not
|
||||
parse cursor values, but they may rely on their ordering properties.
|
||||
|
||||
- To maintain the event log, the server must prune items outside the time
|
||||
window and in excess of the item limit.
|
||||
|
||||
The initial implementation will do this by checking the tail of the event log
|
||||
after each new item is published. If the number of items in the log exceeds
|
||||
the item limit, it will delete oldest items until the log is under the limit;
|
||||
then discard any older than the time window before the latest.
|
||||
|
||||
To minimize coordination interference between the publisher (the event bus)
|
||||
and the subcribers (the `events` service handlers), the event log will be
|
||||
stored as a persistent linear queue with shared structure (a cons list). A
|
||||
single reader-writer mutex will guard the "head" of the queue where new
|
||||
items are published:
|
||||
|
||||
- **To publish a new item**, the publisher acquires the write lock, conses a
|
||||
new item to the front of the existing queue, and replaces the head pointer
|
||||
with the new item.
|
||||
|
||||
- **To scan the queue**, a reader acquires the read lock, captures the head
|
||||
pointer, and then releases the lock. The rest of its request can be served
|
||||
without holding a lock, since the queue structure will not change.
|
||||
|
||||
When a reader wants to wait, it will yield the lock and wait on a condition
|
||||
that is signaled when the publisher swings the pointer.
|
||||
|
||||
- **To prune the queue**, the publisher (who is the sole writer) will track
|
||||
the queue length and the age of the oldest item separately. When the
|
||||
length and or age exceed the configured bounds, it will construct a new
|
||||
queue spine on the same items, discarding out-of-band values.
|
||||
|
||||
Pruning can be done while the publisher already holds the write lock, or
|
||||
could be done outside the lock entirely: Once the new queue is constructed,
|
||||
the lock can be re-acquired to swing the pointer. This costs some extra
|
||||
allocations for the cons cells, but avoids duplicating any event items.
|
||||
The pruning step is a simple linear scan down the first (up to) max-items
|
||||
elements of the queue, to find the breakpoint of age and length.
|
||||
|
||||
Moreover, the publisher can amortize the cost of pruning by item count, if
|
||||
necessary, by pruning length "more aggressively" than the configuration
|
||||
requires (e.g., reducing to 3/4 of the maximum rather than 1/1).
|
||||
|
||||
The state of the event log before the publisher acquires the lock:
|
||||

|
||||
|
||||
After the publisher has added a new item and pruned old ones:
|
||||

|
||||
|
||||
### Migration Plan
|
||||
|
||||
This design requires that clients eventually migrate to the new event
|
||||
subscription API, but provides a full release cycle with both APIs in place to
|
||||
make this burden more tractable. The migration strategy is broadly:
|
||||
|
||||
**Phase 1**: Release v0.36.
|
||||
|
||||
- Implement the new `events` endpoint, keeping the existing methods as they are.
|
||||
- Update the Go clients to support the new `events` endpoint, and handle polling.
|
||||
- Update the old endpoints to log annoyingly about their own deprecation.
|
||||
- Write tutorials about how to migrate client usage.
|
||||
|
||||
At or shortly after release, we should proactively update the Cosmos SDK to use
|
||||
the new API, to remove a disincentive to upgrading.
|
||||
|
||||
**Phase 2**: Release v0.37
|
||||
|
||||
- During development, we should actively seek out any existing users of the
|
||||
streaming event subscription API and help them migrate.
|
||||
- Possibly also: Spend some time writing clients for JS, Rust, et al.
|
||||
- Release: Delete the old implementation and all the websocket support code.
|
||||
|
||||
> **Discussion point**: Even though the plan is to keep the existing service,
|
||||
> we might take the opportunity to restrict the websocket endpoint to _only_
|
||||
> the event streaming service, removing the other endpoints. To minimize the
|
||||
> disruption for users in the v0.36 cycle, I have decided not to do this for
|
||||
> the first phase.
|
||||
>
|
||||
> If we wind up pushing this design into v0.37, however, we should re-evaulate
|
||||
> this partial turn-down of the websocket.
|
||||
|
||||
### Future Work
|
||||
|
||||
- This design does not immediately address the problem of allowing the client
|
||||
to control which data are reported back for event items. That concern is
|
||||
deferred to future work. However, it would be straightforward to extend the
|
||||
filter and/or the request parameters to allow more control.
|
||||
|
||||
- The node currently stores a subset of event data (specifically the block and
|
||||
transaction events) for use in reindexing. While these data are redundant
|
||||
with the event log described in this document, they are not sufficient to
|
||||
cover event subscription, as they omit other event types.
|
||||
|
||||
In the future we should investigate consolidating or removing event data from
|
||||
the state store entirely. For now this issue is out of scope for purposes of
|
||||
updating the RPC API. We may be able to piggyback on the database unification
|
||||
plans (see [RFC 001][rfc001]) to store the event log separately, so its
|
||||
pruning policy does not need to be tied to the block and state stores.
|
||||
|
||||
- This design reuses the existing filter query language from the old API. In
|
||||
the future we may want to use a more structured and/or expressive query. The
|
||||
Filter object can be extended with more fields as needed to support this.
|
||||
|
||||
- Some users have trouble communicating with the RPC service because of
|
||||
configuration problems like improperly-set CORS policies. While this design
|
||||
does not address those issues directly, we might want to revisit how we set
|
||||
policies in the RPC service to make it less susceptible to confusing errors
|
||||
caused by misconfiguration.
|
||||
|
||||
---
|
||||
## Consequences
|
||||
|
||||
- ✅ Reduces the number of transport options for RPC. Supports [RFC 002][rfc002].
|
||||
- ️✅ Removes the primary non-standard use of JSON-RPC.
|
||||
- ⛔️ Forces clients to migrate to a different API (eventually).
|
||||
- ↕️ API requires clients to poll, but this reduces client state on the server.
|
||||
- ↕️ We have to maintain both implementations for a whole release, but this
|
||||
gives clients time to migrate.
|
||||
|
||||
---
|
||||
## Alternative Approaches
|
||||
|
||||
The following alternative approaches were considered:
|
||||
|
||||
1. **Leave it alone.** Since existing tools mostly already work with the API as
|
||||
it stands today, we could leave it alone and do our best to improve its
|
||||
performance and reliability.
|
||||
|
||||
Based on many issues reported by users and node operators (e.g.,
|
||||
[#3380][i3380], [#6439][i6439], [#6729][i6729], [#7247][i7247]), the
|
||||
problems described here affect even the existing use that works. Investing
|
||||
further incremental effort in the existing API is unlikely to address these
|
||||
issues.
|
||||
|
||||
2. **Design a better streaming API.** Instead of polling, we might try to
|
||||
design a better "streaming" API for event subscription.
|
||||
|
||||
A significant advantage of switching away from streaming is to remove the
|
||||
need for persistent connections between the node and subscribers. A new
|
||||
streaming protocol design would lose that advantage, and would still need a
|
||||
way to let clients recover and replay.
|
||||
|
||||
This approach might look better if we decided to use a different protocol
|
||||
for event subscription, say gRPC instead of JSON-RPC. That choice, however,
|
||||
would be just as breaking for existing clients, for marginal benefit.
|
||||
Moreover, this option increases both the complexity and the resource cost on
|
||||
the node implementation.
|
||||
|
||||
Given that resource consumption and complexity are important considerations,
|
||||
this option was not chosen.
|
||||
|
||||
3. **Defer to an external event broker.** We might remove the entire event
|
||||
subscription infrastructure from the node, and define an optional interface
|
||||
to allow the node to publish all its events to an external event broker,
|
||||
such as Apache Kafka.
|
||||
|
||||
This has the advantage of greatly simplifying the node, but at a great cost
|
||||
to the node operator: To enable event subscription in this design, the
|
||||
operator has to stand up and maintain a separate process in communion with
|
||||
the node, and configuration changes would have to be coordinated across
|
||||
both.
|
||||
|
||||
Moreover, this approach would be highly disruptive to existing client use,
|
||||
and migration would probably require switching to third-party libraries.
|
||||
Despite the potential benefits for the node itself, the costs to operators
|
||||
and clients seems too large for this to be the best option.
|
||||
|
||||
Publishing to an external event broker might be a worthwhile future project,
|
||||
if there is any demand for it. That decision is out of scope for this design,
|
||||
as it interacts with the design of the indexer as well.
|
||||
|
||||
---
|
||||
## References
|
||||
|
||||
- [RFC 006: Event Subscription][rfc006]
|
||||
- [Tendermint RPC service][rpc-service]
|
||||
- [Event query grammar][query-grammar]
|
||||
- [RFC 6455: The WebSocket protocol][ws]
|
||||
- [JSON-RPC 2.0 Specification][jsonrpc2]
|
||||
- [Nginx proxy server][nginx]
|
||||
- [Proxying websockets][rp-ws]
|
||||
- [Extension modules][ng-xm]
|
||||
- [FastCGI][fcgi]
|
||||
- [RFC 001: Storage Engines & Database Layer][rfc001]
|
||||
- [RFC 002: Interprocess Communication in Tendermint][rfc002]
|
||||
- Issues:
|
||||
- [rpc/client: test that client resubscribes upon disconnect][i3380] (#3380)
|
||||
- [Too high memory usage when creating many events subscriptions][i6439] (#6439)
|
||||
- [Tendermint emits events faster than clients can pull them][i6729] (#6729)
|
||||
- [indexer: unbuffered event subscription slow down the consensus][i7247] (#7247)
|
||||
- [rpc: remove duplication of events when querying][i7273] (#7273)
|
||||
|
||||
[rfc006]: https://github.com/tendermint/tendermint/blob/master/docs/rfc/rfc-006-event-subscription.md
|
||||
[rpc-service]: https://github.com/tendermint/tendermint/blob/master/rpc/openapi/openapi.yaml
|
||||
[query-grammar]: https://pkg.go.dev/github.com/tendermint/tendermint@master/internal/pubsub/query/syntax
|
||||
[ws]: https://datatracker.ietf.org/doc/html/rfc6455
|
||||
[jsonrpc2]: https://www.jsonrpc.org/specification
|
||||
[nginx]: https://nginx.org/en/docs/
|
||||
[fcgi]: http://www.mit.edu/~yandros/doc/specs/fcgi-spec.html
|
||||
[rp-ws]: https://nginx.org/en/docs/http/websocket.html
|
||||
<!-- markdown-link-check-disable-next-line -->
|
||||
[ng-xm]: https://www.nginx.com/resources/wiki/extending/
|
||||
[abci-event]: https://pkg.go.dev/github.com/tendermint/tendermint/abci/types#Event
|
||||
[rfc001]: https://github.com/tendermint/tendermint/blob/master/docs/rfc/rfc-001-storage-engine.rst
|
||||
[rfc002]: https://github.com/tendermint/tendermint/blob/master/docs/rfc/rfc-002-ipc-ecosystem.md
|
||||
[i3380]: https://github.com/tendermint/tendermint/issues/3380
|
||||
[i6439]: https://github.com/tendermint/tendermint/issues/6439
|
||||
[i6729]: https://github.com/tendermint/tendermint/issues/6729
|
||||
[i7247]: https://github.com/tendermint/tendermint/issues/7247
|
||||
[i7273]: https://github.com/tendermint/tendermint/issues/7273
|
||||
@@ -1,112 +0,0 @@
|
||||
# ADR 076: Combine Spec and Tendermint Repositories
|
||||
|
||||
## Changelog
|
||||
|
||||
- 2022-02-04: Initial Draft. (@tychoish)
|
||||
|
||||
## Status
|
||||
|
||||
Accepted.
|
||||
|
||||
## Context
|
||||
|
||||
While the specification for Tendermint was originally in the same
|
||||
repository as the Go implementation, at some point the specification
|
||||
was split from the core repository and maintained separately from the
|
||||
implementation. While this makes sense in promoting a conceptual
|
||||
separation of specification and implementation, in practice this
|
||||
separation was a premature optimization, apparently aimed at supporting
|
||||
alternate implementations of Tendermint.
|
||||
|
||||
The operational and documentary burden of maintaining a separate
|
||||
spec repo has not returned value to justify its cost. There are no active
|
||||
projects to develop alternate implementations of Tendermint based on the
|
||||
common specification, and having separate repositories creates an ongoing
|
||||
burden to coordinate versions, documentation, and releases.
|
||||
|
||||
## Decision
|
||||
|
||||
The specification repository will be merged back into the Tendermint
|
||||
core repository.
|
||||
|
||||
Stakeholders including representatives from the maintainers of the
|
||||
spec, the Go implementation, and the Tendermint Rust library, agreed
|
||||
to merge the repositories in the Tendermint core dev meeting on 27
|
||||
January 2022, including @williambanfield @cmwaters @creachadair and
|
||||
@thanethomson.
|
||||
|
||||
## Alternative Approaches
|
||||
|
||||
The main alternative we considered was to keep separate repositories,
|
||||
and to introduce a coordinated versioning scheme between the two, so
|
||||
that users could figure out which spec versions go with which versions
|
||||
of the core implementation.
|
||||
|
||||
We decided against this on the grounds that it would further complicate
|
||||
the release process for _both_ repositories, without mitigating any of
|
||||
the other existing issues.
|
||||
|
||||
## Detailed Design
|
||||
|
||||
Clone and merge the master branch of the `tendermint/spec` repository
|
||||
as a branch of the `tendermint/tendermint`, to ensure the commit history
|
||||
of both repositories remains intact.
|
||||
|
||||
### Implementation Instructions
|
||||
|
||||
1. Within the `tendermint` repository, execute the following commands
|
||||
to add a new branch with the history of the master branch of `spec`:
|
||||
|
||||
```bash
|
||||
git remote add spec git@github.com:tendermint/spec.git
|
||||
git fetch spec
|
||||
git checkout -b spec-master spec/master
|
||||
mkdir spec
|
||||
git ls-tree -z --name-only HEAD | xargs -0 -I {} git mv {} subdir/
|
||||
git commit -m "spec: organize specification prior to merge"
|
||||
git checkout -b spec-merge-mainline origin/master
|
||||
git merge --allow-unrelated-histories spec-master
|
||||
```
|
||||
|
||||
This merges the spec into the `tendermint/tendermint` repository as
|
||||
a normal branch. This commit can also be backported to the 0.35
|
||||
branch, if needed.
|
||||
|
||||
2. Migrate outstanding issues from `tendermint/spec` to the
|
||||
`tendermint/tendermint` repository.
|
||||
|
||||
3. In the specification repository, add redirect to the README and mark
|
||||
the repository as archived.
|
||||
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
Easier maintenance for the specification will obviate a number of
|
||||
complicated and annoying versioning problems, and will help prevent the
|
||||
possibility of the specification and the implementation drifting apart.
|
||||
|
||||
Additionally, co-locating the specification will help encourage
|
||||
cross-pollination and collaboration, between engineers focusing on the
|
||||
specification and the protocol and engineers focusing on the implementation.
|
||||
|
||||
### Negative
|
||||
|
||||
Co-locating the spec and Go implementation has the potential effect of
|
||||
prioritizing the Go implementation with regards to the spec, and
|
||||
making it difficult to think about alternate implementations of the
|
||||
Tendermint algorithm. Although we may want to foster additional
|
||||
Tendermint implementations in the future, this isn't an active goal
|
||||
in our current roadmap, and *not* merging these repos doesn't
|
||||
change the fact that the Go implementation of Tendermint is already the
|
||||
primary implementation.
|
||||
|
||||
### Neutral
|
||||
|
||||
N/A
|
||||
|
||||
## References
|
||||
|
||||
- https://github.com/tendermint/spec
|
||||
- https://github.com/tendermint/tendermint
|
||||
@@ -1,109 +0,0 @@
|
||||
# ADR 077: Configurable Block Retention
|
||||
|
||||
## Changelog
|
||||
|
||||
- 2020-03-23: Initial draft (@erikgrinaker)
|
||||
- 2020-03-25: Use local config for snapshot interval (@erikgrinaker)
|
||||
- 2020-03-31: Use ABCI commit response for block retention hint
|
||||
- 2020-04-02: Resolved open questions
|
||||
- 2021-02-11: Migrate to tendermint repo (Originally [RFC 001](https://github.com/tendermint/spec/pull/84))
|
||||
|
||||
## Author(s)
|
||||
|
||||
- Erik Grinaker (@erikgrinaker)
|
||||
|
||||
## Context
|
||||
|
||||
Currently, all Tendermint nodes contain the complete sequence of blocks from genesis up to some height (typically the latest chain height). This will no longer be true when the following features are released:
|
||||
|
||||
- [Block pruning](https://github.com/tendermint/tendermint/issues/3652): removes historical blocks and associated data (e.g. validator sets) up to some height, keeping only the most recent blocks.
|
||||
|
||||
- [State sync](https://github.com/tendermint/tendermint/issues/828): bootstraps a new node by syncing state machine snapshots at a given height, but not historical blocks and associated data.
|
||||
|
||||
To maintain the integrity of the chain, the use of these features must be coordinated such that necessary historical blocks will not become unavailable or lost forever. In particular:
|
||||
|
||||
- Some nodes should have complete block histories, for auditability, querying, and bootstrapping.
|
||||
|
||||
- The majority of nodes should retain blocks longer than the Cosmos SDK unbonding period, for light client verification.
|
||||
|
||||
- Some nodes must take and serve state sync snapshots with snapshot intervals less than the block retention periods, to allow new nodes to state sync and then replay blocks to catch up.
|
||||
|
||||
- Applications may not persist their state on commit, and require block replay on restart.
|
||||
|
||||
- Only a minority of nodes can be state synced within the unbonding period, for light client verification and to serve block histories for catch-up.
|
||||
|
||||
However, it is unclear if and how we should enforce this. It may not be possible to technically enforce all of these without knowing the state of the entire network, but it may also be unrealistic to expect this to be enforced entirely through social coordination. This is especially unfortunate since the consequences of misconfiguration can be permanent chain-wide data loss.
|
||||
|
||||
## Proposal
|
||||
|
||||
Add a new field `retain_height` to the ABCI `ResponseCommit` message:
|
||||
|
||||
```proto
|
||||
service ABCIApplication {
|
||||
rpc Commit(RequestCommit) returns (ResponseCommit);
|
||||
}
|
||||
|
||||
message RequestCommit {}
|
||||
|
||||
message ResponseCommit {
|
||||
// reserve 1
|
||||
bytes data = 2; // the Merkle root hash
|
||||
uint64 retain_height = 3; // the oldest block height to retain
|
||||
}
|
||||
```
|
||||
|
||||
Upon ABCI `Commit`, which finalizes execution of a block in the state machine, Tendermint removes all data for heights lower than `retain_height`. This allows the state machine to control block retention, which is preferable since only it can determine the significance of historical blocks. By default (i.e. with `retain_height=0`) all historical blocks are retained.
|
||||
|
||||
Removed data includes not only blocks, but also headers, commit info, consensus params, validator sets, and so on. In the first iteration this will be done synchronously, since the number of heights removed for each run is assumed to be small (often 1) in the typical case. It can be made asynchronous at a later time if this is shown to be necessary.
|
||||
|
||||
Since `retain_height` is dynamic, it is possible for it to refer to a height which has already been removed. For example, commit at height 100 may return `retain_height=90` while commit at height 101 may return `retain_height=80`. This is allowed, and will be ignored - it is the application's responsibility to return appropriate values.
|
||||
|
||||
State sync will eventually support backfilling heights, via e.g. a snapshot metadata field `backfill_height`, but in the initial version it will have a fully truncated block history.
|
||||
|
||||
## Cosmos SDK Example
|
||||
|
||||
As an example, we'll consider how the Cosmos SDK might make use of this. The specific details should be discussed in a separate SDK proposal.
|
||||
|
||||
The returned `retain_height` would be the lowest height that satisfies:
|
||||
|
||||
- Unbonding time: the time interval in which validators can be economically punished for misbehavior. Blocks in this interval must be auditable e.g. by the light client.
|
||||
|
||||
- IAVL snapshot interval: the block interval at which the underlying IAVL database is persisted to disk, e.g. every 10000 heights. Blocks since the last IAVL snapshot must be available for replay on application restart.
|
||||
|
||||
- State sync snapshots: blocks since the _oldest_ available snapshot must be available for state sync nodes to catch up (oldest because a node may be restoring an old snapshot while a new snapshot was taken).
|
||||
|
||||
- Local config: archive nodes may want to retain more or all blocks, e.g. via a local config option `min-retain-blocks`. There may also be a need to vary rentention for other nodes, e.g. sentry nodes which do not need historical blocks.
|
||||
|
||||

|
||||
|
||||
## Status
|
||||
|
||||
Accepted
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- Application-specified block retention allows the application to take all relevant factors into account and prevent necessary blocks from being accidentally removed.
|
||||
|
||||
- Node operators can independently decide whether they want to provide complete block histories (if local configuration for this is provided) and snapshots.
|
||||
|
||||
### Negative
|
||||
|
||||
- Social coordination is required to run archival nodes, failure to do so may lead to permanent loss of historical blocks.
|
||||
|
||||
- Social coordination is required to run snapshot nodes, failure to do so may lead to inability to run state sync, and inability to bootstrap new nodes at all if no archival nodes are online.
|
||||
|
||||
### Neutral
|
||||
|
||||
- Reduced block retention requires application changes, and cannot be controlled directly in Tendermint.
|
||||
|
||||
- Application-specified block retention may set a lower bound on disk space requirements for all nodes.
|
||||
|
||||
## References
|
||||
|
||||
- State sync ADR: <https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-053-state-sync-prototype.md>
|
||||
|
||||
- State sync issue: <https://github.com/tendermint/tendermint/issues/828>
|
||||
|
||||
- Block pruning issue: <https://github.com/tendermint/tendermint/issues/3652>
|
||||
@@ -1,82 +0,0 @@
|
||||
# ADR 078: Non-Zero Genesis
|
||||
|
||||
## Changelog
|
||||
|
||||
- 2020-07-26: Initial draft (@erikgrinaker)
|
||||
- 2020-07-28: Use weak chain linking, i.e. `predecessor` field (@erikgrinaker)
|
||||
- 2020-07-31: Drop chain linking (@erikgrinaker)
|
||||
- 2020-08-03: Add `State.InitialHeight` (@erikgrinaker)
|
||||
- 2021-02-11: Migrate to tendermint repo (Originally [RFC 002](https://github.com/tendermint/spec/pull/119))
|
||||
|
||||
## Author(s)
|
||||
|
||||
- Erik Grinaker (@erikgrinaker)
|
||||
|
||||
## Context
|
||||
|
||||
The recommended upgrade path for block protocol-breaking upgrades is currently to hard fork the
|
||||
chain (see e.g. [`cosmoshub-3` upgrade](https://blog.cosmos.network/cosmos-hub-3-upgrade-announcement-39c9da941aee).
|
||||
This is done by halting all validators at a predetermined height, exporting the application
|
||||
state via application-specific tooling, and creating an entirely new chain using the exported
|
||||
application state.
|
||||
|
||||
As far as Tendermint is concerned, the upgraded chain is a completely separate chain, with e.g.
|
||||
a new chain ID and genesis file. Notably, the new chain starts at height 1, and has none of the
|
||||
old chain's block history. This causes problems for integrators, e.g. coin exchanges and
|
||||
wallets, that assume a monotonically increasing height for a given blockchain. Users also find
|
||||
it confusing that a given height can now refer to distinct states depending on the chain
|
||||
version.
|
||||
|
||||
An ideal solution would be to always retain block backwards compatibility in such a way that chain
|
||||
history is never lost on upgrades. However, this may require a significant amount of engineering
|
||||
work that is not viable for the planned Stargate release (Tendermint 0.34), and may prove too
|
||||
restrictive for future development.
|
||||
|
||||
As a first step, allowing the new chain to start from an initial height specified in the genesis
|
||||
file would at least provide monotonically increasing heights. There was a proposal to include the
|
||||
last block header of the previous chain as well, but since the genesis file is not verified and
|
||||
hashed (only specific fields are) this would not be trustworthy.
|
||||
|
||||
External tooling will be required to map historical heights onto e.g. archive nodes that contain
|
||||
blocks from previous chain version. Tendermint will not include any such functionality.
|
||||
|
||||
## Proposal
|
||||
|
||||
Tendermint will allow chains to start from an arbitrary initial height:
|
||||
|
||||
- A new field `initial_height` is added to the genesis file, defaulting to `1`. It can be set to any
|
||||
non-negative integer, and `0` is considered equivalent to `1`.
|
||||
|
||||
- A new field `InitialHeight` is added to the ABCI `RequestInitChain` message, with the same value
|
||||
and semantics as the genesis field.
|
||||
|
||||
- A new field `InitialHeight` is added to the `state.State` struct, where `0` is considered invalid.
|
||||
Including the field here simplifies implementation, since the genesis value does not have to be
|
||||
propagated throughout the code base separately, but it is not strictly necessary.
|
||||
|
||||
ABCI applications may have to be updated to handle arbitrary initial heights, otherwise the initial
|
||||
block may fail.
|
||||
|
||||
## Status
|
||||
|
||||
Accepted
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- Heights can be unique throughout the history of a "logical" chain, across hard fork upgrades.
|
||||
|
||||
### Negative
|
||||
|
||||
- Upgrades still cause loss of block history.
|
||||
|
||||
- Integrators will have to map height ranges to specific archive nodes/networks to query history.
|
||||
|
||||
### Neutral
|
||||
|
||||
- There is no explicit link to the last block of the previous chain.
|
||||
|
||||
## References
|
||||
|
||||
- [#2543: Allow genesis file to start from non-zero height w/ prev block header](https://github.com/tendermint/tendermint/issues/2543)
|
||||
@@ -1,57 +0,0 @@
|
||||
# ADR 079: Ed25519 Verification
|
||||
|
||||
## Changelog
|
||||
|
||||
- 2020-08-21: Initial RFC
|
||||
- 2021-02-11: Migrate RFC to tendermint repo (Originally [RFC 003](https://github.com/tendermint/spec/pull/144))
|
||||
|
||||
## Author(s)
|
||||
|
||||
- Marko (@marbar3778)
|
||||
|
||||
## Context
|
||||
|
||||
Ed25519 keys are the only supported key types for Tendermint validators currently. Tendermint-Go wraps the ed25519 key implementation from the go standard library. As more clients are implemented to communicate with the canonical Tendermint implementation (Tendermint-Go) different implementations of ed25519 will be used. Due to [RFC 8032](https://www.rfc-editor.org/rfc/rfc8032.html) not guaranteeing implementation compatibility, Tendermint clients must to come to an agreement of how to guarantee implementation compatibility. [Zcash](https://z.cash/) has multiple implementations of their client and have identified this as a problem as well. The team at Zcash has made a proposal to address this issue, [Zcash improvement proposal 215](https://zips.z.cash/zip-0215).
|
||||
|
||||
## Proposal
|
||||
|
||||
- Tendermint-Go would adopt [hdevalence/ed25519consensus](https://github.com/hdevalence/ed25519consensus).
|
||||
- This library is implements `ed25519.Verify()` in accordance to zip-215. Tendermint-go will continue to use `crypto/ed25519` for signing and key generation.
|
||||
|
||||
- Tendermint-rs would adopt [ed25519-zebra](https://github.com/ZcashFoundation/ed25519-zebra)
|
||||
- related [issue](https://github.com/informalsystems/tendermint-rs/issues/355)
|
||||
|
||||
Signature verification is one of the major bottlenecks of Tendermint-go, batch verification can not be used unless it has the same consensus rules, ZIP 215 makes verification safe in consensus critical areas.
|
||||
|
||||
This change constitutes a breaking changes, therefore must be done in a major release. No changes to validator keys or operations will be needed for this change to be enabled.
|
||||
|
||||
This change has no impact on signature aggregation. To enable this signature aggregation Tendermint will have to use different signature schema (Schnorr, BLS, ...). Secondly, this change will enable safe batch verification for the Tendermint-Go client. Batch verification for the rust client is already supported in the library being used.
|
||||
|
||||
As part of the acceptance of this proposal it would be best to contract or discuss with a third party the process of conducting a security review of the go library.
|
||||
|
||||
## Status
|
||||
|
||||
Proposed
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- Consistent signature verification across implementations
|
||||
- Enable safe batch verification
|
||||
|
||||
### Negative
|
||||
|
||||
#### Tendermint-Go
|
||||
|
||||
- Third_party dependency
|
||||
- library has not gone through a security review.
|
||||
- unclear maintenance schedule
|
||||
- Fragmentation of the ed25519 key for the go implementation, verification is done using a third party library while the rest
|
||||
uses the go standard library
|
||||
|
||||
### Neutral
|
||||
|
||||
## References
|
||||
|
||||
[It’s 255:19AM. Do you know what your validation criteria are?](https://hdevalence.ca/blog/2020-10-04-its-25519am)
|
||||
@@ -1,203 +0,0 @@
|
||||
# ADR 080: ReverseSync - fetching historical data
|
||||
|
||||
## Changelog
|
||||
|
||||
- 2021-02-11: Migrate to tendermint repo (Originally [RFC 005](https://github.com/tendermint/spec/pull/224))
|
||||
- 2021-04-19: Use P2P to gossip necessary data for reverse sync.
|
||||
- 2021-03-03: Simplify proposal to the state sync case.
|
||||
- 2021-02-17: Add notes on asynchronicity of processes.
|
||||
- 2020-12-10: Rename backfill blocks to reverse sync.
|
||||
- 2020-11-25: Initial draft.
|
||||
|
||||
## Author(s)
|
||||
|
||||
- Callum Waters (@cmwaters)
|
||||
|
||||
## Context
|
||||
|
||||
Two new features: [Block pruning](https://github.com/tendermint/tendermint/issues/3652)
|
||||
and [State sync](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-042-state-sync.md)
|
||||
meant nodes no longer needed a complete history of the blockchain. This
|
||||
introduced some challenges of its own which were covered and subsequently
|
||||
tackled with [RFC-001](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-077-block-retention.md).
|
||||
The RFC allowed applications to set a block retention height; an upper bound on
|
||||
what blocks would be pruned. However nodes who state sync past this upper bound
|
||||
(which is necessary as snapshots must be saved within the trusting period for
|
||||
the assisting light client to verify) have no means of backfilling the blocks
|
||||
to meet the retention limit. This could be a problem as nodes who state sync and
|
||||
then eventually switch to consensus (or fast sync) may not have the block and
|
||||
validator history to verify evidence causing them to panic if they see 2/3
|
||||
commit on what the node believes to be an invalid block.
|
||||
|
||||
Thus, this RFC sets out to instil a minimum block history invariant amongst
|
||||
honest nodes.
|
||||
|
||||
## Proposal
|
||||
|
||||
A backfill mechanism can simply be defined as an algorithm for fetching,
|
||||
verifying and storing, headers and validator sets of a height prior to the
|
||||
current base of the node's blockchain. In matching the terminology used for
|
||||
other data retrieving protocols (i.e. fast sync and state sync), we
|
||||
call this method **ReverseSync**.
|
||||
|
||||
We will define the mechanism in four sections:
|
||||
|
||||
- Usage
|
||||
- Design
|
||||
- Verification
|
||||
- Termination
|
||||
|
||||
### Usage
|
||||
|
||||
For now, we focus purely on the case of a state syncing node, whom after
|
||||
syncing to a height will need to verify historical data in order to be capable
|
||||
of processing new blocks. We can denote the earliest height that the node will
|
||||
need to verify and store in order to be able to verify any evidence that might
|
||||
arise as the `max_historical_height`/`time`. Both height and time are necessary
|
||||
as this maps to the BFT time used for evidence expiration. After acquiring
|
||||
`State`, we calculate these parameters as:
|
||||
|
||||
```go
|
||||
max_historical_height = max(state.InitialHeight, state.LastBlockHeight - state.ConsensusParams.EvidenceAgeHeight)
|
||||
max_historical_time = max(GenesisTime, state.LastBlockTime.Sub(state.ConsensusParams.EvidenceAgeTime))
|
||||
```
|
||||
|
||||
Before starting either fast sync or consensus, we then run the following
|
||||
synchronous process:
|
||||
|
||||
```go
|
||||
func ReverseSync(max_historical_height int64, max_historical_time time.Time) error
|
||||
```
|
||||
|
||||
Where we fetch and verify blocks until a block `A` where
|
||||
`A.Height <= max_historical_height` and `A.Time <= max_historical_time`.
|
||||
|
||||
Upon successfully reverse syncing, a node can now safely continue. As this
|
||||
feature is only used as part of state sync, one can think of this as merely an
|
||||
extension to it.
|
||||
|
||||
In the future we may want to extend this functionality to allow nodes to fetch
|
||||
historical blocks for reasons of accountability or data accessibility.
|
||||
|
||||
### Design
|
||||
|
||||
This section will provide a high level overview of some of the more important
|
||||
characteristics of the design, saving the more tedious details as an ADR.
|
||||
|
||||
#### P2P
|
||||
|
||||
Implementation of this RFC will require the addition of a new channel and two
|
||||
new messages.
|
||||
|
||||
```proto
|
||||
message LightBlockRequest {
|
||||
uint64 height = 1;
|
||||
}
|
||||
```
|
||||
|
||||
```proto
|
||||
message LightBlockResponse {
|
||||
Header header = 1;
|
||||
Commit commit = 2;
|
||||
ValidatorSet validator_set = 3;
|
||||
}
|
||||
```
|
||||
|
||||
The P2P path may also enable P2P networked light clients and a state sync that
|
||||
also doesn't need to rely on RPC.
|
||||
|
||||
### Verification
|
||||
|
||||
ReverseSync is used to fetch the following data structures:
|
||||
|
||||
- `Header`
|
||||
- `Commit`
|
||||
- `ValidatorSet`
|
||||
|
||||
Nodes will also need to be able to verify these. This can be achieved by first
|
||||
retrieving the header at the base height from the block store. From this trusted
|
||||
header, the node hashes each of the three data structures and checks that they are correct.
|
||||
|
||||
1. The trusted header's last block ID matches the hash of the new header
|
||||
|
||||
```go
|
||||
header[height].LastBlockID == hash(header[height-1])
|
||||
```
|
||||
|
||||
2. The trusted header's last commit hash matches the hash of the new commit
|
||||
|
||||
```go
|
||||
header[height].LastCommitHash == hash(commit[height-1])
|
||||
```
|
||||
|
||||
3. Given that the node now trusts the new header, check that the header's validator set
|
||||
hash matches the hash of the validator set
|
||||
|
||||
```go
|
||||
header[height-1].ValidatorsHash == hash(validatorSet[height-1])
|
||||
```
|
||||
|
||||
### Termination
|
||||
|
||||
ReverseSync draws a lot of parallels with fast sync. An important consideration
|
||||
for fast sync that also extends to ReverseSync is termination. ReverseSync will
|
||||
finish it's task when one of the following conditions have been met:
|
||||
|
||||
1. It reaches a block `A` where `A.Height <= max_historical_height` and
|
||||
`A.Time <= max_historical_time`.
|
||||
2. None of it's peers reports to have the block at the height below the
|
||||
processes current block.
|
||||
3. A global timeout.
|
||||
|
||||
This implies that we can't guarantee adequate history and thus the term
|
||||
"invariant" can't be used in the strictest sense. In the case that the first
|
||||
condition isn't met, the node will log an error and optimistically attempt
|
||||
to continue with either fast sync or consensus.
|
||||
|
||||
## Alternative Solutions
|
||||
|
||||
The need for a minimum block history invariant stems purely from the need to
|
||||
validate evidence (although there may be some application relevant needs as
|
||||
well). Because of this, an alternative, could be to simply trust whatever the
|
||||
2/3+ majority has agreed upon and in the case where a node is at the head of the
|
||||
blockchain, you simply abstain from voting.
|
||||
|
||||
As it stands, if 2/3+ vote on evidence you can't verify, in the same manner if
|
||||
2/3+ vote on a header that a node sees as invalid (perhaps due to a different
|
||||
app hash), the node will halt.
|
||||
|
||||
Another alternative is the method with which the relevant data is retrieved.
|
||||
Instead of introducing new messages to the P2P layer, RPC could have been used
|
||||
instead.
|
||||
|
||||
The aforementioned data is already available via the following RPC endpoints:
|
||||
`/commit` for `Header`'s' and `/validators` for `ValidatorSet`'s'. It was
|
||||
decided predominantly due to the instability of the current RPC infrastructure
|
||||
that P2P be used instead.
|
||||
|
||||
## Status
|
||||
|
||||
Proposed
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- Ensures a minimum block history invariant for honest nodes. This will allow
|
||||
nodes to verify evidence.
|
||||
|
||||
### Negative
|
||||
|
||||
- Statesync will be slower as more processing is required.
|
||||
|
||||
### Neutral
|
||||
|
||||
- By having validator sets served through p2p, this would make it easier to
|
||||
extend p2p support to light clients and state sync.
|
||||
- In the future, it may also be possible to extend this feature to allow for
|
||||
nodes to freely fetch and verify prior blocks
|
||||
|
||||
## References
|
||||
|
||||
- [RFC-001: Block retention](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-077-block-retention.md)
|
||||
- [Original issue](https://github.com/tendermint/tendermint/issues/4629)
|
||||
@@ -1,201 +0,0 @@
|
||||
# ADR 081: Protocol Buffers Management
|
||||
|
||||
## Changelog
|
||||
|
||||
- 2022-02-28: First draft
|
||||
|
||||
## Status
|
||||
|
||||
Accepted
|
||||
|
||||
[Tracking issue](https://github.com/tendermint/tendermint/issues/8121)
|
||||
|
||||
## Context
|
||||
|
||||
At present, we manage the [Protocol Buffers] schema files ("protos") that define
|
||||
our wire-level data formats within the Tendermint repository itself (see the
|
||||
[`proto`](../../proto/) directory). Recently, we have been making use of [Buf],
|
||||
both locally and in CI, in order to generate Go stubs, and lint and check
|
||||
`.proto` files for breaking changes.
|
||||
|
||||
The version of Buf used at the time of this decision was `v1beta1`, and it was
|
||||
discussed in [\#7975] and in weekly calls as to whether we should upgrade to
|
||||
`v1` and harmonize our approach with that used by the Cosmos SDK. The team
|
||||
managing the Cosmos SDK was primarily interested in having our protos versioned
|
||||
and easily accessible from the [Buf] registry.
|
||||
|
||||
The three main sets of stakeholders for the `.proto` files and their needs, as
|
||||
currently understood, are as follows.
|
||||
|
||||
1. Tendermint needs Go code generated from `.proto` files.
|
||||
2. Consumers of Tendermint's `.proto` files, specifically projects that want to
|
||||
interoperate with Tendermint and need to generate code for their own
|
||||
programming language, want to be able to access these files in a reliable and
|
||||
efficient way.
|
||||
3. The Tendermint Core team wants to provide stable interfaces that are as easy
|
||||
as possible to maintain, on which consumers can depend, and to be able to
|
||||
notify those consumers promptly when those interfaces change. To this end, we
|
||||
want to:
|
||||
1. Prevent any breaking changes from being introduced in minor/patch releases
|
||||
of Tendermint. Only major version updates should be able to contain
|
||||
breaking interface changes.
|
||||
2. Prevent generated code from diverging from the Protobuf schema files.
|
||||
|
||||
There was also discussion surrounding the notion of automated documentation
|
||||
generation and hosting, but it is not clear at this time whether this would be
|
||||
that valuable to any of our stakeholders. What will, of course, be valuable at
|
||||
minimum would be better documentation (in comments) of the `.proto` files
|
||||
themselves.
|
||||
|
||||
## Alternative Approaches
|
||||
|
||||
### Meeting stakeholders' needs
|
||||
|
||||
1. Go stub generation from protos. We could use:
|
||||
1. [Buf]. This approach has been rather cumbersome up to this point, and it
|
||||
is not clear what Buf really provides beyond that which `protoc` provides
|
||||
to justify the additional complexity in configuring Buf for stub
|
||||
generation.
|
||||
2. [protoc] - the Protocol Buffers compiler.
|
||||
2. Notification of breaking changes:
|
||||
1. Buf in CI for all pull requests to *release* branches only (and not on
|
||||
`master`).
|
||||
2. Buf in CI on every pull request to every branch (this was the case at the
|
||||
time of this decision, and the team decided that the signal-to-noise ratio
|
||||
for this approach was too low to be of value).
|
||||
3. `.proto` linting:
|
||||
1. Buf in CI on every pull request
|
||||
4. `.proto` formatting:
|
||||
1. [clang-format] locally and a [clang-format GitHub Action] in CI to check
|
||||
that files are formatted properly on every pull request.
|
||||
5. Sharing of `.proto` files in a versioned, reliable manner:
|
||||
1. Consumers could simply clone the Tendermint repository, check out a
|
||||
specific commit, tag or branch and manually copy out all of the `.proto`
|
||||
files they need. This requires no effort from the Tendermint Core team and
|
||||
will continue to be an option for consumers. The drawback of this approach
|
||||
is that it requires manual coding/scripting to implement and is brittle in
|
||||
the face of bigger changes.
|
||||
2. Uploading our `.proto` files to Buf's registry on every release. This is
|
||||
by far the most seamless for consumers of our `.proto` files, but requires
|
||||
the dependency on Buf. This has the additional benefit that the Buf
|
||||
registry will automatically [generate and host
|
||||
documentation][buf-docs-gen] for these protos.
|
||||
3. We could create a process that, upon release, creates a `.zip` file
|
||||
containing our `.proto` files.
|
||||
|
||||
### Popular alternatives to Buf
|
||||
|
||||
[Prototool] was not considered as it appears deprecated, and the ecosystem seems
|
||||
to be converging on Buf at this time.
|
||||
|
||||
### Tooling complexity
|
||||
|
||||
The more tools we have in our build/CI processes, the more complex and fragile
|
||||
repository/CI management becomes, and the longer it takes to onboard new team
|
||||
members. Maintainability is a core concern here.
|
||||
|
||||
### Buf sustainability and costs
|
||||
|
||||
One of the primary considerations regarding the usage of Buf is whether, for
|
||||
example, access to its registry will eventually become a
|
||||
paid-for/subscription-based service and whether this is valuable enough for us
|
||||
and the ecosystem to pay for such a service. At this time, it appears as though
|
||||
Buf will never charge for hosting open source projects' protos.
|
||||
|
||||
Another consideration was Buf's sustainability as a project - what happens when
|
||||
their resources run out? Will there be a strong and broad enough open source
|
||||
community to continue maintaining it?
|
||||
|
||||
### Local Buf usage options
|
||||
|
||||
Local usage of Buf (i.e. not in CI) can be accomplished in two ways:
|
||||
|
||||
1. Installing the relevant tools individually.
|
||||
2. By way of its [Docker image][buf-docker].
|
||||
|
||||
Local installation of Buf requires developers to manually keep their toolchains
|
||||
up-to-date. The Docker option comes with a number of complexities, including
|
||||
how the file system permissions of code generated by a Docker container differ
|
||||
between platforms (e.g. on Linux, Buf-generated code ends up being owned by
|
||||
`root`).
|
||||
|
||||
The trouble with the Docker-based approach is that we make use of the
|
||||
[gogoprotobuf] plugin for `protoc`. Continuing to use the Docker-based approach
|
||||
to using Buf will mean that we will have to continue building our own custom
|
||||
Docker image with embedded gogoprotobuf.
|
||||
|
||||
Along these lines, we could eventually consider coming up with a [Nix]- or
|
||||
[redo]-based approach to developer tooling to ensure tooling consistency across
|
||||
the team and for anyone who wants to be able to contribute to Tendermint.
|
||||
|
||||
## Decision
|
||||
|
||||
1. We will adopt Buf for now for proto generation, linting, breakage checking
|
||||
and its registry (mainly in CI, with optional usage locally).
|
||||
2. Failing CI when checking for breaking changes in `.proto` files will only
|
||||
happen when performing minor/patch releases.
|
||||
3. Local tooling will be favored over Docker-based tooling.
|
||||
|
||||
## Detailed Design
|
||||
|
||||
We currently aim to:
|
||||
|
||||
1. Update to Buf `v1` to facilitate linting, breakage checking and uploading to
|
||||
the Buf registry.
|
||||
2. Configure CI appropriately for proto management:
|
||||
1. Uploading protos to the Buf registry on every release (e.g. the
|
||||
[approach][cosmos-sdk-buf-registry-ci] used by the Cosmos SDK).
|
||||
2. Linting on every pull request (e.g. the
|
||||
[approach][cosmos-sdk-buf-linting-ci] used by the Cosmos SDK). The linter
|
||||
passing should be considered a requirement for accepting PRs.
|
||||
3. Checking for breaking changes in minor/patch version releases and failing
|
||||
CI accordingly - see [\#8003].
|
||||
4. Add [clang-format GitHub Action] to check `.proto` file formatting. Format
|
||||
checking should be considered a requirement for accepting PRs.
|
||||
3. Update the Tendermint [`Makefile`](../../Makefile) to primarily facilitate
|
||||
local Protobuf stub generation, linting, formatting and breaking change
|
||||
checking. More specifically:
|
||||
1. This includes removing the dependency on Docker and introducing the
|
||||
dependency on local toolchain installation. CI-based equivalents, where
|
||||
relevant, will rely on specific GitHub Actions instead of the Makefile.
|
||||
2. Go code generation will rely on `protoc` directly.
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- We will still offer Go stub generation, proto linting and breakage checking.
|
||||
- Breakage checking will only happen on minor/patch releases to increase the
|
||||
signal-to-noise ratio in CI.
|
||||
- Versioned protos will be made available via Buf's registry upon every release.
|
||||
|
||||
### Negative
|
||||
|
||||
- Developers/contributors will need to install the relevant Protocol
|
||||
Buffers-related tooling (Buf, gogoprotobuf, clang-format) locally in order to
|
||||
build, lint, format and check `.proto` files for breaking changes.
|
||||
|
||||
### Neutral
|
||||
|
||||
## References
|
||||
|
||||
- [Protocol Buffers]
|
||||
- [Buf]
|
||||
- [\#7975]
|
||||
- [protoc] - The Protocol Buffers compiler
|
||||
|
||||
[Protocol Buffers]: https://developers.google.com/protocol-buffers
|
||||
[Buf]: https://buf.build/
|
||||
[\#7975]: https://github.com/tendermint/tendermint/pull/7975
|
||||
[protoc]: https://github.com/protocolbuffers/protobuf
|
||||
[clang-format]: https://clang.llvm.org/docs/ClangFormat.html
|
||||
[clang-format GitHub Action]: https://github.com/marketplace/actions/clang-format-github-action
|
||||
[buf-docker]: https://hub.docker.com/r/bufbuild/buf
|
||||
[cosmos-sdk-buf-registry-ci]: https://github.com/cosmos/cosmos-sdk/blob/e6571906043b6751951a42b6546431b1c38b05bd/.github/workflows/proto-registry.yml
|
||||
[cosmos-sdk-buf-linting-ci]: https://github.com/cosmos/cosmos-sdk/blob/e6571906043b6751951a42b6546431b1c38b05bd/.github/workflows/proto.yml#L15
|
||||
[\#8003]: https://github.com/tendermint/tendermint/issues/8003
|
||||
[Nix]: https://nixos.org/
|
||||
[gogoprotobuf]: https://github.com/gogo/protobuf
|
||||
[Prototool]: https://github.com/uber/prototool
|
||||
[buf-docs-gen]: https://docs.buf.build/bsr/documentation
|
||||
[redo]: https://redo.readthedocs.io/en/latest/
|
||||
@@ -1,101 +0,0 @@
|
||||
# ADR {ADR-NUMBER}: {TITLE}
|
||||
|
||||
## Changelog
|
||||
|
||||
- {date}: {changelog}
|
||||
|
||||
## Status
|
||||
|
||||
> An architecture decision is considered "proposed" when a PR containing the ADR
|
||||
> is submitted. When merged, an ADR must have a status associated with it, which
|
||||
> must be one of: "Accepted", "Rejected", "Deprecated" or "Superseded".
|
||||
>
|
||||
> An accepted ADR's implementation status must be tracked via a tracking issue,
|
||||
> milestone or project board (only one of these is necessary). For example:
|
||||
>
|
||||
> Accepted
|
||||
>
|
||||
> [Tracking issue](https://github.com/tendermint/tendermint/issues/123)
|
||||
> [Milestone](https://github.com/tendermint/tendermint/milestones/123)
|
||||
> [Project board](https://github.com/orgs/tendermint/projects/123)
|
||||
>
|
||||
> Rejected ADRs are captured as a record of recommendations that we specifically
|
||||
> do not (and possibly never) want to implement. The ADR itself must, for
|
||||
> posterity, include reasoning as to why it was rejected.
|
||||
>
|
||||
> If an ADR is deprecated, simply write "Deprecated" in this section. If an ADR
|
||||
> is superseded by one or more other ADRs, provide local a reference to those
|
||||
> ADRs, e.g.:
|
||||
>
|
||||
> Superseded by [ADR 123](./adr-123.md)
|
||||
|
||||
Accepted | Rejected | Deprecated | Superseded by
|
||||
|
||||
## Context
|
||||
|
||||
> This section contains all the context one needs to understand the current state,
|
||||
> and why there is a problem. It should be as succinct as possible and introduce
|
||||
> the high level idea behind the solution.
|
||||
|
||||
## Alternative Approaches
|
||||
|
||||
> This section contains information around alternative options that are considered
|
||||
> before making a decision. It should contain a explanation on why the alternative
|
||||
> approach(es) were not chosen.
|
||||
|
||||
## Decision
|
||||
|
||||
> This section records the decision that was made.
|
||||
> It is best to record as much info as possible from the discussion that happened.
|
||||
> This aids in not having to go back to the Pull Request to get the needed information.
|
||||
|
||||
## Detailed Design
|
||||
|
||||
> This section does not need to be filled in at the start of the ADR, but must
|
||||
> be completed prior to the merging of the implementation.
|
||||
>
|
||||
> Here are some common questions that get answered as part of the detailed design:
|
||||
>
|
||||
> - What are the user requirements?
|
||||
>
|
||||
> - What systems will be affected?
|
||||
>
|
||||
> - What new data structures are needed, what data structures will be changed?
|
||||
>
|
||||
> - What new APIs will be needed, what APIs will be changed?
|
||||
>
|
||||
> - What are the efficiency considerations (time/space)?
|
||||
>
|
||||
> - What are the expected access patterns (load/throughput)?
|
||||
>
|
||||
> - Are there any logging, monitoring or observability needs?
|
||||
>
|
||||
> - Are there any security considerations?
|
||||
>
|
||||
> - Are there any privacy considerations?
|
||||
>
|
||||
> - How will the changes be tested?
|
||||
>
|
||||
> - If the change is large, how will the changes be broken up for ease of review?
|
||||
>
|
||||
> - Will these changes require a breaking (major) release?
|
||||
>
|
||||
> - Does this change require coordination with the SDK or other?
|
||||
|
||||
## Consequences
|
||||
|
||||
> This section describes the consequences, after applying the decision. All
|
||||
> consequences should be summarized here, not just the "positive" ones.
|
||||
|
||||
### Positive
|
||||
|
||||
### Negative
|
||||
|
||||
### Neutral
|
||||
|
||||
## References
|
||||
|
||||
> Are there any relevant PR comments, issues that led up to this, or articles
|
||||
> referenced for why we made the given design choice? If so link them here!
|
||||
|
||||
- {reference link}
|
||||
|
Before Width: | Height: | Size: 13 KiB |
|
Before Width: | Height: | Size: 344 KiB |
|
Before Width: | Height: | Size: 22 KiB |
|
Before Width: | Height: | Size: 15 KiB |
|
Before Width: | Height: | Size: 7.9 KiB |
|
Before Width: | Height: | Size: 43 KiB |
|
Before Width: | Height: | Size: 52 KiB |
|
Before Width: | Height: | Size: 121 KiB |
|
Before Width: | Height: | Size: 167 KiB |
|
Before Width: | Height: | Size: 107 KiB |
|
Before Width: | Height: | Size: 672 KiB |
|
Before Width: | Height: | Size: 9.6 KiB |
|
Before Width: | Height: | Size: 5.8 KiB |
|
Before Width: | Height: | Size: 92 KiB |
|
Before Width: | Height: | Size: 31 KiB |
|
Before Width: | Height: | Size: 99 KiB |
|
Before Width: | Height: | Size: 10 KiB |
@@ -35,7 +35,7 @@ little overview what they do.
|
||||
- `abci-client` As mentioned in [Application Architecture Guide](../app-dev/app-architecture.md), Tendermint acts as an ABCI
|
||||
client with respect to the application and maintains 3 connections:
|
||||
mempool, consensus and query. The code used by Tendermint Core can
|
||||
be found [here](https://github.com/tendermint/tendermint/tree/master/abci/client).
|
||||
be found [here](https://github.com/tendermint/tendermint/blob/v0.36.x/abci/client).
|
||||
- `blockchain` Provides storage, pool (a group of peers), and reactor
|
||||
for both storing and exchanging blocks between peers.
|
||||
- `consensus` The heart of Tendermint core, which is the
|
||||
@@ -43,16 +43,16 @@ little overview what they do.
|
||||
"submodules": `wal` (write-ahead logging) for ensuring data
|
||||
integrity and `replay` to replay blocks and messages on recovery
|
||||
from a crash.
|
||||
[here](https://github.com/tendermint/tendermint/blob/master/types/events.go).
|
||||
[here](https://github.com/tendermint/tendermint/blob/v0.36.x/types/events.go).
|
||||
You can subscribe to them by calling `subscribe` RPC method. Refer
|
||||
to [RPC docs](../tendermint-core/rpc.md) for additional information.
|
||||
- `mempool` Mempool module handles all incoming transactions, whenever
|
||||
they are coming from peers or the application.
|
||||
- `p2p` Provides an abstraction around peer-to-peer communication. For
|
||||
more details, please check out the
|
||||
[README](https://github.com/tendermint/tendermint/tree/master/spec/p2p).
|
||||
[README](https://github.com/tendermint/tendermint/blob/v0.36.x/spec/p2p).
|
||||
- `rpc-server` RPC server. For implementation details, please read the
|
||||
[doc.go](https://github.com/tendermint/tendermint/blob/master/rpc/jsonrpc/doc.go).
|
||||
[doc.go](https://github.com/tendermint/tendermint/blob/v0.36.x/rpc/jsonrpc/doc.go).
|
||||
- `state` Represents the latest state and execution submodule, which
|
||||
executes blocks against the application.
|
||||
- `statesync` Provides a way to quickly sync a node with pruned history.
|
||||
|
||||