Commit Graph

182 Commits

Author SHA1 Message Date
Botond Dénes
0c381572fd repair::row_level: pin table for local reads
The repair reader depends on the table object being alive, while it is
reading. However, for local reads, there was no synchronization between
the lifecycle of the repair reader and that of the table. In some cases
this can result in use-after-free. Solve by using the table's existing
mechanism for lifecycle extension: `read_in_progress()`.

For the non-local reader, when the local node's shard configuration is
different from the remote one's, this problem is already solved, as the
multishard streaming reader already pins table objects on the used
shards. This creates an inconsistency that might be suprising (in a bad
way). One reader takes care of pinning needed resources while the other
one doesn't. I was thorn on how to reconcile this, and decided to go
with the simplest solution, explicitely pinning the table for local
reads, that is conserve the inconsistency. It was suggested that this
inconsitency is remedied by building resource pinning into the local
reader as well [1] but there is opposition to this [2]. Adding a wrapper
reader which does just the resource pinning seems excessive, both in
code and runtime overhead.

Spotted while investigating repair-related crashes which occured during
interrupted repairs.

Fixes: #4342

[1] https://github.com/scylladb/scylla/issues/4342#issuecomment-474271050
[2] https://github.com/scylladb/scylla/issues/4342#issuecomment-474331657

Tests: none, this is a trivial fix for a not-yet-seen-in-the-wild bug.

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <8e84ece8343468960d4e161467ecd9bb10870c27.1553072505.git.bdenes@scylladb.com>
2019-03-20 14:45:22 +02:00
Asias He
a949ccee82 repair: Reject combination of -dc and -hosts options
4 nodes in the cluster
n1, n2 in dc1
n3, n4 in dc2

dc1 RF=2, dc2 RF=2.

If we run

    nodetool repair -hosts 127.0.0.1,127.0.03 -dc "dc1,dc2" multi

on n1.

The -hosts option will be ignored and only the -dc option
will be used to choose which hosts to repair. In this case, n1 to n4
will be repaired.

If user wants to select specific hosts to repair with, there is no need
to specify the -dc option. Use the -hosts option is enough.

Reject the combination and not to surprise the user.

In https://issues.apache.org/jira/browse/CASSANDRA-9876, the same logic
is introduced as well.

Refs #3836
Message-Id: <e95ac1099f98dd53bb9d6534316005ea3577e639.1551406529.git.asias@scylladb.com>
2019-03-02 16:42:29 +02:00
Tomasz Grabiec
1a63a313c8 Merge "repair: Rename names to be consistent with rpc verb
" from Asias

Some of the function names are not updated after we change the rpc verb
names. Rename them to make them consistent with the rpc verb names.

* seastar-dev.git asias/row_level_repair_rename_consistent_with_rpc_verb/v1:
  repair: Rename request_sync_boundary to get_sync_boundary
  repair: Rename request_full_row_hashes to get_full_row_hashes
  repair: Rename request_combined_row_hash to get_combined_row_hash
  repair: Rename request_row_diff to get_row_diff
  repair: Rename send_row_diff to put_row_diff
  repair: Update function name in docs/row_level_repair.md
2019-02-26 13:01:36 +01:00
Asias He
62104902db repair: Rename send_row_diff to put_row_diff
Make it consistent with the row level repair rpc verb.
2019-02-25 15:13:39 +08:00
Asias He
6e4ea1b3c4 repair: Rename request_row_diff to get_row_diff
Make it consistent with the row level repair rpc verb.
2019-02-25 15:13:39 +08:00
Asias He
5b29fb30ac repair: Rename request_combined_row_hash to get_combined_row_hash
Make it consistent with the row level repair rpc verb.
2019-02-25 15:13:39 +08:00
Asias He
6f6c4878d5 repair: Rename request_full_row_hashes to get_full_row_hashes
Make it consistent with the row level repair rpc verb.
2019-02-25 15:13:39 +08:00
Asias He
02ddfa393e repair: Rename request_sync_boundary to get_sync_boundary
Make it consistent with the row level repair rpc verb.
2019-02-25 15:13:39 +08:00
Rafael Ávila de Espíndola
fd5ea2df5a Avoid including cryptopp headers
cryptopp's config.h has the following pragma:

 #pragma GCC diagnostic ignored "-Wunused-function"

It is not wrapped in a push/pop. Because of that, including cryptopp
headers disables that warning on scylla code too.

The issue has been reported as
https://github.com/weidai11/cryptopp/issues/793

To work around it, this patch uses a pimpl to have a single .cc file
that has to include cryptopp headers.

While at it, it also reduces the differences and code duplication
between the md5 and sha1 hashers.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-02-20 08:03:46 -08:00
Avi Kivity
468f8c7ee7 Merge "Print a warning if a row is too large" from Rafael
"
This is a first step in fixing #3988.
"

* 'espindola/large-row-warn-only-v4' of https://github.com/espindola/scylla:
  Rename large_partition_handler
  Print a warning if a row is too large
  Remove defaut parameter value
  Rename _threshold_bytes to _partition_threshold_bytes
  keys: add schema-aware printing for clustering_key_prefix
2019-02-03 13:57:42 +02:00
Asias He
9d9ecda619 repair: Log keyspace and table name in repair_cf_range
When a repair failed, we saw logs like:

   repair - Checksum of range (8235770168569320790, 8235957818553794560] on
   127.0.0.1 failed: std::bad_alloc (std::bad_alloc)

It is hard to tell which keyspace and table has failed.

To fix, log the keyspace and table name. It is useful to know when debugging.

Fixes #4166
Message-Id: <8424d314125b88bf5378ea02a703b0f82c2daeda.1548818669.git.asias@scylladb.com>
2019-01-31 12:36:46 +02:00
Rafael Ávila de Espíndola
625080b414 Rename large_partition_handler
Now that it also handles large rows, rename it to large_data_handler.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-01-28 15:03:14 -08:00
Piotr Jastrzebski
fab1b7a3a2 Fix cross shard cf usage in repair
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2019-01-24 18:13:49 +01:00
Asias He
4b9e1a9f1d repair: Add row level metrics
Number of rows sent and received
- tx_row_nr
- rx_row_nr

Bytes of rows sent and received
- tx_row_bytes
- rx_row_bytes

Number of row hashes sent and received
- tx_hashes_nr
- rx_hashes_nr

Number of rows read from disk
- row_from_disk_nr

Bytes of rows read from disk
- row_from_disk_bytes

Message-Id: <d1ee6b8ae8370857fe45f88b6c13087ea217d381.1547603905.git.asias@scylladb.com>
2019-01-16 14:04:57 +02:00
Duarte Nunes
04a14b27e4 Merge 'Add handling staging sstables to /upload dir' from Piotr
"
This series adds generating view updates from sstables added through
/upload directory if their tables have accompanying materialized views.
Said sstables are left in /upload directory until updates are generated
from them and are treated just like staging sstables from /staging dir.
If there are no views for a given tables, sstables are simply moved
from /upload dir to datadir without any changes.

Tests: unit (release)
"

* 'add_handling_staging_sstables_to_upload_dir_5' of https://github.com/psarna/scylla:
  all: rename view_update_from_staging_generator
  distributed_loader: fix indentation
  service: add generating view updates from uploaded sstables
  init: pass view update generator to storage service
  sstables: treat sstables in upload dir as needing view build
  sstables,table: rename is_staging to requires_view_building
  distributed_loader: use proper directory for opening SSTable
  db,view: make throttling optional for view_update_generator
2019-01-15 18:19:27 +00:00
Piotr Sarna
0eb703dc80 all: rename view_update_from_staging_generator
The new name, view_update_generator, is both more concise
and correct, since we now generate from directories
other than "/staging".
2019-01-15 17:31:47 +01:00
Piotr Sarna
08a42d47a5 repair: add stream phasing to row level repair
In order to allow other services to wait for incoming streams
to finish, row level repair uses stream phasing when creating
new sstables from incoming data.

Fixes scylladb#4032
2019-01-15 10:28:21 +01:00
Duarte Nunes
fa2b0384d2 Replace std::experimental types with C++17 std version.
Replace stdx::optional and stdx::string_view with the C++ std
counterparts.

Some instances of boost::variant were also replaced with std::variant,
namely those that called seastar::visit.

Scylla now requires GCC 8 to compile.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20190108111141.5369-1-duarte@scylladb.com>
2019-01-08 13:16:36 +02:00
Asias He
1de24c8495 repair: Use mf.visit() in fragment_hasher
When new fragment type is added, it will fail to compile instead of
producing runtime errors.

Message-Id: <cf10200e4185c779aad15da3a776a5b79f5323af.1546930796.git.asias@scylladb.com>
2019-01-08 12:02:42 +02:00
Avi Kivity
f02c64cadf streaming: stream_session: remove include of db/view/view_update_from_staging_generator.hh
This header, which is easily replaced with a forward declaration,
introduces a dependency on database.hh everywhere. Remove it and scatter
includes of database.hh in source files that really need it.
2019-01-05 17:33:25 +02:00
Piotr Sarna
bc74ac6f09 repair: add staging sstables support to row level repair
In some cases, sstables created during row level repair
should be enqueued as staging in order to generate
view updates from them.

Fixes #4034
2019-01-03 08:36:45 +01:00
Piotr Sarna
a0003c52cf main,repair: add params to row level repair init
Row level repair needs references to system distributed keyspace
and view update generator in order to enqueue some sstables
as staging.
2019-01-03 08:31:41 +01:00
Avi Kivity
c96fc1d585 Merge "Introduce row level repair" from Asias
"
=== How the the partition level repair works

- The repair master decides which ranges to work on.
- The repair master splits the ranges to sub ranges which contains around 100
partitions.
- The repair master computes the checksum of the 100 partitions and asks the
related peers to compute the checksum of the 100 partitions.
- If the checksum matches, the data in this sub range is synced.
- If the checksum mismatches, repair master fetches the data from all the peers
and sends back the merged data to peers.

=== Major problems with partition level repair

- A mismatch of a single row in any of the 100 partitions causes 100
partitions to be transferred. A single partition can be very large. Not to
mention the size of 100 partitions.

- Checksum (find the mismatch) and streaming (fix the mismatch) will read the
same data twice

=== Row level repair

Row level checksum and synchronization: detect row level mismatch and transfer
only the mismatch

=== How the row level repair works

- To solve the problem of reading data twice

Read the data only once for both checksum and synchronization between nodes.

We work on a small range which contains only a few mega bytes of rows,
We read all the rows within the small range into memory. Find the
mismatch and send the mismatch rows between peers.

We need to find a sync boundary among the nodes which contains only N bytes of
rows.

- To solve the problem of sending unnecessary data.

We need to find the mismatched rows between nodes and only send the delta.
The problem is called set reconciliation problem which is a common problem in
distributed systems.

For example:
Node1 has set1 = {row1, row2, row3}
Node2 has set2 = {      row2, row3}
Node3 has set3 = {row1, row2, row4}

To repair:
Node1 fetches nothing from Node2 (set2 - set1), fetches row4 (set3 - set1) from Node3.
Node1 sends row1 and row4 (set1 + set2 + set3 - set2) to Node2
Node1 sends row3 (set1 + set2 + set3 - set3) to Node3.

=== How to implement repair with set reconciliation

- Step A: Negotiate sync boundary

class repair_sync_boundary {
    dht::decorated_key pk;
    position_in_partition position
}

Reads rows from disk into row buffers until the size is larger than N
bytes. Return the repair_sync_boundary of the last mutation_fragment we
read from disk. The smallest repair_sync_boundary of all nodes is
set as the current_sync_boundary.

- Step B: Get missing rows from peer nodes so that repair master contains all the rows

Request combined hashes from all nodes between last_sync_boundary and
current_sync_boundary. If the combined hashes from all nodes are identical,
data is synced, goto Step A. If not, request the full hashes from peers.

At this point, the repair master knows exactly what rows are missing. Request the
missing rows from peer nodes.

Now, local node contains all the rows.

- Step C: Send missing rows to the peer nodes

Since local node also knows what peer nodes own, it sends the missing rows to
the peer nodes.

=== How the RPC API looks like

- repair_range_start()

Step A:
- request_sync_boundary()

Step B:
- request_combined_row_hashes()
- reqeust_full_row_hashes()
- request_row_diff()

Step C:
- send_row_diff()

- repair_range_stop()

=== Performance evaluation

We created a cluster of 3 Scylla nodes on AWS using i3.xlarge instance. We
created a keyspace with a replication factor of 3 and inserted 1 billion
rows to each of the 3 nodes. Each node has 241 GiB of data.
We tested 3 cases below.

1) 0% synced: one of the node has zero data. The other two nodes have 1 billion identical rows.

Time to repair:
   old = 87 min
   new = 70 min (rebuild took 50 minutes)
   improvement = 19.54%

2) 100% synced: all of the 3 nodes have 1 billion identical rows.
Time to repair:
   old = 43 min
   new = 24 min
   improvement = 44.18%

3) 99.9% synced: each node has 1 billion identical rows and 1 billion * 0.1% distinct rows.

Time to repair:
   old: 211 min
   new: 44 min
   improvement: 79.15%

Bytes sent on wire for repair:
   old: tx= 162 GiB,  rx = 90 GiB
   new: tx= 1.15 GiB, tx = 0.57 GiB
   improvement: tx = 99.29%, rx = 99.36%

It is worth noting that row level repair sends and receives exactly the
number of rows needed in theory.

In this test case, repair master needs to receives 2 million rows and
sends 4 million rows. Here are the details: Each node has 1 billion *
0.1% distinct rows, that is 1 million rows. So repair master receives 1
million rows from repair slave 1 and 1 million rows from repair slave 2.
Repair master sends 1 million rows from repair master and 1 million rows
received from repair slave 1 to repair slave 2. Repair master sends
sends 1 million rows from repair master and 1 million rows received from
repair slave 2 to repair slave 1.

In the result, we saw the rows on wire were as expected.

tx_row_nr  = 1000505 + 999619 + 1001257 + 998619 (4 shards, the numbers are for each shard) = 4'000'000
rx_row_nr  =  500233 + 500235 +  499559 + 499973 (4 shards, the numbers are for each shard) = 2'000'000

Fixes: #3033

Tests: dtests/repair_additional_test.py
"

* 'asias/row_level_repair_v7' of github.com:cloudius-systems/seastar-dev: (51 commits)
  repair: Enable row level repair
  repair: Add row_level_repair
  repair: Add docs for row level repair
  repair: Add repair_init_messaging_service_handler
  repair: Add repair_meta
  repair: Add repair_writer
  repair: Add repair_reader
  repair: Add repair_row
  repair: Add fragment_hasher
  repair: Add decorated_key_with_hash
  repair: Add get_random_seed
  repair: Add get_common_diff_detect_algorithm
  repair: Add shard_config
  repair: Add suportted_diff_detect_algorithms
  repair: Add repair_stats to repair_info
  repair: Introduce repair_stats
  flat_mutation_reader:  Add make_generating_reader
  storage_service: Introduce ROW_LEVEL_REPAIR feature
  messaging_service: Add RPC verbs for row level repair
  repair: Export the repair logger
  ...
2018-12-25 13:13:00 +02:00
Botond Dénes
1865e5da41 treewide: remove include database.hh from headers where possible
Many headers don't really need to include database.hh, the include can
be replaced by forward declarations and/or including the actually needed
headers directly. Some headers don't need this include at all.

Each header was verified to be compilable on its own after the change,
by including it into an empty `.cc` file and compiling it. `.cc` files
that used to get `database.hh` through headers that no longer include it
were changed to include it themselves.
2018-12-14 08:03:57 +02:00
Asias He
b9e0db801d repair: Enable row level repair
Finally, enable new row level repair if the cluster supports it. If not,
fallback to the old partition level repair.

Fixes #3033
2018-12-12 16:49:01 +08:00
Asias He
d372317e99 repair: Add row_level_repair
=== How the the partition level repair works

- The repair master decides which ranges to work on.
- The repair master splits the ranges to sub ranges which contains around 100
partitions.
- The repair master computes the checksum of the 100 partitions and asks the
related peers to compute the checksum of the 100 partitions.
- If the checksum matches, the data in this sub range is synced.
- If the checksum mismatches, repair master fetches the data from all the peers
and sends back the merged data to peers.

=== Major problems with partition level repair

- A mismatch of a single row in any of the 100 partitions causes 100
partitions to be transferred. A single partition can be very large. Not to
mention the size of 100 partitions.

- Checksum (find the mismatch) and streaming (fix the mismatch) will read the
same data twice

=== Row level repair

Row level checksum and synchronization: detect row level mismatch and transfer
only the mismatch

=== How the row level repair works

- To solve the problem of reading data twice

Read the data only once for both checksum and synchronization between nodes.

We work on a small range which contains only a few mega bytes of rows,
We read all the rows within the small range into memory. Find the
mismatch and send the mismatch rows between peers.

We need to find a sync boundary among the nodes which contains only N bytes of
rows.

- To solve the problem of sending unnecessary data.

We need to find the mismatched rows between nodes and only send the delta.
The problem is called set reconciliation problem which is a common problem in
distributed systems.

For example:
Node1 has set1 = {row1, row2, row3}
Node2 has set2 = {      row2, row3}
Node3 has set3 = {row1, row2, row4}

To repair:
Node1 fetches nothing from Node2 (set2 - set1), fetches row4 (set3 - set1) from Node3.
Node1 sends row1 and row4 (set1 + set2 + set3 - set2) to Node2
Node1 sends row3 (set1 + set2 + set3 - set3) to Node3.

=== How to implement repair with set reconciliation

- Step A: Negotiate sync boundary

class repair_sync_boundary {
    dht::decorated_key pk;
    position_in_partition position
}

Reads rows from disk into row buffers until the size is larger than N
bytes. Return the repair_sync_boundary of the last mutation_fragment we
read from disk. The smallest repair_sync_boundary of all nodes is
set as the current_sync_boundary.

- Step B: Get missing rows from peer nodes so that repair master contains all the rows

Request combined hashes from all nodes between last_sync_boundary and
current_sync_boundary. If the combined hashes from all nodes are identical,
data is synced, goto Step A. If not, request the full hashes from peers.

At this point, the repair master knows exactly what rows are missing. Request the
missing rows from peer nodes.

Now, local node contains all the rows.

- Step C: Send missing rows to the peer nodes

Since local node also knows what peer nodes own, it sends the missing rows to
the peer nodes.

=== How the RPC API looks like

- repair_range_start()

Step A:
- request_sync_boundary()

Step B:
- request_combined_row_hashes()
- reqeust_full_row_hashes()
- request_row_diff()

Step C:
- send_row_diff()

- repair_range_stop()

=== Performance evaluation

We created a cluster of 3 Scylla nodes on AWS using i3.xlarge instance. We
created a keyspace with a replication factor of 3 and inserted 1 billion
rows to each of the 3 nodes. Each node has 241 GiB of data.
We tested 3 cases below.

1) 0% synced: one of the node has zero data. The other two nodes have 1 billion identical rows.

Time to repair:
   old = 87 min
   new = 70 min (rebuild took 50 minutes)
   improvement = 19.54%

2) 100% synced: all of the 3 nodes have 1 billion identical rows.
Time to repair:
   old = 43 min
   new = 24 min
   improvement = 44.18%

3) 99.9% synced: each node has 1 billion identical rows and 1 billion * 0.1% distinct rows.

Time to repair:
   old: 211 min
   new: 44 min
   improvement: 79.15%

Bytes sent on wire for repair:
   old: tx= 162 GiB,  rx = 90 GiB
   new: tx= 1.15 GiB, tx = 0.57 GiB
   improvement: tx = 99.29%, rx = 99.36%

It is worth noting that row level repair sends and receives exactly the
number of rows needed in theory.

In this test case, repair master needs to receives 2 million rows and
sends 4 million rows. Here are the details: Each node has 1 billion *
0.1% distinct rows, that is 1 million rows. So repair master receives 1
million rows from repair slave 1 and 1 million rows from repair slave 2.
Repair master sends 1 million rows from repair master and 1 million rows
received from repair slave 1 to repair slave 2. Repair master sends
sends 1 million rows from repair master and 1 million rows received from
repair slave 2 to repair slave 1.

In the result, we saw the rows on wire were as expected.

tx_row_nr  = 1000505 + 999619 + 1001257 + 998619 (4 shards, the numbers are for each shard) = 4'000'000
rx_row_nr  =  500233 + 500235 +  499559 + 499973 (4 shards, the numbers are for each shard) = 2'000'000

Fixes #3033
2018-12-12 16:49:01 +08:00
Asias He
fab31efae1 repair: Add repair_init_messaging_service_handler
This patch implements all the rpc handlers for row level repair.
2018-12-12 16:49:01 +08:00
Asias He
3c80727d51 repair: Add repair_meta
This patch introduces repair_meta class that is the core class for the
row level repair.

For each range to repair, repair_meta objects are created on both repair
master and repair slaves. It stores the meta data for the row level
repair algorithms, e.g, the current sync boundary, the buffer used to
hold the rows the peers are working on, the reader to read data from
sstable and the writer to write data to sstable.

This patch also implements the RPC verbs for row level repair, for
example, REPAIR_ROW_LEVEL_START/REPAIR_ROW_LEVEL_STOP to starts/stops
row level repair for a range, REPAIR_GET_SYNC_BOUNDARY to get sync
boundary peers want to work on, REPAIR_GET_ROW_DIFF to get missing rows
from repair slaves and REPAIR_PUT_ROW_DIFF to pus missing rows to repair
slaves.
2018-12-12 16:49:01 +08:00
Asias He
65099bac85 repair: Add repair_writer
repair_writer uses multishard_writer to apply the mutation_fragments to
sstable. The repair master needs one such writer for each of the repair
slave. The repair slave needs one writer for the repair master.
2018-12-12 16:49:01 +08:00
Asias He
5b75f64e0e repair: Add repair_reader
repair_reader is used to read data from disk. It is simply a local
flat_mutation_reader reader for the repair master. It is more
complicated for the repair slave.

The repair slaves have to follow what repair master read from disk.

For example,

Assume repair master has 2 shards and repair slave has 3 shards
Repair master on shard 0 asks repair slave on shard 0 to read range [0,100).
Repair master on shard 1 asks repair slave on shard 1 to read range [0,100).

Repair master on shard 0 will only read the data that belongs to shard 0
within range [0,100). Since master and slave have different shard count,
repair slave on shard 0 has to use the multi shard reader to collect
data on all the shards. It can not pass range [0, 100) to the multi
shard reader, otherwise it will read more data than the repair master.
Instead, repair slave uses a sharder using sharding configuration of the
repair master, to generate the sub ranges belong to shard 0 of repair
master.

If repair master and slave has the same sharding configuration, a simple
local reader is enough for repair slave.
2018-12-12 16:49:01 +08:00
Asias He
27128d132d repair: Add repair_row
repair_row is the in-memory representation of "row" that the row level
repair works on. It represents a mutation_fragment that is read from the
flat_mutation reader. The hash of a repair_row is the combination of the
mutation_fragment hash and partition_key hash.
2018-12-12 16:49:01 +08:00
Asias He
3e7b1d2ef4 repair: Add fragment_hasher
It is used to calculate the hash of a mutation_fragment.
2018-12-12 16:49:01 +08:00
Asias He
e135871e4a repair: Add decorated_key_with_hash
Represents a decorated_key and the hash for it so that we do not need to
calculate more than once if the decorated_key is used more than once.
2018-12-12 16:49:01 +08:00
Asias He
16c1b26937 repair: Add get_random_seed
Get a random uint64_t number as the seed for the repair row hashing.
The seed is passed to xx_hasher.

We add the randomization when hashing rows so that when we run repair
for the next time the same row produces different hashing number.
2018-12-12 16:49:01 +08:00
Asias He
54888ac52c repair: Add get_common_diff_detect_algorithm
It is used to find the common difference detection algorithms supported
by repair master and repair slaves.

It is up to repair master to choose what algorithm to use.
2018-12-12 16:49:01 +08:00
Asias He
0b294d5829 repair: Add shard_config
It is used to store the shard configuration.
2018-12-12 16:49:01 +08:00
Asias He
a36b0966cf repair: Add suportted_diff_detect_algorithms
It returns a vector of row level repair difference detection algorithms
supported by this node.

We are going to implement the "send_full_set" in the following patches.
2018-12-12 16:49:01 +08:00
Asias He
42f2cd8dc5 repair: Add repair_stats to repair_info
Also add update_statistics() to update current stats.
2018-12-12 16:49:01 +08:00
Asias He
43c04302f3 repair: Introduce repair_stats
It is used by row level repair to track repair statistics.
2018-12-12 16:49:01 +08:00
Asias He
8cfdcf435e repair: Export the repair logger
It will be used by the row level repair soon.
2018-12-12 16:49:01 +08:00
Asias He
e62aeae2db repair: Export repair_info
It will be used by the row level repair soon.
2018-12-12 16:49:01 +08:00
Asias He
6be3b35d52 repair: Export estimate_partitions
It will be used by row level repair soon.
2018-12-12 16:49:01 +08:00
Asias He
1a0bc8acf1 repair: Add struct hash<node_repair_meta_id> for node_repair_meta_id 2018-12-12 16:49:01 +08:00
Asias He
28d090ffda repair: Add struct hash<repair_hash> for repair_hash 2018-12-12 16:49:01 +08:00
Asias He
ce70225b1c repair: Introduce row_level_diff_detect_algorithm
It specifies the algorithm that is used to find the row difference in
repair.
2018-12-12 16:49:01 +08:00
Asias He
e9251df478 repair: Introduce partition_key_and_mutation_fragments
Represent a partition_key and frozen_mutation_fragments within the
partition_key.
2018-12-12 16:49:01 +08:00
Asias He
5d5a1beaec repair: Introduce node_repair_meta_id
It uses an IP address and a repair_meta_id to identify a repair
instance started by the row level repair.
2018-12-12 16:49:01 +08:00
Asias He
edd72e10ac repair: Introduce get_sync_boundary_response
The return value of the REPAIR_GET_SYNC_BOUNDARY verb. It will be used
in the row level repair code soon.
2018-12-12 16:49:01 +08:00
Asias He
95b9a889cf repair: Introduce repair_hash
It represents the hash value of a repair row.
2018-12-12 16:49:01 +08:00
Asias He
3e86b7a646 repair: Introduce repair_sync_boundary
Represent a position of a mutation_fragment read from a flat mutation
reader. Repair nodes negotiate a small sub range identified by two
repair_sync_boundary to work on in each round.
2018-12-12 16:49:01 +08:00