Commit Graph

169 Commits

Author SHA1 Message Date
Asias He
4b9e1a9f1d repair: Add row level metrics
Number of rows sent and received
- tx_row_nr
- rx_row_nr

Bytes of rows sent and received
- tx_row_bytes
- rx_row_bytes

Number of row hashes sent and received
- tx_hashes_nr
- rx_hashes_nr

Number of rows read from disk
- row_from_disk_nr

Bytes of rows read from disk
- row_from_disk_bytes

Message-Id: <d1ee6b8ae8370857fe45f88b6c13087ea217d381.1547603905.git.asias@scylladb.com>
2019-01-16 14:04:57 +02:00
Duarte Nunes
04a14b27e4 Merge 'Add handling staging sstables to /upload dir' from Piotr
"
This series adds generating view updates from sstables added through
/upload directory if their tables have accompanying materialized views.
Said sstables are left in /upload directory until updates are generated
from them and are treated just like staging sstables from /staging dir.
If there are no views for a given tables, sstables are simply moved
from /upload dir to datadir without any changes.

Tests: unit (release)
"

* 'add_handling_staging_sstables_to_upload_dir_5' of https://github.com/psarna/scylla:
  all: rename view_update_from_staging_generator
  distributed_loader: fix indentation
  service: add generating view updates from uploaded sstables
  init: pass view update generator to storage service
  sstables: treat sstables in upload dir as needing view build
  sstables,table: rename is_staging to requires_view_building
  distributed_loader: use proper directory for opening SSTable
  db,view: make throttling optional for view_update_generator
2019-01-15 18:19:27 +00:00
Piotr Sarna
0eb703dc80 all: rename view_update_from_staging_generator
The new name, view_update_generator, is both more concise
and correct, since we now generate from directories
other than "/staging".
2019-01-15 17:31:47 +01:00
Piotr Sarna
08a42d47a5 repair: add stream phasing to row level repair
In order to allow other services to wait for incoming streams
to finish, row level repair uses stream phasing when creating
new sstables from incoming data.

Fixes scylladb#4032
2019-01-15 10:28:21 +01:00
Duarte Nunes
fa2b0384d2 Replace std::experimental types with C++17 std version.
Replace stdx::optional and stdx::string_view with the C++ std
counterparts.

Some instances of boost::variant were also replaced with std::variant,
namely those that called seastar::visit.

Scylla now requires GCC 8 to compile.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20190108111141.5369-1-duarte@scylladb.com>
2019-01-08 13:16:36 +02:00
Asias He
1de24c8495 repair: Use mf.visit() in fragment_hasher
When new fragment type is added, it will fail to compile instead of
producing runtime errors.

Message-Id: <cf10200e4185c779aad15da3a776a5b79f5323af.1546930796.git.asias@scylladb.com>
2019-01-08 12:02:42 +02:00
Avi Kivity
f02c64cadf streaming: stream_session: remove include of db/view/view_update_from_staging_generator.hh
This header, which is easily replaced with a forward declaration,
introduces a dependency on database.hh everywhere. Remove it and scatter
includes of database.hh in source files that really need it.
2019-01-05 17:33:25 +02:00
Piotr Sarna
bc74ac6f09 repair: add staging sstables support to row level repair
In some cases, sstables created during row level repair
should be enqueued as staging in order to generate
view updates from them.

Fixes #4034
2019-01-03 08:36:45 +01:00
Piotr Sarna
a0003c52cf main,repair: add params to row level repair init
Row level repair needs references to system distributed keyspace
and view update generator in order to enqueue some sstables
as staging.
2019-01-03 08:31:41 +01:00
Avi Kivity
c96fc1d585 Merge "Introduce row level repair" from Asias
"
=== How the the partition level repair works

- The repair master decides which ranges to work on.
- The repair master splits the ranges to sub ranges which contains around 100
partitions.
- The repair master computes the checksum of the 100 partitions and asks the
related peers to compute the checksum of the 100 partitions.
- If the checksum matches, the data in this sub range is synced.
- If the checksum mismatches, repair master fetches the data from all the peers
and sends back the merged data to peers.

=== Major problems with partition level repair

- A mismatch of a single row in any of the 100 partitions causes 100
partitions to be transferred. A single partition can be very large. Not to
mention the size of 100 partitions.

- Checksum (find the mismatch) and streaming (fix the mismatch) will read the
same data twice

=== Row level repair

Row level checksum and synchronization: detect row level mismatch and transfer
only the mismatch

=== How the row level repair works

- To solve the problem of reading data twice

Read the data only once for both checksum and synchronization between nodes.

We work on a small range which contains only a few mega bytes of rows,
We read all the rows within the small range into memory. Find the
mismatch and send the mismatch rows between peers.

We need to find a sync boundary among the nodes which contains only N bytes of
rows.

- To solve the problem of sending unnecessary data.

We need to find the mismatched rows between nodes and only send the delta.
The problem is called set reconciliation problem which is a common problem in
distributed systems.

For example:
Node1 has set1 = {row1, row2, row3}
Node2 has set2 = {      row2, row3}
Node3 has set3 = {row1, row2, row4}

To repair:
Node1 fetches nothing from Node2 (set2 - set1), fetches row4 (set3 - set1) from Node3.
Node1 sends row1 and row4 (set1 + set2 + set3 - set2) to Node2
Node1 sends row3 (set1 + set2 + set3 - set3) to Node3.

=== How to implement repair with set reconciliation

- Step A: Negotiate sync boundary

class repair_sync_boundary {
    dht::decorated_key pk;
    position_in_partition position
}

Reads rows from disk into row buffers until the size is larger than N
bytes. Return the repair_sync_boundary of the last mutation_fragment we
read from disk. The smallest repair_sync_boundary of all nodes is
set as the current_sync_boundary.

- Step B: Get missing rows from peer nodes so that repair master contains all the rows

Request combined hashes from all nodes between last_sync_boundary and
current_sync_boundary. If the combined hashes from all nodes are identical,
data is synced, goto Step A. If not, request the full hashes from peers.

At this point, the repair master knows exactly what rows are missing. Request the
missing rows from peer nodes.

Now, local node contains all the rows.

- Step C: Send missing rows to the peer nodes

Since local node also knows what peer nodes own, it sends the missing rows to
the peer nodes.

=== How the RPC API looks like

- repair_range_start()

Step A:
- request_sync_boundary()

Step B:
- request_combined_row_hashes()
- reqeust_full_row_hashes()
- request_row_diff()

Step C:
- send_row_diff()

- repair_range_stop()

=== Performance evaluation

We created a cluster of 3 Scylla nodes on AWS using i3.xlarge instance. We
created a keyspace with a replication factor of 3 and inserted 1 billion
rows to each of the 3 nodes. Each node has 241 GiB of data.
We tested 3 cases below.

1) 0% synced: one of the node has zero data. The other two nodes have 1 billion identical rows.

Time to repair:
   old = 87 min
   new = 70 min (rebuild took 50 minutes)
   improvement = 19.54%

2) 100% synced: all of the 3 nodes have 1 billion identical rows.
Time to repair:
   old = 43 min
   new = 24 min
   improvement = 44.18%

3) 99.9% synced: each node has 1 billion identical rows and 1 billion * 0.1% distinct rows.

Time to repair:
   old: 211 min
   new: 44 min
   improvement: 79.15%

Bytes sent on wire for repair:
   old: tx= 162 GiB,  rx = 90 GiB
   new: tx= 1.15 GiB, tx = 0.57 GiB
   improvement: tx = 99.29%, rx = 99.36%

It is worth noting that row level repair sends and receives exactly the
number of rows needed in theory.

In this test case, repair master needs to receives 2 million rows and
sends 4 million rows. Here are the details: Each node has 1 billion *
0.1% distinct rows, that is 1 million rows. So repair master receives 1
million rows from repair slave 1 and 1 million rows from repair slave 2.
Repair master sends 1 million rows from repair master and 1 million rows
received from repair slave 1 to repair slave 2. Repair master sends
sends 1 million rows from repair master and 1 million rows received from
repair slave 2 to repair slave 1.

In the result, we saw the rows on wire were as expected.

tx_row_nr  = 1000505 + 999619 + 1001257 + 998619 (4 shards, the numbers are for each shard) = 4'000'000
rx_row_nr  =  500233 + 500235 +  499559 + 499973 (4 shards, the numbers are for each shard) = 2'000'000

Fixes: #3033

Tests: dtests/repair_additional_test.py
"

* 'asias/row_level_repair_v7' of github.com:cloudius-systems/seastar-dev: (51 commits)
  repair: Enable row level repair
  repair: Add row_level_repair
  repair: Add docs for row level repair
  repair: Add repair_init_messaging_service_handler
  repair: Add repair_meta
  repair: Add repair_writer
  repair: Add repair_reader
  repair: Add repair_row
  repair: Add fragment_hasher
  repair: Add decorated_key_with_hash
  repair: Add get_random_seed
  repair: Add get_common_diff_detect_algorithm
  repair: Add shard_config
  repair: Add suportted_diff_detect_algorithms
  repair: Add repair_stats to repair_info
  repair: Introduce repair_stats
  flat_mutation_reader:  Add make_generating_reader
  storage_service: Introduce ROW_LEVEL_REPAIR feature
  messaging_service: Add RPC verbs for row level repair
  repair: Export the repair logger
  ...
2018-12-25 13:13:00 +02:00
Botond Dénes
1865e5da41 treewide: remove include database.hh from headers where possible
Many headers don't really need to include database.hh, the include can
be replaced by forward declarations and/or including the actually needed
headers directly. Some headers don't need this include at all.

Each header was verified to be compilable on its own after the change,
by including it into an empty `.cc` file and compiling it. `.cc` files
that used to get `database.hh` through headers that no longer include it
were changed to include it themselves.
2018-12-14 08:03:57 +02:00
Asias He
b9e0db801d repair: Enable row level repair
Finally, enable new row level repair if the cluster supports it. If not,
fallback to the old partition level repair.

Fixes #3033
2018-12-12 16:49:01 +08:00
Asias He
d372317e99 repair: Add row_level_repair
=== How the the partition level repair works

- The repair master decides which ranges to work on.
- The repair master splits the ranges to sub ranges which contains around 100
partitions.
- The repair master computes the checksum of the 100 partitions and asks the
related peers to compute the checksum of the 100 partitions.
- If the checksum matches, the data in this sub range is synced.
- If the checksum mismatches, repair master fetches the data from all the peers
and sends back the merged data to peers.

=== Major problems with partition level repair

- A mismatch of a single row in any of the 100 partitions causes 100
partitions to be transferred. A single partition can be very large. Not to
mention the size of 100 partitions.

- Checksum (find the mismatch) and streaming (fix the mismatch) will read the
same data twice

=== Row level repair

Row level checksum and synchronization: detect row level mismatch and transfer
only the mismatch

=== How the row level repair works

- To solve the problem of reading data twice

Read the data only once for both checksum and synchronization between nodes.

We work on a small range which contains only a few mega bytes of rows,
We read all the rows within the small range into memory. Find the
mismatch and send the mismatch rows between peers.

We need to find a sync boundary among the nodes which contains only N bytes of
rows.

- To solve the problem of sending unnecessary data.

We need to find the mismatched rows between nodes and only send the delta.
The problem is called set reconciliation problem which is a common problem in
distributed systems.

For example:
Node1 has set1 = {row1, row2, row3}
Node2 has set2 = {      row2, row3}
Node3 has set3 = {row1, row2, row4}

To repair:
Node1 fetches nothing from Node2 (set2 - set1), fetches row4 (set3 - set1) from Node3.
Node1 sends row1 and row4 (set1 + set2 + set3 - set2) to Node2
Node1 sends row3 (set1 + set2 + set3 - set3) to Node3.

=== How to implement repair with set reconciliation

- Step A: Negotiate sync boundary

class repair_sync_boundary {
    dht::decorated_key pk;
    position_in_partition position
}

Reads rows from disk into row buffers until the size is larger than N
bytes. Return the repair_sync_boundary of the last mutation_fragment we
read from disk. The smallest repair_sync_boundary of all nodes is
set as the current_sync_boundary.

- Step B: Get missing rows from peer nodes so that repair master contains all the rows

Request combined hashes from all nodes between last_sync_boundary and
current_sync_boundary. If the combined hashes from all nodes are identical,
data is synced, goto Step A. If not, request the full hashes from peers.

At this point, the repair master knows exactly what rows are missing. Request the
missing rows from peer nodes.

Now, local node contains all the rows.

- Step C: Send missing rows to the peer nodes

Since local node also knows what peer nodes own, it sends the missing rows to
the peer nodes.

=== How the RPC API looks like

- repair_range_start()

Step A:
- request_sync_boundary()

Step B:
- request_combined_row_hashes()
- reqeust_full_row_hashes()
- request_row_diff()

Step C:
- send_row_diff()

- repair_range_stop()

=== Performance evaluation

We created a cluster of 3 Scylla nodes on AWS using i3.xlarge instance. We
created a keyspace with a replication factor of 3 and inserted 1 billion
rows to each of the 3 nodes. Each node has 241 GiB of data.
We tested 3 cases below.

1) 0% synced: one of the node has zero data. The other two nodes have 1 billion identical rows.

Time to repair:
   old = 87 min
   new = 70 min (rebuild took 50 minutes)
   improvement = 19.54%

2) 100% synced: all of the 3 nodes have 1 billion identical rows.
Time to repair:
   old = 43 min
   new = 24 min
   improvement = 44.18%

3) 99.9% synced: each node has 1 billion identical rows and 1 billion * 0.1% distinct rows.

Time to repair:
   old: 211 min
   new: 44 min
   improvement: 79.15%

Bytes sent on wire for repair:
   old: tx= 162 GiB,  rx = 90 GiB
   new: tx= 1.15 GiB, tx = 0.57 GiB
   improvement: tx = 99.29%, rx = 99.36%

It is worth noting that row level repair sends and receives exactly the
number of rows needed in theory.

In this test case, repair master needs to receives 2 million rows and
sends 4 million rows. Here are the details: Each node has 1 billion *
0.1% distinct rows, that is 1 million rows. So repair master receives 1
million rows from repair slave 1 and 1 million rows from repair slave 2.
Repair master sends 1 million rows from repair master and 1 million rows
received from repair slave 1 to repair slave 2. Repair master sends
sends 1 million rows from repair master and 1 million rows received from
repair slave 2 to repair slave 1.

In the result, we saw the rows on wire were as expected.

tx_row_nr  = 1000505 + 999619 + 1001257 + 998619 (4 shards, the numbers are for each shard) = 4'000'000
rx_row_nr  =  500233 + 500235 +  499559 + 499973 (4 shards, the numbers are for each shard) = 2'000'000

Fixes #3033
2018-12-12 16:49:01 +08:00
Asias He
fab31efae1 repair: Add repair_init_messaging_service_handler
This patch implements all the rpc handlers for row level repair.
2018-12-12 16:49:01 +08:00
Asias He
3c80727d51 repair: Add repair_meta
This patch introduces repair_meta class that is the core class for the
row level repair.

For each range to repair, repair_meta objects are created on both repair
master and repair slaves. It stores the meta data for the row level
repair algorithms, e.g, the current sync boundary, the buffer used to
hold the rows the peers are working on, the reader to read data from
sstable and the writer to write data to sstable.

This patch also implements the RPC verbs for row level repair, for
example, REPAIR_ROW_LEVEL_START/REPAIR_ROW_LEVEL_STOP to starts/stops
row level repair for a range, REPAIR_GET_SYNC_BOUNDARY to get sync
boundary peers want to work on, REPAIR_GET_ROW_DIFF to get missing rows
from repair slaves and REPAIR_PUT_ROW_DIFF to pus missing rows to repair
slaves.
2018-12-12 16:49:01 +08:00
Asias He
65099bac85 repair: Add repair_writer
repair_writer uses multishard_writer to apply the mutation_fragments to
sstable. The repair master needs one such writer for each of the repair
slave. The repair slave needs one writer for the repair master.
2018-12-12 16:49:01 +08:00
Asias He
5b75f64e0e repair: Add repair_reader
repair_reader is used to read data from disk. It is simply a local
flat_mutation_reader reader for the repair master. It is more
complicated for the repair slave.

The repair slaves have to follow what repair master read from disk.

For example,

Assume repair master has 2 shards and repair slave has 3 shards
Repair master on shard 0 asks repair slave on shard 0 to read range [0,100).
Repair master on shard 1 asks repair slave on shard 1 to read range [0,100).

Repair master on shard 0 will only read the data that belongs to shard 0
within range [0,100). Since master and slave have different shard count,
repair slave on shard 0 has to use the multi shard reader to collect
data on all the shards. It can not pass range [0, 100) to the multi
shard reader, otherwise it will read more data than the repair master.
Instead, repair slave uses a sharder using sharding configuration of the
repair master, to generate the sub ranges belong to shard 0 of repair
master.

If repair master and slave has the same sharding configuration, a simple
local reader is enough for repair slave.
2018-12-12 16:49:01 +08:00
Asias He
27128d132d repair: Add repair_row
repair_row is the in-memory representation of "row" that the row level
repair works on. It represents a mutation_fragment that is read from the
flat_mutation reader. The hash of a repair_row is the combination of the
mutation_fragment hash and partition_key hash.
2018-12-12 16:49:01 +08:00
Asias He
3e7b1d2ef4 repair: Add fragment_hasher
It is used to calculate the hash of a mutation_fragment.
2018-12-12 16:49:01 +08:00
Asias He
e135871e4a repair: Add decorated_key_with_hash
Represents a decorated_key and the hash for it so that we do not need to
calculate more than once if the decorated_key is used more than once.
2018-12-12 16:49:01 +08:00
Asias He
16c1b26937 repair: Add get_random_seed
Get a random uint64_t number as the seed for the repair row hashing.
The seed is passed to xx_hasher.

We add the randomization when hashing rows so that when we run repair
for the next time the same row produces different hashing number.
2018-12-12 16:49:01 +08:00
Asias He
54888ac52c repair: Add get_common_diff_detect_algorithm
It is used to find the common difference detection algorithms supported
by repair master and repair slaves.

It is up to repair master to choose what algorithm to use.
2018-12-12 16:49:01 +08:00
Asias He
0b294d5829 repair: Add shard_config
It is used to store the shard configuration.
2018-12-12 16:49:01 +08:00
Asias He
a36b0966cf repair: Add suportted_diff_detect_algorithms
It returns a vector of row level repair difference detection algorithms
supported by this node.

We are going to implement the "send_full_set" in the following patches.
2018-12-12 16:49:01 +08:00
Asias He
42f2cd8dc5 repair: Add repair_stats to repair_info
Also add update_statistics() to update current stats.
2018-12-12 16:49:01 +08:00
Asias He
43c04302f3 repair: Introduce repair_stats
It is used by row level repair to track repair statistics.
2018-12-12 16:49:01 +08:00
Asias He
8cfdcf435e repair: Export the repair logger
It will be used by the row level repair soon.
2018-12-12 16:49:01 +08:00
Asias He
e62aeae2db repair: Export repair_info
It will be used by the row level repair soon.
2018-12-12 16:49:01 +08:00
Asias He
6be3b35d52 repair: Export estimate_partitions
It will be used by row level repair soon.
2018-12-12 16:49:01 +08:00
Asias He
1a0bc8acf1 repair: Add struct hash<node_repair_meta_id> for node_repair_meta_id 2018-12-12 16:49:01 +08:00
Asias He
28d090ffda repair: Add struct hash<repair_hash> for repair_hash 2018-12-12 16:49:01 +08:00
Asias He
ce70225b1c repair: Introduce row_level_diff_detect_algorithm
It specifies the algorithm that is used to find the row difference in
repair.
2018-12-12 16:49:01 +08:00
Asias He
e9251df478 repair: Introduce partition_key_and_mutation_fragments
Represent a partition_key and frozen_mutation_fragments within the
partition_key.
2018-12-12 16:49:01 +08:00
Asias He
5d5a1beaec repair: Introduce node_repair_meta_id
It uses an IP address and a repair_meta_id to identify a repair
instance started by the row level repair.
2018-12-12 16:49:01 +08:00
Asias He
edd72e10ac repair: Introduce get_sync_boundary_response
The return value of the REPAIR_GET_SYNC_BOUNDARY verb. It will be used
in the row level repair code soon.
2018-12-12 16:49:01 +08:00
Asias He
95b9a889cf repair: Introduce repair_hash
It represents the hash value of a repair row.
2018-12-12 16:49:01 +08:00
Asias He
3e86b7a646 repair: Introduce repair_sync_boundary
Represent a position of a mutation_fragment read from a flat mutation
reader. Repair nodes negotiate a small sub range identified by two
repair_sync_boundary to work on in each round.
2018-12-12 16:49:01 +08:00
Avi Kivity
85e9b0d78d repair: remove unneeded config.hh inclusion 2018-12-09 20:11:38 +02:00
Avi Kivity
51ce53738f repair: convert sprint() to format()
sprint() recently became more strict, throwing on sprint("%s", 5). Replace
with the more modern format().

Mechanically converted with https://github.com/avikivity/unsprint.
2018-11-01 13:16:17 +00:00
Avi Kivity
6488b017c3 repair: fix bad format string syntax
Some sprint() calls use the fmt language instead of the printf syntax. Convert
them all the way to format().
2018-11-01 13:16:17 +00:00
Asias He
7f826d3343 streaming: Expose reason for streaming
On receiving a mutation_fragment or a mutation triggered by a streaming
operation, we pass an enum stream_reason to notify the receiver what
the streaming is used for. So the receiver can decide further operation,
e.g., send view updates, beyond applying the streaming data on disk.

Fixes #3276
Message-Id: <f15ebcdee25e87a033dcdd066770114a499881c0.1539498866.git.asias@scylladb.com>
2018-10-15 22:03:28 +01:00
Botond Dénes
eb357a385d flat_mutation_reader: make timeout opt-out rather than opt-in
Currently timeout is opt-in, that is, all methods that even have it
default it to `db::no_timeout`. This means that ensuring timeout is used
where it should be is completely up to the author and the reviewrs of
the code. As humans are notoriously prone to mistakes this has resulted
in a very inconsistent usage of timeout, many clients of
`flat_mutation_reader` passing the timeout only to some members and only
on certain call sites. This is small wonder considering that some core
operations like `operator()()` only recently received a timeout
parameter and others like `peek()` didn't even have one until this
patch. Both of these methods call `fill_buffer()` which potentially
talks to the lower layers and is supposed to propagate the timeout.
All this makes the `flat_mutation_reader`'s timeout effectively useless.

To make order in this chaos make the timeout parameter a mandatory one
on all `flat_mutation_reader` methods that need it. This ensures that
humans now get a reminder from the compiler when they forget to pass the
timeout. Clients can still opt-out from passing a timeout by passing
`db::no_timeout` (the previous default value) but this will be now
explicit and developers should think before typing it.

There were suprisingly few core call sites to fix up. Where a timeout
was available nearby I propagated it to be able to pass it to the
reader, where I couldn't I passed `db::no_timeout`. Authors of the
latter kind of code (view, streaming and repair are some of the notable
examples) should maybe consider propagating down a timeout if needed.
In the test code (the wast majority of the changes) I just used
`db::no_timeout` everywhere.

Tests: unit(release, debug)

Signed-off-by: Botond Dénes <bdenes@scylladb.com>

Message-Id: <1edc10802d5eb23de8af28c9f48b8d3be0f1a468.1536744563.git.bdenes@scylladb.com>
2018-09-20 11:31:24 +02:00
Duarte Nunes
a025bf6a7d Merge seastar upstream
Seastar introduced a "compat" namespace, which conflicts with Scylla's
own "compat" namespaces. The merge thus includes changes to scope
uses of Scylla's "compat" namespaces.

* seastar 8ad870f...9bb1611  (5):
  > util/variant_utils: Ensure variant_cast behaves well with rvalues
  > util/std-compat: Fix infinite recursion
  > doc/tutorial: Undo namespace changes
  > util/variant_utils: Add cast_variant()
  > Add compatbility with C++17's library types

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2018-08-14 13:07:09 +01:00
Nadav Har'El
5e47061438 repair: fix small error-handling logic mistake
As noticed by Tomasz Grabiec, we test a future's available() after
having already waited for it with when_all(), which is pointless.

The code after the wrong if() exchanges the contents of a token-range
between this node and several other live neighbors; We can't do this
exchange if either this node is broken or there is no other live neighbor.
So this is what we needed to test. so !available() should have been failed().

Also the test for live_neighbors_checksum.empty() added in commit 7c873f0d1f
is unnecessary - we build live_neighbors and live_neighbors_checksum
together, so if one of them is empty, so is the other.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20180710114940.26027-1-nyh@scylladb.com>
2018-07-10 15:04:03 +03:00
Nadav Har'El
3194ce16b3 repair: fix combination of "-pr" and "-local" repair options
When nodetool repair is used with the combination of the "-pr" (primary
range) and "-local" (only repair with nodes in the same DC) options,
Scylla needs to define the "primary ranges" differently: Rather than
assign one node in the entire cluster to be the primary owner of every
token, we need one node in each data-center - so that a "-local"
repair will cover all the tokens.

Fixes #3557.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20180701132445.21685-1-nyh@scylladb.com>
2018-07-01 16:39:33 +03:00
Vladimir Krivopalov
acdce55572 Inject CryptoPP namespace where Crypto++ byte typedef is used.
In Crypto++ v6, the `byte` typedef has been moved from the global
namespace to the CryptoPP:: namespace.
To make Scylla code compile with both old and new versions, bring the
namespace in so that the code works regardless of the scope of `byte`
definition.

Fixes #3252

Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
Message-Id: <60e7bfe868b778b1c9bbe15d7247db64b61bd406.1520272198.git.vladimir@scylladb.com>
2018-03-05 20:43:07 +02:00
Pekka Enberg
bd365a10d3 Merge "Add an API to get all active repairs" from Amnon
"This series adds an API to return the active repairs by their IDs.

 After this series a call to:

   curl -X GET --header "Accept: application/json" "http://localhost:10000/storage_service/active_repair/"

 Will return an array with the ids of the active repairs.

 Fixes #3193"

* 'amnon/get_active_repairs_v3' of github.com:scylladb/seastar-dev:
  API: Add get active repair api
  repair: Add a get_active_repairs function to return the active repair
2018-02-19 15:32:17 +02:00
Amnon Heiman
3f2eae35fd repair: Add a get_active_repairs function to return the active repair
This patch adds a function that returns an array with the ids of the
active repairs by filtering the RUNNING ones in the repair tracker status.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2018-02-14 11:43:37 +02:00
Duarte Nunes
7ba63b1521 atomic_cell_hash: Add specialization for atomic_cell_or_collection
Replace the atomic_cell_or_collection::feed_hash() member function
with the specialization of appending_hash, and use that instead.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2018-02-01 00:22:51 +00:00
Duarte Nunes
a0d748c71c range_tombstone: Replace feed_hash() member function with appending_hash
Replace range_tombstone::feed_hash() with the specialization of
appending_hash, so that we can use the general feed_hash() function.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2018-02-01 00:22:50 +00:00