Commit Graph

400 Commits

Author SHA1 Message Date
Botond Dénes
6ca0464af5 mutation_fragment: add schema and permit
We want to start tracking the memory consumption of mutation fragments.
For this we need schema and permit during construction, and on each
modification, so the memory consumption can be recalculated and pass to
the permit.

In this patch we just add the new parameters and go through the insane
churn of updating all call sites. They will be used in the next patch.
2020-09-28 11:27:23 +03:00
Botond Dénes
3fab83b3a1 flat_mutation_reader: impl: add reader_permit parameter
Not used yet, this patch does all the churn of propagating a permit
to each impl.

In the next patch we will use it to track to track the memory
consumption of `_buffer`.
2020-09-28 10:53:48 +03:00
Pavel Emelyanov
9a15ebfe6a repair: Move CHECKSUM_RANGE verb into repair/
The verb is sent by repair code, so it should be registered
in the same place, not in main. Also -- the verb should be
unregistered on stop.

The global messaging service instance is made similarly to the
row-level one, as there's no ready to use repair service.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-09-17 09:52:48 +03:00
Pavel Emelyanov
d5769346d7 repair: Toss messaging init/uninit calls
There goal is to make it possible to reg/unreg not only row-level
verbs. While at it -- equip the init call with sharded<database>&
argument, it will be needed by the next patch.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-09-17 09:52:48 +03:00
Avi Kivity
253a7640e3 Merge 'Clean up old cluster features' from Piotr Sarna
"
This series follows the suggestion from https://github.com/scylladb/scylla/pull/7203#issuecomment-689499773 discussion and deprecates a number of cluster features. The deprecation does not remove any features from the strings sent via gossip to other nodes, but it removes all checks for these features from code, assuming that the checks are always true. This assumption is quite safe for features introduced over 2 years ago, because the official upgrade path only allows upgrading from a previous official release, and these feature bits were introduced many release cycles ago.
All deprecated features were picked from a `git blame` output which indicated that they come from 2018:
```git
e46537b7d3 2016-05-31 11:44:17 +0200   RANGE_TOMBSTONES_FEATURE = "RANGE_TOMBSTONES";
85c092c56c 2016-07-11 10:59:40 +0100   LARGE_PARTITIONS_FEATURE = "LARGE_PARTITIONS";
02bc0d2ab3 2016-12-09 22:09:30 +0100   MATERIALIZED_VIEWS_FEATURE = "MATERIALIZED_VIEWS";
67ca6959bd 2017-01-30 19:50:13 +0000   COUNTERS_FEATURE = "COUNTERS";
815c91a1b8 2017-04-12 10:14:38 +0300   INDEXES_FEATURE = "INDEXES";
d2a2a6d471 2017-08-03 10:53:22 +0300   DIGEST_MULTIPARTITION_READ_FEATURE = "DIGEST_MULTIPARTITION_READ";
ecd2bf128b 2017-09-01 09:55:02 +0100   CORRECT_COUNTER_ORDER_FEATURE = "CORRECT_COUNTER_ORDER";
713d75fd51 2017-09-14 19:15:41 +0200   SCHEMA_TABLES_V3 = "SCHEMA_TABLES_V3";
2f513514cc 2017-11-29 11:57:09 +0000   CORRECT_NON_COMPOUND_RANGE_TOMBSTONES = "CORRECT_NON_COMPOUND_RANGE_TOMBSTONES";
0be3bd383b 2017-12-04 13:55:36 +0200   WRITE_FAILURE_REPLY_FEATURE = "WRITE_FAILURE_REPLY";
0bab3e59c2 2017-11-30 00:16:34 +0000   XXHASH_FEATURE = "XXHASH";
fbc97626c4 2018-01-14 21:28:58 -0500   ROLES_FEATURE = "ROLES";
802be72ca6 2018-03-18 06:25:52 +0100   LA_SSTABLE_FEATURE = "LA_SSTABLE_FORMAT";
71e22fe981 2018-05-25 10:37:54 +0800   STREAM_WITH_RPC_STREAM = "STREAM_WITH_RPC_STREAM";
```
Tests: unit(dev)
           manual(verifying with cqlsh that the feature strings are indeed still set)
"

Closes #7234.

* psarna-clean_up_features:
  gms: add comments for deprecated features
  gms: remove unused feature bits
  streaming: drop checks for RPC stream support
  roles: drop checks for roles schema support
  service: drop checks for xxhash support
  service: drop checks for write failure reply support
  sstables: drop checks for non-compound range tombstones support
  service: drop checks for v3 schema support
  repair: drop checks for large partitions support
  service: drop checks for digest multipartition read support
  sstables: drop checks for correct counter order support
  cql3: drop checks for materialized views support
  cql3: drop checks for counters support
  cql3: drop checks for indexing support
2020-09-16 10:53:25 +03:00
Piotr Sarna
7c8728dd73 Merge 'Add progress metrics for replace decommission removenode'
from Asias.

This series follows "repair: Add progress metrics for node ops #6842"
and adds the metrics for the remaining node operations,
i.e., replace, decommission and removenode.

Fixes #1244, #6733

* asias-repair_progress_metrics_replace_decomm_removenode:
  repair: Add progress metrics for removenode ops
  repair: Add progress metrics for decommission ops
  repair: Add progress metrics for replace ops
2020-09-15 12:19:11 +02:00
Benny Halevy
0dc45529c8 abstract_replication_strategy: get_ranges_in_thread: copy _token_metadata if func may yield
Change 94995acedb added yielding to abstract_replication_strategy::do_get_ranges.
And 07e253542d used get_ranges_in_thread in compaction_manager.

However, there is nothing to prevent token_metadata, and in particular its
`_sorted_tokens` from changing while iterating over them in do_get_ranges if the latter yields.

Therefore copy the the replication strategy `_token_metadata` in `get_ranges_in_thread(inet_address ep)`.

If the caller provides `token_metadata` to get_ranges_in_thread, then the caller
must make sure that we can safely yield while accessing token_metadata (like
in `do_rebuild_replace_with_repair`).

Fixes #7044

Test: unit(dev)

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20200915074555.431088-1-bhalevy@scylladb.com>
2020-09-15 11:33:55 +03:00
Piotr Sarna
9e6098a422 repair: drop checks for large partitions support
Large partitions are supported for over 2 years and upgrades are only
allowed from versions which already have the support, so the checks
are hereby dropped.
2020-09-14 12:07:20 +02:00
Pavel Emelyanov
a89c7198c2 range_tombstone_list: Introduce and use pop_as<>()
The method extracts an element from the list, constructs
a desired object from it and frees. This is common usage
of range_tombstone_list. Having a helper helps encapsulating
the exact collection inside the class.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-09-07 23:17:41 +03:00
Pavel Emelyanov
f19ade31ee repair: Mark some partition_hasher methods noexcept
The net patch will change the way range tombstones are
fed into hasher. To make sure the codeflow doesn't
become exception-unsafe, mark the relevant methods as
nont-throwing.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-09-07 23:17:41 +03:00
Asias He
8b4530a643 repair: Add progress metrics for removenode ops
The following metric is added:

scylla_node_maintenance_operations_removenode_finished_percentage{shard="0",type="gauge"} 0.650000

It is the number of finished percentage for removenode operation so
far.

Fixes #1244, #6733
2020-08-31 14:43:39 +08:00
Asias He
25e03233f1 repair: Add progress metrics for decommission ops
The following metric is added:

scylla_node_maintenance_operations_decommission_finished_percentage{shard="0",type="gauge"}
0.650000

It is the number of finished percentage for decommission operation so
far.

Fixes #1244, #6733
2020-08-31 14:43:39 +08:00
Asias He
80cb157669 repair: Add progress metrics for replace ops
The following metric is added:

scylla_node_maintenance_operations_replace_finished_percentage{shard="0",type="gauge"} 0.650000

It is the number of finished percentage for replace operation so far.

Fixes #1244, #6733
2020-08-31 14:03:05 +08:00
Avi Kivity
7416b3c34b Merge 'scylla-gdb.py: Add scylla repairs command' from Asias
"
This series adds scylla repairs command to help debug repair.

Fixes #7103
"

* asias-repair_help_debug_scylla_repairs_cmd:
  scylla-gdb.py: Add scylla repairs command
  repair: Add repair_state to track repair states
  scylla-gdb.py: Print the pointers of elements in boost_intrusive_list_printer
  scylla-gdb.py: Add printer for gms::inet_address
  scylla-gdb.py: Fix a typo in boost_intrusive_list
  repair: Fix the incorrect comments for _all_nodes
  repair: Add row_level_repair object pointer in repair_meta
  repair: Add counter for reads issued and finished for repair_reader
2020-08-26 13:57:31 +03:00
Avi Kivity
6ff12b7f79 repair: apply_rows_on_follower(): remove copy of repair_rows list
We copy a list, which was reported to generate a 15ms stall.

This is easily fixed by moving it instead, which is safe since this is
the last use of the variable.

Fixes #7115.
2020-08-26 11:52:39 +03:00
Asias He
ab57cea783 repair: Add repair_state to track repair states
Use repair_state to track the major state of repair from the beginning
to the end of repair.

With this patch, we can easily know at which state both the repair
master and followers are. It is very helpful when debugging a repair
hang issue.

Refs #7103
2020-08-26 11:19:25 +08:00
Asias He
9ee86bb5a0 repair: Fix the incorrect comments for _all_nodes
The _all_nodes field contains both the peer nodes and the node itself.

Refs #7103
2020-08-26 10:12:07 +08:00
Asias He
656ff93d49 repair: Add row_level_repair object pointer in repair_meta
It is helpful to track back the row_level_repair object for repair
master when debugging.

Refs #7103
2020-08-26 10:12:07 +08:00
Asias He
283c3dae0a repair: Add counter for reads issued and finished for repair_reader
It is helpful to check the reader blocks forever when debugging a repair
hang.

Refs #7103
2020-08-26 10:12:07 +08:00
Asias He
e86881be99 repair: Print repair reason in repair stats log
It is useful to distinguish if the repair is a regular repair or used
for node operations.

In addition, log the keyspace and tables are repaired.

Fixes #7086
2020-08-25 11:05:47 +03:00
Pavel Emelyanov
06f4828b93 db: Factor out get_local_ranges helper
Storage service and repair code have identical helpers to get local
ranges for keyspace. Move this helper's code onto database, later it
will be reused by one more place.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-08-21 14:58:40 +03:00
Pavel Emelyanov
24eaf827c0 migration_manager: Add messaging service as argument to get_schema_definition
There are 4 places that call this helper:

- storage proxy. Callers are rpc verb handlers and already have the proxy
  at hands from which they can get the messaging service instance
- repair. There's local-global messaging instance at hands, and the caller
  is in verb handler too
- streaming. The caller is verb handler, which is unregistered on stop, so
  the messaging service instance can be captured
- migration manager itself. The caller already uses "this", so the messaging
  service instance can be get from it

The better approach would be to make get_schema_definition be the method of
migration_manager, but the manager is stopped for real on shutdown, thus
referencing it from the callers might not be safe and needs revisiting. At
the same time the messaging service is always alive, so using its reference
is safe.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-08-19 20:50:53 +03:00
Pavel Emelyanov
704880d564 repair: Stop using global messaging_service references
Now all the users of messaging service have the needed reference.

Again, the messaging service is not really stopped at the end, so its usage
is safe regardless of whether repair stuff itself leaks on stop or not.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-08-19 20:50:53 +03:00
Pavel Emelyanov
d7e90dbfa9 repair: Keep sharded messaging service reference on repair_meta
The reference comes from repair_info and storage_service calls, both
had been already patched for that.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-08-19 20:50:53 +03:00
Pavel Emelyanov
285648620b repair: Keep sharded messaging service reference on repair_info
This reference comes from the API that already has it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-08-19 20:50:53 +03:00
Pavel Emelyanov
74494bac87 repair: Keep reference on messaging in row-level code
The row-level repair keeps its statics for needed services, same as the
streaming does. Treat the messaging service the same way to stop using
the global one in the next patches.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-08-19 20:50:53 +03:00
Pavel Emelyanov
45c31eadb3 repair: Push the sharded<messaging_service> reference down to sync_data_using_repair
This function needs the messaging service inside, but the closest place where it
can get one from is the storage_service API handlers. Temporarily move the call for
global messaging service into storage service, its turn for this cleanup will
come later.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-08-19 20:50:52 +03:00
Pavel Emelyanov
6b0f4d5c8d repair: Use existing sharded db reference
The db.invoke_on_all's lambda tries to get the sharded db reference via
the global storage service. This can be done in a much nicer way.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-08-19 20:50:52 +03:00
Pavel Emelyanov
3d2e3203f7 repair: Mark repair.cc local functions as static
Just a cleanup to facilitate code reading.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-08-19 20:50:52 +03:00
Piotr Jastrzebski
c001374636 codebase wide: replace count with contains
C++20 introduced `contains` member functions for maps and sets for
checking whether an element is present in the collection. Previously
`count` function was often used in various ways.

`contains` does not only express the intend of the code better but also
does it in more unified way.

This commit replaces all the occurences of the `count` with the
`contains`.

Tests: unit(dev)

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <b4ef3b4bc24f49abe04a2aba0ddd946009c9fcb2.1597314640.git.piotr@scylladb.com>
2020-08-15 20:26:02 +03:00
Avi Kivity
736863c385 Merge "repair: Add progress metrics for node ops" from Asias
"
This series adds progress metrics for the node operations. Metrics for bootstrap and rebuild progress are added as a starter. I will add more for the remaining operations after getting feedback.

With this the Scylla Monitor and Scylla Manager can know the progress of the bootstrap and other node operations. E.g.,

    scylla_node_ops_bootstrap_nr_ranges_finished{shard="0",type="derive"} 50
    scylla_node_ops_bootstrap_nr_ranges_total{shard="0",type="derive"} 1040

Fixes #1244, #6733
"

* 'repair_progress_metrics_v3' of github.com:asias/scylla:
  repair: Add progress metrics for repair ops
  repair: Add progress metrics for rebuild ops
  repair: Add progress metrics for bootstrap ops
2020-08-12 11:42:14 +03:00
Avi Kivity
8853eddaf6 Merge 'repair: Track repair_meta created on both repair follower and master' from Asias
"
It is pretty hard to find the repair_meta object when debugging a core.
This patch makes it is easier by putting repair_meta object created by
both repair follower and master into a map.

Fixes #7009
"

* asias-repair_make_debug_eaiser_track_all_repair_metas:
  repair: Add repair_meta_tracker to track repair_meta for followers and masters
  repair: Move thread local object _repair_metas out of the function
2020-08-12 11:01:32 +03:00
Asias He
e9a520a22b repair: Add repair_meta_tracker to track repair_meta for followers and masters
It is pretty hard to find the repair_meta object when debugging a core.
This patch makes it is easier by putting repair_meta object created by
both repair follower and master into boost intrusive list.

Fixes #7009
2020-08-12 15:44:22 +08:00
Asias He
58f4c730b0 repair: Move thread local object _repair_metas out of the function
It is a lot of pain to access _repair_metas when debugging.

Refs #7009
2020-08-12 11:23:18 +08:00
Avi Kivity
4547949420 Merge "Fix repair stalls in get_sync_boundary and apply_rows_on_master_in_thread" from Asias
"
This path set fixes stalls in repair that are caused by std::list merge and clear operations during test_latency_read_with_nemesis test.

Fixes #6940
Fixes #6975
Fixes #6976
"

* 'fix_repair_list_stall_merge_clear_v2' of github.com:asias/scylla:
  repair: Fix stall in apply_rows_on_master_in_thread and apply_rows_on_follower
  repair: Use clear_gently in get_sync_boundary to avoid stall
  utils: Add clear_gently
  repair: Use merge_to_gently to merge two lists
  utils: Add merge_to_gently
2020-08-11 14:52:23 +03:00
Asias He
c65ad02fcd repair: Fix stall in apply_rows_on_master_in_thread and apply_rows_on_follower
The row_diff list in apply_rows_on_master_in_thread and
apply_rows_on_follower can be large. Modify do_apply_rows to remove the
row from the list when the row is consumed to avoid stall when the list
is destroyed.

Fixes #6975
2020-08-11 19:37:47 +08:00
Asias He
9f4b3a5fa6 repair: Use clear_gently in get_sync_boundary to avoid stall
The _row_buf and _working_row_buf list can be large. Use
clear_gently helper to avoid stalls.

Fixes #6940
2020-08-11 19:37:47 +08:00
Piotr Jastrzebski
80e3923b3c codebase wide: replace find(...) != end() with contains
C++20 introduced `contains` member functions for maps and sets for
checking whether an element is present in the collection. Previously
the code pattern looked like:

<collection>.find(<element>) != <collection>.end()

In C++20 the same can be expressed with:

<collection>.contains(<element>)

This is not only more concise but also expresses the intend of the code
more clearly.

This commit replaces all the occurences of the old pattern with the new
approach.

Tests: unit(dev)

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <f001bbc356224f0c38f06ee2a90fb60a6e8e1980.1597132302.git.piotr@scylladb.com>
2020-08-11 13:28:50 +03:00
Asias He
97d47bffa5 repair: Add progress metrics for repair ops
The following metric is added:

scylla_node_maintenance_operations_repair_finished_percentage{shard="0",type="gauge"} 0.650000

It is the number of finished percentage for all ongoing repair operations.

When all ongoing repair operations finish, the percentage stays at 100%.

Fixes #1244, #6733
2020-08-11 18:15:10 +08:00
Asias He
53fee789f0 repair: Use merge_to_gently to merge two lists
During a performance test, test_latency_read_with_nemesis during manager
repair, it experienced a stall of 73 ms:

```
 (inlined by) std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > >::operator=(repair_row const&) at /usr/include/c++/9/bits/stl_iterator.h:515
 (inlined by) std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > > std::__copy_move<false, false, std::bidirectional_iterator_tag>::__copy_m<std::_List_iterator<repair_row>, std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > > >(std::_List_iterator<repair_row>, std::_List_iterator<repair_row>, std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > >) at /usr/include/c++/9/bits/stl_algobase.h:312
 (inlined by) std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > > std::__copy_move_a<false, std::_List_iterator<repair_row>, std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > > >(std::_List_iterator<repair_row>, std::_List_iterator<repair_row>, std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > >) at /usr/include/c++/9/bits/stl_algobase.h:404
 (inlined by) std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > > std::__copy_move_a2<false, std::_List_iterator<repair_row>, std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > > >(std::_List_iterator<repair_row>, std::_List_iterator<repair_row>, std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > >) at /usr/include/c++/9/bits/stl_algobase.h:440
 (inlined by) std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > > std::copy<std::_List_iterator<repair_row>, std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > > >(std::_List_iterator<repair_row>, std::_List_iterator<repair_row>, std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > >) at /usr/include/c++/9/bits/stl_algobase.h:474
 (inlined by) std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > > std::__merge<std::_List_iterator<repair_row>, std::_List_iterator<repair_row>, std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > >, __gnu_cxx::__ops::_Iter_comp_iter<repair_meta::apply_rows_on_master_in_thread(std::__cxx11::list<partition_key_and_mutation_fragments, std::allocator<partition_key_and_mutation_fragments> >, gms::inet_address, seastar::bool_class<update_working_row_buf_tag>, seastar::bool_class<update_peer_row_hash_sets_tag>, unsigned int)::{lambda(repair_row const&, repair_row const&)#1}> >(std::_List_iterator<repair_row>, std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > >, std::_List_iterator<repair_row>, std::_List_iterator<repair_row>, __gnu_cxx::__ops::_Iter_comp_iter<repair_meta::apply_rows_on_master_in_thread(std::__cxx11::list<partition_key_and_mutation_fragments, std::allocator<partition_key_and_mutation_fragments> >, gms::inet_address, seastar::bool_class<update_working_row_buf_tag>, seastar::bool_class<update_peer_row_hash_sets_tag>, unsigned int)::{lambda(repair_row const&, repair_row const&)#1}>, __gnu_cxx::__ops::_Iter_comp_iter<repair_meta::apply_rows_on_master_in_thread(std::__cxx11::list<partition_key_and_mutation_fragments, std::allocator<partition_key_and_mutation_fragments> >, gms::inet_address, seastar::bool_class<update_working_row_buf_tag>, seastar::bool_class<update_peer_row_hash_sets_tag>, unsigned int)::{lambda(repair_row const&, repair_row const&)#1}>) at /usr/include/c++/9/bits/stl_algo.h:4923
 (inlined by) std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > > std::merge<std::_List_iterator<repair_row>, std::_List_iterator<repair_row>, std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > >, repair_meta::apply_rows_on_master_in_thread(std::__cxx11::list<partition_key_and_mutation_fragments, std::allocator<partition_key_and_mutation_fragments> >, gms::inet_address, seastar::bool_class<update_working_row_buf_tag>, seastar::bool_class<update_peer_row_hash_sets_tag>, unsigned int)::{lambda(repair_row const&, repair_row const&)#1}>(std::_List_iterator<repair_row>, std::back_insert_iterator<std::__cxx11::list<repair_row, std::allocator<repair_row> > >, std::_List_iterator<repair_row>, std::_List_iterator<repair_row>, repair_meta::apply_rows_on_master_in_thread(std::__cxx11::list<partition_key_and_mutation_fragments, std::allocator<partition_key_and_mutation_fragments> >, gms::inet_address, seastar::bool_class<update_working_row_buf_tag>, seastar::bool_class<update_peer_row_hash_sets_tag>, unsigned int)::{lambda(repair_row const&, repair_row const&)#1}, repair_meta::apply_rows_on_master_in_thread(std::__cxx11::list<partition_key_and_mutation_fragments, std::allocator<partition_key_and_mutation_fragments> >, gms::inet_address, seastar::bool_class<update_working_row_buf_tag>, seastar::bool_class<update_peer_row_hash_sets_tag>, unsigned int)::{lambda(repair_row const&, repair_row const&)#1}) at /usr/include/c++/9/bits/stl_algo.h:5018
 (inlined by) repair_meta::apply_rows_on_master_in_thread(std::__cxx11::list<partition_key_and_mutation_fragments, std::allocator<partition_key_and_mutation_fragments> >, gms::inet_address, seastar::bool_class<update_working_row_buf_tag>, seastar::bool_class<update_peer_row_hash_sets_tag>, unsigned int) at ./repair/row_level.cc:1242
repair_meta::get_row_diff_source_op(seastar::bool_class<update_peer_row_hash_sets_tag>, gms::inet_address, unsigned int, seastar::rpc::sink<repair_hash_with_cmd>&, seastar::rpc::source<repair_row_on_wire_with_cmd>&) at ./repair/row_level.cc:1608
repair_meta::get_row_diff_with_rpc_stream(std::unordered_set<repair_hash, std::hash<repair_hash>, std::equal_to<repair_hash>, std::allocator<repair_hash> >, seastar::bool_class<needs_all_rows_tag>, seastar::bool_class<update_peer_row_hash_sets_tag>, gms::inet_address, unsigned int) at ./repair/row_level.cc:1674
row_level_repair::get_missing_rows_from_follower_nodes(repair_meta&) at ./repair/row_level.cc:2413
```

The problem was that when std::merge() ran out of one range, it copied the second range.

To fix, use the new merge_to_gently helper.

Fixes #6976
2020-08-11 10:37:34 +08:00
Asias He
e3c2d08f4f repair: Add progress metrics for rebuild ops
The following metric is added:

scylla_node_maintenance_operations_rebuild_finished_percentage{shard="0",type="gauge"} 0.650000

It is the number of finished percentage for rebuild operation so far.

Fixes #1244, #6733
2020-08-10 15:45:37 +08:00
Asias He
b23f65d1d9 repair: Add progress metrics for bootstrap ops
The following metric is added:

scylla_node_maintenance_operations_bootstrap_finished_percentage{shard="0",type="gauge"} 0.850000

It is the number of finished percentage for bootstrap operation so far.

Fixes #1244, #6733
2020-08-10 15:45:37 +08:00
Asias He
e6f640441a repair: Fix race between create_writer and wait_for_writer_done
We saw scylla hit user after free in repair with the following procedure during tests:

- n1 and n2 in the cluster

- n2 ran decommission

- n2 sent data to n1 using repair

- n2 was killed forcely

- n1 tried to remove repair_meta for n1

- n1 hit use after free on repair_meta object

This was what happened on n1:

1) data was received -> do_apply_rows was called -> yield before create_writer() was called

2) repair_meta::stop() was called -> wait_for_writer_done() / do_wait_for_writer_done was called
   with _writer_done[node_idx] not engaged

3) step 1 resumed, create_writer() was called and _repair_writer object was referenced

4) repair_meta::stop() finished, repair_meta object and its member _repair_writer was destroyed

5) The fiber created by create_writer() at step 3 hit use after free on _repair_writer object

To fix, we should call wait_for_writer_done() after any pending
operations were done which were protected by repair_meta::_gate. This
prevents wait for writer done finishes before the writer is in the
process of being created.

Fixes: #6853
Fixes: #6868
Backports: 4.0, 4.1, 4.2
2020-07-28 11:53:40 +03:00
Botond Dénes
fe127a2155 sstables: clamp estimated_partitions to [1, +inf) in writers
In some cases estimated number of partitions can be 0, which is albeit a
legit estimation result, breaks many low-level sstable writer code, so
some of these have assertions to ensure estimated partitions is > 0.
To avoid hitting this assert all users of the sstable writers do the
clamping, to ensure estimated partitions is at least 1. However leaving
this to the callers is error prone as #6913 has shown it. As this
clamping is standard practice, it is better to do it in the writers
themselves, avoiding this problem altogether. This is exactly what this
patch does. It also adds two unit tests, one that reproduces the crash
in #6913, and another one that ensures all sstable writers are fine with
estimated partitions being 0 now. Call sites previously doing the
clamping are changed to not do it, it is unnecessary now as the writer
does it itself.

Fixes #6913

Tests: unit(dev)
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200724120227.267184-1-bdenes@scylladb.com>
2020-07-27 09:19:37 +02:00
Pavel Emelyanov
5060063cd6 messaging: Add missing per-service unregistering methods
5 services register handlers in messaging, but not all of them
have clear unregistration methods.

Summary:

migration_manager: everything is in place, no changes
gossiper: ditto
proxy: some verbs unregistration is missing
repair: no unregistration at all
streaming: ditto

This patch adds the needed unregistration methods.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-07-22 16:34:00 +03:00
Asias He
28f8798464 repair: Do not use libfmt format specifiers if not needed
We recently saw a weird log message:

   WARN  2020-07-19 10:22:46,678 [shard 0] repair - repair id [id=4,
   uuid=0b1092a1-061f-4691-b0ac-547b281ef09d] failed: std::runtime_error
   ({shard 0: fmt::v6::format_error (invalid type specifier), shard 1:
   fmt::v6::format_error (invalid type specifier)})

It turned out we have:

   throw std::runtime_error(format("repair id {:d} on shard {:d} failed to
   repair {:d} sub ranges", id, shard, nr_failed_ranges));

in the code, but we changed the id from integer to repair_uniq_id class.

We do not really need to specify the format specifiers for numbers.

Fixes #6874
2020-07-20 12:52:36 +03:00
Pavel Emelyanov
92f58f62f2 headers:: Remove flat_mutation_reader.hh from several other headers
All they can live with forward declaration of the f._m._r. plus a
seastar header in commitlog code.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-07-17 17:54:47 +03:00
Pavel Emelyanov
8618a02815 migration_manager: Remove db/schema_tables.hh inclustion into header
The schema_tables.hh -> migration_manager.hh couple seems to work as one
of "single header for everyhing" creating big blot for many seemingly
unrelated .hh's.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-07-17 17:54:43 +03:00
Asias He
4d7faac350 repair: Add uuid to a repair job
Currently, repair uses an integer to identify a repair job. The repair
id starts from 1 since node restart. As a result, different repair jobs
will have same id across restart.

To make the id more unique across restart, we can use an uuid in
addition to the integer id. We can not drop the use of the integer id
completely since the http api and nodetool use it.

Fixes #6786
2020-07-16 11:03:19 +03:00
Asias He
38d964352d repair: Relax node selection in bootstrap when nodes are less than RF
Consider a cluster with two nodes:

 - n1 (dc1)
 - n2 (dc2)

A third node is bootstrapped:

 - n3 (dc2)

The n3 fails to bootstrap as follows:

 [shard 0] init - Startup failed: std::runtime_error
 (bootstrap_with_repair: keyspace=system_distributed,
 range=(9183073555191895134, 9196226903124807343], no existing node in
 local dc)

The system_distributed keyspace is using SimpleStrategy with RF 3. For
the keyspace that does not use NetworkTopologyStrategy, we should not
require the source node to be in the same DC.

Fixes: #6744
Backports: 4.0 4.1, 4.2
2020-07-14 11:54:34 +02:00