Commit Graph

112 Commits

Author SHA1 Message Date
Piotr Dulikowski
39771967bb hinted handoff: fix race - decomission vs. endpoint mgr init
This patch fixes a race between two methods in hints manager: drain_for
and store_hint.

The first method is called when a node leaves the cluster, and it
'drains' end point hints manager for that node (sends out all hints for
that node). If this method is called when the local node is being
decomissioned or removed, it instead drains hints managers for all
endpoints.

In the case of decomission/remove, drain_for first calls
parallel_for_each on all current ep managers and tells them to drain
their hints. Then, after all of them complete, _ep_managers.clear() is
called.

End point hints managers are created lazily and inserted into
_ep_managers map the first time a hint is stored for that node. If
this happens between parallel_for_each and _ep_managers.clear()
described above, the clear operation will destroy the new ep manager
without draining it first. This is a bug and will trigger an assert in
ep manager's destructor.

To solve this, a new flag for the hints manager is added which is set
when it drains all ep managers on removenode/decommission, and prevents
further hints from being written.

Fixes #7257

Closes #7278
2020-09-24 14:51:24 +03:00
Piotr Jastrzebski
c001374636 codebase wide: replace count with contains
C++20 introduced `contains` member functions for maps and sets for
checking whether an element is present in the collection. Previously
`count` function was often used in various ways.

`contains` does not only express the intend of the code better but also
does it in more unified way.

This commit replaces all the occurences of the `count` with the
`contains`.

Tests: unit(dev)

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <b4ef3b4bc24f49abe04a2aba0ddd946009c9fcb2.1597314640.git.piotr@scylladb.com>
2020-08-15 20:26:02 +03:00
Nadav Har'El
8135647906 merge: Add metrics to semaphores
Merged pull request https://github.com/scylladb/scylla/pull/7018
by Piotr Sarna:

This series addresses various issues with metrics and semaphores - it mainly adds missing metrics, which makes it possible to see the length of the queues attached to the semaphores. In case of view building and view update generation, metrics was not present in these services at all, so a first, basic implementation is added.

More precise semaphore metrics would ease the testing and development of load shedding and admission control.

	view_builder: add metrics
	db, view: add view update generator metrics
	hints: track resource_manager sending queue length
	hints: add drain queue length to metrics
	table: add metrics for sstable deletion semaphore
	database: remove unused semaphore
2020-08-12 12:39:59 +03:00
Piotr Sarna
180a1505fd hints: track resource_manager sending queue length
The number of tasks waiting for a hint to be sent is now tracked.
2020-08-11 17:43:53 +02:00
Piotr Sarna
58a9fa7d2e hints: add drain queue length to metrics
The number of tasks waiting for a drain is now tracked.
2020-08-11 17:43:53 +02:00
Piotr Jastrzebski
80e3923b3c codebase wide: replace find(...) != end() with contains
C++20 introduced `contains` member functions for maps and sets for
checking whether an element is present in the collection. Previously
the code pattern looked like:

<collection>.find(<element>) != <collection>.end()

In C++20 the same can be expressed with:

<collection>.contains(<element>)

This is not only more concise but also expresses the intend of the code
more clearly.

This commit replaces all the occurences of the old pattern with the new
approach.

Tests: unit(dev)

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <f001bbc356224f0c38f06ee2a90fb60a6e8e1980.1597132302.git.piotr@scylladb.com>
2020-08-11 13:28:50 +03:00
Piotr Dulikowski
b955793088 hinted handoff: disable warnings about segments left on disk
When a mutation is written to the commitlog, a rp_handle object is
returned which keeps a reference to commitlog segment. A segment is
"dirty" when its reference count is not zero, otherwise it is "clean".

When commitlog object is being destroyed, a warning is being printed
for every dirty segment. On the other hand, clean segments are deleted.

In case of standard mutation writing path, the rp_handle moves
responsibility for releasing the reference to the memtable to which the
mutation is written. When the memtable is flushed to disk, all
references accumulated in the memtable are released. In this context, it
makes sense to warn about dirty segments, because such segments contain
mutations that are not written to sstables, and need to be replayed.

However, hinted handoff uses a different workflow - it recreates a
commitlog object periodically. When a hint is written to commitlog, the
rp_handle reference is not released, so that segments with hints are not
deleted when destroying the commitlog. When commitlog is created again,
we get a list of saved segments with hints that we can try to send at a
later time.

Although this is intended behavior, now that releasing the hints
commitlog is done properly, it causes the mentioned warning to
periodically appear in the logs.

This patch adds a parameter for the commitlog that allows to disable
this warning. It is only used when creating hinted handoff commitlogs.
2020-07-07 19:40:42 +02:00
Piotr Dulikowski
002e6c4056 hinted handoff: release memory on commitlog termination
When commitlog is recreated in hints manager, only shutdown() method is
called, but not release(). Because of that, some internal commitlog
objects (`segment_manager` and `segment`s) may be left pointing to each
other through shared_ptr reference cycles, which may result in memory
leak when the parent commitlog object is destroyed.

This commit prevents memory leaks that may happen this way by calling
release() after shutdown() from the hints manager.

Fixes: #6409, #6776
2020-07-07 19:40:32 +02:00
Avi Kivity
de38091827 priority_manager: merge streaming_read and streaming_write classes into one class
Streaming is handled by just once group for CPU scheduling, so
separating it into read and write classes for I/O is artificial, and
inflates the resources we allow for streaming if both reads and writes
happen at the same time.

Merge both classes into one class ("streaming") and adjust callers. The
merged class has 200 shares, so it reduces streaming bandwidth if both
directions are active at the same time (which is rare; I think it only
happens in view building).
2020-06-22 15:09:04 +03:00
Piotr Dulikowski
e5b2218ad4 hinted handoff: use bool instead of send_state_set
After restart_segment was removed from send_state enum, send_state_set
now has only one possible element: segment_replay_failed.

This patch removes send_state_set and uses bool in its place instead.
2020-06-12 16:10:20 +02:00
Piotr Dulikowski
6b34bb1a43 hinted handoff: update replay position on commitlog failure
Hints manager uses commitlog framework to store and replay hints.

The commitlog::read_log_file function is used for replaying hints. It
reads commitlog entries and passes them to a callback. In case of hints
manager, the callback calls manager::send_one_hint function.

In case something goes wrong during this process, sending of that file
is attempted again later. If the error was caused by hints that failed
to be sent (e.g. due to network error), then we also advance
_last_not_complete_rp field to the position of the first hint that
failed. In the next retry, we will start reading from the commitlog from
that position.

However, current logic does not account for the case when an error
occurs in the commitlog::read_log_file function itself. If,
coincidentally, all hints sent by send_one_hint succeed, then we won't
advance the _last_not_complete_rp field and we may unnecessarily repeat
sending some of the hints that succeeded.

This patch adds the send_one_file_ctx::last_sent_rp field, which keeps
track of the last commitlog position for which a hint was attempted to
be sent. In case read_log_file throws an error but all send_one_hint
calls succeed, then it will be used to update _last_not_complete_rp.
This will reduce the amount of hints that are resent in this case to
only one.

Tests:

- unit(dev)
- dtest(hintedhandoff_additional_test, dev)
2020-06-12 16:10:20 +02:00
Piotr Dulikowski
d369b538f0 hinted handoff: remove rps_set, use first_failed_rp instead
When sending hints from one file, rps_set is used to keep track of
positions of hints that are currently sent. If sending of a hint fails,
its position is not removed from rps_set. If some hints fail to be sent
while handling a hints file, the lowest position from rps_set is used
to calculate the position from where to start when sending of the file
is retried.

Keeping track of commitlog positions this way isn't necessary to
calculate this position. This patch removes rps_set and replaces it
with first_failed_rp - which is just a single
std::optional<db::replay_position>. This value is updated when a hint
send failure is detected.

This simplifies calculation of starting position for the next retry, and
allowed to remove some error handling logic related to an edge case when
inserting to rps_set fails.

- unit(dev)
- dtest(hintedhandoff_additional_test, dev)
2020-06-12 16:10:19 +02:00
Rafael Ávila de Espíndola
555d8fe520 build: Be consistent about system versus regular headers
We were not consistent about using '#include "foo.hh"' instead of
'#include <foo.hh>' for scylla's own headers. This patch fixes that
inconsistency and, to enforce it, changes the build to use -iquote
instead of -I to find those headers.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200608214208.110216-1-espindola@scylladb.com>
2020-06-10 15:49:51 +03:00
Benny Halevy
a96087165a hints: get_device_id: use seastar file_stat
This avoids potential use-after-move, since undefined c++ sequencing order
may std::move(f) in the lambda capture before evaluating f.stat().

Also, this makes use of a more generic library function that doesn't
require to open and hold on to the file in the application.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20200514152054.162168-1-bhalevy@scylladb.com>
2020-05-15 10:11:45 +02:00
Piotr Dulikowski
0c5ac0da98 hinted handoff: remove discarded hint positions from rps_set
Related commit: 85d5c3d

When attempting to send a hint, an exception might occur that results in
that hint being discarded (e.g. keyspace or table of the hint was
removed).

When such an exception is thrown, position of the hint will already be
stored in rps_set. We are only allowed to retain positions of hints that
failed to be sent and needed to be retried later. Dropping a hint is not
an error, therefore its position should be removed from rps_set - but
current logic does not do that.

Because of that bug, hint files with many discardable hints might cause
rps_set to grow large when the file is replayed. Furthermore, leaving
positions of such hints in rps_set might cause more hints than necessary
to be re-sent if some non-discarded hints fail to be sent.

This commit fixes the problem by removing positions of discarded hints
from rps_set.

Fixes #6433
2020-05-12 15:13:59 +02:00
Piotr Dulikowski
85d5c3d5ee hinted handoff: don't keep positions of old hints in rps_set
When sending hints from one file, rps_set field in send_one_file_ctx
keeps track of commitlog positions of hints that are being currently
sent, or have failed to be sent. At the end of the operation, if sending
of some hints failed, we will choose position of the earliest hint that
failed to be sent, and will retry sending that file later, starting from
that position. This position is stored in _last_not_complete_rp.

Usually, this set has a bounded size, because we impose a limit of at
most 128 hints being sent concurrently. Because we do not attempt to
send any more hints after a failure is detected, rps_set should not have
more than 128 elements at a time.

Due to a bug, commitlog positions of old hints (older than
gc_grace_seconds of the destination table) were inserted into rps_set
but not removed after checking their age. This could cause rps_set to
grow very large when replaying a file with old hints.

Moreover, if the file mixed expired and non-expired hints (which could
happen if it had hints to two tables with different gc_grace_seconds),
and sending of some non-expired hints failed, then positions of expired
hints could influence calculation _last_not_complete_rp, and more hints
than necessary would be resent on the next retry.

This simple patch removes commitlog position of a hint from rps_set when
it is detected to be too old.

Fixes #6422
2020-05-11 11:33:31 +02:00
Vlad Zolotarov
b83e84b467 db::hints:: optimize with_file_update_mutex()
Avoid extra shared_ptr copy.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Reviewed-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20200311214313.2988-1-vladz@scylladb.com>
2020-04-16 09:01:40 +03:00
Avi Kivity
88ade3110f treewide: replace calls to engine().some_api() with some_api()
This removes the need to include reactor.hh, a source of compile
time bloat.

In some places, the call is qualified with seastar:: in order
to resolve ambiguities with a local name.

Includes are adjusted to make everything compile. We end up
having 14 translation units including reactor.hh, primarily for
deprecated things like reactor::at_exit().

Ref #1
2020-04-05 12:46:04 +03:00
Avi Kivity
1799cfa88a logalloc: use namespace-scope seastar::idle_cpu_handler and related rather than reactor scope
This allows us to drop a #include <reactor.hh>, reducing compile time.

Several translation units that lost access to required declarations
are updated with the required includes (this can be an include of
reactor.hh itself, in case the translation unit that lost it got it
indirectly via logalloc.hh)

Ref #1.
2020-04-05 12:45:08 +03:00
Botond Dénes
240b5e0594 frozen_schema: key() remove unused schema parameter
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200402092249.680210-1-bdenes@scylladb.com>
2020-04-02 14:43:35 +02:00
Rafael Ávila de Espíndola
c5795e8199 everywhere: Replace engine().cpu_id() with this_shard_id()
This is a bit simpler and might allow removing a few includes of
reactor.hh.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200326194656.74041-1-espindola@scylladb.com>
2020-03-27 11:40:03 +03:00
Rafael Ávila de Espíndola
eca0ac5772 everywhere: Update for deprecated apply functions
Now apply is only for tuples, for varargs use invoke.

This depends on the seastar changes adding invoke.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200324163809.93648-1-espindola@scylladb.com>
2020-03-25 08:49:53 +02:00
Piotr Dulikowski
41d82e39ea storage proxy: rename mutate_hint_from_scratch
Changes the name of storage_proxy::mutate_hint_from_scratch function to
another name, whose meaning is more clear: send_hint_to_all_replicas.

Tests: unit(dev)
2020-02-24 17:30:22 +02:00
Avi Kivity
6c7aa18238 Merge "Introduce schema::get_partitioner" from Piotr
"
Introduce schema::get_partitioner and use it instead of dht::global_partitioner.

Fixes #5493

Tests: unit(dev, release, debug)
"

* 'per_table_partitioner_prep' of https://github.com/haaawk/scylla: (35 commits)
  cdc: stop using partitioners
  partitioner_test: stop calling set_global_partitioner
  storage_service: stop calling global_partitioner()
  mutation_writer_test: stop calling global_partitioner()
  schema: reduce number of global_partitioner() calls
  test_services: stop calling global_partitioner()
  sstable_utils: stop calling global_partitioner()
  sstable_resharding_test: stop depending on global partitioner
  sstable_mutation_test: stop calling global_partitioner()
  sstable_data_file_test: stop calling global_partitioner()
  random_schema: stop taking partitioner in constructor
  mutation_reader_test: stop calling global_partitioner()
  multishard_mutation_query_test: stop calling global_partitioner()
  row_level repair: stop calling global_partitioner()
  distribute_reader_and_consume_on_shards: don't take partitioner
  thrift: reduce global_partitioner() calls
  binary_search: stop calling global_partitioner()
  index_entry: stop calling global_partitioner()
  mc writer: stop calling global_partitioner()
  sstable: stop calling global_partitioner()
  ...
2020-02-17 18:12:53 +02:00
Piotr Dulikowski
01084a79b8 hh: send orphaned hints on HINT_MUTATION verb
When replaying a hint with a destination node that is no longer in the
cluster, it will be sent with cl=ALL to all its new replicas. Before
this patch, the MUTATION verb was used, which causes such hints to be
handled on the same connection and with the same priority as regular
writes. This can cause problems when a large number of hints is
orphaned and they are scheduled to be sent at once. Such situation
may happen when replacing a dead node - all nodes that accumulated hints
for the dead node will now send them with cl=ALL to their new replicas.

This patch changes the verb used to send such hints to HINT_MUTATION.
This verb is handled on a separate connection and with streaming
scheduling group, which gives them similar priority to non-orphaned
hints.

Refs: #4712

Tests: unit(dev)
2020-02-17 14:45:22 +01:00
Piotr Jastrzebski
2d7532f87f dht: add dht::get_token
and replace all calls to dht::global_partitioner().get_token

dht::get_token is better because it takes schema and uses it
to obtain partitioner instead of using a global partitioner.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-02-17 10:59:15 +01:00
Pavel Emelyanov
d1775dd701 utils: Move disk-error-handler into it
The disk-error-handler is purely auxiliary thing that helps
propagating IO errors to the rest of the code. It well
deserves not sitting in the root namespace.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20200207112443.18475-1-xemul@scylladb.com>
2020-02-09 17:26:52 +02:00
Rafael Ávila de Espíndola
e4b8f52237 commitlog: Simplify the return of read_log_file
This function really just wants to signal it is done, so return a
future<>.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200128172847.31513-1-espindola@scylladb.com>
2020-01-30 12:00:29 +02:00
Asias He
343986a70b gossiper: Introduce gossip STATUS_UNKNOWN
When a node does not have gossip STATUS application_state, we currently
use an empty string to present such state in get_gossip_status.

It is better to use an explicit "UNKNOWN" to present it. It makes the
log easier to understand when the status is unknown.

 Before:

   'gossip - InetAddress n2 is now UP, status ='

 After:

   'gossip - InetAddress n2 is now UP, status = UNKNOWN'

This patch is safe because the STATUS_UNKNOWN is never sent over the
cluster. So the presentation is only internal to the node.

Fixes #5520
2020-01-20 10:59:14 +02:00
Gleb Natapov
e0bc4aa098 commitlog: add sync method to entry_writer
If the method returns true commitlog should sync to file immediately
after writing the entry and wait for flush to complete before returning.
2020-01-15 12:15:42 +02:00
Benny Halevy
d1bcb39e7f hinted handoff: log message after removing hints directory (#5372)
To be used by dtest as an indicator that endpoint's hints
were drained and hints directory is removed.

Refs #5354

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2019-12-12 01:16:19 +02:00
Piotr Dulikowski
77d2ceaeba storage_proxy: handle hints through separate rpc verb 2019-12-05 00:51:52 +01:00
Piotr Sarna
9c5a5a5ac2 treewide: add names to semaphores
By default, semaphore exceptions bring along very little context:
either that a semaphore was broken or that it timed out.
In order to make debugging easier without introducing significant
runtime costs, a notion of named semaphore is added.
A named semaphore is simply a semaphore with statically defined
name, which is present in its errors, bringing valuable context.
A semaphore defined as:

  auto sem = semaphore(0);

will present the following message when it breaks:
"Semaphore broken"
However, a named semaphore:

  auto named_sem = named_semaphore(0, named_semaphore_exception_factory{"io_concurrency_sem"});

will present a message with at least some debugging context:

  "Semaphore broken: io_concurrency_sem"

It's not much, but it would really help in pinpointing bugs
without having to inspect core dumps.

At the same time, it does not incur any costs for normal
semaphore operations (except for its creation), but instead
only uses more CPU in case an error is actually thrown,
which is considered rare and not to be on the hot path.

Refs #4999

Tests: unit(dev), manual: hardcoding a failure in view building code
2019-11-26 15:14:21 +02:00
Avi Kivity
623071020e commitlog: change variadic stream in read_log_file to future<struct>
Since seastar::streams are based on future/promise, variadic streams
suffer the same fate as variadic futures - deprecation and eventual
removal.

This patch therefore replaces a variadic stream in commitlog::read_log_file()
with a non-variadic stream, via a helper struct.

Tests: unit (dev)
2019-10-29 19:25:12 +01:00
Avi Kivity
3cb081eb84 Merge " hinted handoff: fix races during shutdown and draining" from Vlad
"
Fix races that may lead to use-after-free events and file system level exceptions
during shutdown and drain.

The root cause of use-after-free events in question is that space_watchdog blocks on
end_point_hints_manager::file_update_mutex() and we need to make sure this mutex is alive as long as
it's accessed even if the corresponding end_point_hints_manager instance
is destroyed in the context of manager::drain_for().

File system exceptions may occur when space_watchdog attempts to scan a
directory while it's being deleted from the drain_for() context.
In case of such an exception new hints generation is going to be blocked
- including for materialized views, till the next space_watchdog round (in 1s).

Issues that are fixed are #4685 and #4836.

Tested as follows:
 1) Patched the code in order to trigger the race with (a lot) higher
    probability and running slightly modified hinted handoff replace
    dtest with a debug binary for 100 times. Side effect of this
    testing was discovering of #4836.
 2) Using the same patch as above tested that there are no crashes and
    nodes survive stop/start sequences (they were not without this series)
    in the context of all hinted handoff dtests. Ran the whole set of
    tests with dev binary for 10 times.
"

* 'hinted_handoff_race_between_drain_for_and_space_watchdog_no_global_lock-v2' of https://github.com/vladzcloudius/scylla:
  hinted handoff: fix a race on a directory removal between space_watchdog and drain_for()
  hinted handoff: make taking file_update_mutex safe
  db::hints::manager::drain_for(): fix alignment
  db::hints::manager: serialize calls to drain_for()
  db::hints: cosmetics: identation and missing method qualifier
2019-10-03 14:38:00 +03:00
Botond Dénes
fddd9a88dd treewide: silence discarded future warnings for legit discards
This patch silences those future discard warnings where it is clear that
discarding the future was actually the intent of the original author,
*and* they did the necessary precautions (handling errors). The patch
also adds some trivial error handling (logging the error) in some
places, which were lacking this, but otherwise look ok. No functional
changes.
2019-08-26 18:54:44 +03:00
Vlad Zolotarov
d253846c91 hinted handoff: fix a race on a directory removal between space_watchdog and drain_for()
The endpoint directories scanned by space_watchdog may get deleted
by the manager::drain_for().

If a deleted directory is given to a lister::scan_dir() this will end up
in an exception and as a result a space_watchdog will skip this round
and hinted handoff is going to be disabled (for all agents including MVs)
for the whole space_watchdog round.

Let's make sure this doesn't happen by serializing the scanning and deletion
using end_point_hints_manager::file_update_mutex.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2019-08-20 11:46:46 -04:00
Vlad Zolotarov
b34c36baa2 hinted handoff: make taking file_update_mutex safe
end_point_hints_manager::file_update_mutex is taken by space_watchdog
but while space_watchdog is waiting for it the corresponding
end_point_hints_manager instance may get destroyed by manager::drain_for()
or by manager::stop().

This will end up in a use-after-free event.

Let's change the end_point_hints_manager's API in a way that would prevent
such an unsafe locking:

   - Introduce the with_file_update_mutex().
   - Make end_point_hints_manager::file_update_mutex() method private.

Fixes #4685
Fixes #4836

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2019-08-20 11:26:19 -04:00
Vlad Zolotarov
dbad9fcc7d db::hints::manager::drain_for(): fix alignment
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2019-08-20 10:58:36 -04:00
Vlad Zolotarov
7a12b46fc9 db::hints::manager: serialize calls to drain_for()
If drain_for() is running together with itself: one instance for the local
node and one for some other node, erasing of elements from the _ep_managers
map may lead to a use-after-free event.

Let's serialize drain_for() calls with a semaphore.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2019-08-20 10:58:36 -04:00
Vlad Zolotarov
09600f1779 db::hints: cosmetics: identation and missing method qualifier
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2019-08-20 10:58:36 -04:00
Gleb Natapov
6a4207f202 Pass service permit to storage_proxy
Current cql transport code acquire a permit before processing a query and
release it when the query gets a reply, but some quires leave work behind.
If the work is allowed to accumulate without any limit a server may
eventually run out of memory. To prevent that the permit system should
account for the background work as well. The patch is a first step in
this direction. It passes a permit down to storage proxy where it will
be later hold by background work.
2019-08-12 10:20:43 +03:00
Piotr Sarna
ac7531d8d9 db,hints: decouple in-flight hints limits from resource manager
The resource manager is used to manage common resources between
various hints managers. In-flight hints used to be one of the shared
resources, but it proves to cause starvation, when one manager eats
the whole limit - which may be especially painful if the background
materialized views hints manager starves the regular hints manager,
which can in turn start failing user writes because of admission control.
This patch makes the limit per-manager again,
which effectively reverts the limit to its original behavior.

Fixes #4483
Message-Id: <8498768e8bccbfa238e6a021f51ec0fa0bf3f7f9.1559649491.git.sarna@scylladb.com>
2019-07-12 19:21:26 +03:00
Vlad Zolotarov
f07c341efc hints_manager: rename the state::ep_state_is_not_normal enum value
Rename this state value to better reflect the reality:
state::ep_state_is_not_normal -> state::ep_state_left_the_ring

The manager gets to this state when the destination Node has left the ring.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2019-05-08 15:46:47 -04:00
Vlad Zolotarov
93ba700458 hinted handoff: fix the logic that detects that the destination node is in DN state
When node is in a DN state its gossiper state may be NORMAL, SHUTDOWN
or "" depending on the use case.

In addition to that if node has been removed from the ring its state is
also going to be removed from the gossiper_state map.

Let's consider the above when deciding if node is in the DN state.

Fixes #4461

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2019-05-08 14:53:01 -04:00
Vlad Zolotarov
274b9d8069 hinted_handoff: sender::can_send(): optimize gossiper::is_alive(ep) check
gossiper::is_alive() has a lot of not needed checks (e.g. is_me(ep)) that
are irrelevant for HH use case and we may safely skip them.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2019-04-25 23:16:07 -04:00
Vlad Zolotarov
74b4076ceb hinted handoff: end_point_hints_manager::sender: use _gossiper instead of _shard_manager.local_gossiper()
sender has its own reference to the local gossiper - use it.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2019-04-25 23:04:02 -04:00
Benny Halevy
5a99023d4a treewide: use lambda for io_check of *touch_directory
To prepare for a seastar change that adds an optional file_permissions
parameter to touch_directory and recursive_touch_directory.
This change messes up the call to io_check since the compiler can't
derive the Func&& argument.  Therefore, use a lambda function instead
to wrap the call to {recursive_,}touch_directory.

Ref #4395

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20190421085502.24729-1-bhalevy@scylladb.com>
2019-04-21 12:04:39 +03:00
Vlad Zolotarov
db2ba0df61 hinted handoff: discard corrupted segments
If we discover that a current segment is corrupted there is nothing we
can do about it.

This patch does the following:
1) Drops the corrupted segment and moves to the next one.
2) Logs such events as ERRORs.
3) Introduces a new metrics that accounts such event.

Fixes #4364

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2019-04-09 15:54:20 -04:00
Vlad Zolotarov
00fe2acb35 hinted handoff: disable "reuse_segments"
Hinted handoff doesn't utilize this feature (which was developed with a
commitlog in mind).
Since it's enabled by default we need to explicitly disable it.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2019-04-09 11:13:41 -04:00