Commit Graph

2229 Commits

Author SHA1 Message Date
Piotr Dulikowski
d41d39bbcd hints: add functions for creating and waiting for sync points
Adds functions which allow to create per-shard sync points and wait for
them.
2021-08-09 09:24:36 +02:00
Piotr Dulikowski
e18b29765a hints: add hint sync point structure
Adds a sync_point structure. A sync point is a (possibly incomplete)
mapping from hint queues to a replay position in it. Users will be able
to create sync points consisting of the last written positions of some
hint queues, so then they can wait until hint replay in all of the
queues reach that point.

The sync point supports serialization - first it is serialized with the
help of IDL to a binary form, and then converted to a hexadecimal
string. Deserialization is also possible.
2021-08-09 09:24:36 +02:00
Piotr Dulikowski
70df9973f3 hints: make it possible to wait until hints are replayed
Adds necessary infrastructure which allows, for a given endpoint
manager, to wait until hints are replayed up to a specified position. An
abort source must be specified which, if triggered, cancels waiting for
hint replay.

If the endpoint manager is stopped, current waiters are dismissed with
an exception.
2021-08-09 09:24:36 +02:00
Piotr Dulikowski
93f244426d hints: track the RP of the last replayed position
Keeps track of a position which serves as an upper bound for positions
of already replayed hints - i.e. all hints with replay positions
strictly lower than it are considered replayed.

In order to accurately track this bound during hint replay, a std::map
is introduced which contains positions of hints which are currently
being sent.
2021-08-09 09:24:36 +02:00
Piotr Dulikowski
03e2e671cd hints: track the RP of the last written hint
The position of the last written hint is now tracked by the endpoint
hints manager.

When manager is constructed and no hints are replayed yet, the last
written hint position is initialized to the beginning of a fake segment
with ID corresponding to the current number of milliseconds since the
epoch. This choice makes sure that, in case a new hint sync point is
created before any hints are written, the position recorded for that
hint queue will be larger than all replay positions in segments
currently stored on disk.
2021-08-09 09:24:36 +02:00
Piotr Dulikowski
27d0d598fd hints: change last_attempted_rp to last_succeeded_rp
Instead of tracking the last position for which hint sending is
attempted, the last successfully replayed position is tracked.

The previous variable was used to calculate the position from which hint
replay should restart in case of an error, in the following way:

    _last_not_complete_rp = ctx_ptr->first_failed_rp.value_or(
        ctx_ptr->last_attempted_rp.value_or(_last_not_complete_rp));

Now, this formula uses the last_succeeded_rp in place of
last_attempted_rp. This change does not have an effect on the choice of
the starting position of the next retry:

- If the hint at `last_attempted_rp` has succeeded, in the new algorithm
  the same position will be recorded in `last_succeeded_rp`, and the
  formula will yield the same result.
- If the hint at `last_attempted_rp` has failed, it will be accounted
  into `first_failed_rp`, so the formula will yield the same result.

The motivation for this change is that in the next commits of this PR we
will start tracking the position of the last replayed hint per hint
queue, and the meaning of the new variable makes it more useful - when
there are no failed hints in the hint sending attempt, last_succeeded_rp
gives us information that hints _up to this position_ were replayed; the
last_attempted_rp variable can only tell us that hints _before that
position_ were replayed successfully.
2021-08-09 09:24:36 +02:00
Piotr Dulikowski
08a7d79ffc hints: rearrange error handling logic for hint sending
Instead of calling the `on_hint_send_failure` method inside the hint
sending task in places where an error occurs, we now let the exceptions
be returned and handle them inside a single `then_wrapped` attached to
the hint sending task.

Apart from the `then_wrapped`, there is one more place which calls
`on_hint_send_failure` - in the exception handler for the future which
spawns the asynchronous hint sending task. It needs to be kept separate
because it is a part of a separate task.
2021-08-09 09:24:36 +02:00
Piotr Dulikowski
45b04c94e0 hints: sort segments by ID, divide into foreign and local
Endpoint hints manager keeps a commitlog instance which is used to write
hints into new segments. This instance is re-created every 10 seconds,
which causes the previous instance to leave its segments on disk.

On the other hand, hints sender keeps a list of segments to replay which
is updated only after it becomes empty. The list is repopulated with
segments returned by the commitlog::get_segments_to_replay() method
which does not specify the order of the segments returned.

As a preparation for the upcoming hint sync points feature, this commit
changes the order in which segments are replayed:

- First, segments written by other shards are replayed. Such segments
  may appear in the queue because of segment rebalancing which is done
  at startup.
  The purpose of replaying "foreign" segments first is that they are
  problematic for hint sync points. For each hint queue, a hint sync
  point encodes a replay position of the last written hint on the local
  shard. Accounting foreign segments precisely would make the
  implementation more complicated. To make things simpler, waiting for
  sync points will always make sure that all foreign segments are
  replayed. This might sometimes cause more hints to be waited on than
  necessary if a restart occurs in the meantime.
- Segments written by the local shard are replayed later, in order of
  their IDs. This makes sure that local hints are replayed in the order
  they were written to segments, and will make it possible to use replay
  positions to track progress of hint replay.
2021-08-09 09:24:36 +02:00
Piotr Dulikowski
f83699bb7c Revert "db/hints: allow to forcefully update segment list on flush"
This reverts commit e48739a6da.

This commit removes the functionality from endpoint hints manager which
allowed to flush hints immediately and forcefully update the list of
segments to replay.

The new implementation of waiting for hints will be based on replay
positions returned by the commitlog API and it won't be necessary to
forcefully update the segment list when creating a sync point.
2021-08-09 09:24:36 +02:00
Piotr Dulikowski
9c1d4e7e6c Revert "db/hints: add a metric for counting processed files"
This reverts commit 5a49fe74bb.

This commit removes a metric which tracks how many segments were
replayed during current runtime. It was necessary for current "wait for
hints" mechanism which is being replaced with a different one -
therefore we can remove the metric.
2021-08-09 09:24:36 +02:00
Piotr Dulikowski
3b851a5ebd Revert "db/hints: make it possible to wait until current hints are sent"
This reverts commit 427bbf6d86.

This commit removes the infrastructure which allows to wait until
current hints are replayed in a given hint queue.

It will be replaced with a different mechanism in later commits.
2021-08-09 09:24:36 +02:00
Piotr Dulikowski
4a35d138f6 Revert "storage_proxy: add functions for syncing with hints queue"
This reverts commit 244738b0d5.

This commit removes create_hint_queue_sync_point and
check_hint_queue_sync_point functions from storage_proxy, which were
used to wait until local hints are sent out to particular nodes.

Similar methods will be reintroduced later in this PR, with a completely
different implementation.
2021-08-09 09:24:36 +02:00
Piotr Dulikowski
0d74dee683 Revert "messaging_service: add verbs for hint sync points"
This reverts commit 82c419870a.

This commit removes the HINT_SYNC_POINT_CREATE and HINT_SYNC_POINT_CHECK
rpc verbs.

The upcoming HTTP API for waiting for hint replay will be restricted
to waiting for hints on the node handling the request, so there is no
need for new verbs.
2021-08-09 09:24:36 +02:00
Piotr Dulikowski
ff453d80ff Revert "config: add wait_for_hint_replay_before_repair option"
This reverts commit 86d831b319.

This commit removes the wait_for_hints_before_repair option. Because a
previous commit in this series removes the logic from repair which
caused it to wait for hints to be replayed, this option is now useless.

We can safely remove this option because it is not present in any
release yet.
2021-08-09 09:24:36 +02:00
Piotr Dulikowski
e3c32c897a Revert "hints: dismiss segment waiters when hint queue can't send"
This reverts commit 9d68824327.

First, we are reverting existing infrastructure for waiting for hints in
order to replace it with a different one, therefore this commit needs to
be reverted as well.

Second, errors during hint replay can occur naturally and don't
necessarily indicate that no progress can be made - for example, the
target node is heavily loaded and some hints time out. The "waiting for
hints" operation becomes a user-issued command, so it's not as vital to
ensure liveness.
2021-08-09 09:06:23 +02:00
Avi Kivity
3b5e312800 db: schema_tables: clean up read_schema_partition_for_keyspace() coroutine captures
read_schema_partition_for_keyspace() copies some parameters to capture them
in a coroutine, but the same can be achieved more cleanly by changing the
reference parameters to value parameters, so do that.

Test: unit (dev)

Closes #9154
2021-08-08 12:55:10 +03:00
Asias He
6350a19f73 compaction: Move compaction_strategy.hh to compaction dir
The top dir is a mess. Move compaction_strategy.hh and
compaction_strategy_type.hh to the new home.
2021-08-07 08:06:37 +08:00
Avi Kivity
885ca2158e db: schema_tables: reindent
Following conversion to corotuines in fc91e90c59, remove extra
indents and braces left to make the change clearer.

One variable had to be renamed since without the braces it
duplicated another variable in the same block.

Test: unit (dev)

Closes #9125
2021-08-02 22:36:57 +02:00
Nadav Har'El
fc91e90c59 Merge 'db: schema_tables: coroutinize' from Avi Kivity
schema_tables is quite hairy, but can be easily simplified with coroutines.

In addition to switching future-returning functions to coroutines, we also
switch Seastar threads to coroutines. This is less of a clear-cut win; the
motivation is to reduce the chances of someone calling a function that
expects to run in a thread from a non-thread context. This sometimes works
by accident, but when it doesn't, it's pretty bad. So a uniform calling convention
has some benefit.

I left the extra indents in, since the indent-fixing patch is hard to rebase in case
a rebase is needed. I will follow up with an indent fix post merge.

Test: unit (dev, debug, release)

Closes #9118

* github.com:scylladb/scylla:
  db: schema_tables: drop now redundant #includes
  db: schema_tables: coroutinize drop_column_mapping()
  db: schema_tables: coroutinize column_mapping_exists()
  db: schema_tables: coroutinize get_column_mapping()
  db: schema_tables: coroutinize read_table_mutations()
  db: schema_tables: coroutinize create_views_from_schema_partition()
  db: schema_tables: coroutinize create_views_from_table_row()
  db: schema_tables: unpeel lw_shared_ptr in create_Tables_from_tables_partition()
  db: schema_tables: coroutinize create_tables_from_tables_partition()
  db: schema_tables: coroutinize create_table_from_name()
  db: schema_tables: coroutinize read_table_mutations()
  db: schema_tables: coroutinize merge_keyspaces()
  db: schema_tables: coroutinize do_merge_schema()
  db: schema_tables: futurize and coroutinize merge_functions()
  db: schema_tables: futurize and coroutinize user_types_to_drop::drop
  db: schema_tables: futurize and coroutinize merge_types()
  db: schema_tables: futurize and coroutinize merge_tables_and_views()
  db: schema_tables: coroutinize store_column_mapping()
  db: schema_tables: futurize and coroutinize read_tables_for_keyspaces()
  db: schema_tables: coroutinize read_table_names_of_keyspace()
  db: schema_tables: coroutinize recalculate_schema_version()
  db: schema_tables: coroutinize merge_schema()
  db: schema_tables: introduce and use with_merge_lock()
  db: schema_tables: coroutinize update_schema_version_and_announce()
  db: schema_tables: coroutinize read_keyspace_mutation()
  db: schema_tables: coroutinize read_schema_partition_for_table()
  db: schema_tables: coroutinize read_schema_partition_for_keyspace()
  db: schema_tables: coroutinize query_partition_mutation()
  db: schema_tables: coroutinize read_schema_for_keyspaces()
  db: schema_tables: coroutinize convert_schema_to_mutations()
  db: schema_tables: coroutinize calculate_schema_digest()
  db: schema_tables: coroutinize save_system_schema()
2021-08-02 13:43:53 +03:00
Tomasz Grabiec
c3ada1a145 Merge "count row (sstables/row cache/memtables) and range (memtables) tombstone reads" from Michael
Fixes #7749.
2021-08-01 23:13:18 +02:00
Avi Kivity
ca59754e68 db: schema_tables: drop now redundant #includes 2021-08-01 20:13:15 +03:00
Avi Kivity
40fdbf9558 db: schema_tables: coroutinize drop_column_mapping() 2021-08-01 20:13:15 +03:00
Avi Kivity
7d46300af2 db: schema_tables: coroutinize column_mapping_exists() 2021-08-01 20:13:15 +03:00
Avi Kivity
74b2200f4d db: schema_tables: coroutinize get_column_mapping() 2021-08-01 20:13:15 +03:00
Avi Kivity
f19ca7aaaa db: schema_tables: coroutinize read_table_mutations() 2021-08-01 20:13:15 +03:00
Avi Kivity
81a2be17b6 db: schema_tables: coroutinize create_views_from_schema_partition() 2021-08-01 20:13:15 +03:00
Avi Kivity
15f2fd2a23 db: schema_tables: coroutinize create_views_from_table_row() 2021-08-01 20:13:15 +03:00
Avi Kivity
0843d441ff db: schema_tables: unpeel lw_shared_ptr in create_Tables_from_tables_partition()
The tables local is a lw_shared_ptr which is created and then refeferenced
before returning. It can be unpeeled to the pointed-to type, resulting in
one less allocation.
2021-08-01 20:13:15 +03:00
Avi Kivity
66054d24c4 db: schema_tables: coroutinize create_tables_from_tables_partition() 2021-08-01 20:13:15 +03:00
Avi Kivity
82ba3c5f4a db: schema_tables: coroutinize create_table_from_name() 2021-08-01 20:13:15 +03:00
Avi Kivity
862f491605 db: schema_tables: coroutinize read_table_mutations() 2021-08-01 20:13:15 +03:00
Avi Kivity
91c1a29808 db: schema_tables: coroutinize merge_keyspaces() 2021-08-01 20:13:15 +03:00
Avi Kivity
78fc05922b db: schema_tables: coroutinize do_merge_schema()
It is now using an internal thread, so unpeel is and replace
future::get() with co_await.
2021-08-01 20:13:15 +03:00
Avi Kivity
9680d9e76c db: schema_tables: futurize and coroutinize merge_functions()
Right now, merge_functions() expects to be called in a thread.
Remove that requirement by converting it into a coroutine and returning
a future.

De-threading helps reduce errors where something expects to be called
in a thread, but isn't.
2021-08-01 20:13:15 +03:00
Avi Kivity
9cbae212bf db: schema_tables: futurize and coroutinize user_types_to_drop::drop
user_types_to_drop::drop is a function object returning void, and expecting
to be called in a thread. Make it return a future and convert the
only value it is initialized to to a coroutine.

De-threading helps reduce errors where something expects to be called
in a thread, but isn't.
2021-08-01 20:13:15 +03:00
Avi Kivity
e5f28fc746 db: schema_tables: futurize and coroutinize merge_types()
Right now, merge_types() expects to be called in a thread.
Remove that requirement by converting it into a coroutine and returning
a future.

The [[nodiscard]] attribute is moved from the function to the
return type, since the function now returns a future which is
nodiscard anyway.

The lambda returned is not coroutinized (yet) since it's part
of the user_types_to_drop inner function that still returns void
and expects to be called in a thread.

De-threading helps reduce errors where something expects to be called
in a thread, but isn't.
2021-08-01 20:13:15 +03:00
Avi Kivity
c9584d50ee db: schema_tables: futurize and coroutinize merge_tables_and_views()
Right now, merge_tables_and_views() expects to be called in a thread.
Remove that requirement by converting it into a coroutine and returning
a future.

De-threading helps reduce errors where something expects to be called
in a thread, but isn't.
2021-08-01 20:13:15 +03:00
Avi Kivity
80fe158387 db: schema_tables: coroutinize store_column_mapping() 2021-08-01 20:13:15 +03:00
Avi Kivity
ee8b02f437 db: schema_tables: futurize and coroutinize read_tables_for_keyspaces()
Right now, read_tables_for_keyspaces() expects to be called in a thread.
Remove that requirement by converting it into a coroutine and returning
a future.

De-threading helps reduce errors where something expects to be called
in a thread, but isn't.
2021-08-01 20:13:15 +03:00
Avi Kivity
cd1003daad db: schema_tables: coroutinize read_table_names_of_keyspace() 2021-08-01 20:13:15 +03:00
Avi Kivity
000f7eabd5 db: schema_tables: coroutinize recalculate_schema_version() 2021-08-01 20:13:15 +03:00
Avi Kivity
95d33e9e86 db: schema_tables: coroutinize merge_schema() 2021-08-01 20:13:15 +03:00
Avi Kivity
25548f46dd db: schema_tables: introduce and use with_merge_lock()
Rather than open-coding merge_lock()/merge_unlock() pairs, introduce
and use a helper. This helps in coroutinization, since coroutines
don't support RAII with destructors that wait.
2021-08-01 20:13:15 +03:00
Avi Kivity
7b731ae2c6 db: schema_tables: coroutinize update_schema_version_and_announce() 2021-08-01 20:13:15 +03:00
Avi Kivity
385e0dcc2e db: schema_tables: coroutinize read_keyspace_mutation() 2021-08-01 20:13:15 +03:00
Avi Kivity
ef5df86b1f db: schema_tables: coroutinize read_schema_partition_for_table() 2021-08-01 20:13:15 +03:00
Avi Kivity
8841c2ba10 db: schema_tables: coroutinize read_schema_partition_for_keyspace()
Two reference parameters are copied rather than changing the signature,
to avoid a compile-the-world. It can be cleaned up post-merge.
2021-08-01 20:09:00 +03:00
Michael Livshin
f364666d4a row_cache: count read row tombstones
Refs #7749.

Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2021-08-01 19:41:11 +03:00
Avi Kivity
d1876488f7 db: schema_tables: coroutinize query_partition_mutation() 2021-08-01 19:17:13 +03:00
Avi Kivity
35f9caf6a9 db: schema_tables: coroutinize read_schema_for_keyspaces() 2021-08-01 19:17:09 +03:00