Commit Graph

2914 Commits

Author SHA1 Message Date
Benny Halevy
46e2a7c83b database: add truncate_table_on_all_shards
As a first step to decouple truncate from flush
and snpashot.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-08-07 12:53:05 +03:00
Botond Dénes
fbbe2529c1 Merge "Remove global snitch usage from consistency_level.cc" from Pavel Emelyanov
"
There are several helpers in this .cc file that need to get datacenter
for endpoints. For it they use global snitch, because there's no other
place out there to get that data from.

The whole dc/rack info is now moving to topology, so this set patches
the consistency_level.cc to get the topology. This is done two ways.
First, the helpers that have keyspace at hand may get the topology via
ks's effective_replication_map.

Two difficult cases are db::is_local() and db.count_local_endpoints()
because both have just inet_address at hand. Those are patched to be
methods of topology itself and all their callers already mess with
token metadata and can get topology from it.
"

* 'br-consistency-level-over-topology' of https://github.com/xemul/scylla:
  consistency_level: Remove is_local() and count_local_endpoints()
  storage_proxy: Use topology::local_endpoints_count()
  storage_proxy: Use proxy's topology for DC checks
  storage_proxy: Keep shared_ptr<proxy> on digest_read_resolver
  storage_proxy: Use topology local_dc_filter in its methods
  storage_proxy: Mark some digest_read_resolver methods private
  forwarding_service: Use topology local_dc_filter
  storage_service: Use topology local_dc_filter
  consistency_level: Use topology local_dc_filter
  consitency-level: Call count_local_endpoints from topology
  consistency_level: Get datacenter from topology
  replication_strategy: Remove hold snitch reference
  effective_replication_map: Get datacenter from topology
  topology: Add local-dc detection shugar
2022-08-05 13:31:55 +03:00
Pavel Emelyanov
9c662ee0e5 storage_proxy: Use topology::local_endpoints_count()
A continuation of the previous patches -- now all the code that needs
this helper have proxy pointer at hand

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-08-05 12:19:48 +03:00
Pavel Emelyanov
9a50d318b6 storage_proxy: Use proxy's topology for DC checks
Several proxy helper classes need to filter endpoints by datacenter.
Since now the have shared_ptr<proxy> on-board, they can get topology
via proxy's token metadata

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-08-05 12:19:48 +03:00
Pavel Emelyanov
183a2d5a83 storage_proxy: Keep shared_ptr<proxy> on digest_read_resolver
It will be needed to get token metadata from proxy. The resolver in
question is created and maintained by abstract_read_executor which
already has shared_ptr<proxy>, so it just gives its copy

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-08-05 12:19:48 +03:00
Pavel Emelyanov
e1ea801b67 storage_proxy: Use topology local_dc_filter in its methods
The proxy has token metadata pointer, so it can use its topology
reference to filter endpoints by datacenter

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-08-05 12:19:47 +03:00
Pavel Emelyanov
6f515f852d storage_proxy: Mark some digest_read_resolver methods private
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-08-05 12:19:47 +03:00
Pavel Emelyanov
9a19414c62 forwarding_service: Use topology local_dc_filter
The service needs to filter out non-local endpoints for its needs. The
service carries token metadata pointer and can get topology from it to
fulfill this goal

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-08-05 12:19:47 +03:00
Pavel Emelyanov
2423e1c642 storage_service: Use topology local_dc_filter
The storage-service API calls use db::is_local() helper to filter out
tokens from non-local datacenter. In all those places topology is
available from the token metadata pointer

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-08-05 12:19:47 +03:00
Pavel Emelyanov
527b345079 Merge 'storage_proxy: introduce a remote "subservice"' from Kamil Braun
Introduce a `remote` class that handles all remote communication in `storage_proxy`: sending and receiving RPCs, checking the state of other nodes by accessing the gossiper, and fetching schema.

The `remote` object lives inside `storage_proxy` and right now it's initialized and destroyed together with `storage_proxy`.

The long game here is to split the initialization of `storage_proxy` into two steps:
- the first step, which constructs `storage_proxy`, initializes it "locally" and does not require references to `messaging_service` and `gossiper`.
- the second step will take those references and add the `remote` part to `storage_proxy`.

This will allow us to remove some cycles from the service (de)initialization order and in general clean it up a bit. We'll be able to start `storage_proxy` right after the `database` (without messaging/gossiper). Similar refactors are planned for `query_processor`.

Closes #11088

* github.com:scylladb/scylladb:
  service: storage_proxy: pass `migration_manager*` to `init_messaging_service`
  service: storage_proxy: `remote`: make `_gossiper` a const reference
  gms: gossiper: mark some member functions const
  db: consistency_level: `filter_for_query`: take `const gossiper&`
  replica: table: `get_hit_rate`: take `const gossiper&`
  gms: gossiper: move `endpoint_filter` to `storage_proxy` module
  service: storage_proxy: pass `shared_ptr<gossiper>` to `start_hints_manager`
  service: storage_proxy: establish private section in `remote`
  service: storage_proxy: remove `migration_manager` pointer
  service: storage_proxy: remove calls to `storage_proxy::remote()` from `remote`
  service: storage_proxy: remove `_gossiper` field
  alternator: ttl: pass `gossiper&` to `expiration_service`
  service: storage_proxy: move `truncate_blocking` implementation to `remote`
  service: storage_proxy: introduce `is_alive` helper
  service: storage_proxy: remove `_messaging` reference
  service: storage_proxy: move `connection_dropped` to `remote`
  service: storage_proxy: make `encode_replica_exception_for_rpc` a static function
  service: storage_proxy: move `handle_write` to `remote`
  service: storage_proxy: move `handle_paxos_prune` to `remote`
  service: storage_proxy: move `handle_paxos_accept` to `remote`
  service: storage_proxy: move `handle_paxos_prepare` to `remote`
  service: storage_proxy: move `handle_truncate` to `remote`
  service: storage_proxy: move `handle_read_digest` to `remote`
  service: storage_proxy: move `handle_read_mutation_data` to `remote`
  service: storage_proxy: move `handle_read_data` to `remote`
  service: storage_proxy: move `handle_mutation_failed` to `remote`
  service: storage_proxy: move `handle_mutation_done` to `remote`
  service: storage_proxy: move `handle_paxos_learn` to `remote`
  service: storage_proxy: move `receive_mutation_handler` to `remote`
  service: storage_proxy: move `handle_counter_mutation` to `remote`
  service: storage_proxy: remove `get_local_shared_storage_proxy`
  service: storage_proxy: (de)register RPC handlers in `remote`
  service: storage_proxy: introduce `remote`
2022-08-04 17:50:20 +03:00
Kamil Braun
0a4e701b50 service: storage_proxy: pass migration_manager* to init_messaging_service
`migration_manager` lifetime is longer than the lifetime of "storage
proxy's messaging service part" - that is, `init_messaging_service` is
called after `migration_manager` is started, and `uninit_messaging_service`
is called before `migration_manager` is stopped. Thus we don't need to
hold an owning pointer to `migration_manager` here.

Later, when `init_messaging_service` will actually construct `remote`,
this will be a reference, not a pointer.

Also observe that `_mm` in `remote` is only used in handlers, and
handlers are unregistered before `_mm` is nullified, which ensures that
handlers are not running when `_mm` is nullified. (This argument shows
why the code made sense regardless of our switch from shared_ptr to raw
ptr).
2022-08-04 12:19:43 +02:00
Kamil Braun
a08be82ce2 service: storage_proxy: remote: make _gossiper a const reference 2022-08-04 12:19:43 +02:00
Kamil Braun
566e5f2a4f gms: gossiper: move endpoint_filter to storage_proxy module
The function only uses one public function of `gossiper` (`is_alive`)
and is used only in one place in `storage_proxy`.

Make it a static function private to the `storage_proxy` module.

The function used a `default_random_engine` field in `gossiper` for
generating random numbers. Turn this field into a static `thread_local`
variable inside the function - no other `gossiper` members used the
field.
2022-08-04 12:16:09 +02:00
Kamil Braun
078900042f service: storage_proxy: pass shared_ptr<gossiper> to start_hints_manager
No need to call `_remote->gossiper().shared_from_this()` from within
storage_proxy now.
2022-08-04 12:16:09 +02:00
Kamil Braun
d9d10d87ec service: storage_proxy: establish private section in remote
Only the (un)init, send_*, and `is_alive` functions are public,
plus a getter for gossiper.
2022-08-04 12:16:05 +02:00
Kamil Braun
7364d453dd service: storage_proxy: remove migration_manager pointer
The ownership is passed to `remote`, which now contains
a `shared_ptr<migration_manager>`.
2022-08-04 12:15:36 +02:00
Kamil Braun
bcc22ed1dc service: storage_proxy: remove calls to storage_proxy::remote() from remote
Catch `this` in the lambdas.
2022-08-04 12:15:36 +02:00
Kamil Braun
eddd3b8226 service: storage_proxy: remove _gossiper field
Access `gossiper` through `_remote`.
Later, all those accesses will handle missing `remote`.

Note that there are also accesses through the `remote()` internal getter.

The plan is as follows:
- direct accesses through `_remote` will be modified to handle missing
  `_remote` (these won't cause an error)
- `remote()` will throw if `_remote` is missing (`remote()` is only used
  for operations which actually need to send a message to a remote node).
2022-08-04 12:15:35 +02:00
Kamil Braun
ab946e392f alternator: ttl: pass gossiper& to expiration_service
This allows us to remove the `gossiper()` getter from `storage_proxy`.
2022-08-04 12:12:43 +02:00
Kamil Braun
242e31d56e service: storage_proxy: move truncate_blocking implementation to remote
The truncate operation always truncates a table on the entire cluster,
even for local tables. And it always does it by sending RPCs (the node
sends an RPC to itself too). Thus it fits in the remote class.

If we want to add a possibility to "truncate locally only" and/or change
the behavior for local tables, we can add a branch in
`storage_proxy::truncate_blocking`.

Refs: #11087
2022-08-04 12:12:43 +02:00
Kamil Braun
3e73de9a40 service: storage_proxy: introduce is_alive helper
A helper is introduced both in `remote` and in `storage_proxy`.

The `storage_proxy` one calls the `remote` one. In the future it will
also handle a missing `remote`. Then it will report only the local node
to be alive and other nodes dead while `remote` is missing.

The change reduces the number of functions using the `_gossiper` field
in `storage_proxy`.
2022-08-04 12:12:41 +02:00
Botond Dénes
df203a48af Merge "Remove reconnectable_snitch_helper" from Pavel Emelyanov
"
The helper is in charge of receiving INTERNAL_IP app state from
gossiper join/change notifications, updating system.peers with it
and kicking messaging service to update its preferred ip cache
along with initiating clients reconnection.

Effectively this helper duplicates the topology tracking code in
storage-service notifiers. Removing it makes less code and drops
a bunch of unwanted cross-components dependencies, in particular:

- one qctx call is gone
- snitch (almost) no longer needs to get messaging from gossiper
- public:private IP cache becomes local to messaging and can be
  moved to topology at low cost

Some nice minor side effect -- this helper was left unsubscribed
from gossiper on stop and snitch rename. Now its all gone.
"

* 'br-remove-reconnectible-snitch-helper-2' of https://github.com/xemul/scylla:
  snitch: Remove reconnectable snitch helper
  snitch, storage_service: Move reconnect to internal_ip kick
  snitch, storage_service: Move system.peers preferred_ip update
  snitch: Export prefer-local
2022-08-04 13:06:05 +03:00
Kamil Braun
2aff2fea00 service: storage_proxy: remove _messaging reference
All uses of `messaging_service&` have been moved to `remote`.
2022-08-02 19:55:12 +02:00
Kamil Braun
cf931c7863 service: storage_proxy: move connection_dropped to remote 2022-08-02 19:55:12 +02:00
Kamil Braun
2203d4fa09 service: storage_proxy: make encode_replica_exception_for_rpc a static function
No need for this ugly template to be part of the `storage_proxy` header.
2022-08-02 19:55:12 +02:00
Kamil Braun
3499bc7731 service: storage_proxy: move handle_write to remote
It is a helper used by `receive_mutation_handler`
and `handle_paxos_learn`.
2022-08-02 19:55:12 +02:00
Kamil Braun
ba88ad8db0 service: storage_proxy: move handle_paxos_prune to remote 2022-08-02 19:55:12 +02:00
Kamil Braun
548767f91e service: storage_proxy: move handle_paxos_accept to remote 2022-08-02 19:55:12 +02:00
Kamil Braun
807c7f32de service: storage_proxy: move handle_paxos_prepare to remote 2022-08-02 19:55:12 +02:00
Kamil Braun
0e431e7c03 service: storage_proxy: move handle_truncate to remote 2022-08-02 19:55:12 +02:00
Kamil Braun
f8c1ba357f service: storage_proxy: move handle_read_digest to remote 2022-08-02 19:55:12 +02:00
Kamil Braun
43997af40f service: storage_proxy: move handle_read_mutation_data to remote 2022-08-02 19:55:12 +02:00
Kamil Braun
80586a0c7e service: storage_proxy: move handle_read_data to remote 2022-08-02 19:55:12 +02:00
Kamil Braun
00c0ee44bd service: storage_proxy: move handle_mutation_failed to remote 2022-08-02 19:55:12 +02:00
Kamil Braun
b9c436c6e0 service: storage_proxy: move handle_mutation_done to remote 2022-08-02 19:55:12 +02:00
Kamil Braun
178536d5d2 service: storage_proxy: move handle_paxos_learn to remote 2022-08-02 19:55:12 +02:00
Kamil Braun
f309886fac service: storage_proxy: move receive_mutation_handler to remote 2022-08-02 19:55:12 +02:00
Kamil Braun
fad14d2094 service: storage_proxy: move handle_counter_mutation to remote 2022-08-02 19:55:12 +02:00
Kamil Braun
93325a220f service: storage_proxy: remove get_local_shared_storage_proxy
Its remaining uses are trivial to remove.

Note: in `handle_counter_mutation` we had this piece of code:
```
            }).then([trace_state_ptr = std::move(trace_state_ptr), &mutations, cl, timeout] {
                auto sp = get_local_shared_storage_proxy();
                return sp->mutate_counters_on_leader(...);
```

Obtaining a `shared_ptr` to `storage_proxy` at this point is
no different from obtaining a regular pointer:
- The pointer is obtained inside `then` lambda body, not in the capture
  list. So if the goal of obtaining a `shared_ptr` here was to keep
  `storage_proxy` alive until the `then` lambda body is executed, that
  goal wasn't achieved because the pointer was obtained too late.
- The `shared_ptr` is destroyed as soon as `mutate_counters_on_leader`
  returns, it's not stored anywhere. So it doesn't prolong the lifetime
  of the service.

I replaced this with a simple capture of `this` in the lambda.
2022-08-02 19:55:12 +02:00
Kamil Braun
5148eafbd6 service: storage_proxy: (de)register RPC handlers in remote 2022-08-02 19:55:12 +02:00
Kamil Braun
f174645ab5 service: storage_proxy: introduce remote
Move most accesses to `_messaging` to this struct
(functions that send RPCs).
2022-08-02 19:55:10 +02:00
Piotr Sarna
dd2417618e forward_service: limit the number of partition ranges fetched
The forward service uses a vector of ranges owned by a particular
shard in order to split and delegate the work. The number can
grow large though, which can cause large allocations.
This commit limits the number of ranges handled at a time to 256.

Fixes #10725

Closes #11182
2022-08-01 17:36:34 +03:00
Pavel Emelyanov
22fdc03b71 storage_service: Relax confirm_replication()
This method is called from REPLICATION_FINISHED handler and now just
logs a message. The verb is probably worth keeping for compatibility
at least for some time. The logging itself can be moved into handler's
lambda

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-07-29 11:47:37 +03:00
Pavel Emelyanov
c8f9d1237f storage_service: Remove _removing_node
This optional is always disengaged

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-07-29 11:47:11 +03:00
Pavel Emelyanov
4d08554a92 storage_service: Remove _replicating_nodes
The set in question is read-and-ease-only

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-07-29 11:45:42 +03:00
Avi Kivity
2c0932cc41 Merge 'Reduce the amount of per-table metrics' from Amnon Heiman
This series is the first step in the effort to reduce the number of metrics reported by Scylla.
The series focuses on the per-table metrics.

The combination of histograms, per-tables, and per shard makes the number of metrics in a cluster explode.
The following series uses multiple tools to reduce the number of metrics.
1. Multiple metrics should only be reported for the user tables and the condition that checked it was not updated when more non-user keyspaces were added.
2. Second, instead of a histogram, per table, per shard, it will report a summary per table, per shard, and a single histogram per node.
3. Histograms, summaries, and counters will be reported only if they are used (for example, the cas-related metrics will not be reported for tables that are not using cas).

Closes #11058

* github.com:scylladb/scylla:
  Add summary_test
  database: Reduce the number of per-table metrics
  replica/table.cc: Do not register per-table metrics for system
  histogram_metrics_helper.hh: Add to_metrics_summary function
  Unified histogram, estimated_histogram, rates, and summaries
  Split the timed_rate_moving_average into data and timer
  utils/histogram.hh: should_sample should use a bitmask
  estimated_histogram: add missing getter method
2022-07-27 22:01:08 +03:00
Amnon Heiman
99a060126d database: Reduce the number of per-table metrics
This patch reduces the number of metrics that is reported per table, when
the per-table flag is on.

When possible, it moves from time_estimated_histogram and
timed_rate_moving_average_and_histogram to use the unified timer.

Instead of a histogram per shard, it will now report a summary per shard
and a histogram per node.

Counters, histograms, and summaries will not be reported if they were
never used.

The API was updated accordingly so it would not break.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2022-07-27 16:58:52 +03:00
Amnon Heiman
72414b613b Split the timed_rate_moving_average into data and timer
This patch split the timed_rate_moving_average functionality into two, a
data class: rates_moving_average, and a wrapper class
timed_rate_moving_average that uses a timer to update the rates
periodically.

To make the transition as simple as possible timed_rate_moving_average,
takes the original API.

A new helper class meter_timer was introduced to handle the timer update
functionality.

This change required minimal code adaptation in some other parts of the
code.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2022-07-26 15:59:33 +03:00
Pavel Emelyanov
b91f7e9ec4 snitch, storage_service: Move reconnect to internal_ip kick
The same thing as in previous patch -- when gossiper issues
on_join/_change notification, storage service can kick messaging
service to update its internal_ip cache and reconnect to the peer.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-07-26 13:48:46 +03:00
Pavel Emelyanov
1bf8b0dd92 snitch, storage_service: Move system.peers preferred_ip update
Currently the INTERNAL_IP state is updated using reconnectable helper
by subscribing on on_join/on_change events from gossiper. The same
subscription exists in storage service (it's a bit more elaborated by
checking if the node is the part of the ring which is OK).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-07-26 13:48:46 +03:00