"
There are several helpers in this .cc file that need to get datacenter
for endpoints. For it they use global snitch, because there's no other
place out there to get that data from.
The whole dc/rack info is now moving to topology, so this set patches
the consistency_level.cc to get the topology. This is done two ways.
First, the helpers that have keyspace at hand may get the topology via
ks's effective_replication_map.
Two difficult cases are db::is_local() and db.count_local_endpoints()
because both have just inet_address at hand. Those are patched to be
methods of topology itself and all their callers already mess with
token metadata and can get topology from it.
"
* 'br-consistency-level-over-topology' of https://github.com/xemul/scylla:
consistency_level: Remove is_local() and count_local_endpoints()
storage_proxy: Use topology::local_endpoints_count()
storage_proxy: Use proxy's topology for DC checks
storage_proxy: Keep shared_ptr<proxy> on digest_read_resolver
storage_proxy: Use topology local_dc_filter in its methods
storage_proxy: Mark some digest_read_resolver methods private
forwarding_service: Use topology local_dc_filter
storage_service: Use topology local_dc_filter
consistency_level: Use topology local_dc_filter
consitency-level: Call count_local_endpoints from topology
consistency_level: Get datacenter from topology
replication_strategy: Remove hold snitch reference
effective_replication_map: Get datacenter from topology
topology: Add local-dc detection shugar
A continuation of the previous patches -- now all the code that needs
this helper have proxy pointer at hand
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Several proxy helper classes need to filter endpoints by datacenter.
Since now the have shared_ptr<proxy> on-board, they can get topology
via proxy's token metadata
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
It will be needed to get token metadata from proxy. The resolver in
question is created and maintained by abstract_read_executor which
already has shared_ptr<proxy>, so it just gives its copy
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The proxy has token metadata pointer, so it can use its topology
reference to filter endpoints by datacenter
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The service needs to filter out non-local endpoints for its needs. The
service carries token metadata pointer and can get topology from it to
fulfill this goal
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The storage-service API calls use db::is_local() helper to filter out
tokens from non-local datacenter. In all those places topology is
available from the token metadata pointer
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Introduce a `remote` class that handles all remote communication in `storage_proxy`: sending and receiving RPCs, checking the state of other nodes by accessing the gossiper, and fetching schema.
The `remote` object lives inside `storage_proxy` and right now it's initialized and destroyed together with `storage_proxy`.
The long game here is to split the initialization of `storage_proxy` into two steps:
- the first step, which constructs `storage_proxy`, initializes it "locally" and does not require references to `messaging_service` and `gossiper`.
- the second step will take those references and add the `remote` part to `storage_proxy`.
This will allow us to remove some cycles from the service (de)initialization order and in general clean it up a bit. We'll be able to start `storage_proxy` right after the `database` (without messaging/gossiper). Similar refactors are planned for `query_processor`.
Closes#11088
* github.com:scylladb/scylladb:
service: storage_proxy: pass `migration_manager*` to `init_messaging_service`
service: storage_proxy: `remote`: make `_gossiper` a const reference
gms: gossiper: mark some member functions const
db: consistency_level: `filter_for_query`: take `const gossiper&`
replica: table: `get_hit_rate`: take `const gossiper&`
gms: gossiper: move `endpoint_filter` to `storage_proxy` module
service: storage_proxy: pass `shared_ptr<gossiper>` to `start_hints_manager`
service: storage_proxy: establish private section in `remote`
service: storage_proxy: remove `migration_manager` pointer
service: storage_proxy: remove calls to `storage_proxy::remote()` from `remote`
service: storage_proxy: remove `_gossiper` field
alternator: ttl: pass `gossiper&` to `expiration_service`
service: storage_proxy: move `truncate_blocking` implementation to `remote`
service: storage_proxy: introduce `is_alive` helper
service: storage_proxy: remove `_messaging` reference
service: storage_proxy: move `connection_dropped` to `remote`
service: storage_proxy: make `encode_replica_exception_for_rpc` a static function
service: storage_proxy: move `handle_write` to `remote`
service: storage_proxy: move `handle_paxos_prune` to `remote`
service: storage_proxy: move `handle_paxos_accept` to `remote`
service: storage_proxy: move `handle_paxos_prepare` to `remote`
service: storage_proxy: move `handle_truncate` to `remote`
service: storage_proxy: move `handle_read_digest` to `remote`
service: storage_proxy: move `handle_read_mutation_data` to `remote`
service: storage_proxy: move `handle_read_data` to `remote`
service: storage_proxy: move `handle_mutation_failed` to `remote`
service: storage_proxy: move `handle_mutation_done` to `remote`
service: storage_proxy: move `handle_paxos_learn` to `remote`
service: storage_proxy: move `receive_mutation_handler` to `remote`
service: storage_proxy: move `handle_counter_mutation` to `remote`
service: storage_proxy: remove `get_local_shared_storage_proxy`
service: storage_proxy: (de)register RPC handlers in `remote`
service: storage_proxy: introduce `remote`
`migration_manager` lifetime is longer than the lifetime of "storage
proxy's messaging service part" - that is, `init_messaging_service` is
called after `migration_manager` is started, and `uninit_messaging_service`
is called before `migration_manager` is stopped. Thus we don't need to
hold an owning pointer to `migration_manager` here.
Later, when `init_messaging_service` will actually construct `remote`,
this will be a reference, not a pointer.
Also observe that `_mm` in `remote` is only used in handlers, and
handlers are unregistered before `_mm` is nullified, which ensures that
handlers are not running when `_mm` is nullified. (This argument shows
why the code made sense regardless of our switch from shared_ptr to raw
ptr).
The function only uses one public function of `gossiper` (`is_alive`)
and is used only in one place in `storage_proxy`.
Make it a static function private to the `storage_proxy` module.
The function used a `default_random_engine` field in `gossiper` for
generating random numbers. Turn this field into a static `thread_local`
variable inside the function - no other `gossiper` members used the
field.
Access `gossiper` through `_remote`.
Later, all those accesses will handle missing `remote`.
Note that there are also accesses through the `remote()` internal getter.
The plan is as follows:
- direct accesses through `_remote` will be modified to handle missing
`_remote` (these won't cause an error)
- `remote()` will throw if `_remote` is missing (`remote()` is only used
for operations which actually need to send a message to a remote node).
The truncate operation always truncates a table on the entire cluster,
even for local tables. And it always does it by sending RPCs (the node
sends an RPC to itself too). Thus it fits in the remote class.
If we want to add a possibility to "truncate locally only" and/or change
the behavior for local tables, we can add a branch in
`storage_proxy::truncate_blocking`.
Refs: #11087
A helper is introduced both in `remote` and in `storage_proxy`.
The `storage_proxy` one calls the `remote` one. In the future it will
also handle a missing `remote`. Then it will report only the local node
to be alive and other nodes dead while `remote` is missing.
The change reduces the number of functions using the `_gossiper` field
in `storage_proxy`.
"
The helper is in charge of receiving INTERNAL_IP app state from
gossiper join/change notifications, updating system.peers with it
and kicking messaging service to update its preferred ip cache
along with initiating clients reconnection.
Effectively this helper duplicates the topology tracking code in
storage-service notifiers. Removing it makes less code and drops
a bunch of unwanted cross-components dependencies, in particular:
- one qctx call is gone
- snitch (almost) no longer needs to get messaging from gossiper
- public:private IP cache becomes local to messaging and can be
moved to topology at low cost
Some nice minor side effect -- this helper was left unsubscribed
from gossiper on stop and snitch rename. Now its all gone.
"
* 'br-remove-reconnectible-snitch-helper-2' of https://github.com/xemul/scylla:
snitch: Remove reconnectable snitch helper
snitch, storage_service: Move reconnect to internal_ip kick
snitch, storage_service: Move system.peers preferred_ip update
snitch: Export prefer-local
Its remaining uses are trivial to remove.
Note: in `handle_counter_mutation` we had this piece of code:
```
}).then([trace_state_ptr = std::move(trace_state_ptr), &mutations, cl, timeout] {
auto sp = get_local_shared_storage_proxy();
return sp->mutate_counters_on_leader(...);
```
Obtaining a `shared_ptr` to `storage_proxy` at this point is
no different from obtaining a regular pointer:
- The pointer is obtained inside `then` lambda body, not in the capture
list. So if the goal of obtaining a `shared_ptr` here was to keep
`storage_proxy` alive until the `then` lambda body is executed, that
goal wasn't achieved because the pointer was obtained too late.
- The `shared_ptr` is destroyed as soon as `mutate_counters_on_leader`
returns, it's not stored anywhere. So it doesn't prolong the lifetime
of the service.
I replaced this with a simple capture of `this` in the lambda.
The forward service uses a vector of ranges owned by a particular
shard in order to split and delegate the work. The number can
grow large though, which can cause large allocations.
This commit limits the number of ranges handled at a time to 256.
Fixes#10725Closes#11182
This method is called from REPLICATION_FINISHED handler and now just
logs a message. The verb is probably worth keeping for compatibility
at least for some time. The logging itself can be moved into handler's
lambda
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This series is the first step in the effort to reduce the number of metrics reported by Scylla.
The series focuses on the per-table metrics.
The combination of histograms, per-tables, and per shard makes the number of metrics in a cluster explode.
The following series uses multiple tools to reduce the number of metrics.
1. Multiple metrics should only be reported for the user tables and the condition that checked it was not updated when more non-user keyspaces were added.
2. Second, instead of a histogram, per table, per shard, it will report a summary per table, per shard, and a single histogram per node.
3. Histograms, summaries, and counters will be reported only if they are used (for example, the cas-related metrics will not be reported for tables that are not using cas).
Closes#11058
* github.com:scylladb/scylla:
Add summary_test
database: Reduce the number of per-table metrics
replica/table.cc: Do not register per-table metrics for system
histogram_metrics_helper.hh: Add to_metrics_summary function
Unified histogram, estimated_histogram, rates, and summaries
Split the timed_rate_moving_average into data and timer
utils/histogram.hh: should_sample should use a bitmask
estimated_histogram: add missing getter method
This patch reduces the number of metrics that is reported per table, when
the per-table flag is on.
When possible, it moves from time_estimated_histogram and
timed_rate_moving_average_and_histogram to use the unified timer.
Instead of a histogram per shard, it will now report a summary per shard
and a histogram per node.
Counters, histograms, and summaries will not be reported if they were
never used.
The API was updated accordingly so it would not break.
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
This patch split the timed_rate_moving_average functionality into two, a
data class: rates_moving_average, and a wrapper class
timed_rate_moving_average that uses a timer to update the rates
periodically.
To make the transition as simple as possible timed_rate_moving_average,
takes the original API.
A new helper class meter_timer was introduced to handle the timer update
functionality.
This change required minimal code adaptation in some other parts of the
code.
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
The same thing as in previous patch -- when gossiper issues
on_join/_change notification, storage service can kick messaging
service to update its internal_ip cache and reconnect to the peer.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Currently the INTERNAL_IP state is updated using reconnectable helper
by subscribing on on_join/on_change events from gossiper. The same
subscription exists in storage service (it's a bit more elaborated by
checking if the node is the part of the ring which is OK).
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>