If a partition range is not present locally,
`partition_ranges_owned_by_this_shard` assigns it to shard 0, which can
overload shard 0. To address this, this commit adds a `shard_id_hint`
to the mapreduce request. When `shard_id_hint` is set, the entire
partition range in the request is handled by the specified shard.
The `shard_id_hint` is set by the new tablet-aware mapreduce algorithm,
introduced in `dispatch_to_tablets`. This algorithm balances the
workload across shards, so the changes in this commit ensure that
load balancing is preserved, even during events such as tablet splits.
Fixes: scylladb#21831
Before these changes, shutting down a node could be prolonged because of
mapreduce_service. `mapreduce_service::stop()` uninitializes messaging
service, which includes waiting for all ongoing RPC handlers. We already
had a mechanism for cancelling local mapreduce tasks, but we were missing
one for cancelling external queries.
In this commit, we modify the signature of the request so it supports
cancelling via an abort source. We also provide a reproducer test
for the problem.
Fixesscylladb/scylladb#22337Closesscylladb/scylladb#22651
forward_service is nondescriptive and misnamed, as it does more than
forward requests. It's a classic map/reduce algorithm (and in fact one
of its parameters is "reducer"), so name it accordingly.
The name "forward" leaked into the wire protocol for the messaging
service RPC isolation cookie, so it's kept there. It's also maintained
in the name of the logger (for "nodetool setlogginglevel") for
compatibility with tests.
Closesscylladb/scylladb#19444