mirror of
https://github.com/scylladb/scylladb.git
synced 2026-04-23 01:50:35 +00:00
After this change, the load balancer can make progress with active migrations. If the algorithm is called with active tablet migrations in tablet metadata, those are treated by load balancer as if they were already completed. This allows the algorithm to incrementally make decision which when executed with active migrations will produce the desired result. Overload of shards is limited by the fact that the algorithm tracks streaming concurrency on both source and target shards of active migrations and takes concurrency limit into account when producing new migrations. The coordinator executes the load balancer on edges of tablet state machine stransitions. This allows new migrations to be started as soon as tablets finish streaming. The load balancer is also continuously invoked as long as it produces a non-empty plan. This is in order to saturate the cluster with streaming. A single make_plan() call is still not saturating, due to the way algorithm is implemented.
76 lines
2.3 KiB
C++
76 lines
2.3 KiB
C++
/*
|
|
* Copyright (C) 2023-present ScyllaDB
|
|
*/
|
|
|
|
/*
|
|
* SPDX-License-Identifier: AGPL-3.0-or-later
|
|
*/
|
|
|
|
#pragma once
|
|
|
|
#include "replica/database.hh"
|
|
#include "service/migration_manager.hh"
|
|
#include "locator/tablets.hh"
|
|
#include <any>
|
|
|
|
namespace service {
|
|
|
|
/// Represents intention to move a single tablet replica from src to dst.
|
|
struct tablet_migration_info {
|
|
locator::global_tablet_id tablet;
|
|
locator::tablet_replica src;
|
|
locator::tablet_replica dst;
|
|
};
|
|
|
|
using migration_plan = utils::chunked_vector<tablet_migration_info>;
|
|
|
|
/// Returns a tablet migration plan that aims to achieve better load balance in the whole cluster.
|
|
/// The plan is computed based on information in the given token_metadata snapshot
|
|
/// and thus should be executed and reflected, at least as pending tablet transitions, in token_metadata
|
|
/// before this is called again.
|
|
///
|
|
/// For any given global_tablet_id there is at most one tablet_migration_info in the returned plan.
|
|
///
|
|
/// To achieve full balance, do:
|
|
///
|
|
/// while (true) {
|
|
/// auto plan = co_await balance_tablets(get_token_metadata());
|
|
/// if (plan.empty()) {
|
|
/// break;
|
|
/// }
|
|
/// co_await execute(plan);
|
|
/// }
|
|
///
|
|
/// It is ok to invoke the algorithm with already active tablet migrations. The algorithm will take them into account
|
|
/// when balancing the load as if they already succeeded. This means that applying a series of migration plans
|
|
/// produced by this function will give the same result regardless of whether applying they are fully executed or
|
|
/// only initiated by creating corresponding transitions in tablet metadata.
|
|
///
|
|
/// The algorithm takes care of limiting the streaming load on the system, also by taking active migrations into account.
|
|
///
|
|
future<migration_plan> balance_tablets(locator::token_metadata_ptr);
|
|
|
|
class tablet_allocator_impl;
|
|
|
|
class tablet_allocator {
|
|
public:
|
|
class impl {
|
|
public:
|
|
virtual ~impl() = default;
|
|
};
|
|
private:
|
|
std::unique_ptr<impl> _impl;
|
|
tablet_allocator_impl& impl();
|
|
public:
|
|
tablet_allocator(service::migration_notifier& mn, replica::database& db);
|
|
public:
|
|
future<> stop();
|
|
};
|
|
|
|
}
|
|
|
|
template <>
|
|
struct fmt::formatter<service::tablet_migration_info> : fmt::formatter<std::string_view> {
|
|
auto format(const service::tablet_migration_info&, fmt::format_context& ctx) const -> decltype(ctx.out());
|
|
};
|