Most likely 817fdad uncovered the fact that our choice of primary replica was resonating with tablet allocation and we were ending up picking the same replica as primary within a scope instead of rotating primaryship among all replicas in the scope.
This created situations where for instance, restoring into a 9 nodes with primary_replica_only=true would put all data into 3 nodes, leaving the other 6 unused. The balancing of the dataset was performed by the subsequent repair step.
This PR fixes this by changing the formula for picking up the primary replica out of a set of eligible replicas from within the passed scope.
The PR also extends the testing scenarios in `test_backup.py` so we get to run restore for a set of topologies, for all combinations of scope, primary_replica_only and min_tablet_counts.
Most of the work was done by @bhalevy [here](https://github.com/scylladb/scylladb/compare/master...bhalevy:scylla:load-balance-primary-replica), this PR just splitted it and did touchups here and there.
Fixes#27281Closesscylladb/scylladb#27397
* github.com:scylladb/scylladb:
test: reduce dataset and number of test cases or debug builds
test: bump repair timeout up, it's sometimes not enough in CI
test: refactor test_refresh.py to match test_restore_with_streaming_scopes.
test: extend test_restore_with_streaming_scopes
test: Adjust test_restore_primary_replica_different_dc_scope_all
test: Refactor restoring code in test_backup to match SM pattern
test: add check_mutation_replicas calls after fresh creation of dataset
test: extend create_dataset to accept consistency_level
test: refactor check_mutation_replicas so it's more readable
test: make create_dataset async and refactor so it's configurable
test: use defaultdict in collect_mutations
test: add log marks to facilitate reusing server for restore
locator: tablets: Distribute data evenly among primary replicas during restore
Most likely 817fdad uncovered the fact that our choice of
primary replica was resonating with tablet allocation and we were ending up
picking the same replica as primary within a scope instead of rotating
primaryship among all replicas in the scope.
This created situations where for instance, restoring into a 9 nodes cluster
with primary_replica_only=true would put all data into 3 nodes, leaving
the other 6 unused. The balancing of the dataset was performed by the
subsequent repair step.
split from bhalevy/load-balance-primary-replica
Fixes#27281
Signed-off-by: Robert Bindar <robert.bindar@scylladb.com>
repair: Implement auto repair for tablet repair
This patch implements the basic auto repair support for tablet repair.
It was decided to add no per table configuration for the initial
implementation, so two scylla yaml config options are introduced to set
the default auto repair configs for all the tablet tables.
- auto_repair_enabled_default
Set true to enable auto repair for tablet tables by default. The value
will be overridden by the per keyspace or per table configuration which
is not implemented yet.
- auto_repair_threshold_default_in_seconds
Set the default time in seconds for the auto repair threshold for tablet
tables. If the time since last repair is bigger than the configured
time, the tablet is eligible for auto repair. The value will be
overridden by the per keyspace or per table configuration which is not
implemented yet.
The following metrcis are added:
- auto_repair_needs_repair_nr
The number of tablets with auto repair enabled that needs repair
- auto_repair_enabled_nr
The number of tablets with auto repair enabled
The metrics are useful to tell if auto repair is falling behind.
In the future, more auto repair scheduling will be added, e.g.,
scheduling based on the repaired and unrepaired sstable set size,
tombstone ratio and so on, in addition to the time based scheduling.
Fixes SCYLLADB-99
New feature. No backport.
Closesscylladb/scylladb#27534
* github.com:scylladb/scylladb:
topology_coordinator: Add metrics for tablet repair
repair: Implement auto repair for tablet repair
This patch implements the basic auto repair support for tablet repair.
It was decided to add no per table configuration for the initial
implementation, so two scylla yaml config options are introduced to set
the default auto repair configs for all the tablet tables.
- auto_repair_enabled_default
Set true to enable auto repair for tablet tables by default. The value
will be overridden by the per keyspace or per table configuration which
is not implemented yet.
- auto_repair_threshold_default_in_seconds
Set the default time in seconds for the auto repair threshold for tablet
tables. If the time since last repair is bigger than the configured
time, the tablet is eligible for auto repair. The value will be
overridden by the per keyspace or per table configuration which is not
implemented yet.
The following metrcis are added:
- auto_repair_needs_repair_nr
The number of tablets with auto repair enabled that needs repair
- auto_repair_enabled_nr
The number of tablets with auto repair enabled
The metrics are useful to tell if auto repair is falling behind.
In the future, more auto repair scheduling will be added, e.g.,
scheduling based on the repaired and unrepaired sstable set size,
tombstone ratio and so on, in addition to the time based scheduling.
Fixes SCYLLADB-99
Allow creating materialized views and secondary indexes in a tablets keyspace only if it's RF-rack-valid, and enforce RF-rack-validity while the keyspace has views by restricting some operations:
* Altering a keyspace's RF if it would make the keyspace RF-rack-invalid
* Adding a node in a new rack
* Removing / Decommissioning the last node in a rack
Previously the config option `rf_rack_valid_keyspaces` was required for creating views. We now remove this restriction - it's not needed because we always maintain RF-rack-validity for keyspaces with views.
The restrictions are relevant only for keyspaces with numerical RF. Keyspace with rack-list-based RF are always RF-rack-valid.
Fixesscylladb/scylladb#23345
Fixes https://github.com/scylladb/scylladb/issues/26820
backport to relevant versions for materialized views with tablets since it depends on rf-rack validity
Closesscylladb/scylladb#26354
* github.com:scylladb/scylladb:
docs: update RF-rack restrictions
cql3: don't apply RF-rack restrictions on vector indexes
cql3: add warning when creating mv/index with tablets about rf-rack
service/tablet_allocator: always allow tablet merge of tables with views
locator: extend rf-rack validation for rack lists
test: test rf-rack validity when creating keyspace during node ops
locator: fix rf-rack validation during node join/remove
test: test topology restrictions for views with tablets
test: add test_topology_ops_with_rf_rack_valid
topology coordinator: restrict node join/remove to preserve RF-rack validity
topology coordinator: add validation to node remove
locator: extend rf-rack validation functions
view: change validate_view_keyspace to allow MVs if RF=Racks
db: enforce rf-rack-validity for keyspaces with views
replica/db: add enforce_rf_rack_validity_for_keyspace helper
db: remove enforce parameter from check_rf_rack_validity
test: adjust test to not break rf-rack validity
Currently, tablet allocation intentionally ignores current load (
introduced by the commit #1e407ab) which could cause identical shard
selection when allocating a small number of tablets in the same topology.
When a tablet allocator is asked to allocate N tablets (where N is smaller
than the number of shards on a node), it selects the first N lowest shards.
If multiple such tables are created, each allocator run picks the same
shards, leading to tablet imbalance across shards.
This change initializes the load sketch with the current shard load,
scaled into the [0,1] range, ensuring allocation still remains even
while starting from globally least-loaded shards.
Fixes https://github.com/scylladb/scylladb/issues/27620Closesscylladb/scylladb#27802
This patch adds a method to load_stats which searches for the tablet
size during tablet transition. In case of tablet migration, the tablet
will be searched on the leaving replica, and during rebuild we will
return the average tablet size of the pending replicas.
Extend the RF-rack validation in `assert_rf_rack_valid_keyspace` to
validate rack-list-based replication as well. Previously, validation was
done only for numeric replication.
If the replication is based on a rack list, we validate that all racks
that are required for replication are present in the topology rack map.
If some rack is needed for replication but is missing, or it doesn't
have normal token owner nodes, the validation fails with an error.
If a keyspace is created while a node is joining or being removed, it could
break the rf-rack invariant. For example:
1. We have 3 nodes in 3 racks, no keyspaces
2. A new node starts to join in a new rack - passes validation because
there are no keyspaces
3. Create a keyspace with rf=3 - passes validation because the joining
node is not a normal token owner yet
4. The new node becomes a normal token owner
5. The rf-rack invariant is broken. We have rf=3 and 4 racks
To fix this, we change the rf-rack check to consider a node as a token
owner if it's either a normal token owner or it has bootstrap tokens and
is about to become a normal token owner.
Now the condition can't be broken. Consider keyspace creation at
different stages of adding a node in our example:
* Before the node is assigned bootstrap tokens: the node is not
considered. We can create a keyspace with rf=3 as if the node doesn't
exist, and then node join will fail in the group0 operation that
assigns bootstrap tokens, because during this operation we check
rf-rack validity.
* Assigning bootstrap tokens is a single group0 operation that is
serialized with keyspace creation. During this operation we check that
adding the node as a token owner will maintain rf-rack validity for all
keyspaces.
* After the node is assigned bootstrap tokens and until it becomes a
normal token owner: it is considered as a transitioning token owner by
the rf-rack check and the rack is considered a transitioning rack. We
can't count the rack as a normal rack because the node join may still
fail and rollback. Trying to create a keyspace with either rf=3 or
rf=4 will fail because we can end up with either 3 or 4 racks.
Similarly, when removing a node, we validate that removing the node will
maintain rf-rack validity in the same group0 operation that changes the
node state to removing/decommissioning, after which the node becomes a
leaving endpoint, and it's not considered a normal token owner anymore
for the rf-rack check.
Extend the locator function assert_rf_rack_valid_keyspace to accept
arbitrary topology dc-rack maps and nodes instead of using the current
token metadata.
This allows us to add a new variant of the function that checks rf-rack
validity given a topology change that we want to apply. we will use it
to check that rf-rack validity will be maintained before applying the
topology change.
The possible topology changes for the check are node add and node remove
/ decommission. These operations can change the number of normal racks -
if a new node is added to a new rack, or the last node is removed from a
rack.
Extend the RF-rack-validity enforcement to keyspaces that have views,
regardless of the option `rf_rack_valid_keyspaces`.
Previously, RF-rack-validity was enforced when `rf_rack_valid_keyspaces`
was set for all keyspaces. Now we want to allow creating MVs in tablet
keyspaces that are RF-rack-valid and enforce the RF-rack-validity even
if the config option is not set.
We currently ignore the `_excluded` field in `node::clone()` and the verbose
formatter of `locator::node`. The first one is a bug that can have
unpredictable consequences on the system. The second one can be a minor
inconvenience during debugging.
We fix both places in this PR.
Fixes https://scylladb.atlassian.net/browse/SCYLLADB-72
This PR is a bugfix that should be backported to all supported branches.
Closesscylladb/scylladb#27265
* github.com:scylladb/scylladb:
locator/node: include _excluded in verbose formatter
locator/node: preserve _excluded in clone()
We currently ignore the `_excluded` field in `clone()`. Losing
information about exclusion can have unpredictable consequences. One
observed effect (that led to finding this issue) is that the
`/storage_service/nodes/excluded` API endpoint sometimes misses excluded
nodes.
Commit 6e4803a750 broke notification about expired erms held for too long since it resets the tracker without calling its destructor (where notification is triggered). Fix the assign operator to call the destructor like it should.
Fixes https://github.com/scylladb/scylladb/issues/27141Closesscylladb/scylladb#27140
* https://github.com/scylladb/scylladb:
test: test that expired erm that held for too long triggers notification
token_metadata: fix notification about expiring erm held for to long
Commit 6e4803a750 broke notification about expired erms held for too
long since it resets the tracker without calling its destructor (where
notification is triggered). Fix assign operator to call destructor.
Add precompiled header support to CMakeLists.txt and configure.py -
it improves compilation time by approximately 10%.
New header `stdafx.hh` is added, don't include it manually -
the compiler will include it for you. The header contains includes from
external libraries used by Scylla - seastar, standard library,
linux headers and zlib.
The feature is enabled by default, use CMake option `Scylla_USE_PRECOMPILED_HEADER`
or configure.py --disable-precompiled-header to disable.
The feature should be disabled, when trying to check headers - otherwise
you might get false negatives on missing includes from seastar / abseil and so on.
Note: following configuration needs to be added to ccache.conf:
sloppiness = pch_defines,time_macros,include_file_mtime,include_file_ctime
Closesscylladb/scylladb#26617
Currently we do not allow tablet merge if either of the tablets contain
a tablet repair request. This could block the tablet merge for a very
long time if the repair requests could not be scheduled and executed.
We can actually merge the repair tasks in most of the cases. This is
because most of the time all tablets are requested to be repaired by a
single API request, so they share the same task_id, request_type and
other parameters. We can merge the repair task info and executes the
repair after the merge. If they do not share the task info, we could
not merge and have to wait for the repair before merge, which is both
rare and ok.
Another case is that one of the tablet has a repair task info (t1) while
the other tablet (t2) does not have, it is possible the t2 has finished
repair by the same repair request or t2 is not requested to be repaired
at all. We allow merge in this case too to avoid blocking the tablet
merge, with the price of reparing a bit more.
Fixes#26844Closesscylladb/scylladb#26922
This PR extends the restore API so that it accepts primary_replica_only as parameter and it combines the concepts of primary-replica-only with scoped streaming so that with:
- `scope=all primary_replica_only=true` The restoring node will stream to the global primary replica only
- `scope=dc primary_replica_only=true` The restoring node will stream to the local primary replica only.
- `scope=rack primary_replica_only=true` The restoring node will stream only to the primary replica from within its own rack (with rf=#racks, the restoring node will stream only to itself)
- `scope=node primary_replica_only=true` is not allowed, the restoring node will always stream only to itself so the primary_replica_only parameter wouldn't make sense.
The PR also adjusts the `nodetool refresh` restriction on running restore with both primary_replica_only and scope, it adds primary_replica_only to `nodetool restore` and it adds cluster tests for primary replica within scope.
Fixes#26584Closesscylladb/scylladb#26609
* github.com:scylladb/scylladb:
Add cluster tests for checking scoped primary_replica_only streaming
Improve choice distribution for primary replica
Refactor cluster/object_store/test_backup
nodetool restore: add primary-replica-only option
nodetool refresh: Enable scope={all,dc,rack} with primary_replica_only
Enable scoped primary replica only streaming
Support primary_replica_only for native restore API
`topology_cooridinator::migrate_tablet_size()` was introduced in 10f07fb95a. It has a bug where the has_tablet_size() lambda always returns false because of bad comparison of iterators after a table and tablet search:
```
if (auto table_i = tables.find(gid.table); table_i != tables.find(gid.table)) {
if (auto size_i = table_i->second.find(trange); size_i != table_i->second.find(trange)) {
```
This change also fixes a problem where the `migrate_tablet_size()` would crash with a `std::out_of_range` if the pending node was not present in load_stats.
This change fixes these two problems and moves the functionality into a separate method of `load_stats`. It also adds tests for the new method.
A version containing this bug has not been released yet, so no backport is needed.
Closesscylladb/scylladb#26946
* github.com:scylladb/scylladb:
load_stats: add test for migrate_tablet_size()
load_stats: fix problem with tablet size migration
This patch fixes a bug with tablet size migration in load_stats.
has_tablet_size() lambda in topology_coordinator::migrate_tablet_size()
was returning false in all cases due to incorrect search iterator
comparison after a table and tablet saeach.
This change moves load_stats migrate_tablet_sizes() functionaility
into a separate method of load_stats.
I noticed during tests that `maybe_get_primary_replica`
would not distribute uniformly the choice of primary replica
because `info.replicas` on some shards would have an order whilst
on others it'd be ordered differently, thus making the function choose
a node as primary replica multiple times when it clearly could've
chosen a different nodes.
This patch sorts the replica set before passing it through the
scope filter.
Signed-off-by: Robert Bindar <robert.bindar@scylladb.com>
Recent seastar update deprecated in/out streams usage pattern when a stream is default constructed early and them move-assigned with the proper one (see scylladb/seastar#3051). This PR fixes few places in Scylla that still use one.
Adopting newer seastar API, no need to backport
Closesscylladb/scylladb#26747
* github.com:scylladb/scylladb:
commitlog: Remove unused work::r stream variable
ec2_snitch: Fix indentation after previous patch
ec2_snitch: Coroutinize the aws_api_call_once()
sstable: Construct output_stream for data instantly
test: Don't reuse on-stack input stream
This change adds the ability to move tablets sizes in load_stats after a tablet migration or table resize (split/merge). This is needed because the size based load balancer needs to have tablet size data which is as accurate as possible, in order to work on fresh tablet size distribution and issue correct tablet migrations.
This is the second part of the size based load balancing changes:
- First part for tablet size collection via load_stats: #26035
- Second part reconcile load_stats: #26152
- The third part for load_sketch changes: #26153
- The fourth part which performs tablet load balancing based on tablet size: #26254
This is a new feature and backport is not needed.
Closesscylladb/scylladb#26152
* github.com:scylladb/scylladb:
load_balancer: load_stats reconcile after tablet migration and table resize
load_stats: change data structure which contains tablet sizes
Auto-exands numeric RF in CREATE/ALTER KEYSPACE statements for
new DCs specified in the statement.
Doesn't auto-expand existing options, as the rack choice may not be in
line with current replica placement. This requires co-locating tablet
replicas, and tracking of co-location state, which is not implemented yet.
Signed-off-by: Tomasz Grabiec <tgrabiec@scylladb.com>
The method connects a socket, grabs in/out streams from it then writes
HTTP request and reads+parses the response. For that it uses class
variables for socket and streams, but there's no real need for that --
all three actually exists throughput the method "lifetime".
To fix it, coroutinizes the method. The same could be achieved my moving
the connected socket and streams into do_with() context, but coroutine
is better than that.
(indentation is left broken)
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This change adds the ability to move tablets sizes in load_stats after a
tablet migration or table resize (split/merge). This is needed because
the size based load balancer needs to have tablet size data which is as
accurate as possible, in order to issue migrations which improve
load balance.
Currently, tablet_sstable_streamer::get_primary_endpoints is out of
sync with tablet_map::get_primary_replica. The get_primary_replica
optimizes the choice of the replica so that the work is fairly
distributes among nodes. Meanwhile, get_primary_endpoints always
chooses the first replica.
Use get_primary_replica for get_primary_endpoints.
Fixes: https://github.com/scylladb/scylladb/issues/21883.
Closesscylladb/scylladb#26385
Problems addressed by this PR
* Missing barrier before cleanup: If a node was bootstrapped before cleanup, some request coordinators could still be in `write_both_read_new` and send stale requests to replicas being cleaned up.
* Sessions not drained before cleanup: We lacked protection against stale streaming or repair operations.
* `sstable_vnodes_cleanup_fiber()` calling `flush_all_tables()` under group0 lock: This caused SCT test failures (see [this comment](https://github.com/scylladb/scylladb/issues/25333#issuecomment-3298859046) for details).
* Issues with `storage_proxy::start_write()` used by `sstable_vnodes_cleanup_fiber`:
* The result of `start_write()` was not held during `abstract_write_response_handler::apply_locally`, so coordinator-local writes were not properly awaited.
* Synchronization was racy — `start_write()` was not atomic with the fence check, allowing stale writes to sneak in if `fence_version` changed in between.
* It waited for all writes, including local tables and tablet-based tables, which is redundant because `sstable_vnodes_cleanup_fiber` does not apply to them.
* It also waited for writes with versions greater than the current `fence_version`, which is unnecessary.
Fixesscylladb/scylladb#26150
backport: this PR fixes several issues with the vnodes cleanup procedure, but it doesn't seem they are critical enough to deserve backporting
Closesscylladb/scylladb#26315
* https://github.com/scylladb/scylladb:
test_automatic_cleanup: add test_cleanup_waits_for_stale_writes
test_fencing: fix due to new version increment
test_automatic_cleanup: clean it up
storage_proxy: wait for closing sessions in sstable cleanup fiber
storage_proxy: rename await_pending_writes -> await_stale_pending_writes
storage_proxy: use run_fenceable_write
storage_proxy: abstract_write_response_handler: apply_locally: extract post fence check
storage_proxy: introduce run_fenceable_write
storage_proxy: move update_fence_version from shared_token_metadata
storage_proxy: fix start_write() operation scope in apply_locally
storage_proxy: move post fence check into handle_write
storage_proxy: move fencing into mutate_counter_on_leader_and_replicate
storage_proxy::handle_read: add fence check before get_schema
storage_service: rebrand cleanup_fiber to vnodes_cleanup_fiber
sstable_cleanup_fiber: use coroutine::parallel_for_each
storage_service: sstable_cleanup_fiber: move flush_all_tables out of the group0 lock
topology_coordinator: barrier before cleanup
topology_coordinator: small start_cleanup refactoring
global_token_metadata_barrier: add fenced flag
This patch changes the tablet size map in load_stats. Previously, this
data structure was:
std::unordered_map<range_based_tablet_id, uint64_t> tablet_sizes;
and is changed into:
std::unordered_map<table_id, std::unordered_map<dht::token_range, uint64_t>> tablet_sizes;
This allows for improved performance of tablet tablet size reconciliation.
Future commits will extend update_fence_version, and it is simpler to do
so if the function resides in storage_proxy. Additionally, fence_version
is the only field this function accesses, and it is used solely within
storage_proxy, making this change natural on its own.
The guard should stop refreshing the ERM when the number of tablets
changes. Tablet splits or merges invalidate the tablet_id field
(_tablet), which means the guard can no longer correctly protect
ongoing operations from tablet migrations.
Fixesscylladb/scylladb#26437
The series adds an experimental flag for strongly consistent tables and extends "CREATE KEYSPACE" ddl with `consistency` option that allows specifying the consistency mode for the keyspace.
Closesscylladb/scylladb#26116
* github.com:scylladb/scylladb:
schema: Allow configuring consistency setting for a keyspace
db: experimental consistent-tablets option
We want to add strongly consistent tables as an option. We will have
two kind of strongly consistent tables: globally consistent and locally
consistent. The former means that requests from all DCs will be globally
linearisable while the later - only requests to the same DCs will be
linearisable. To allow configuring all the possibilities the patch
adds new parameter to a keyspace definition "consistency" that can be
configured to be `eventual`, `global` or `local`. Non eventual setting
is supported for tablets enabled keyspaces only. Since we want to start
with implementing local consistency configuring global consistency will
result in an error for now.
It never belonged to tables and views and its placement stems
from location of _tablet_hint handling code.
In the follwing commits we'll reference it in storage_service.cc.
It prepares pending_token_metadata to handle both new and copy
of existing metadata for consistent usage in later commit.
It also adds shared_token_metatada getter so that we don't
need to get it from db.
Using the name regular as the incremental mode could be confusing, since
regular might be interpreted as the non-incremental repair. It is better
to use incremental directly.
Before:
- regular (standard incremental repair)
- full (full incremental repair)
- disabled (incremental repair disabled)
After:
- incremental (standard incremental repair)
- full (full incremental repair)
- disabled (incremental repair disabled)
Fixes#26503Closesscylladb/scylladb#26504
This change extends the CQL replication options syntax so the replication factor can be stated as a list of rack names.
For example: { 'mydatacenter': [ 'myrack1', 'myrack2', 'myrack4' ] }
Rack-list based RF can coexist with the old numerical RF, even in the same keyspace for different DCs.
Specifying the rack list also allows to add replicas on the specified racks (increasing the replication factor), or decommissioning certain racks from their replicas (by omitting them from the current datacenter rack-list). This will allow us to keep the keyspace rf-rack-valid, maintaining guarantees, while allowing adding/removing racks. In particular, this will allow us to add a new DC, which happens by incrementally increasing RF in that DC to cover existing racks.
Migration from numerical RF to rack-list is not supported yet. Migration from rack-list to numerical RF is not planned to be supported.
New feature, no backport required.
Co-authored with @bhalevy
Fixes https://github.com/scylladb/scylladb/issues/25269
Fixes https://github.com/scylladb/scylladb/issues/23525Closesscylladb/scylladb#26358
* github.com:scylladb/scylladb:
tablets: load_balancer: Recognize that tablets are confined to racks when computing desired tablet count
locator: Make hasher for endpoint_dc_rack globally accessible
test: tablets: Add test for replica allocation on rack list changes
test: lib: topology_builder: generate unique rack names
test: Add tests for rack list RF
doc: Document rack-list replication factor
topology_coordinator: Restore formatting
topology_coordinator: Cancel keyspace alter on broader set of errors
topology_coordinator: Make keyspace alter process options through as_ks_metadata_update()
cql3: ks_prop_defs: Preserve old options
cql3: ks_prop_defs: Introduce flattened()
locator: Recognize rack list RF as valid in assert_rf_rack_valid_keyspace()
tablet_allocator: Respect binding replicas to racks
locator: network_topology_strategy: Respect rack list when reallocating tablets
cql3: ks_prop_defs: Fail with more information when options are not in expected format
locator, cql3: Support rack lists in replication options
cql3: Fail early on vnode/tablet flavor alter
cql3: Extract convert_property_map() out of Cql.g
schema: Use definition from the header instead of open-coding it
locator: Abstract obtaining the number of replicas from replication_strategy_config_option
cql3, locator: Use type aliases for option maps
locator: Add debug logging
locator: Pass topology to replication strategy constructor
abstract_replication_strategy, network_topology_strategy: add replication_factor_data class
This commit extend the TABLE_LOAD_STATS RPC with data about the tablet
replica sizes and effective disk capacity.
Effective disk capacity of a node is computed as a sum of the sizes of
all tablet replicas on a node and available disk space.
This is the first change in the size based load balancing series.
Closesscylladb/scylladb#26035