Files
scylladb/test/lib
Tomasz Grabiec 962fa5a369 Merge '[Backport 2025.1] tablets: Equalize per-table balance when allocating tablets for a new table' from Scylladb[bot]
Fixes the following scenario:

1. Scale out adds new nodes to each rack
2. Table is created - all tablets are allocated to new nodes because they have low load
3. Rebalancing moves tablets from old nodes to new nodes - table balance for the new table is not fixed

We're wrong to try to equalize global load when allocating tablets,
and we should equalize per-table load instead, and let background load
balancing fix it in a fair way. It will add to the allocated storage
imbalance, but:

1. The table is initially empty, so doesn't impact actual storage imbalance.
2. It's more important to avoid overloading CPU on the nodes - imbalance hurts this aspect immediately.
3. If the table was created before imbalance was formed, we would end up in the same situation as in the problematic scenario after the patch.
4. It's the job of the load balancing to keep up with storage growing, and if it's not, scale out should kick in.

Before we have CPU-aware tablet allocation, and thus can prove we have
CPU capacity on the small nodes, we should respect per-table balance
as this is the way in which we achieve full CPU utilization.

Fixes #23631

Backport to 2025.1 because load imbalance is a serious problem in production.

- (cherry picked from commit d493a8d736)

- (cherry picked from commit 2597a7e980)

- (cherry picked from commit 1e407ab4d2)

Parent PR: #23708

Closes scylladb/scylladb#23873

* github.com:scylladb/scylladb:
  tablets: Equalize per-table balance when allocating tablets for a new table
  load_sketch: Tolerate missing tablet_map when selecting for a given table
  tests: tablets: Simplify tests by moving common code to topology_builder
2025-04-28 13:23:25 +02:00
..
2025-01-09 10:37:35 +00:00