Files
scylladb/test/lib
Avi Kivity 2dcd2b21ae Merge 'tablets: Equalize per-table balance when allocating tablets for a new table' from Tomasz Grabiec
Fixes the following scenario:

1. Scale out adds new nodes to each rack
2. Table is created - all tablets are allocated to new nodes because they have low load
3. Rebalancing moves tablets from old nodes to new nodes - table balance for the new table is not fixed

We're wrong to try to equalize global load when allocating tablets,
and we should equalize per-table load instead, and let background load
balancing fix it in a fair way. It will add to the allocated storage
imbalance, but:

1. The table is initially empty, so doesn't impact actual storage imbalance.
2. It's more important to avoid overloading CPU on the nodes - imbalance hurts this aspect immediately.
3. If the table was created before imbalance was formed, we would end up in the same situation as in the problematic scenario after the patch.
4. It's the job of the load balancing to keep up with storage growing, and if it's not, scale out should kick in.

Before we have CPU-aware tablet allocation, and thus can prove we have
CPU capacity on the small nodes, we should respect per-table balance
as this is the way in which we achieve full CPU utilization.

Fixes #23631

Backport to 2025.1 because load imbalance is a serious problem in production.

Closes scylladb/scylladb#23708

* github.com:scylladb/scylladb:
  tablets: Equalize per-table balance when allocating tablets for a new table
  load_sketch: Tolerate missing tablet_map when selecting for a given table
  tests: tablets: Simplify tests by moving common code to topology_builder
2025-04-21 17:06:30 +03:00
..
2025-02-04 12:21:31 +03:00