Files
scylladb/exceptions
Jan Ciolek e0442d7bfa mv: throttle view update generation for large queries
For every mutation applied to the base table we have to
generate the corresponding materialized view table updates.

In case of simple requests, like INSERT or UPDATE, the number
of view updates generated per base table mutation is limited
to at most a few view table updates per base table update.

The situation is different for DELETE queries, which can delete
the whole partitions or clustering ranges. Range deletions are
fast on the base table, but for the view table the situation
is different. Deleting a single partition in the base table
will generate as many singular view updates as there are rows
in the deleted partition, which could potentially be in the millions.

To prevent OOM view updates are generated in batches of at most 100 rows.
There is a loop which generates the next batch of updates, spawns tasks
to send them to remote nodes, generates another batch and so on.

The problem is that there is no concurrency control - each batch is scheduled
to be sent in the background, but the following batch is generated without
waiting for the previously generated updates to be sent. This can lead to
unbounded concurrency and OOM.

To protect against this view update generation should be limited somehow.

There is an existing mechanism for limiting view updates - throttling.
We keep track of how many pending view updates there are, in the view backlog,
and delay responses to the client based on this backlog's fullness.
For a well behaved client with limited concurrency this will slow down
the amount of incoming requests until it reaches an optimal point.

This works for simple queries (INSERT, UPDATE, ...), but it doesn't do anything
for range DELETEs. A DELETE is a single request that generates millions of view
updates, delaying client response doesn't help.

The throttling mechanism could be extend to cover this case - we could treat the
DELETE request like any other client and force it to wait before sending more updates.

This commit implements this approach - before sending the next batch of updates
the generator is forced to sleep for a bit of time, calculated using the exisiting
throttling equation.
The more full the backlog gets the more the generator will have to sleep for,
and hopefully this will prevent overloading the system with view updates.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2024-05-13 18:16:23 +02:00
..