mirror of
https://github.com/google/nomulus
synced 2026-01-09 15:43:52 +00:00
The original RDE pipeline was a direct translation of the App Engine MapReduce logic. It turned out to be too slow (taking more than a day to run) due to the way it finds the most recent history entry. This PR overhauled the pipeline by using embedded EPP resource entities inside history entries (only available in SQL) and finding the most recent entries using the SQL engine. It cuts the time done to ~2h. Note that there are quota limits on the CPU cores and external IP addresses for a given GCP region inside a project, which will need to accommodate the resource requirements for the pipeline. More details are provided in comments. Also merged the update cursor stage and enqueue next action stage in RdeIO so that they can be done within a transaction, same as how MapReduce handles them. <!-- Reviewable:start --> This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1427) <!-- Reviewable:end -->