* Fix flaky RdeStagingActionDatastoreTest
Fixed the most common cause that makes one method flaky (Clock and
timestamp problem). Added a TODO to rethink test case.
Also added notes on tasks potentially enqueued multiple times.
* Use the built-in replicaJpaTm() in RDAP
This includes a test for the replica-simulating transaction manager and
removal of any replica-specific code in RDAP tests, because it's
unnecessary due to the existing tests.
* Add DS validation to match Cloud DNS
* Add checks to flows
* Add some flow tests
* Add tests for DomainCreateFlow
* Add tests for UpdateDomainCommand
* Fix docs test
* Small fixes
* Remove builder from tests
The cached methods are only used in situations where we don't really
care about being 100% synchronously up to date (e.g. whois), and they're
not used frequently anyway, so it's safe to use the replica in these
locations.
We can use it more places later but this can serve as a template. We
should inject the connection to the read-only replica (only created
once) to the constructor of the action, then use that instead of the
regular transaction manager.
We add a transaction manager that simulates the read-only-replica
behavior for testing purposes as well.
In addition, we set the transaction isolation level to READ COMMITTED
for this transaction manager (this is fine since we're never writing to
it). Postgres requires this for replica SQL access (it fails if we try
to use SERIALIZABLE) transactions. We didn't see this with the pipelines
before since those already had transaction isolation level overrides
* Only compare recent changes in Datastore and SQL
When comparing Datastore and SQL, ignore older History and EPP resource
objects. This cuts the run time in half compared with a full comparison.
The intention is to run a full comparison before the switch-over from
Datastore and SQL, and run this incremental comparison during the down
time.
The incremental comparison takes about 25 minutes in production.
Performance can be improved further by filtering out older billing
events (OneTime and Cancellation). However, we don't think further
optimization is worth the effort (considering that Recurring events
cannot be filtered since they are mutable but without lastUpdateTime).
Verified in Sandbox and prod with and without time filter.
We already have ValidateEscrowDepositCommand to check for internal
reference consistency of two deposits, i. e. making sure that all
contacts and hosts referenced by domains exist in the same deposit.
Therefore to compare whether two deposits are equal we only need to make
sure that they contain the same domains and registrars, assuming they
both pass the validation. We don't compare their contents directly
because the MapReduce deposit contains all contacts and domains whereas
the Beam deposit only contains referenced ones, making a direct
comparison impossible.
<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1476)
<!-- Reviewable:end -->
* Speed up updating of premium lists
There are two parts to this:
1. Don't load the premium entries in the command prompt (this isn't
necessary and we didn't display that information anyway).
2. Set a proper batch size (rather than just 1) when saving all the
premium entries. This means that we generate only one INSERT statement
rather than N statements.
* Resolve ResaveEntityAction related conflicts
* Replace string with existing constants
* Remove solved TODOs related to ofy string to new vkey string
* Add a TODO for clean up
* Fix missing annotation
* Make not logged in errors take precedence over extension errors
This is the right order to do the checks in, because if the registrar isn't
logged in (or their login failed) then they will have an empty set of declared
extensions, so any attempt to use an extension will throw a "Service
extension(s) must be declared at login" error. This is potentially misleading
because the actual error in this situation is that the registrar isn't logged
in at all.
This also fixes some flows that weren't declared final (but should be), or
methods declared final on final classes, which is superfluous.
* Don't throw errors when existing premium list is empty
This state is possible to get into when things go wrong and it shouldn't prevent
saving new revisions of the list. Note that it will continue to throw errors if
you attempt to save a new revision that is blank (which is usually a mistake).
See http://b/211774375
* Add pending action extension to server update poll messages
This is necessary for the poll messages to contain the necessary context
explaining what domain name the relevant statuses were being added/removed
to/from.
* Always use JPA TM on Beam
Beam does not have access to datastore. Using ofy on Beam always results
in an error. Normally we should use database migration state schedule to
determine which TM to use, but on Beam there's no point in doing so. By
hard-coding the TM on beam to be SQL we can start testing features before
we migrate to SQL mode, for example the new RDE pipeline.
Also made a change to where the manual deposits are stored. It made more
sense to store them under manual/[direcitory]/[jobname]/ instead of
[jobname]/manual/[directory]/.
TESTED=deployed the pipeline on production and ran a job.
This version of Beam does not have an explicit dependency on log4j.
There are a couple of other things that need to change due to the
upgrade.
1) The new version pulls in a dependency that is not on Maven Central
but on packages.confluent.io, so we need to explicitly add this repo.
2) The new version has a dependency on flogger 0.6 anb above , which removed
the LoggerConfig class (see google/flogger#142).
We therefore backported the class. In the long term we should do what
was suggested in the issue and use the normal JDK Logger config
directly.
3) The intSqlPipeline dependency graph also needs to be updated.
<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1472)
<!-- Reviewable:end -->
* Make ImmutableObject.toString deterministic
Remove the identity hash from the output. There is no use case
(including debugging) for it.
Removing it allows us to also remove some overriding implementations in
subclasses, and may also simplify tests.
This adds two new options:
1) An option to run RDE in lenient mode.
2) An option to run RDE with the new Beam pipeline regardless of the datastore setting.
<!-- Reviewable:start -->
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1453)
<!-- Reviewable:end -->
* Filter out empty dsData objects, not just null ones
Hibernate/SQL will get mad if the digest is null or empty, and
previously we only check for null. We should filter out empty digests as
well.
* Properly handle Joda Money in JPA
Joda Money has BigDecimal as amount, which is mapped to a numeric(19,2)
column in the database. As a result, the Money amount load from DB has
scale 2. This becomes a problem with currencies such as JPY, which
requires scale to be 0. To properly load a currency, we must adjust the
scale post-load.
The current approach, which uses Hibernate component mapping, puts the
burden of post-load cleanup on each entity type that uses Money. It is
easy to forget this, as we just discovered.
This PR uses a CompositeUserType to map Money. It adjusts the scale
properly when loading Money instances. Although CompositeUserType appear
to be deprecated in Hibernate 6, it is the only proper solution right
now for mapping non-owned classes.
This is what's causing https://b.corp.google.com/issues/208274109, where
there are DTR rows with null foreign key values.
We should probably wait to make the columns officially non-null until we
get this in and verify that we can do so.
* Write commit logs during SQL->DS replay
Previously, we had no way to ignore read-only mode while still writing
commit log backups. Now, we added this so we can write commit logs in
the SQL->DS replay.
Note:
- When moving to either of the DATASTORE_PRIMARY stages, one must
manually set the SqlReplayCheckpoint first. We don't write to SQL with
backup in this stage because we already wrote the transaction in
question to Datastore. The fact that we manually set the replay
checkpoint means that we'll ignore the extra commit logs that might
otherwise cause problems if we switched back and forth from
DATASTORE_PRIMARY to SQL_PRIMARY.
- The commit logs written during the SQL_PRIMARY phase will, ideally, be
unused. We write them here only so that in the event of a rollback to
Datastore, we will have them for RDE purposes.
* Add NotLoggedInException tests to flows and flow docs
This wasn't included in flows.md before because the test existed in
ResourceFlowTestCase. So even though the exception could be thrown and
even though this was tested, it wasn't picked up in the documentation
because the documentation is picked up from the corresponding concrete
test class.
* Validate SQL with Datastore being primary
Validates the data asynchronously replicated from Datastore to SQL.
This is a short term tool optimized for the current production database.
Tested in production.
We want to keep the read-only-mode-exception as an unchecked exception,
so we introduce a temporary check in the EppController that provides a
specific error message for this situation (rather than letting it fall
through to the generic "command failed" messaging
* Replace with stringify() and VKey.create(string)
* Convert implicit cases of VKey.fromWebsafeKey(string)
* Convert from Key to VKey to use stringify()
* Modify existing code to show correct string representation of a key
* Use VKey.create(websafeKey) to get ofy key in ResaveEntitiesCommand
* Add TODO note in CommitLogMutation and determine if key string should be modified
* Revert from stringify() to getOfyKey().getString()
* Add bug ids to TODOs
We can handle it the same way that we handle UpdateAutoTimestamp, where
we simply populate it in SQL if it doesn't exist. This has the following
benefits:
1. The converter is unnecessary code
2. We get non-null column definitions for free (overridden in
EppResource to allow null creation times so that legacy *History objects
can contain null in that field
3. More importantly, this allows us for proper SQL->DS replay. If the
field is filled out using a converter (as before this PR) then the field
is only actually filled out on transaction commit (rather than when the
write occurs within the transaction). This means that when we serialize
the Transaction object during the transaction (the data that gets
replayed to Datastore), we are crucially missing the creation time.
If the creation time is written on commit, we have to start a new
transaction to write the Transaction object, and it's an absolute
necessity that the record of the transaction be included in the
transaction itself so as to avoid situations where the transaction
succeeds but the record fails.
If the field is filled out in a @PrePersist method, crucially that
occurs on the object write itself (before transaction commit).
The original RDE pipeline was a direct translation of the App Engine
MapReduce logic. It turned out to be too slow (taking more than a day to
run) due to the way it finds the most recent history entry.
This PR overhauled the pipeline by using embedded EPP resource entities
inside history entries (only available in SQL) and finding the most
recent entries using the SQL engine. It cuts the time done to ~2h.
Note that there are quota limits on the CPU cores and external IP
addresses for a given GCP region inside a project, which will need to
accommodate the resource requirements for the pipeline. More details are
provided in comments.
Also merged the update cursor stage and enqueue next action stage in
RdeIO so that they can be done within a transaction, same as how
MapReduce handles them.
<!-- Reviewable:start -->
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1427)
<!-- Reviewable:end -->
* Change TaskOptions to Task in CommitLogFanoutAction
* Add a createTask method that takes clock and jitterSeconds
* Change CreateTask parameter type and improve test cases
* Improve comments and test casse
* Improve test cases that handel jitterSeconds
* Grandfather in old data for one-time billing event requirement
We have data from 2018 and earlier where we didn't consistently set periodYears
for OneTime BillingEvents with certain reasons. This grandfathers in that old
data so that we can successfully move it over to Cloud SQL for now, then we can
later run a query that will backfill it, after which we can then tighten up the
requirement again. Note that the requirement is still being enforced for all
billing events from 2019 onwards.
This also improves the handling of validation, by adding a private field to the
Reason enum rather than creating a throwaway inline ImmmutableSet in the
Builder.
* Make TaskMatcher default to POST methods
TaskOptions.Builder.withUrl() defaults to POST methods. Therefore, it seems
reasonable to verify that task queue methods are using the POST method,
especially given that the method must now be identified explicitly when using
CloudTaskUtils. This check would have guarded against the bug fixed by #1413.
* Elaborate on comment
* Further improved the comment