* Fix entity delete replication, compare db @ replay
Replay tests currently only verify that the contents of a transaction are
can be successfully replicated to the other database. They do not verify that
the contents of both databases are equivalent. As a result, we miss any
changes omitted from the transaction (as was the case with entity deletions).
This change adds a final database comparison to ReplayExtension so we can
safely say that the databases are in the same state.
This comparison is introduced in part as a unit test for the one-line fix for
replication of an "entity delete" operation (where we delete using an entity
object instead of the object's key) which so far has only affected PollMessage
deletion. The fix is also included in this commit within
JpaTransactionManagerImpl.
* Exclude tests and entities with failing comparisons
* Get all tests to pass and fix more timestamp
Fix most of the unit tests that were broken by this change.
- Fix timestamp updates after grace period changes in DomainContent and for
TLD changes in Registrar.
- Reenable full database comparison for most DomainCreateFlowTest's.
- Make some test entities NonReplicated so they don't break when used with
jpaTm().delete()
- Diable checking of a few more entity types that are failing comparisons.
- Add some formatting fixes.
* Remove unnecessary "NoDatabaseCompare"
I turns out that after other fixes/elisions we no longer need these for
any tests in DomainCreateFlowTest.
* Changes for review
* Remove old "compare" flag.
* Reformatted.
* Make a few quality-of-life improvements in CloudTasksUtils
1. Update the method names. There are too many overloaded methods and it
is hard to figure out which one does which without checking the
javadoc.
2. Added a method in the task matcher to specify the delay time in
DateTime, so the caller does not need to convert it to Timestamp.
3. Remove the expilict dependency on a clock when enqueueing a task with
delay, the clock is now injected directly into the util instance
itself.
This is necessary because the Cloud Tasks API is not transactionally enrolled,
so it's possible that multiple tasks might end up being enqueued. We need to be
able to handle them.
* Fix update timestamps for DomainContent types
We expect update timestamps to be updated whenever a containing entity is
modified and persisted, but unfortunately Hibernate doesn't seem to do this --
instead it appears to regard such an entity as unchanged.
To work around this, we explicitly reset the update timestamp whenever a
nested collection is modified in the Builder.
Note that this change only solves the problem for DomainContent. All other
entitities containing UpdateAutoTimestamp will need to be audited and
instrumented with a similar change.
* Fix a handful of tests broken by this change
* Reformatted.
* Use CloudTaskUtils to enqueue
* Add CloudTasksUtilsModule to FrontendComponent
* Fix Uri query issue
* Remove header and check service in matcher
* Use a ThreadLocal boolean in TestServer to determine enqueueing
* Extract enqueuing and email sending from tm().transact()
* Add action to DB comparison pipeline
Add a backend Action in Nomulus server that lanuches the pipeline for
comparing datastore (secondary) with Cloud SQL (primary).
* Save progress
* Revert test changes
* Add pipeline launching
* Fix create/update timestamp replay problems
When CreateAutoTimestamp and UpdateAutoTimestamp are inserted into a
Transaction, their values are not populated in the same way as when they are
stored in the course of an SQL commit. This results in different timestamp
values between SQL and datastore during the SQL -> DS replay.
Fix this by providing these values from the JPA transaction time when we're
doing transaction serialization.
This change also removes the initialization of the Ofy clock in
ExpandRecurringBillingEventsActionTest. It's not necessary as the
ReplayExtension already takes care of this and doing it after the
ReplayExtension as we were breaks a test now that the update timestamps are
correct.
* Remove ReportingUtils and use CloudTaskUtil to enqueue
* Use schedule time helper to enqueue and update schedule time comparison
* Fix comment, indentation in gradle file and improve time comparison
* Change from TaskQueueUtils to CloudTasksUtils in LoadTestAction
* Put X_CSRF_TOKEN in task headers
* Fix schedule time and gradle issue
* Remove TaskQueue constant dependency
* Double run seconds
* Add comment for X_CSRF_TOKEN
* Fix flaky RdeStagingActionDatastoreTest
Fixed the most common cause that makes one method flaky (Clock and
timestamp problem). Added a TODO to rethink test case.
Also added notes on tasks potentially enqueued multiple times.
* Use the built-in replicaJpaTm() in RDAP
This includes a test for the replica-simulating transaction manager and
removal of any replica-specific code in RDAP tests, because it's
unnecessary due to the existing tests.
* Add DS validation to match Cloud DNS
* Add checks to flows
* Add some flow tests
* Add tests for DomainCreateFlow
* Add tests for UpdateDomainCommand
* Fix docs test
* Small fixes
* Remove builder from tests
The cached methods are only used in situations where we don't really
care about being 100% synchronously up to date (e.g. whois), and they're
not used frequently anyway, so it's safe to use the replica in these
locations.
We can use it more places later but this can serve as a template. We
should inject the connection to the read-only replica (only created
once) to the constructor of the action, then use that instead of the
regular transaction manager.
We add a transaction manager that simulates the read-only-replica
behavior for testing purposes as well.
In addition, we set the transaction isolation level to READ COMMITTED
for this transaction manager (this is fine since we're never writing to
it). Postgres requires this for replica SQL access (it fails if we try
to use SERIALIZABLE) transactions. We didn't see this with the pipelines
before since those already had transaction isolation level overrides
* Only compare recent changes in Datastore and SQL
When comparing Datastore and SQL, ignore older History and EPP resource
objects. This cuts the run time in half compared with a full comparison.
The intention is to run a full comparison before the switch-over from
Datastore and SQL, and run this incremental comparison during the down
time.
The incremental comparison takes about 25 minutes in production.
Performance can be improved further by filtering out older billing
events (OneTime and Cancellation). However, we don't think further
optimization is worth the effort (considering that Recurring events
cannot be filtered since they are mutable but without lastUpdateTime).
Verified in Sandbox and prod with and without time filter.
We already have ValidateEscrowDepositCommand to check for internal
reference consistency of two deposits, i. e. making sure that all
contacts and hosts referenced by domains exist in the same deposit.
Therefore to compare whether two deposits are equal we only need to make
sure that they contain the same domains and registrars, assuming they
both pass the validation. We don't compare their contents directly
because the MapReduce deposit contains all contacts and domains whereas
the Beam deposit only contains referenced ones, making a direct
comparison impossible.
<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1476)
<!-- Reviewable:end -->
* Speed up updating of premium lists
There are two parts to this:
1. Don't load the premium entries in the command prompt (this isn't
necessary and we didn't display that information anyway).
2. Set a proper batch size (rather than just 1) when saving all the
premium entries. This means that we generate only one INSERT statement
rather than N statements.
* Resolve ResaveEntityAction related conflicts
* Replace string with existing constants
* Remove solved TODOs related to ofy string to new vkey string
* Add a TODO for clean up
* Fix missing annotation
* Make not logged in errors take precedence over extension errors
This is the right order to do the checks in, because if the registrar isn't
logged in (or their login failed) then they will have an empty set of declared
extensions, so any attempt to use an extension will throw a "Service
extension(s) must be declared at login" error. This is potentially misleading
because the actual error in this situation is that the registrar isn't logged
in at all.
This also fixes some flows that weren't declared final (but should be), or
methods declared final on final classes, which is superfluous.
* Don't throw errors when existing premium list is empty
This state is possible to get into when things go wrong and it shouldn't prevent
saving new revisions of the list. Note that it will continue to throw errors if
you attempt to save a new revision that is blank (which is usually a mistake).
See http://b/211774375
* Add pending action extension to server update poll messages
This is necessary for the poll messages to contain the necessary context
explaining what domain name the relevant statuses were being added/removed
to/from.
* Always use JPA TM on Beam
Beam does not have access to datastore. Using ofy on Beam always results
in an error. Normally we should use database migration state schedule to
determine which TM to use, but on Beam there's no point in doing so. By
hard-coding the TM on beam to be SQL we can start testing features before
we migrate to SQL mode, for example the new RDE pipeline.
Also made a change to where the manual deposits are stored. It made more
sense to store them under manual/[direcitory]/[jobname]/ instead of
[jobname]/manual/[directory]/.
TESTED=deployed the pipeline on production and ran a job.
This version of Beam does not have an explicit dependency on log4j.
There are a couple of other things that need to change due to the
upgrade.
1) The new version pulls in a dependency that is not on Maven Central
but on packages.confluent.io, so we need to explicitly add this repo.
2) The new version has a dependency on flogger 0.6 anb above , which removed
the LoggerConfig class (see google/flogger#142).
We therefore backported the class. In the long term we should do what
was suggested in the issue and use the normal JDK Logger config
directly.
3) The intSqlPipeline dependency graph also needs to be updated.
<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1472)
<!-- Reviewable:end -->
* Make ImmutableObject.toString deterministic
Remove the identity hash from the output. There is no use case
(including debugging) for it.
Removing it allows us to also remove some overriding implementations in
subclasses, and may also simplify tests.
This adds two new options:
1) An option to run RDE in lenient mode.
2) An option to run RDE with the new Beam pipeline regardless of the datastore setting.
<!-- Reviewable:start -->
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1453)
<!-- Reviewable:end -->
* Filter out empty dsData objects, not just null ones
Hibernate/SQL will get mad if the digest is null or empty, and
previously we only check for null. We should filter out empty digests as
well.
* Properly handle Joda Money in JPA
Joda Money has BigDecimal as amount, which is mapped to a numeric(19,2)
column in the database. As a result, the Money amount load from DB has
scale 2. This becomes a problem with currencies such as JPY, which
requires scale to be 0. To properly load a currency, we must adjust the
scale post-load.
The current approach, which uses Hibernate component mapping, puts the
burden of post-load cleanup on each entity type that uses Money. It is
easy to forget this, as we just discovered.
This PR uses a CompositeUserType to map Money. It adjusts the scale
properly when loading Money instances. Although CompositeUserType appear
to be deprecated in Hibernate 6, it is the only proper solution right
now for mapping non-owned classes.
This is what's causing https://b.corp.google.com/issues/208274109, where
there are DTR rows with null foreign key values.
We should probably wait to make the columns officially non-null until we
get this in and verify that we can do so.
* Write commit logs during SQL->DS replay
Previously, we had no way to ignore read-only mode while still writing
commit log backups. Now, we added this so we can write commit logs in
the SQL->DS replay.
Note:
- When moving to either of the DATASTORE_PRIMARY stages, one must
manually set the SqlReplayCheckpoint first. We don't write to SQL with
backup in this stage because we already wrote the transaction in
question to Datastore. The fact that we manually set the replay
checkpoint means that we'll ignore the extra commit logs that might
otherwise cause problems if we switched back and forth from
DATASTORE_PRIMARY to SQL_PRIMARY.
- The commit logs written during the SQL_PRIMARY phase will, ideally, be
unused. We write them here only so that in the event of a rollback to
Datastore, we will have them for RDE purposes.
* Add NotLoggedInException tests to flows and flow docs
This wasn't included in flows.md before because the test existed in
ResourceFlowTestCase. So even though the exception could be thrown and
even though this was tested, it wasn't picked up in the documentation
because the documentation is picked up from the corresponding concrete
test class.
* Validate SQL with Datastore being primary
Validates the data asynchronously replicated from Datastore to SQL.
This is a short term tool optimized for the current production database.
Tested in production.
We want to keep the read-only-mode-exception as an unchecked exception,
so we introduce a temporary check in the EppController that provides a
specific error message for this situation (rather than letting it fall
through to the generic "command failed" messaging
* Replace with stringify() and VKey.create(string)
* Convert implicit cases of VKey.fromWebsafeKey(string)
* Convert from Key to VKey to use stringify()
* Modify existing code to show correct string representation of a key
* Use VKey.create(websafeKey) to get ofy key in ResaveEntitiesCommand
* Add TODO note in CommitLogMutation and determine if key string should be modified
* Revert from stringify() to getOfyKey().getString()
* Add bug ids to TODOs