Upgrade to using Jakarta EE 10 from Java EE 8 by mostly following the upgrade instructions. Only the servlet package is upgrade. Other Jakarta EE components (like the persistence package that Hibernate depends on) need to be upgraded separately.
TESTED=deployed and successfully communicated with the pubapi endpoint for web WHOIS.
Note that this currently requires packaing the App Engine runtime per instructions here due to GoogleCloudPlatform/appengine-java-standard#98. This PR will only be merged until the fix is deployed to production (https://rapid.corp.google.com/#/release/serverless_runtimes_run_java/java21_20240310_21_0).
Note that Dagger currently doesn't work with the Jakarta namespace and
we have to cap the jakarta inject package version below 2.0 so that it
sill provides classes in the old namespace.
This PR makes the runtime of most of our workload Java 21.
1. App Engine. Java 21 is in GA and it supports Java EE 8. I had to add
an environmental variable so that we don't get an
AppEngineCredentails by default (we have been using
ComputeEngineCredentials for a couple of years). The uprade to Java
21 runtime changed a system property that controls how jetty logging
works, which also control if AppEngineCredential is return. Tested by
deploying to alpha.
2. Proxy base image upgradedd to Java 21 (distroless still doesn't
support Java 21 and it looks like Temurin is the way to go
b/306728455). Tested by deploying to alpha.
3. Nomulus tool image upgrade to Temurin 21 as well. Tested locally.
4. Beam pipeline base image upgrade to Java 21. The JAVA21 flag is not
supported by gcloud yet, but specifying the image URL directly works
(and is supported). Tested by running in alpha.
5. Jetty base image upgraded to Java 21. Tested locally.
Make the necessary changes for the code base to compile with JDK 21.
Other changes:
1. Upgraded testcontainer version and the SQL image version (to be the
same as what we use in Cloud SQL). This led to some schema changes and
also changed the order of results in some test queries (for the
better I think, as the new order appears to be alphabetical).
2. Remove dependency on Truth8, which is deprecated.
3. Enable parallel Gradle task execution and greatly increased the
number of parallel tests in standardTest. Removed outcastTest.
* Check BSA block status in CheckApi
Checks for and reports BSA block status if the name is not registered or
reserved.
Also moves CheckApiActionTest to standardTest. Whatever problem forcing
it to another suite has apparently disappeared.
This includes removing (hopefully temporarily) the gradle-lint plugin as
it is incompatible with various Gradle versions (see
https://github.com/nebula-plugins/gradle-lint-plugin/issues/393). This
is somewhat unfortunate since the plugin is useful for removing unused
dependencies, though with the relatively small amount of Gradle code we
write hopefully it will not be missed much. If Nebula changes their
code to be compatible with Gradle 8+, we can re-add it easily.
This upgrade means we can remove the code added in 342051e1.
The Java code will be added in a followup PR.
Also fixed tests failing due to org.json upgrade: decimal whole numbers
no longer have their fractional parts removed, so currency value strings
must end with ".00" instead of ".0".
This includes renaming the billing classes to match the SQL table names,
as well as splitting them out into their own separate top-level classes.
The rest of the changes are mostly renaming variables and comments etc.
We now use `BillingBase` as the name of the common billing superclass,
because one-time events are called BillingEvents
Because we need to check if a contact history is the most recent for its
underlying contact resource, the query-wipe out-repeat loop no longer works
ideally due to the added overhead with the query.
Instead, we refactor the logic into a Beam pipeline where the query only
needs to be performed once and history entries eligible for wipe out are
handled individually in their own transforms. Because history entries
are otherwise immutable, we can run the pipeline in relatively relaxed
repeatable read isolation level. We also do not worry about batching for
performance, as we do not anticipate this operation to put a lot of
strains on the particular table.
This will replace the ExpandRecurringBillingEventsAction, which has a
couple of issues:
1) The action starts with too many Recurrings that are later filtered out
because their expanded OneTimes are not actually in scope. This is due
to the Recurrings not recording its latest expanded event time, and
therefore many Recurrings that are not yet due for renewal get included
in the initial query.
2) The action works in sequence, which exacerbated the issue in 1) and
makes it very slow to run if the window of operation is wider than
one day, which in turn makes it impossible to run any catch-up
expansions with any significant gap to fill.
3) The action only expands the recurrence when the billing times because
due, but most of its logic works on event time, which is 45 days
before billing time, making the code hard to reason about and
error-prone. This has led to b/258822640 where a premature
optimization intended to fix 1) caused some autorenwals to not be
expanded correctly when subsequent manual renews within the autorenew
grace period closed the original recurrece.
As a result, the new pipeline addresses the above issues in the
following way:
1) Update the recurrenceLastExpansion field on the Recurring when a new
expansion occurs, and narrow down the Recurrings in scope for
expansion by only looking for the ones that have not been expanded for
more than a year.
2) Make it a Beam pipeline so expansions can happen in parallel. The
Recurrings are grouped into batches in order to not overwhelm the
database with writes for each expansion.
3) Create new expansions when the event time, as opposed to billing
time, is within the operation window. This streamlines the logic and
makes it clearer and easier to reason about. This also aligns with
how other (cancelllable) operations for which there are accompanying
grace periods are handled, when the corresponding data is always
speculatively created at event time. Lastly, doing this negates the
need to check if the expansion has finished running before generating
the monthly invoices, because the billing events are now created not
just-in-time, but 45 days in advance.
Note that this PR only adds the pipeline. It does not switch the default
behavior to using the pipeline, which is still done by
ExpandRecurringBillingEventsAction. We will first use this pipeline to
generate missing billing events and domain histories caused by
b/258822640. This also allows us to test it in production, as it
backfills data that will not affect ongoing invoice generation. If
anything goes wrong, we can always delete the generated billing events
and domain histories, based on the unique "reason" in them.
This pipeline can only run after we switch to use SQL sequence based ID
allocation, introduced in #1831.
Some retriers are no longer needed because transactions are
automatically retried by the JPA transaction manager when there's a
transient exception.
<!-- Reviewable:start -->
- - -
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1874)
<!-- Reviewable:end -->
So long, farewell, adios, ciao, sayonara, 再见!
TESTED=deployed to alpha and used `nomulus list_tlds` to confirm that the web app can receive and serve requests.
<!-- Reviewable:start -->
- - -
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1863)
<!-- Reviewable:end -->
* Fix Gradle dependency version pinning
In Gradle 7, version labels require '!!' at the end to be free from
any forced upgrade.
Hibernate min version needs to be advanced past 5.6.12, which is buggy.
Upgraded most dependencies to the latest version.
* Remove Cloud KMS from Nomulus Server
Removed Cloud KMS from the Nomulus (:core) since it is no longer used.
Renamed remaining classes to reflect their use of the SecretManager.
Updated the config instructions to use a new codename for the keyring:
KMS to CSM. This PR works with both codenames. Will drop 'KMS' after
the internal repo is updated.
This runs over all domains that weren't deleted as of September 5. This
will fix most of b/248112997, which is itself caused by b/245940594 --
creating synthetic history objects means that the RDE pipeline will look
at those instead of the potentially-no-longer-valid data in the old
history objects.
* Convert to gradle 7.
* More fixes, regenerated lockfiles.
* Update lockfiles for dependency update.
* Fix show_upgrade_diff for new lockfile format
* Add property for allowInsecureProtocol
Allow us to override the restriction against use of plain HTTP for
communication to dependency repositories. We need this to be able to use a
local proxy for dependency gathering.
* Checking in missing gradle.lockfile
Fortunately this no longer requires a log-in, we can just send a GET
request and receive a CSV result in return.
This also adds the apache-commons CSV parser to the dependencies
See https://b.corp.google.com/issues/237784559 for more details
This includes:
- deletion of helper DB methods in tests
- deletion of various old Datastore-only classes and removal of any
endpoints
- removal of the dual-database test concept
- removal of 'ofy' from the AppEngineExtension
One of the more significant changes introduced in this PR is that we use
SQL as the backing database in all tests unless otherwise specified,
e.g. by using the TmOverrideExtension. We change various ofy-related
tests to use this.
This includes various changes:
- Deletion of SqlEntity/DatastoreEntity and related classes. Includes
any necessary changes because of that (e.g. getting a nice SQL key on
error in RegistryJpaIO).
- Deletion of classes that used libraries from the init-sql code
(RefreshDnsOnHostRenameAction)
- Removal of the JpaTransactionManager's backup implementation
- Modification of RegistryJpaWriteTest to not use init-sql code
- Removal of the Transaction class and related classes, however it does
not remove the TransactionEntity class as that would require DB
changes
- Removal of anything related to the actual usage of the database
migration schedule or read-only phases
- Various test changes and fixes to account for the differences in SQL
(like how foreign keys need to exist)
This deliberately doesn't do anything to alter the objects actually
stored in the DB yet, just how we use them
This includes:
- removing the actions that do the replay
- removing the tests for the replay
- removing the ReplayExtension and adjusting the various tests that used
it appropriately
- removing functionality relating to "things that happen during replay",
e.g. beforeSqlSaveOnReplay
This does not include:
- removing the InitSqlPipeline or similar tasks
- removing e.g. SqlEntity (it's used in other places)
- removing Transforms/RegistryJpaIO and other SQL-pipeline-creation code
This included removing ofy-specific code from various tests. Also, some
of the other tests (e.g. RdapDomainActionTest) had to be configured to
use only SQL -- otherwise, as it currently stands, they were trying to
use ofy.
We also delete the CreateSyntheticHistoryEntriesAction and pipeline
because they're no longer relevant, and impossible to test (the goal of
the actions were to create objects in ofy which doesn't happen any
more).
* Create a Dataflow pipeline to resave EPP resources
This has two modes.
If `fast` is false, then we will just load all EPP resources, project them to the current time, and save them.
If `fast` is true, we will attempt to intelligently load and save only resources that we expect to have changes applied when we project them to the current time. This means resources with pending transfers that have expired, domains with expired grace periods, and non-deleted domains that have expired (we expect that they autorenewed).
* Begin migration from Guava Cache to Caffeine
Caffeine is apparently strictly superior to the older Guava Cache (and is even
recommended in lieu of Guava Cache on Guava Cache's own documentation).
This adds the relevant dependencies and switch over just a single call site to
use the new Caffeine cache. It also implements a new pattern, asynchronously
refreshing the cache value starting from half of our configuration time. For
frequently accessed entities this will allow us to NEVER block on a load, as it
will be asynchronously refreshed in the background long before it ever expires
synchronously during a read operation.
* Add a tools command to launch SQL validation job
Stopping using Pipeline.run().waitUntilFinish in
ValidateDatastorePipeline. Flex-templalate does not support blocking
wait in the main thread.
This PR adds a new ValidateSqlCommand that launches the pipeline and
maintains the SQL snapshot while the pipeline is running.
This PR also added more parameters to both ValidateSqlCommand and
ValidateDatastoreCommand:
- The -c option to supply an optional incremental comparison start time
- The -r option to supply an optional release tag that is not 'live',
e.g., nomulus-DDDDYYMM-RC00
If the manual launch option (-m) is enabled, the commands will print the
gcloud command that can launch the pipeline.
Tested with sandbox, qa and the dev project.
* Remove ReportingUtils and use CloudTaskUtil to enqueue
* Use schedule time helper to enqueue and update schedule time comparison
* Fix comment, indentation in gradle file and improve time comparison
* Compare migration data with SQL as primary DB
Add a BEAM pipeline that compares the secondary Datastore against SQL.
This is a dumb pipeline to be launched by a driver (in a followup PR).
Manually tested pipeline in sandbox.
Also updated the ValidateSqlPipeline and the snapshot finder class so
that an appropriate Datastore export is found (one that ends before the
replay checkpoint value).
We already have ValidateEscrowDepositCommand to check for internal
reference consistency of two deposits, i. e. making sure that all
contacts and hosts referenced by domains exist in the same deposit.
Therefore to compare whether two deposits are equal we only need to make
sure that they contain the same domains and registrars, assuming they
both pass the validation. We don't compare their contents directly
because the MapReduce deposit contains all contacts and domains whereas
the Beam deposit only contains referenced ones, making a direct
comparison impossible.
<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1476)
<!-- Reviewable:end -->
This version of Beam does not have an explicit dependency on log4j.
There are a couple of other things that need to change due to the
upgrade.
1) The new version pulls in a dependency that is not on Maven Central
but on packages.confluent.io, so we need to explicitly add this repo.
2) The new version has a dependency on flogger 0.6 anb above , which removed
the LoggerConfig class (see google/flogger#142).
We therefore backported the class. In the long term we should do what
was suggested in the issue and use the normal JDK Logger config
directly.
3) The intSqlPipeline dependency graph also needs to be updated.
<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1472)
<!-- Reviewable:end -->
* Validate SQL with Datastore being primary
Validates the data asynchronously replicated from Datastore to SQL.
This is a short term tool optimized for the current production database.
Tested in production.
BSD sed requires a parameter to -i to indicate the backup suffix. By
adding a blank suffix the sed command works on both Linux and macOS.
<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1421)
<!-- Reviewable:end -->
* Add a beam pipeline to create synthetic history entries in SQL
The logic is mostly lifted from CreateSyntheticHistoryEntriesAction. We
do not need to test for the existence of an embedded EPP resource in the
history entry before create a synthetic one because after
InitSqlPipeline runs it is guaranteed that no embedded resource exists.
This PR adds the final step in RDE pipeline (enqueueing the next action
to Cloud Tasks) and makes some necessary changes, namely by making all
CloudTasksUtils related classes serializable, so that they can be used
on Beam.
<!-- Reviewable:start -->
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/google/nomulus/1330)
<!-- Reviewable:end -->
* Make :core:cleanTest depend on FilterTests
The "cleanTest" target doesn't work for our specialized tests derived from
FilterTest. Make them all explicit dependencies of cleanTest so we can reset
the tests from a single target.