This allows us to switch the proxy to a different client ID without
disrupting the service. This is a temporary measure and will be removed
once the switch is complete.
Also adds a placeholder getter in the Tld class, so that it can be
mocked/spied in tests. This way more BSA related code can be submitted
before the schema is deployed to prod.
This also adds a targeted QPS as a parameter in case we need to manually bump it
up (or down) for some reason without having to make code changes and re-deploy.
This PR makes a few changes to make it possible to turn on
per-transaction isolation level with minimal disruption:
1) Changed the signatures of transact() and reTransact() methods to allow
passing in lambdas that throw checked exceptions. Previously one has
always to wrap such lambdas in try-and-retrow blocks, which wasn't a
big issue when one can liberally open nested transactions around small
lambdas and keeps the "throwing" part outside the lambda. This becomes a
much bigger hassle when the goal is to eliminate nested transactions and
put as much code as possible within the top-level lambda. As a result,
the transactNoRetry() method now handles checked exceptions by re-throwing
them as runtime exceptions.
2) Changed the name and meaning of the config file field that used to
indicate if per-transaction isolation level is enabled or not. Now it
decides if transact() is called within a transaction, whether to
throw or to log, regardless whether the transaction could have
succeeded based on the isolation override level (if provided). The
flag will initially be set to false and would help us identify all
instances of nested calls and either refactor them or use reTransact()
instead. Once we are fairly certain that no nested calls to transact()
exists, we flip the flag to true and start enforcing this logic.
Eventually the flag will go away and nested calls to transact() will
always throw.
3) Per-transaction isolation level will now always be applied, if an
override is provided. Because currently there should be no actual
use of such feature (except for places where we explicitly use an
override and have ensured no nested transactions exist, like in
RefreshDnsForAllDomainsAction), we do not expect any issues with
conflicting isolation levels, which would resulted in failure.
3) transactNoRetry() is made package private and removed from the
exposed API of JpaTransactionManager. This saves a lot of redundant
methods that do not have a practical use. The only instances where this
method was called outside the package was in the reader of
RegistryJpaIO, which should have no problem with retrying.
We have been using SHA256 to hash passwords (for both EPP and registry lock),
which is now considered too weak.
This PR switches to using Scrypt, a memory-hard slow hash function, with
recommended parameters per go/crypto-password-hash.
To ease the transition, when a password is being verified, both Scrypt
and SHA256 are tried. If SHA256 verification is successful, we re-hash
the verified password with Scrypt and replace the stored SHA256 hash
with the new one. This way, as long as a user uses the password once
before the transition period ends (when Scrypt becomes the only valid
algorithm), there would be no need for manual intervention from them.
We will send out notifications to users to remind them of the transition
and urge them to use the password (which should not be a problem with
EPP, but less so with the registry lock). After the transition,
out-of-band reset for EPP password, or remove-and-add on the console for
registry lock password, would be required for the hashes that have not
been re-saved.
Note that the re-save logic is not present for console user's registry
lock password, as there is no production data for console users yet.
Only legacy GAE user's password requires re-save.
The domains (and their associated billing recurrences) were already being loaded
to check restores, but they also now need to be loaded for renews and transfers
as well, as the billing renewal behavior on the recurrence could be modifying
the relevant renew price that should be shown. (The renew price is used for
transfers as well.)
See https://buganizer.corp.google.com/issues/306212810
Both actions have not been used for a while (the wipe out action
actually caused problems when it ran unintentionally and wiped out QA).
Keeping them around is a burden when refactoring efforts have to take
them into consideration.
It is always possible to resurrect them form git history should the need
arises.
We previously needed to use URLFetch in some instances where TLS 1.3 is
required (mostly when connecting to ICANN servers),and the JDK-bundled SSL
engine that came with App Engine runtime did not support TLS 1.3.
It appears now that the Java 8 runtime on App Engine supports TLS 1.3
out of the box, which allows us to get rid of URLFetch, which depends on
App Engine APIs.
Also removed some redundant retry and logging logic, now that we know
the HTTP client behaves correctly.
TESTED=modified the CannedScriptExecutionAction and deployed to alpha, used the
new HTTP client to connect to the three URL endpoints that were
problematic before and confirmed that TLS connections can be established. HTTP
sessions were rejected in some cases when authentication failed, but
that was expected.
Also made some refactoring to various Auth related classes to clean up things a bit and make the logic less convoluted:
1. In Auth, remove AUTH_API_PUBLIC as it is only used by the WHOIS and EPP endpoints accessed by the proxy. Previously, the proxy relies on OAuth and its service account is not given admin role (in OAuth parlance), so we made them accessible by a public user, deferring authorization to the actions themselves. In practice, OAuth checks for allowlisted client IDs and only the proxy client ID was allowlisted, which effectively limited access to only the proxy anyway.
2. In AuthResult, expose the service account email if it is at APP level. RequestAuthenticator will print out the auth result and therefore log the email, making it easy to identify which account was used. This field is mutually exclusive to the user auth info field. As a result, the factory methods are refactored to explicitly create either APP or USER level auth result.
3. Completely re-wrote RequestAuthenticatorTest. Previously, the test mingled testing functionalities of the target class with testing how various authentication mechanisms work. Now they are cleanly decoupled, and each method in RequestAuthenticator is tested individually.
4. Removed nomulus-config-production-sample.yaml as it is vastly out of date.
Surround the dot in unsafe domain names with a square bracket. This
is suggested by Gmail abuse-detection and allows outgoing messages
to pass Gmail's check. This should also help with recipients' checks.
Adds a delay between emails sent in a tight loop. This helps avoid
triggering Gmail abuse detections.
Also updated the recipient address for billing alerts.
* Add custom YAML serializer for Duration
This addresses b/301119144. This changes the YAML representation of a TLD to show Duration fields as a String reperesntation using the Java Duration object's toString() format. This eliminates the previous ambiguity over the time unit that is being used for each duration.
* change standardSeconds to standardMinutes in test
* Add custom serializer to the entire mapper
Unless an --oauth flag is used, the nomulus tool will only send the OIDC
header. The server still accepts both headers and the user should use
`create_user` command to create an admin User (with the --oauth flag on), which
will then allow one to use the nomulus tool without the --oauth flag.
The --oauth flag and the server's ability to support OAuth-based
authentication will be removed soon. Users are urged to create the User
object in time to avoid service interruption.
TESTED=verified on alpha.
* Defend against deserialization-based attacks
Added the `SafeObjectInputStream` class that defends attacks using
malformed serialized data, including remote code execution and
denial-of-service attacks.
Started using the new class to handle EPP resource VKeys and
PendingDeposits, which are passed across credential-boundaries: between
TaskQueue and AppEngine server, and between AppEngine server and the RDE
pipeline on GCE. Note that the wireformat of VKeys do not change,
therefore existing tasks sitting in the TaskQueue are not affected.
Also removed an unused class: JaxbFragment.
* Add a configureTld command that uses YAML
* Add more tests and edge case handling
* Add out of order test and fix wrong inject
* small changes
* Add check for ascii
* Add check for ROID suffix
Hibernate logs certain information at the ERROR level, which for the
purpose of troubleshooting is misleading, since most affected operations
succeed after retry. ERROR-level logging should only be added by Nomulus
code.
This PR does two things:
1. Disable all logging in two Hibernate classes: we cannot disable
logging at a finer granularity, and we cannot preserve lower-level
logging while disabling ERROR.
2. Adds a DatabaseException which captures all error details that may
escape the typical loggers' attention: SQLException instances can be
chained in a different way from Throwable's `getCause()` method.
The old semantics for TransactionalFlow meant "anything that needs to mutate the
database", but then FlowRunner was not creating transactions for
non-transactional flows even though nearly every flow needs a transaction (as
nearly every flow needs to hit the database for some purpose). So now
TransactionalFlow simply means "any flow that needs the database", and
MutatingFlow means "a flow that mutates the database". In the future we will
have FlowRunner use a read-only transaction for TransactionalFlow and then a
normal writes-allowed transaction for MutatingFlow. That is a TODO.
This also fixes up some transact() calls inside caches to be reTransact(), as we
rightly can't move the transaction outside them as from some callsites we
legitimately do not know whether a transaction will be needed at all (depending
on whether said data is already in memory). And it removes the replicaTm() calls
which weren't actually doing anything as they were always nested inside of normal
tm()s, thus causing confusion.
In the future, reTransact() will be the only way to initiate a transaction that
doesn't fail when called inside an outer wrapping transaction (when wrapped,
it's a no-op). It should be used sparingly, with a preference towards
refactoring the code to move transactions outwards (which this PR also
contains).
Note that this PR includes some potential efficiency gains caused by existing
poor use of transactions. E.g. in the file RefreshDnsAction, the existing code
was using two separate transactions to refresh the DNS for domains and hosts
(one is hidden in loadAndVerifyExistence(), whereas now as of this PR it has a
single wrapping transaction to do so.
It turns out that disallowing all nested transaction is impractical. So
in this PR we make it possible to run nested transactions (which are not
really nested as far as SQL is concerned, but rather lexically nested
calls to tm().transact() which will NOT open new transactions when
called within a transaction) as long as there is no conflict between the
specified isolation levels between the enclosing and the enclosed
transactions.
Note that this will change the behavior of calling tm().transact() with
no isolation level override, or a null override INSIDE a transaction.
The lack of the override will allow the nested transaction to run at
whatever level the enclosing transaction runs at, instead of at the
default level specified in the config file.
* Fix Cloud Tasks failure to retry
Replace `SC_NOT_MODIFIED` (304) with `SC_SERVICE_UNAVAILABLE` (503) when
data is not available yet. Affected actions are invoice- and spec11-publishing.
It is confirmed that Cloud Tasks currently does not retry with code 304,
despite the public documentation stating so. We will use 503 for now,
pending the decision by Cloud Tasks whether to change behavior or
documentation.
The code `TOO_EARLY` (425) is another alternative. It is not meant for
our use case but at least sounds like it is. However, it is not in any
javax.servlet jar. We don't want to define our own constant, and we cannot upgrade
to jakarta.servlet yet.
Also revert previous mitigation.
This includes a bit of refactoring of the GSON creation. There can exist
some objects (e.g. Address) where the JSON representation is not equal to the
representation that we store in the database. For these objects, when
deserializing, we should update the objects so that they reflect the
proper DB structure (indeed, this is already what we do for the XML
parsing of Address).
* Change PackagePromotion to BulkPricingPackage
* More name changes
* Fix some test names
* Change token type "BULK" to "BULK_PRICING"
* Fix missed token_type reference
* Add todo to remove package type
A config file field is added to control if per-transaction isolation
level is actually used. If set to true, nested transactions will throw
a runtime exception as the enclosing transaction may run at a different
isolation level.
In this PR we only add the ability to specify the isolation level,
without enabling it in any environment (including unit tests), or
actually specifying one for any query. This should allow us to set up
the system without impacting anything currently working.
* Mitigate Cloud task retry problem
Increase PublishSpec11Action start delay to avoid the need to retry.
The only other use case is invoice, which typically does not retry:
delay is 10 minutes, pipeline finishes within 7 minutes.
This includes two changes:
1. Creating a base string-type adapter for use parsing to/from JSON
classes that are represented as simple strings
2. Changing the object-provider methods so that the POST bodies should
contain precisely the expected object(s) and nothing else. This way,
it's easier for the frontend and backend to agree that, for instance,
one POST endpoint might accept exactly a Registrar object, or a list
of Contact objects.
Co-Authored-By: gbrodman <gbrodman@google.com>
This includes two changes:
1. Creating a base string-type adapter for use parsing to/from JSON
classes that are represented as simple strings
2. Changing the object-provider methods so that the POST bodies should
contain precisely the expected object(s) and nothing else. This way,
it's easier for the frontend and backend to agree that, for instance,
one POST endpoint might accept exactly a Registrar object, or a list
of Contact objects.