This was added back in early 2018 long ago to enable promotions, but
since then (and for many years) we've added the ability to run
promotions on the tokens themselves, rather than relying on custom Java
classes.
This will make the changes for b/315504612 much easier, as that will
split up token validation into "is this token valid in general?" and "is
this token valid for this domain/action?"
This can potentially help even more with serializable transaction
failures (optimistic locking exceptions, which are expected to occur
somewhat frequently).
With six attempts, we will sleep at most five times, for
100+200+400+800+1600 ms each, for a total of at most 3.1 seconds (much
less than the EPP maximum which I believe (?) to be 30 seconds.
In addition, we add a 20% skew in an attempt to spread out
possibly-conflicting transaction retries.
This changes the code to only save console histories of this type. We
keep the old Java code (and, necessarily, the corresponding SQL code)
for now because there's no harm in doing so and we want to avoid hastily
deleting too much.
The SQL statement was incorrectly flooring to zero one layer too deep, which was
negating all negative transaction report rows (which occur most frequently when
a domain in the autorenew grace period is deleted). I've changed it so that it
now only floors to zero at the report level, which still solves the issue
reported in http://b/290228682 but whose original fix caused the issue
http://b/344645788
This bug was introduced in https://github.com/google/nomulus/pull/2074
I tested this by running the new query against the DB for 2024 Q4 using the
registrar that was having issues and confirmed that the total renewal numbers
for .app now match with the sum total of what we invoiced for the last three
months of 2024.
This doesn't check for correctness (we have other scripts that do that)
but just that the service is available at all (the other scripts do not
do that).
This should, and will, be configured with a scheduled trigger in GCB (for us, in
the domain-registry-dev project) and configuration to send some sort of
pub/sub notification on failure (for us, this is already set up on
domain-registry-dev and it sends messages to the "Domain Registry
Notifications" chat channel.
We've seen this issue happen more often than not recently, where GAE
canary deployment is stuck for about 10 min and the failed. The reason
is not clear, but delete the canary version prior to a deployment always
fixes the issue.
We use the $environment property to set the console config. If it is not
given, 'alpha' is used, which has the same effect as 'production'.
TESTED=ran :jetty:copyConsole with
-Penvironment=(sandbox|production|alpha) and checked the resulting js
file.
For GKE all tasks should be on backend, BSA was on its own service
because of egress IP constraint.
Also made it possible to specify a timeout for the Cloud Scheduler job,
with the default (3m) suitable for most tasks.
There are two session cookies, JSESSIONID, which is set by Jetty, and
GCLB, which is set by the Gateway.
In one session, every request other than the first one (the <hello>)
should have the same GCLB value, and every request after a successful
<login> should have the same JSESSIONID.
With these two metadata, we should be able to trace all requests that
*should* belong to the same session and debug issues with session
mismatch (if any).
There can be delays in releasing predicate locks when we have
transactions that are long-lived -- even delays in releasing predicate
locks acquired by shorter-lived transactions. Logging the transaction
duration will allow us to get a sense as to transaction durations during
busy times.
This can happen on public endpoints (in pubapi) where the service is
behind IAP but all users (including not-logged-in ones) are allowed. IAP
will add an OIDC token with no email field in the request header.
This removes logic from an inner helper method so that it becomes more clear
from callsites within each test exactly which behavior is expected from those
test conditions.
We now pin to postgreSQL v17 when running tests, which means that minor
version might increase without our intervention. This causes (at least)
the comment in the golden schema to change, and failing the test as a
result.
This PR adds the ability to strip lines that we deem as comment from the
comparison, so we don't have to do trivial upgrades to the gold schema
whenever there's minor version upgrade.
This allows us to easily tell which tag was deployed.
Also set the gateway to use named address so they are stable, and so
that we can attach an IPv6 record to it. Auto-provisioned addresses are
IPv4 only.
Failed pg_dump may not leave a file, failing the subsequent diffing and
causing the verifier to return success.
The verifier should abort in this case.