This is particularly important with the FS backend, where there isn't
necessarily native tooling capable of handling this task correctly
(since not every filesystem supports file "birth times", and since
restoring data from a backup will reset the "birth time" of audit
records to the moment of restoration).
This is mainly done to speed up histogram collection, as waiting some
minutes defeats the purpose of having a quick overview function.
This commit does speed up GC tracing as well, but not as much because
audit records are still retrieved one at a time. A similar mechanism
could be added in the future there.
Filesystem logic is functionally identical since it was fine already.
Last-Modified does not have enough resolution to be fully reliable;
ETag does. This test now passes on both filesystem and MinIO:
$ go run ./test/stresspatch -count 100
...
written: 100 of 100
Other S3 implementations haven't been tested.
This is only a breaking change if you've enabled the `audit` feature.
All past audit reports should be removed once this commit is deployed,
as both the Protobuf schema and the Snowflake epoch have changed.
The list of audit events is:
- `CommitManifest`
- `DeleteManifest`
- `FreezeDomain`
- `UnfreezeDomain`
Currently these are the main abuse/moderation-relevant actions.
If collection is enabled, these events will be logged to `audit/...`
storage hierarchy; a way to examine audit logs will be added in
the future.
The auditing interposer backend is enabled with feature `audit`.
The following script may be used to handle abusive sites:
cd $(mktemp -d)
echo "<h1>Gone</h1>" >index.html
echo "/* /index.html 410" >_redirects
tar cf site.tar index.html _redirects
git-pages -update-site $1 site.tar
git-pages -freeze-domain $1
The former metric was misnamed: it only counted NoSuchKey errors.
Also, it was applied *after* the cache, meaning it was just a count
of every request that got a successful 404 from the S3 backend.
Also, it pooled blob and manifest requests together.
The new metric is 1-to-1 correspondent to S3 requests and distinguishes
between different kinds of errors. Also, it distinguishes kinds of
requests. Example output:
git_pages_s3_get_object_responses_count{code="NoSuchKey",kind="manifest"} 1
git_pages_s3_get_object_responses_count{code="OK",kind="blob"} 1
git_pages_s3_get_object_responses_count{code="OK",kind="manifest"} 1
config.MaxAge is now a nanosecond value, and multiplying it by
time.Second (number of nanoseconds in a second) will make it too large
for the cache expiry algorithm to have any effect.
We need to cache NotFound responses to avoid hitting the backend
whenever there's a flurry of requests for the same invalid domain.
But caching any kind of error, including context cancellation, results
in poisoning the cache with that error, e.g. if the domain exists but
the client that requested it first did not wait for the response.
This is done using reflection to avoid boilerplate and potential desync
of the two configuration interfaces. The `[[wildcards]]` section did
not fit well into the "splat every config key" paradigm, so it is
unmarshalled as a whole from a JSON payload in an environment variable.
This commit also splits up the `Config` type into small per-section
struct types and removes most references to the global `config` in
favor of passing pointers to sections around.
A new option, `-print-config-env-vars`, shows the names and types of
all of the available configuration knobs.