48 Commits

Author SHA1 Message Date
miyuko
89f672beda Allow detaching audit records from their blobs for garbage collection.
Resolves: https://codeberg.org/git-pages/git-pages/issues/148
2026-04-27 17:29:16 +01:00
miyuko
a233cdfbb8 Fix S3Backend.SearchAuditLog ignoring search options. 2026-04-27 16:48:36 +01:00
Catherine
e8112c1abe Add a CLI command -audit-expire to purge old audit records.
This is particularly important with the FS backend, where there isn't
necessarily native tooling capable of handling this task correctly
(since not every filesystem supports file "birth times", and since
restoring data from a backup will reset the "birth time" of audit
records to the moment of restoration).
2026-04-26 23:10:22 +00:00
miyuko
bbdaae7280 Add a domain cache to quickly reject non-existent domains. 2026-04-13 13:45:16 +00:00
miyuko
f400f8d246 Enable all S3 features when initializing the store. 2026-04-13 13:13:14 +00:00
miyuko
ed24f08d5f Constrain the parallelism of fetching audit log records. 2026-04-11 19:43:13 +00:00
Catherine
d7651941c0 Fetch manifests from S3 in parallel for histogram and tracing.
This is mainly done to speed up histogram collection, as waiting some
minutes defeats the purpose of having a quick overview function.

This commit does speed up GC tracing as well, but not as much because
audit records are still retrieved one at a time. A similar mechanism
could be added in the future there.

Filesystem logic is functionally identical since it was fine already.
2026-04-04 21:10:05 +00:00
Catherine
6775f4aab5 Fix incorrect frozen domain check for S3 backend. 2026-04-01 22:50:40 +00:00
Catherine
43b6d92492 Split UnfreezeDomain off FreezeDomain. NFC
The code would branch on the value of `freeze` in basically all
implementations and call sites.
2025-12-06 01:40:19 +00:00
Catherine
82aebb70bf Add basic garbage tracer.
This isn't a concurrent GC and it cannot provide a reliable result;
the output is just an estimate.
2025-12-06 01:21:19 +00:00
Catherine
ed2d853cbe Add EnumerateManifests API and -list-manifests option.
The new API replaces the `ListManifests` API.

This also adds `Name` and `Size` to manifest metadata.
2025-12-06 00:10:04 +00:00
Catherine
1e3c39b7f6 Add EnumerateBlobs API and -list-blobs option.
This also adds `Name` to blob metadata.
2025-12-06 00:10:04 +00:00
Catherine
92dc8f7231 Consolidate return values into BlobMetadata. NFC 2025-12-06 00:10:04 +00:00
Catherine
5f1ce5d334 Fix a bug preventing new manifests from being committed to S3. 2025-12-04 17:50:28 +00:00
Catherine
886635ce5e Implement -audit-log option.
Also, record the principal of `git-pages -{freeze,unfreeze}-domain`
and `git-pages -update-site` as the CLI administrator.
2025-12-04 15:58:14 +00:00
Catherine
f5c48d0759 Use ETag as precondition for partial updates.
Last-Modified does not have enough resolution to be fully reliable;
ETag does. This test now passes on both filesystem and MinIO:

    $ go run ./test/stresspatch -count 100
    ...
    written: 100 of 100

Other S3 implementations haven't been tested.
2025-12-04 03:00:47 +00:00
Catherine
92d6796ad9 Return both LastModified and ETag in manifest metadata. NFCI 2025-12-04 03:00:47 +00:00
Catherine
460ff41cc9 Allow PATCH method to apply partial updates.
Gated behind the `patch` feature.
2025-12-04 03:00:47 +00:00
Catherine
21b82f8e2c [breaking-change] Implement audit record retrieval.
This is only a breaking change if you've enabled the `audit` feature.
All past audit reports should be removed once this commit is deployed,
as both the Protobuf schema and the Snowflake epoch have changed.
2025-12-03 16:43:33 +00:00
Catherine
95c4f1041d Fix S3 implementation of frozen domain check. 2025-12-03 04:52:41 +00:00
Catherine
e226f51dd4 Implement auditing of important site lifecycle actions.
The list of audit events is:
  - `CommitManifest`
  - `DeleteManifest`
  - `FreezeDomain`
  - `UnfreezeDomain`

Currently these are the main abuse/moderation-relevant actions.
If collection is enabled, these events will be logged to `audit/...`
storage hierarchy; a way to examine audit logs will be added in
the future.

The auditing interposer backend is enabled with feature `audit`.
2025-12-03 04:19:41 +00:00
Catherine
6faf3b1ee3 Reformat. NFC 2025-12-03 01:07:26 +00:00
Catherine
c250922f8d Allow domains to be administratively frozen.
The following script may be used to handle abusive sites:

    cd $(mktemp -d)
    echo "<h1>Gone</h1>" >index.html
    echo "/* /index.html 410" >_redirects
    tar cf site.tar index.html _redirects
    git-pages -update-site $1 site.tar
    git-pages -freeze-domain $1
2025-12-02 23:56:01 +00:00
Catherine
0b82dcbc25 Replace s3GetObjectErrorsCount metric with *ResponseCount.
The former metric was misnamed: it only counted NoSuchKey errors.
Also, it was applied *after* the cache, meaning it was just a count
of every request that got a successful 404 from the S3 backend.
Also, it pooled blob and manifest requests together.

The new metric is 1-to-1 correspondent to S3 requests and distinguishes
between different kinds of errors. Also, it distinguishes kinds of
requests. Example output:

    git_pages_s3_get_object_responses_count{code="NoSuchKey",kind="manifest"} 1
    git_pages_s3_get_object_responses_count{code="OK",kind="blob"} 1
    git_pages_s3_get_object_responses_count{code="OK",kind="manifest"} 1
2025-11-29 00:04:50 +00:00
miyuko
cb7802df10 Pass the context to logging functions. 2025-11-22 07:05:07 +00:00
Catherine
7e1185309b Fix a regression causing non-observance of ≠200 S3 manifest responses.
Introduced in commit dd168186.
2025-11-20 07:06:14 +00:00
Catherine
0e342b11f6 Add Last-Modified: header to /.git-pages/ metadata responses. 2025-11-19 22:37:06 +00:00
Catherine
dd16818618 Refactor S3Backend.GetManifest. NFCI
This is both to reduce the amount of loose variables in the code, as
well as to make it closer to `S3Backend.GetBlob`.
2025-11-19 22:26:40 +00:00
miyuko
cef3d785ec Add a Prometheus counter for s3:GetObject errors. 2025-11-17 12:33:00 +00:00
miyuko
cf5b98e3e5 Don't issue extraneous HEAD requests for S3 GetObject operations. 2025-11-11 17:33:24 +00:00
Catherine
26b29ec4be Add Netlify _headers support. 2025-11-11 15:36:14 +00:00
Catherine
f9e142dd51 Observe all storage errors reported by GetManifest.
Otherwise users may get jumpscares of "site not found" due to temporary
conditions (network errors to S3 backend included).
2025-11-11 06:10:01 +00:00
miyuko
aa965c5a08 Use s3:GetObject instead of s3:ListObjects for CheckDomain. 2025-10-22 13:45:15 +01:00
Catherine
d1be93919f Make installable with go install. 2025-10-22 05:24:55 +00:00
Catherine
23b516cf15 Observe timings even for 304 Not Modified responses to manifest loads. 2025-10-21 00:29:42 +00:00
miyuko
2ac2aee14a Use ETags when refreshing cached manifests. 2025-10-17 21:13:58 +01:00
miyuko
e709634906 Add classic buckets to git_pages_s3_get_object_duration_seconds. 2025-10-17 03:07:14 +01:00
miyuko
cfeb2d0dbe Observe s3:GetObject latency. 2025-10-16 03:23:38 +01:00
miyuko
eda3e8a791 Add stale-while-revalidate support to the cache. 2025-10-15 23:53:12 +01:00
Catherine
afae6e42f3 S3: log blob sizes in human readable form. 2025-10-13 02:39:17 +00:00
miyuko
a85905bd31 Fix time-based cache expiration practically never working.
config.MaxAge is now a nanosecond value, and multiplying it by
time.Second (number of nanoseconds in a second) will make it too large
for the cache expiry algorithm to have any effect.
2025-10-05 15:06:33 +01:00
Catherine
b1ef57d32a Only cache NotFound errors from S3 backend, rather than any errors.
We need to cache NotFound responses to avoid hitting the backend
whenever there's a flurry of requests for the same invalid domain.
But caching any kind of error, including context cancellation, results
in poisoning the cache with that error, e.g. if the domain exists but
the client that requested it first did not wait for the response.
2025-09-30 03:44:24 +00:00
Catherine
1a0e594624 Add span based timings measurement and Sentry integration. 2025-09-30 00:56:58 +00:00
Catherine
b1c50c10de Thread context argument through the backend interface. NFC 2025-09-29 23:10:33 +00:00
miyuko
1c7ef99359 Add manifest and blob metrics. 2025-09-27 18:36:17 +01:00
Catherine
a159dba0b8 [breaking-change] Redesign environment var configuration overrides.
This is done using reflection to avoid boilerplate and potential desync
of the two configuration interfaces. The `[[wildcards]]` section did
not fit well into the "splat every config key" paradigm, so it is
unmarshalled as a whole from a JSON payload in an environment variable.

This commit also splits up the `Config` type into small per-section
struct types and removes most references to the global `config` in
favor of passing pointers to sections around.

A new option, `-print-config-env-vars`, shows the names and types of
all of the available configuration knobs.
2025-09-22 07:02:42 +00:00
Catherine
bf2922f892 [breaking-change] Add default config values where appropriate. 2025-09-21 23:08:27 +00:00
Catherine
3acab677e0 Split up backend.go. NFC 2025-09-20 04:39:13 +00:00