373 Commits

Author SHA1 Message Date
Catherine
e9a5a901ec Improve panic messages in ApplyTarPatch. 2026-02-03 09:51:22 +00:00
Catherine
8f811147d6 Enable Sentry telemetry buffer by default.
No observed issues on Grebedoc for a month, so it should be stable now.
2026-01-19 02:41:15 +00:00
Catherine
0d33c64372 [breaking-change] Only allow a single [[wildcard]].index-repo.
The git-pages webhook security model depends on there being
a 1:1 mapping between site URLs and repositories; being able to
specify multiple of them breaks this model, as anyone could switch
the published site from one to the other if both repositories exist.
2026-01-19 02:25:01 +00:00
Catherine
1f1927d95d Log Accept: value for HEAD/GET requests.
Instead of `Content-Type:` which is essentially never relevant.
2025-12-24 14:28:16 +00:00
David Leadbeater
7334b8f637 Add a Vary header when content negotiation happens
Without this, if a cache first sees a compressed version of the request,
it will return that for potentially any future requests, even if they
don't request compression.
2025-12-24 14:36:23 +11:00
Catherine
96f210d253 Clear git metadata from PATCH'd manifests. 2025-12-24 02:18:09 +00:00
David Leadbeater
04729c1f48 Ensure leading directories always exist in manifest
When extracting from an archive it is possible the leading directories
are not part of the archive. Add them to the manifest as otherwise the
behaviour of "index.html" varies depending how the archive was created.
2025-12-23 13:40:05 +01:00
miyuko
c5df116673 Scrub the Forge-Authorization header from Sentry events. 2025-12-22 14:35:02 +00:00
Catherine
d97f5ac056 Fix manifest StoredSize field being always zero. 2025-12-16 20:05:35 +00:00
Catherine
79407ba406 Fix timeout bug introduced in commit 9c6f735d.
This bug would cause POST hooks triggered for large repositories to
silently fail.

We need the update context to have the principal (which is tied to
the HTTP request), but not the cancellation (which is also tied to
the HTTP request and is triggered once the request is done either way).
2025-12-16 14:43:36 +00:00
David Leadbeater
937aadc5d3 Allow setting custom Cache-Control headers via _headers
Before this change Cache-Control header would always be overridden, this
change allows custom Cache-Control, provided Cache-Control is added to
the header allow list.
2025-12-15 21:02:25 +11:00
Catherine
24dbab6813 Begin paths with / in problem report.
Otherwise you get reports like:

    (archive)
    : directory shadows redirect "/ /foo 301"; remove the directory or use a 301! forced redirect instead
2025-12-14 19:47:28 +00:00
Catherine
30b6db2758 Limit amount of data fetched from git repository.
Like limiting the size of an archive, it is a supplementary check meant
to limit resource consumption prior to the final check done in
`StoreManifest()`.
2025-12-14 19:42:25 +00:00
Catherine
7655400560 Limit original size of the contents of a site manifest.
The limit is applied to the original size and not compressed size for
predictability and fairness.
2025-12-14 19:30:45 +00:00
Catherine
c88d04c71b Add a relaxed-idna feature to allow some uses of _ in hostnames.
This is added to aid migration from Codeberg Pages v2. Forgejo allows
both `_` and `-` in usernames, and it is necessary to be able to accept
host names like `user_name.codeberg.page` under a wildcard domain.
(It is not possible to get a TLS certificate for a host name like this,
so only a wildcard certificate will be able to cover it.)
2025-12-12 02:27:22 +00:00
David Leadbeater
86845f2505 Check for overflow when calculating size of zip 2025-12-12 01:24:24 +00:00
Catherine
7f112a761c Simplify signal handling code.
This does not require `//go:build`.
2025-12-11 10:09:50 +00:00
David Leadbeater
a9cf69c04a Ensure the branch parameter really is a branch
Currently you can specify "Branch: HEAD" or "Branch: refs/tags/v1" and
go-git will resolve it to the relevant ref. Given the HTTP header is
called Branch this is confusing.
2025-12-11 17:18:19 +11:00
Catherine
132d093021 Implement -audit-rollback.
This feature is useful if you need to restore data after an accidental
overwrite or compromise.
2025-12-11 03:12:57 +00:00
David Leadbeater
62917824fa Support zstd inside zip files.
Given this is already depending on zstd I don't see a reason not to.

Can be tested with libarchive via: `bsdtar -a --options zip:compression=zstd -cf file.zip files...`

Reviewed-on: https://codeberg.org/git-pages/git-pages/pulls/91
Co-authored-by: David Leadbeater <dgl@dgl.cx>
Co-committed-by: David Leadbeater <dgl@dgl.cx>
2025-12-09 06:16:30 +01:00
Catherine
62ef4a5366 Make project name validation more consistent and stricter.
Previously, you could issue e.g. a `GET /%2e%2e/%2e%2e` and it would
get interpreted as a parent directory path segment in the handler.
This didn't result in a path traversal vulnerability when passed to
the S3 backend because of a `path.Clean()` call indirectly done by
`makeWebRoot()`, but it's prudent to not take chances.
2025-12-07 20:24:50 +00:00
Catherine
8fa986015d Process IDNA host names. 2025-12-07 19:28:05 +00:00
Catherine
8d574e5e7d Stabilize the audit feature. 2025-12-07 14:31:48 +00:00
miyuko
91f05e210e [breaking-change] Remove the log-level config option.
This reverts commit 351d0a0c85.

This option does not have any effect at the moment and may potentially
confuse users. It can be easily reintroduced later (by reverting this
commit) once we start logging at any level other than `info`.
2025-12-07 13:12:45 +00:00
miyuko
bc70cba215 Apply the log-level config option to the syslog log sink. 2025-12-07 13:03:14 +00:00
Catherine
8b049da3c7 Treat allowed-repository-url-prefixes = [] the same as unspecified.
Previously, this would disallow all git clones except for those via
wildcard domains. This is highly unintuitive. It also meant that
disabling this function via environment variable was not possible.
2025-12-07 12:55:41 +00:00
Catherine
fc9e6fcf7b [breaking-change] Listen only on localhost by default.
It is expected that in most deployments, a reverse proxy server like
Caddy or Nginx will be connecting to Caddy; listening on any address
by default is a privacy and security concern.
2025-12-07 07:17:54 +00:00
Catherine
3840ba3c98 Use TOML output for -print-config instead of JSON.
This is much easier to read, and can be used as a template for
a new configuration.
2025-12-07 05:43:00 +00:00
Catherine
b58fe54c50 Report "dead" redirects as site issues.
Using a non-forced redirect with a URL matching a manifest entry turns
out to be a common and confusing mistake.
2025-12-07 04:21:00 +00:00
Catherine
d1f55d6776 Style. NFC 2025-12-07 03:41:16 +00:00
Catherine
cf2c8f6270 Don't observe errors expected during incremental updates. 2025-12-06 23:15:25 +00:00
Catherine
43b6d92492 Split UnfreezeDomain off FreezeDomain. NFC
The code would branch on the value of `freeze` in basically all
implementations and call sites.
2025-12-06 01:40:19 +00:00
Catherine
609e5ca452 Display dead blob count after tracing. 2025-12-06 01:36:52 +00:00
Catherine
82aebb70bf Add basic garbage tracer.
This isn't a concurrent GC and it cannot provide a reliable result;
the output is just an estimate.
2025-12-06 01:21:19 +00:00
Catherine
9c6f735df0 Fix loss of context in POST handler.
This caused the principal to not be available when creating the new
audit record.
2025-12-06 00:36:46 +00:00
Catherine
ed2d853cbe Add EnumerateManifests API and -list-manifests option.
The new API replaces the `ListManifests` API.

This also adds `Name` and `Size` to manifest metadata.
2025-12-06 00:10:04 +00:00
Catherine
1e3c39b7f6 Add EnumerateBlobs API and -list-blobs option.
This also adds `Name` to blob metadata.
2025-12-06 00:10:04 +00:00
Catherine
92dc8f7231 Consolidate return values into BlobMetadata. NFC 2025-12-06 00:10:04 +00:00
miyuko
e9edfb8f5c [breaking-change] Read principal's IP address from X-Forwarded-For. 2025-12-06 00:04:42 +00:00
miyuko
2cd8b58944 Don't put blobs that only contain hashes when incrementally uploading. 2025-12-05 20:41:12 +00:00
Catherine
1283b4e0eb Set Content-Type: to negotiated content type. 2025-12-05 19:33:06 +00:00
Catherine
7313ab7d13 Fix several content type negotiation issues.
* No `Accept:` header should be the same as `Accept: */*`.
  * For unresolved reference error, `text/plain` should take priority.
2025-12-05 18:56:20 +00:00
Catherine
bd44f65b51 Add handling of Accept: application/vnd.git-pages.unresolved.
This will be used for incremental archive updates.
2025-12-05 18:21:42 +00:00
Catherine
8d58793576 Provide Accept-Encoding: in 406 Not Acceptable responses. 2025-12-05 16:38:31 +00:00
Catherine
6076c17c51 Rename HTTP negotiation items. NFC 2025-12-05 16:37:49 +00:00
Catherine
959715269f Collect unresolved blob references in a dedicated error structure.
This will be used for incremental archive uploads.
2025-12-05 11:31:34 +00:00
Catherine
faa486c779 Collect statistics on blob reuse during archive upload. 2025-12-05 11:20:28 +00:00
Catherine
50d28f3c8b Resolve /git/blobs/ symlinks as blob references to the old manifest.
This will be used for incremental archive uploads.
2025-12-05 10:53:49 +00:00
Catherine
eb6418b9b6 Fill in git_hash for regular files in archive uploads.
This will be used for incremental archive uploads.
2025-12-05 10:53:44 +00:00
Catherine
32c449e380 Use path.Join where applicable. NFC 2025-12-05 05:52:07 +00:00