Commit Graph

221 Commits

Author SHA1 Message Date
Catherine
f148792bcd Accept an output argument in -get-blob, -get-manifest, -get-archive. 2025-11-20 03:03:03 +00:00
Catherine
6db850e2c4 Allow downloading entire site via CLI or HTTP.
The HTTP endpoint is `/.git-pages/archive.tar` and it is gated behind
a feature flag `archive-site`. It serially downloads every blob and
writes it to the client in a chunked response, optionally compressed
with gzip or zstd as per `Accept-Encoding:`. It is authorized the same
as `/.git-pages/manifest.json`, for the same reasons.

The CLI operation is `-get-archive <site-name>` and it writes a tar
archive to stdout. This could be useful for an administrator to review
the contents of a site in response to a report.

Both `_headers` and `_redirects` files are present in the output,
reconstituted from the manifest.
2025-11-20 02:09:49 +00:00
Catherine
aa6e495505 Fix DIV/0 when compressing a site without contents.
I think this doesn't affect anything, but prevents an embarrassing
message from appearing in the log:

    compress: saved NaN percent (0 B to 0 B)
2025-11-20 01:17:53 +00:00
Catherine
0e342b11f6 Add Last-Modified: header to /.git-pages/ metadata responses. 2025-11-19 22:37:06 +00:00
Catherine
dd16818618 Refactor S3Backend.GetManifest. NFCI
This is both to reduce the amount of loose variables in the code, as
well as to make it closer to `S3Backend.GetBlob`.
2025-11-19 22:26:40 +00:00
Catherine
0b2db170b8 Allow updating wildcard domain sites from an archive with a forge token. 2025-11-19 04:10:02 +00:00
Catherine
457dd60aa0 Factor out authentication helpers. NFC 2025-11-19 02:53:48 +00:00
Catherine
073435aa2b Redirect domain.tld/project to domain.tld/project/ when present.
This is to match the behavior of GitHub, as well as because it isn't
particularly useful to serve a file from the index repo with the same
path segment as the project name (and quite confusing too).
2025-11-18 22:27:03 +00:00
Catherine
325c283e05 Refactor redirect code. NFC 2025-11-18 22:21:51 +00:00
miyuko
cef3d785ec Add a Prometheus counter for s3:GetObject errors. 2025-11-17 12:33:00 +00:00
miyuko
fff345c695 Don't observe context cancellation errors. 2025-11-17 11:09:26 +00:00
miyuko
de17426f41 Observe blob fetch errors during GET requests. 2025-11-17 11:09:26 +00:00
David Leadbeater
3334af922f Allow external redirects for 3xx statuses
Fixes #60
2025-11-17 19:24:54 +11:00
Catherine
d88d97721a Observe whether manifest cache is bypassed. 2025-11-17 04:34:17 +00:00
Catherine
91dc7e0c54 Add original (decompressed) size to site manifest.
This size is not used by git-pages itself, and is not representative of
storage needs, but may be used for estimating how large a site would
be if downloaded in its entirety.
2025-11-16 19:27:04 +00:00
oppiliappan
779f705d5c Allow matching multiple subdomains in wildcards
Previously, this method would match only hosts of the form:

    user.host.com

This changeset allows matches on hosts of the form:

    user.org.host.com
    user.organization.com.host.com

This will potentially be the pattern that tangled.org uses for its hosted
instance of git-pages.

Signed-off-by: oppiliappan <me@oppi.li>
2025-11-16 05:56:15 +00:00
miyuko
2193fb86de Try to fix Sentry errors getting attached to wrong transactions. 2025-11-16 00:30:53 +00:00
Catherine
de40c8263a Set Update-Result for DELETE requests.
Done for uniformity and to make git-pages-cli implementation nicer.
2025-11-16 00:18:29 +00:00
Catherine
3e59fd2734 Rename X-Pages-Update header to Update-Result.
Same rationale as in 9d0a3ac6ad.
2025-11-15 23:46:20 +00:00
Catherine
9a431b8bbb Add /.git-pages/health endpoint. 2025-11-15 21:17:30 +00:00
Catherine
d604455e1f Ignore trailing . in hostnames.
This means that e.g. `https://site.tld.` will be treated the same as
`https://site.tld`. In DNS, the trailing empty label means "root domain"
and is usually ignored when present. There are some sites with links
that don't work otherwise.
2025-11-15 03:12:03 +00:00
Catherine
3431217a09 Don't respond with a completely blank 404 page.
We respond to all other errors with a simple, 1-line explanation that
you could see when using e.g. curl. The one case of "site is found and
the path is a normal path, but it doesn't exist and the 404 page does
not exist either" was unhandled by accident.
2025-11-15 01:42:55 +00:00
Catherine
b70a9ad4dd Allow only ssh, http, and https schemes for clone URLs. 2025-11-14 23:12:53 +00:00
David Leadbeater
19892ecfd1 Correctly read symlinks from zip files
This already worked for tar files, but symlinks in .zip files were
treated as regular files.
2025-11-14 12:51:15 +11:00
Catherine
ff8cf9928e Make compression always enabled.
This removes the `compress` feature.
2025-11-13 23:22:25 +00:00
Catherine
9d0a3ac6ad Use Branch: instead of X-Pages-Branch: to set custom branch name. 2025-11-12 17:05:11 +00:00
Catherine
ed77339144 Remove deprecated COOP/COEP assignment based on content type. 2025-11-11 17:56:02 +00:00
miyuko
cf5b98e3e5 Don't issue extraneous HEAD requests for S3 GetObject operations. 2025-11-11 17:33:24 +00:00
Catherine
02b5b7d2bb Ignore only the malformed _redirects/_headers rules.
Before this commit, upon encountering a malformed rule, the entire file
was ignored. This is both increasingly unviable for complex sites,
a likely source of self-DoS (or at least degradation of service),
and not the behavior Grebedoc has been promising for a few weeks.
2025-11-11 15:55:48 +00:00
Catherine
c90b453d44 Default to allowed-custom-headers = ["X-Clacks-Overhead"].
X-Clacks-Overhead: GNU Terry Pratchett
2025-11-11 15:38:11 +00:00
Catherine
26b29ec4be Add Netlify _headers support. 2025-11-11 15:36:14 +00:00
Catherine
f9e142dd51 Observe all storage errors reported by GetManifest.
Otherwise users may get jumpscares of "site not found" due to temporary
conditions (network errors to S3 backend included).
2025-11-11 06:10:01 +00:00
Catherine
c4b3671a53 Add [[wildcard]].index-repo-branch option (pages by default). 2025-11-05 23:00:32 +00:00
Catherine
9b19eeae82 Add missing [limits] keys to default configuration. 2025-11-05 22:58:12 +00:00
Catherine
47a658ac03 Avoid leaking http.Transport resources.
`http.Transport` objects cache connections and are meant to be long
lived rather than created on demand; creating them on demand leaks
sockets. Bug introduced in commit 3c07ebcc.
2025-11-05 09:48:36 +00:00
Catherine
3c07ebccbf Add [[wildcard]].fallback-insecure option to disable TLS verification.
This is intended for local deployments only.
2025-11-04 19:03:54 +00:00
Catherine
ba820e63e3 Work around slog issues handling %% in a format string. 2025-10-29 01:04:01 +00:00
Catherine
2db3de01c7 Fix a nil dereference on non-custom 404 pages. 2025-10-27 16:14:35 +00:00
Catherine
91cafac86a Apply Content-Type from the manifest to non-200 status pages. 2025-10-27 15:25:14 +00:00
Catherine
30668be4a0 If an https fallback URL is configured, try TLS for Caddy domain check.
This is added pretty much exclusively for Codeberg Pages v2 migration,
but the implementation is generic enough to be useful for other similar
setups (if anyone ever has to deal with one...)
2025-10-26 04:55:58 +00:00
Catherine
26b926293b Serve X-Content-Type-Options: nosniff.
Mozilla HTTP Observatory cares about this (5 points), and there isn't
really any reason not to send it at all times.
2025-10-24 09:28:49 +00:00
Catherine
68343a3dff Turns out a Web Worker is a type of frame (for COEP purposes). 2025-10-24 09:26:54 +00:00
miyuko
8f8521d697 Don't compress video or audio files. 2025-10-22 17:25:13 +01:00
miyuko
ffedc45a14 Don't send COEP/COOP headers for non-HTML resources. 2025-10-22 17:25:10 +01:00
miyuko
d6a7a72e09 Serve compressed content directly if client indicates support. 2025-10-22 16:59:35 +01:00
miyuko
aa965c5a08 Use s3:GetObject instead of s3:ListObjects for CheckDomain. 2025-10-22 13:45:15 +01:00
Catherine
34db13e603 Simplify observability code. NFC 2025-10-22 10:44:25 +00:00
Catherine
d1be93919f Make installable with go install. 2025-10-22 05:24:55 +00:00
miyuko
c39e57a857 Fetch manifests in parallel when handling GET requests. 2025-10-22 00:25:21 +01:00
miyuko
3863f0f134 Revert "Add a GetManifests function."
This reverts commit 0a111234f2.
2025-10-22 00:25:21 +01:00