Commit Graph

221 Commits

Author SHA1 Message Date
Catherine f148792bcd Accept an output argument in -get-blob, -get-manifest, -get-archive. 2025-11-20 03:03:03 +00:00
Catherine 6db850e2c4 Allow downloading entire site via CLI or HTTP.
The HTTP endpoint is `/.git-pages/archive.tar` and it is gated behind
a feature flag `archive-site`. It serially downloads every blob and
writes it to the client in a chunked response, optionally compressed
with gzip or zstd as per `Accept-Encoding:`. It is authorized the same
as `/.git-pages/manifest.json`, for the same reasons.

The CLI operation is `-get-archive <site-name>` and it writes a tar
archive to stdout. This could be useful for an administrator to review
the contents of a site in response to a report.

Both `_headers` and `_redirects` files are present in the output,
reconstituted from the manifest.
2025-11-20 02:09:49 +00:00
Catherine aa6e495505 Fix DIV/0 when compressing a site without contents.
I think this doesn't affect anything, but prevents an embarrassing
message from appearing in the log:

    compress: saved NaN percent (0 B to 0 B)
2025-11-20 01:17:53 +00:00
Catherine 0e342b11f6 Add Last-Modified: header to /.git-pages/ metadata responses. 2025-11-19 22:37:06 +00:00
Catherine dd16818618 Refactor S3Backend.GetManifest. NFCI
This is both to reduce the amount of loose variables in the code, as
well as to make it closer to `S3Backend.GetBlob`.
2025-11-19 22:26:40 +00:00
Catherine 0b2db170b8 Allow updating wildcard domain sites from an archive with a forge token. 2025-11-19 04:10:02 +00:00
Catherine 457dd60aa0 Factor out authentication helpers. NFC 2025-11-19 02:53:48 +00:00
Catherine 073435aa2b Redirect domain.tld/project to domain.tld/project/ when present.
This is to match the behavior of GitHub, as well as because it isn't
particularly useful to serve a file from the index repo with the same
path segment as the project name (and quite confusing too).
2025-11-18 22:27:03 +00:00
Catherine 325c283e05 Refactor redirect code. NFC 2025-11-18 22:21:51 +00:00
miyuko cef3d785ec Add a Prometheus counter for s3:GetObject errors. 2025-11-17 12:33:00 +00:00
miyuko fff345c695 Don't observe context cancellation errors. 2025-11-17 11:09:26 +00:00
miyuko de17426f41 Observe blob fetch errors during GET requests. 2025-11-17 11:09:26 +00:00
David Leadbeater 3334af922f Allow external redirects for 3xx statuses
Fixes #60
2025-11-17 19:24:54 +11:00
Catherine d88d97721a Observe whether manifest cache is bypassed. 2025-11-17 04:34:17 +00:00
Catherine 91dc7e0c54 Add original (decompressed) size to site manifest.
This size is not used by git-pages itself, and is not representative of
storage needs, but may be used for estimating how large a site would
be if downloaded in its entirety.
2025-11-16 19:27:04 +00:00
oppiliappan 779f705d5c Allow matching multiple subdomains in wildcards
Previously, this method would match only hosts of the form:

    user.host.com

This changeset allows matches on hosts of the form:

    user.org.host.com
    user.organization.com.host.com

This will potentially be the pattern that tangled.org uses for its hosted
instance of git-pages.

Signed-off-by: oppiliappan <me@oppi.li>
2025-11-16 05:56:15 +00:00
miyuko 2193fb86de Try to fix Sentry errors getting attached to wrong transactions. 2025-11-16 00:30:53 +00:00
Catherine de40c8263a Set Update-Result for DELETE requests.
Done for uniformity and to make git-pages-cli implementation nicer.
2025-11-16 00:18:29 +00:00
Catherine 3e59fd2734 Rename X-Pages-Update header to Update-Result.
Same rationale as in 9d0a3ac6ad.
2025-11-15 23:46:20 +00:00
Catherine 9a431b8bbb Add /.git-pages/health endpoint. 2025-11-15 21:17:30 +00:00
Catherine d604455e1f Ignore trailing . in hostnames.
This means that e.g. `https://site.tld.` will be treated the same as
`https://site.tld`. In DNS, the trailing empty label means "root domain"
and is usually ignored when present. There are some sites with links
that don't work otherwise.
2025-11-15 03:12:03 +00:00
Catherine 3431217a09 Don't respond with a completely blank 404 page.
We respond to all other errors with a simple, 1-line explanation that
you could see when using e.g. curl. The one case of "site is found and
the path is a normal path, but it doesn't exist and the 404 page does
not exist either" was unhandled by accident.
2025-11-15 01:42:55 +00:00
Catherine b70a9ad4dd Allow only ssh, http, and https schemes for clone URLs. 2025-11-14 23:12:53 +00:00
David Leadbeater 19892ecfd1 Correctly read symlinks from zip files
This already worked for tar files, but symlinks in .zip files were
treated as regular files.
2025-11-14 12:51:15 +11:00
Catherine ff8cf9928e Make compression always enabled.
This removes the `compress` feature.
2025-11-13 23:22:25 +00:00
Catherine 9d0a3ac6ad Use Branch: instead of X-Pages-Branch: to set custom branch name. 2025-11-12 17:05:11 +00:00
Catherine ed77339144 Remove deprecated COOP/COEP assignment based on content type. 2025-11-11 17:56:02 +00:00
miyuko cf5b98e3e5 Don't issue extraneous HEAD requests for S3 GetObject operations. 2025-11-11 17:33:24 +00:00
Catherine 02b5b7d2bb Ignore only the malformed _redirects/_headers rules.
Before this commit, upon encountering a malformed rule, the entire file
was ignored. This is both increasingly unviable for complex sites,
a likely source of self-DoS (or at least degradation of service),
and not the behavior Grebedoc has been promising for a few weeks.
2025-11-11 15:55:48 +00:00
Catherine c90b453d44 Default to allowed-custom-headers = ["X-Clacks-Overhead"].
X-Clacks-Overhead: GNU Terry Pratchett
2025-11-11 15:38:11 +00:00
Catherine 26b29ec4be Add Netlify _headers support. 2025-11-11 15:36:14 +00:00
Catherine f9e142dd51 Observe all storage errors reported by GetManifest.
Otherwise users may get jumpscares of "site not found" due to temporary
conditions (network errors to S3 backend included).
2025-11-11 06:10:01 +00:00
Catherine c4b3671a53 Add [[wildcard]].index-repo-branch option (pages by default). 2025-11-05 23:00:32 +00:00
Catherine 9b19eeae82 Add missing [limits] keys to default configuration. 2025-11-05 22:58:12 +00:00
Catherine 47a658ac03 Avoid leaking http.Transport resources.
`http.Transport` objects cache connections and are meant to be long
lived rather than created on demand; creating them on demand leaks
sockets. Bug introduced in commit 3c07ebcc.
2025-11-05 09:48:36 +00:00
Catherine 3c07ebccbf Add [[wildcard]].fallback-insecure option to disable TLS verification.
This is intended for local deployments only.
2025-11-04 19:03:54 +00:00
Catherine ba820e63e3 Work around slog issues handling %% in a format string. 2025-10-29 01:04:01 +00:00
Catherine 2db3de01c7 Fix a nil dereference on non-custom 404 pages. 2025-10-27 16:14:35 +00:00
Catherine 91cafac86a Apply Content-Type from the manifest to non-200 status pages. 2025-10-27 15:25:14 +00:00
Catherine 30668be4a0 If an https fallback URL is configured, try TLS for Caddy domain check.
This is added pretty much exclusively for Codeberg Pages v2 migration,
but the implementation is generic enough to be useful for other similar
setups (if anyone ever has to deal with one...)
2025-10-26 04:55:58 +00:00
Catherine 26b926293b Serve X-Content-Type-Options: nosniff.
Mozilla HTTP Observatory cares about this (5 points), and there isn't
really any reason not to send it at all times.
2025-10-24 09:28:49 +00:00
Catherine 68343a3dff Turns out a Web Worker is a type of frame (for COEP purposes). 2025-10-24 09:26:54 +00:00
miyuko 8f8521d697 Don't compress video or audio files. 2025-10-22 17:25:13 +01:00
miyuko ffedc45a14 Don't send COEP/COOP headers for non-HTML resources. 2025-10-22 17:25:10 +01:00
miyuko d6a7a72e09 Serve compressed content directly if client indicates support. 2025-10-22 16:59:35 +01:00
miyuko aa965c5a08 Use s3:GetObject instead of s3:ListObjects for CheckDomain. 2025-10-22 13:45:15 +01:00
Catherine 34db13e603 Simplify observability code. NFC 2025-10-22 10:44:25 +00:00
Catherine d1be93919f Make installable with go install. 2025-10-22 05:24:55 +00:00
miyuko c39e57a857 Fetch manifests in parallel when handling GET requests. 2025-10-22 00:25:21 +01:00
miyuko 3863f0f134 Revert "Add a GetManifests function."
This reverts commit 0a111234f2.
2025-10-22 00:25:21 +01:00