Commit Graph

312 Commits

Author SHA1 Message Date
Catherine
89c57cfadb Use git filters for incremental updates from a git repository.
This commit changes the git fetch algorithm to only retrieve blobs
that aren't included in the previously deployed site manifest, if
git filters are supported by the remote.

It also changes how manifest entry sizes are represented, such that
both decompressed and compressed sizes are stored. This enables
computing accurate (and repeatable) sizes even after incremental
updates.

Co-authored-by: David Leadbeater <dgl@dgl.cx>
2025-12-02 22:23:43 +00:00
Catherine
af40848d9f Explicitly mention SHA-256 transition status. 2025-12-02 22:23:43 +00:00
Catherine
689030c28a Add a Prometheus metric for blob/request encoding pairs.
Forcing the server to repeatedly decompress a large blob is a potential
DoS vector, so having a metric for this is essential.
2025-12-01 11:04:50 +00:00
Catherine
30bde8c1c4 Rename blob transforms to match HTTP encoding names. 2025-12-01 11:04:50 +00:00
woodpecker-bot
e1a2143d22 fix(deps): update all dependencies 2025-11-29 00:39:09 +00:00
Catherine
0b82dcbc25 Replace s3GetObjectErrorsCount metric with *ResponseCount.
The former metric was misnamed: it only counted NoSuchKey errors.
Also, it was applied *after* the cache, meaning it was just a count
of every request that got a successful 404 from the S3 backend.
Also, it pooled blob and manifest requests together.

The new metric is 1-to-1 correspondent to S3 requests and distinguishes
between different kinds of errors. Also, it distinguishes kinds of
requests. Example output:

    git_pages_s3_get_object_responses_count{code="NoSuchKey",kind="manifest"} 1
    git_pages_s3_get_object_responses_count{code="OK",kind="blob"} 1
    git_pages_s3_get_object_responses_count{code="OK",kind="manifest"} 1
2025-11-29 00:04:50 +00:00
Catherine
f9669e1c69 Update sentry-go.
Related to 4cca8abaf0.

They've fixed it in https://github.com/getsentry/sentry-go/issues/1142
2025-11-26 03:18:47 +00:00
Catherine
4cca8abaf0 Make Sentry telemetry buffer configurable.
Via `sentry-telemetry-buffer` feature.

I think this causes high CPU use on Grebedoc.
2025-11-23 03:04:25 +00:00
Catherine
d82ae69625 Simplify SIGINT handling code. NFC 2025-11-23 03:03:33 +00:00
Catherine
fa02595f8b Handle OPTIONS method. 2025-11-23 00:14:39 +00:00
Catherine
80d2a7a792 Rename license to satisfy https://pkg.go.dev 2025-11-22 23:32:18 +00:00
Catherine
988da5243e Fix nix flake. 2025-11-22 23:21:00 +00:00
miyuko
eda6d8b6f6 Update the go-slog-syslog dependency. 2025-11-22 14:43:38 +00:00
miyuko
fcc109c315 Add the ability to send logs to a syslog daemon. 2025-11-22 14:10:26 +00:00
woodpecker-bot
4d8f6d5e9d fix(deps): update module github.com/go-git/go-git/v6 to v6.0.0-20251121083746-39fcec474970 2025-11-22 09:35:57 +00:00
miyuko
cb7802df10 Pass the context to logging functions. 2025-11-22 07:05:07 +00:00
miyuko
b01e67f993 Exit gracefully (run deferred statements in main()) on SIGINT. 2025-11-21 23:34:33 +00:00
David Leadbeater
b5a1626a10 Fix content-type detection for small files
Previously a <512 byte file without an extension resulted in:

internal server error: runtime error: slice bounds out of range [:512] with capacity 8
2025-11-21 05:55:50 +01:00
Catherine
b1b8ae26e8 Restrict DNS Allowlist authorization to index site only.
Otherwise, an undesired degree of freedom permits a third party to
deny access to index site URLs by publishing projects with the same
name.

In the future, the _git-pages-repository TXT record format may be
extended to allow non-index sites to be specified without introducing
undesired degrees of freedom.
2025-11-21 03:49:38 +00:00
woodpecker-bot
eac02e5758 fix(deps): update all dependencies 2025-11-21 00:31:03 +00:00
Catherine
7e1185309b Fix a regression causing non-observance of ≠200 S3 manifest responses.
Introduced in commit dd168186.
2025-11-20 07:06:14 +00:00
David Leadbeater
351d0a0c85 Add a log level config option 2025-11-20 17:33:54 +11:00
Catherine
982c3321e0 Reword some log messages. NFC v0.1.0 2025-11-20 04:11:57 +00:00
Catherine
eaf77565bc Improve configuration reload and clarify scope.
This commit also moves all of the globals into `main.go`.
2025-11-20 04:06:09 +00:00
Catherine
c93d3a0bb5 Reload configuration on SIGHUP (if supported by OS).
On Windows, there is no way to reload configuration at runtime.
2025-11-20 03:43:22 +00:00
Catherine
a924dd5116 Improve CLI usage text. 2025-11-20 03:15:03 +00:00
Catherine
f148792bcd Accept an output argument in -get-blob, -get-manifest, -get-archive. 2025-11-20 03:03:03 +00:00
Catherine
99904174e4 Bring documentation up to date. 2025-11-20 02:41:32 +00:00
Catherine
6db850e2c4 Allow downloading entire site via CLI or HTTP.
The HTTP endpoint is `/.git-pages/archive.tar` and it is gated behind
a feature flag `archive-site`. It serially downloads every blob and
writes it to the client in a chunked response, optionally compressed
with gzip or zstd as per `Accept-Encoding:`. It is authorized the same
as `/.git-pages/manifest.json`, for the same reasons.

The CLI operation is `-get-archive <site-name>` and it writes a tar
archive to stdout. This could be useful for an administrator to review
the contents of a site in response to a report.

Both `_headers` and `_redirects` files are present in the output,
reconstituted from the manifest.
2025-11-20 02:09:49 +00:00
Catherine
aa6e495505 Fix DIV/0 when compressing a site without contents.
I think this doesn't affect anything, but prevents an embarrassing
message from appearing in the log:

    compress: saved NaN percent (0 B to 0 B)
2025-11-20 01:17:53 +00:00
Catherine
0e342b11f6 Add Last-Modified: header to /.git-pages/ metadata responses. 2025-11-19 22:37:06 +00:00
Catherine
dd16818618 Refactor S3Backend.GetManifest. NFCI
This is both to reduce the amount of loose variables in the code, as
well as to make it closer to `S3Backend.GetBlob`.
2025-11-19 22:26:40 +00:00
Catherine
0b2db170b8 Allow updating wildcard domain sites from an archive with a forge token. 2025-11-19 04:10:02 +00:00
Catherine
457dd60aa0 Factor out authentication helpers. NFC 2025-11-19 02:53:48 +00:00
Catherine
95894bb403 Docker: clean Go cache after building executables.
This is an attempt to stop OOMing Codeberg's Forgejo Actions runners,
which count disk and RAM against the same quota.
2025-11-19 02:24:20 +00:00
Catherine
6196026312 CI: publish releases and handle tags. 2025-11-19 01:33:08 +00:00
Catherine
073435aa2b Redirect domain.tld/project to domain.tld/project/ when present.
This is to match the behavior of GitHub, as well as because it isn't
particularly useful to serve a file from the index repo with the same
path segment as the project name (and quite confusing too).
2025-11-18 22:27:03 +00:00
Catherine
325c283e05 Refactor redirect code. NFC 2025-11-18 22:21:51 +00:00
Catherine
7773ebd0dc CI: switch package runner to medium. 2025-11-18 21:30:18 +00:00
miyuko
cef3d785ec Add a Prometheus counter for s3:GetObject errors. 2025-11-17 12:33:00 +00:00
miyuko
fff345c695 Don't observe context cancellation errors. 2025-11-17 11:09:26 +00:00
miyuko
de17426f41 Observe blob fetch errors during GET requests. 2025-11-17 11:09:26 +00:00
David Leadbeater
3334af922f Allow external redirects for 3xx statuses
Fixes #60
2025-11-17 19:24:54 +11:00
Catherine
5a09d30d3d Renovate: disable automerge so it'd stop breaking the flake. 2025-11-17 04:34:26 +00:00
Catherine
d88d97721a Observe whether manifest cache is bypassed. 2025-11-17 04:34:17 +00:00
Catherine
91dc7e0c54 Add original (decompressed) size to site manifest.
This size is not used by git-pages itself, and is not representative of
storage needs, but may be used for estimating how large a site would
be if downloaded in its entirety.
2025-11-16 19:27:04 +00:00
Catherine
770ff5c416 Remove unused go.mod entries. 2025-11-16 19:22:20 +00:00
oppiliappan
779f705d5c Allow matching multiple subdomains in wildcards
Previously, this method would match only hosts of the form:

    user.host.com

This changeset allows matches on hosts of the form:

    user.org.host.com
    user.organization.com.host.com

This will potentially be the pattern that tangled.org uses for its hosted
instance of git-pages.

Signed-off-by: oppiliappan <me@oppi.li>
2025-11-16 05:56:15 +00:00
Catherine
5da56a1b94 Link to git-pages-cli in README. 2025-11-16 02:06:19 +00:00
miyuko
2193fb86de Try to fix Sentry errors getting attached to wrong transactions. 2025-11-16 00:30:53 +00:00