Before this commit, a `_git-pages-repository.<host>` TXT record would
allow both forge DNS allowlist authorization, as well as normal DNS
allowlist authorization. This means that a site set up to have its
contents updated by a Forgejo Action could have its contents replaced
by the contents of the repository which contains the Forgejo Action,
which will effectively erase the site in most cases. This is a classic
confused deputy scenario.
To fix this, forge DNS allowlist authorization now uses a distinct
`_git-pages-forge-allowlist.<host>` TXT record, removing ambiguity
that allows this scenario to happen.
The issue was introduced in 27a6de792c
and existed in `main` for about a hour, so it is unlikely anybody
has been impacted by this.
The new authorization method combines DNS allowlist and existing forge
authorization methods: DNS records are used to determine the allowed
repository URL, and forge authorization is used to check for push
permissions to that URL.
This commit unifies most of the implementation of `AuthorizeDeletion`
and `AuthorizeUpdateFromArchive`, with the latter additionally checking
that the repository URL in the authorization grant follows the limits.
This is done in preparation of adding a second forge authorization
sub-mechanism that can handle non-wildcard domains.
Before:
- not authorized by forge (wildcard)
- cannot check repository permissions: GET https://codeberg.org/api/v1/repos/whitequark/whitequark.codeberg.page returned 401 Unauthorized
After:
- not authorized by forge (wildcard)
- no access to whitequark/whitequark.codeberg.page or invalid token
The actual Codeberg Pages v2 server uses the Forgejo default branch
for the index repository. The quirk previously used the `main` branch
unconditionally.
This is complex to implement, so per discussion with gusted we have
decided to change the default branch to `pages` so that it has parity
with non-Codeberg-specific behavior.
This is mainly done to speed up histogram collection, as waiting some
minutes defeats the purpose of having a quick overview function.
This commit does speed up GC tracing as well, but not as much because
audit records are still retrieved one at a time. A similar mechanism
could be added in the future there.
Filesystem logic is functionally identical since it was fine already.
This aborts the response to the client and doesn't log an error.
httputil.ReverseProxy commonly panics with this error.
This results in different behavior from simply swallowing the panic.
Panicking prevents flushing the response to the client, and in the case
of a panic from httputil.ReverseProxy it results in clients potentially
receiving an empty response instead of what was already written to
http.ResponseWriter. This behavior is the same as if the panic handler
hadn't been installed.
"Why the fuck would anybody want that", you could reasonably ask.
Well, most wouldn't want this. However, if you wanted to use git-pages
to deduplicate your backups, you might find it that some backups
include hardlinks.
"Why the fuck would anybody put their backups in git-pages", you could
even more reasonably ask. Well, almost nobody would! However, tarsnap
doesn't let you download deduplicated data (even though it deduplicates
data in storage), restic can't ingest tarballs, I didn't have
a partition I could format for btrfs, and git-pages performed much
better than alternatives like juicefs.
In the end this is correct and not expensive to do, just very niche.
The upstream added AGENTS.md and removed all unsafe code in the same
release. I've manually reviewed the entire v2.2.4..v2.3.0 diff and
found no issues except for one potential problem with
`go-toml/errors.subsliceOffset` that would only appear with a moving
GC. This seems like a strict improvement but we don't want any more
updates.