This cleans up resources that would otherwise be tied up by Caddy
endpoint requests where the originating TLS connection to Caddy has
went away.
V12-Ref: F-77195
Use PUT to upload the following tar file (`unzstd | base64 -d`):
KLUv/QRY7QIAcoQOFLCnDQ0QaaURkYASyN1LJveuZAKkXivfoQMXZ5MhIGJAXHUWHclJufKB
PLvNDSbmD81Htf9W1f/3BgsA/QPwwAuojAHiDA8mpAEqhsJB8IUcTATEusLVn0AbU7ZnkA==
After this commit it should no longer crash the handler.
V12-Ref: F-77219
To reproduce, use PUT to upload this archive (`unzstd | base64 -d`):
KLUv/QRY7QIAxAJhL2IAMDAwMDY0NDAwMDAwMDEAADAwNzU2MAAgMAB1c3RhcgAwAGEAMzM3
YREA/UEF/EC9Y0AdDJBP8GDCTaDGBxATkAAd3gJoMPAbJANAciACGDTAsXKZngAR/m3nXA==
then issue any PATCH request to that site.
After this commit, the server returns "malformed manifest (not
a directory)" instead of "assignment to entry in nil map".
While ideally incoming manifests should be checked for consistency
regardless of how they're uploaded, in practice this is only a self-DoS
so it's probably not worth fixing.
V12-Ref: F-77244
Pull request number was compared, but pull request owner and repository
name were not. As a result you could overwrite any preview site with
the matching PR number.
This functionality is feature-gated and there are no known usable
deployments at the moment.
V12-Ref: F-77256
The old function did not even draw a histogram (it was a bar chart),
and would essentially always overcount sizes.
The new function is always accurate and just as useful at a glance.
It provides two modes, `text` (optionally colorized) and `json`.
This helps avoid incorrect behavior on typos and notifies end users
that a feature has been stabilized and removed. It also helps us avoid
reusing feature names by accident.
This commit includes no behavioral changes, only cosmetic ones:
* Renames the concept to "existence cache".
* Makes log messages more concise.
* Adds written rationale for the module.
* Renames feature to `existence-cache`.
In commit bbdaae7280, a domain cache was
introduced to deal with misbehaving crawlers that forge `Host:` header
and may cause thousands of expensive S3 requests to be submitted.
This domain cache is implemented using a Bloom filter (which can
produce false positives but not false negatives) for S3 backend, and
using a function always returning true (which will be a false positive
in most cases) for the FS backend.
Both of these behaviors are unacceptable for the Caddy endpoint, but
the FS backend case much more so. If you use git-pages with Caddy you
should upgrade to a build that includes this commit as soon as possible
or Let's Encrypt may rate-limit or restrict your account when you get
unlucky with a crawler.
I thought I was being smart by using a trie to record blob existence
and sizes. I was not. The trie approach had at least ~5 times less
throughput and consumed entirely unreasonable amounts of RAM.
A hashmap works just fine here.
This is particularly important with the FS backend, where there isn't
necessarily native tooling capable of handling this task correctly
(since not every filesystem supports file "birth times", and since
restoring data from a backup will reset the "birth time" of audit
records to the moment of restoration).
Before this commit, a `_git-pages-repository.<host>` TXT record would
allow both forge DNS allowlist authorization, as well as normal DNS
allowlist authorization. This means that a site set up to have its
contents updated by a Forgejo Action could have its contents replaced
by the contents of the repository which contains the Forgejo Action,
which will effectively erase the site in most cases. This is a classic
confused deputy scenario.
To fix this, forge DNS allowlist authorization now uses a distinct
`_git-pages-forge-allowlist.<host>` TXT record, removing ambiguity
that allows this scenario to happen.
The issue was introduced in 27a6de792c
and existed in `main` for about a hour, so it is unlikely anybody
has been impacted by this.
The new authorization method combines DNS allowlist and existing forge
authorization methods: DNS records are used to determine the allowed
repository URL, and forge authorization is used to check for push
permissions to that URL.
This commit unifies most of the implementation of `AuthorizeDeletion`
and `AuthorizeUpdateFromArchive`, with the latter additionally checking
that the repository URL in the authorization grant follows the limits.
This is done in preparation of adding a second forge authorization
sub-mechanism that can handle non-wildcard domains.
Before:
- not authorized by forge (wildcard)
- cannot check repository permissions: GET https://codeberg.org/api/v1/repos/whitequark/whitequark.codeberg.page returned 401 Unauthorized
After:
- not authorized by forge (wildcard)
- no access to whitequark/whitequark.codeberg.page or invalid token
The actual Codeberg Pages v2 server uses the Forgejo default branch
for the index repository. The quirk previously used the `main` branch
unconditionally.
This is complex to implement, so per discussion with gusted we have
decided to change the default branch to `pages` so that it has parity
with non-Codeberg-specific behavior.