"Why the fuck would anybody want that", you could reasonably ask.
Well, most wouldn't want this. However, if you wanted to use git-pages
to deduplicate your backups, you might find it that some backups
include hardlinks.
"Why the fuck would anybody put their backups in git-pages", you could
even more reasonably ask. Well, almost nobody would! However, tarsnap
doesn't let you download deduplicated data (even though it deduplicates
data in storage), restic can't ingest tarballs, I didn't have
a partition I could format for btrfs, and git-pages performed much
better than alternatives like juicefs.
In the end this is correct and not expensive to do, just very niche.
The git-pages webhook security model depends on there being
a 1:1 mapping between site URLs and repositories; being able to
specify multiple of them breaks this model, as anyone could switch
the published site from one to the other if both repositories exist.
Without this, if a cache first sees a compressed version of the request,
it will return that for potentially any future requests, even if they
don't request compression.
When extracting from an archive it is possible the leading directories
are not part of the archive. Add them to the manifest as otherwise the
behaviour of "index.html" varies depending how the archive was created.
This bug would cause POST hooks triggered for large repositories to
silently fail.
We need the update context to have the principal (which is tied to
the HTTP request), but not the cancellation (which is also tied to
the HTTP request and is triggered once the request is done either way).
Before this change Cache-Control header would always be overridden, this
change allows custom Cache-Control, provided Cache-Control is added to
the header allow list.
Like limiting the size of an archive, it is a supplementary check meant
to limit resource consumption prior to the final check done in
`StoreManifest()`.
This is added to aid migration from Codeberg Pages v2. Forgejo allows
both `_` and `-` in usernames, and it is necessary to be able to accept
host names like `user_name.codeberg.page` under a wildcard domain.
(It is not possible to get a TLS certificate for a host name like this,
so only a wildcard certificate will be able to cover it.)
Currently you can specify "Branch: HEAD" or "Branch: refs/tags/v1" and
go-git will resolve it to the relevant ref. Given the HTTP header is
called Branch this is confusing.
Given this is already depending on zstd I don't see a reason not to.
Can be tested with libarchive via: `bsdtar -a --options zip:compression=zstd -cf file.zip files...`
Reviewed-on: https://codeberg.org/git-pages/git-pages/pulls/91
Co-authored-by: David Leadbeater <dgl@dgl.cx>
Co-committed-by: David Leadbeater <dgl@dgl.cx>
Previously, you could issue e.g. a `GET /%2e%2e/%2e%2e` and it would
get interpreted as a parent directory path segment in the handler.
This didn't result in a path traversal vulnerability when passed to
the S3 backend because of a `path.Clean()` call indirectly done by
`makeWebRoot()`, but it's prudent to not take chances.
This reverts commit 351d0a0c85.
This option does not have any effect at the moment and may potentially
confuse users. It can be easily reintroduced later (by reverting this
commit) once we start logging at any level other than `info`.
Previously, this would disallow all git clones except for those via
wildcard domains. This is highly unintuitive. It also meant that
disabling this function via environment variable was not possible.
It is expected that in most deployments, a reverse proxy server like
Caddy or Nginx will be connecting to Caddy; listening on any address
by default is a privacy and security concern.