Files
Chris Lu 05f0f7e1c9 fix(remote-storage/azure): fix re-cache of large remote blobs (#9174) (#9179)
* fix(remote-storage/azure): fix re-cache of large remote blobs (#9174)

ReadFile issued a single DownloadStream for the entire requested byte
range, so a large re-cache (e.g. a 2 GB blob re-fetched on S3 GET after
eviction) had to move the whole range over one HTTP connection within
the SDK's per-try TryTimeout. TryTimeout was set to 10s "to fail faster
on auth issues", which silently broke large reads: every attempt hit
context deadline, the filer's CacheRemoteObjectToLocalCluster returned
an error, and the S3 gateway surfaced it to clients as an ETag-mismatch
on the partial response.

Switch ReadFile to the SDK's parallel block downloader (DownloadBuffer
with 4 MiB blocks) so each individual HTTP GET is small enough to
complete well inside TryTimeout. Expose the parallelism through the
RemoteStorageConcurrentReader interface so callers (FetchAndWriteNeedle)
can tune it per request, matching the S3 backend.

Also restore TryTimeout to 60s. With parallel block transfers it is no
longer on the critical path for large-blob bodies, but it gives metadata
operations and any non-parallel paths more headroom on slow links.

* fix(remote-storage/azure): guard ReadFileWithConcurrency inputs

Addresses review feedback on PR #9179:

- Reject negative size up front instead of panicking inside make([]byte, size).
- Clamp concurrency to math.MaxUint16 before casting to uint16 so an
  oversized caller value can't silently wrap to a small number.

* fix(remote-storage/azure): reject negative offset in ReadFileWithConcurrency

Addresses review feedback on PR #9179. Without this guard, a negative
offset combined with size == 0 would compute `size = ContentLength -
offset` -> a value larger than the blob, then attempt to allocate and
download past the end.
2026-04-21 14:56:36 -07:00
..