61 Commits

Author SHA1 Message Date
Lars Lehtonen
935fb42e1d chore(weed/util/chunk_cache): remove unused functions (#9372)
* chore(weed/util/chunk_cache): remove unused functions

* fix(chunk_cache): bound ReadAt buffer in readNeedleSliceAt

When the caller-provided buffer is larger than the remaining needle
bytes, ReadAt would spill into the next needle and trigger the
n != wanted error. Slice to data[:wanted] so the read stops at the
needle boundary.

---------

Co-authored-by: Chris Lu <chris.lu@gmail.com>
2026-05-08 13:12:11 -07:00
Chris Lu
45578a42e9 fix(volume): keep vacuum running past dangling .idx entries (#9115)
* fix(volume): keep vacuum running past dangling .idx entries

Vacuum compaction aborted entirely on the first .idx entry whose offset
pointed past the end of the .dat file, surfacing as `cannot hydrate
needle from file: EOF` and stalling progress on every other volume.

In both Go and Rust:

- During compaction, skip an unreadable needle and continue. The bytes
  it pointed at were already unreachable via reads, so dropping the
  index reference makes the post-vacuum volume consistent. Real EIO
  still bails out so a disk fault is not silently papered over.

- At volume load, do a single linear scan of the .idx and confirm
  every (offset + actual size) fits inside .dat. The pre-existing
  integrity check only looked at the last 10 entries, so deeper
  corruption (e.g. left over from a crashed batched write) went
  undetected and only surfaced later as a vacuum EOF. A failure now
  marks the volume read-only at load time so an operator can react.

Refs #8928

* fix(volume): only skip permanent-corruption needle reads during vacuum

Address PR review feedback (gemini-code-assist + coderabbit):

The original patch skipped any non-EIO read failure, which would silently
drop needles on transient errors — Windows hardware bad-sector errors
(ERROR_CRC etc.) never surface as syscall.EIO; tiered-storage network
timeouts and EROFS would also slip through and shrink the volume.

Switch to an explicit whitelist of permanent-corruption shapes:

- Add needle.ErrorCorrupted sentinel and wrap CRC and "index out of
  range" errors with %w so callers can match via errors.Is.
- copyDataBasedOnIndexFile now skips only when the read failure is
  io.EOF, io.ErrUnexpectedEOF, ErrorSizeMismatch, ErrorSizeInvalid,
  or ErrorCorrupted. Anything else (real disk faults, environmental
  errors, Windows hardware codes) aborts the compaction so an
  operator notices.
- Mirror the same whitelist in the Rust volume server, matching on
  io::ErrorKind::UnexpectedEof and the NeedleError corruption variants
  (SizeMismatch, CrcMismatch, IndexOutOfRange, TailTooShort).

Also add `defer v.Close()` in TestVerifyIndexFitsInDat so Windows
t.TempDir() cleanup can release the .dat/.idx handles.

Refs #8928

* fix(volume): wrap entry-not-found size-mismatch with ErrorSizeMismatch

Address PR review: the fallback branch in ReadBytes returned an
unwrapped fmt.Errorf, so isSkippableNeedleReadError (and any caller
using errors.Is(..., ErrorSizeMismatch)) could not match it. Wrap
with %w so the whitelist applies, while leaving the existing direct
sentinel return for the OffsetSize==4 / offset<MaxPossibleVolumeSize
retry path unchanged so ReadData's `err == ErrorSizeMismatch` retry
still triggers.

Refs #8928

* fix(volume): integrate dangling-idx check into existing index load walk

Address PR review (gemini-code-assist, medium): the structural .idx
check used to do a second linear scan of the index file at every volume
load, doubling the disk-I/O cost on servers managing many volumes.

Track the largest (offset + actual size) seen during the existing
needle-map load walks (`LoadCompactNeedleMap`, `NewLevelDbNeedleMap`,
`NewSortedFileNeedleMap`'s `newNeedleMapMetricFromIndexFile`,
`DoOffsetLoading`) on a new `MaximumNeedleEnd` field on `mapMetric`,
exposed as `MaxNeedleEnd()` on the NeedleMapper interface.
`volume.load()` then compares `nm.MaxNeedleEnd()` to the .dat size
after the load is complete — pure numeric comparison, no extra I/O.

The standalone `verifyIndexFitsInDat` helper and its caller in
`CheckVolumeDataIntegrity` are removed; the test that used to drive
the helper directly now exercises the new path via
`LoadCompactNeedleMap`.

Mirror the same change in the Rust volume server: track
`max_needle_end` on `NeedleMapMetric`, expose via `max_needle_end()`
on `CompactNeedleMap`, `RedbNeedleMap`, and the `NeedleMap` enum.
The Rust load walk already happens in `load_from_idx` for both map
kinds, so the structural check becomes free.

Refs #8928
2026-04-16 22:01:34 -07:00
Chris Lu
e1fa4ec756 perf(cache): drop OS page cache after disk cache reads (#9098)
* perf(cache): drop OS page cache after disk cache reads

After reading from the on-disk chunk cache, advise the kernel via
FADV_DONTNEED to release the corresponding page cache pages. This
prevents double-caching the same data in both user-space and kernel
page caches, freeing RAM for other uses on systems with large disk
caches.

* fix(cache): guard dropReadCache against zero length and invalid fd

A zero-length fadvise is interpreted as "to end of file" on Linux,
which would inadvertently drop the page cache for the entire remainder
of the cache volume. Also check fd >= 0 to avoid unnecessary syscalls
when the backend file is closed.

* perf(cache): only apply FADV_DONTNEED for reads >= 1 MiB

For small needle reads the syscall overhead outweighs the memory
savings, and the kernel page cache is more beneficial for warm data.
Restrict fadvise to reads of at least 1 MiB where the freed page
cache is meaningful.
2026-04-16 09:38:42 -07:00
Lars Lehtonen
80db692728 fix(weed/util/chunk_cache): fix dropped errors (#9042) 2026-04-13 01:16:56 -07:00
promalert
9012069bd7 chore: execute goimports to format the code (#7983)
* chore: execute goimports to format the code

Signed-off-by: promalert <promalert@outlook.com>

* goimports -w .

---------

Signed-off-by: promalert <promalert@outlook.com>
Co-authored-by: Chris Lu <chris.lu@gmail.com>
2026-01-07 13:06:08 -08:00
Chris Lu
f1f0856e50 FUSE Mount: enhance disk cache with volume ID and cookie validation (#7269)
* enhance disk cache with volume ID and cookie validation

* address comments

* fix test

* fmt
2025-09-24 00:31:32 -07:00
chrislu
ec155022e7 "golang.org/x/exp/slices" => "slices" and go fmt 2024-12-19 19:25:06 -08:00
chrislu
c9f3448692 ReadAt may return io.EOF t end of file
related to https://github.com/seaweedfs/seaweedfs/issues/6219
2024-11-21 00:37:38 -08:00
Chris Lu
dc784bf217 merge current message queue code changes (#6201)
* listing files to convert to parquet

* write parquet files

* save logs into parquet files

* pass by value

* compact logs into parquet format

* can skip existing files

* refactor

* refactor

* fix compilation

* when no partition found

* refactor

* add untested parquet file read

* rename package

* refactor

* rename files

* remove unused

* add merged log read func

* parquet wants to know the file size

* rewind by time

* pass in stop ts

* add stop ts

* adjust log

* minor

* adjust log

* skip .parquet files when reading message logs

* skip non message files

* Update subscriber_record.go

* send messages

* skip message data with only ts

* skip non log files

* update parquet-go package

* ensure a valid record type

* add new field to a record type

* Update read_parquet_to_log.go

* fix parquet file name generation

* separating reading parquet and logs

* add key field

* add skipped logs

* use in memory cache

* refactor

* refactor

* refactor

* refactor, and change compact log

* refactor

* rename

* refactor

* fix format

* prefix v to version directory
2024-11-04 12:08:25 -08:00
Bruce
5428229347 fix file read crash (#6021) 2024-09-14 08:33:35 -07:00
Eugeniy E. Mikhailov
dab0bb8097 Feature limit caching to prescribed number of bytes per file (#6009)
* feature: we can check if a fileId is already in the cache

We using this to protect cache from adding the same needle to
the cache over and over.

* fuse mount: Do not start dowloader if needle is already in the cache

* added maxFilePartSizeInCache property to ChunkCache

If file very large only first maxFilePartSizeInCache bytes
are going to be put to the cache (subject to the needle size
constrains).

* feature: for large files put in cache no more than prescribed number of bytes

Before this patch only the first needle of a large file was intended for
caching. This patch uses maximum prescribed amount of bytes to be put in
cache. This allows to bypass default 2MB maximum for a file part stored
in the cache.

* added dummy mock methods to satisfy interfaces of ChunkCache
2024-09-11 21:09:20 -07:00
Eugeniy E. Mikhailov
c04edeed68 bug fix in the data received from cache processing (#6002)
The patch addresses #3745.

The cache should return the exact amount of data requested by the buffer.
By construction of the cache it is always all requested data range
or we have error happening.

The old use of minsize miscalculate the requested data size,
if non zero offset is requested.
2024-09-10 13:30:18 -07:00
chrislu
18afdb15b6 Revert "weed mount, weed dav add option to force cache"
This reverts commit 7367b976b0.
2024-09-04 01:38:29 -07:00
chrislu
7367b976b0 weed mount, weed dav add option to force cache 2024-09-04 01:19:14 -07:00
chrislu
645ae8c57b Revert "Revert "Merge branch 'master' of https://github.com/seaweedfs/seaweedfs""
This reverts commit 8cb42c39
2023-09-25 09:35:16 -07:00
chrislu
8cb42c39ad Revert "Merge branch 'master' of https://github.com/seaweedfs/seaweedfs"
This reverts commit 2e5aa06026, reversing
changes made to 4d414f54a2.
2023-09-18 16:12:50 -07:00
dependabot[bot]
a04bd4d26f Bump github.com/rclone/rclone from 1.63.1 to 1.64.0 (#4850)
* Bump github.com/rclone/rclone from 1.63.1 to 1.64.0

Bumps [github.com/rclone/rclone](https://github.com/rclone/rclone) from 1.63.1 to 1.64.0.
- [Release notes](https://github.com/rclone/rclone/releases)
- [Changelog](https://github.com/rclone/rclone/blob/master/RELEASE.md)
- [Commits](https://github.com/rclone/rclone/compare/v1.63.1...v1.64.0)

---
updated-dependencies:
- dependency-name: github.com/rclone/rclone
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* API changes

* go mod

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>
Co-authored-by: chrislu <chris.lu@gmail.com>
2023-09-18 14:43:05 -07:00
Guo Lei
5b905fb2b7 Lazy loading (#3958)
* types packages is imported more than onece

* lazy-loading

* fix bugs

* fix bugs

* fix unit tests

* fix test error

* rename function

* unload ldb after initial startup

* Don't load ldb when starting volume server if ldbtimeout is set.

* remove uncessary unloadldb

* Update weed/command/server.go

Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>

* Update weed/command/volume.go

Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>

Co-authored-by: guol-fnst <goul-fnst@fujitsu.com>
Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>
2022-11-14 00:19:27 -08:00
Ryan Russell
0fc242b084 docs: panicing -> panicking (#3687)
Signed-off-by: Ryan Russell <git@ryanrussell.org>

Signed-off-by: Ryan Russell <git@ryanrussell.org>
2022-09-15 01:38:04 -07:00
chrislu
26dbc6c905 move to https://github.com/seaweedfs/seaweedfs 2022-07-29 00:17:28 -07:00
justin
3551ca2fcf enhancement: replace sort.Slice with slices.SortFunc to reduce reflection 2022-04-18 10:35:43 +08:00
chrislu
28b395bef4 better control for reader caching 2022-02-26 02:16:47 -08:00
chrislu
3ad5fa6f6f chunk cache adds function ReadChunkAt 2022-02-25 21:55:04 -08:00
Eng Zer Jun
b92df1654c test: use T.TempDir to create temporary test directory
The directory created by `T.TempDir` is automatically removed when the
test and all its subtests complete.

Reference: https://pkg.go.dev/testing#T.TempDir
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2022-02-14 10:38:13 +08:00
Chris Lu
8965a53c4d add warning error 2021-10-16 15:57:30 -07:00
Eng Zer Jun
a23bcbb7ec refactor: move from io/ioutil to io and os package
The io/ioutil package has been deprecated as of Go 1.16, see
https://golang.org/doc/go1.16#ioutil. This commit replaces the existing
io/ioutil functions with their new definitions in io and os packages.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2021-10-14 12:27:58 +08:00
Chris Lu
556cc3a4ca mount: avoid exception if disk cache is not initialized
related to https://github.com/chrislusf/seaweedfs/issues/2102
2021-05-31 16:42:55 -07:00
Nathan Hawkins
042de9359c make reader_at handle random reads more efficiently for FUSE 2021-04-28 19:13:37 -04:00
Chris Lu
6bc09b18c4 truncate is a bit faster to reuse the storage 2021-04-14 20:26:56 -07:00
Chris Lu
4abb511db3 make a local copy of the in memory cached data 2021-03-22 22:33:07 -07:00
bingoohuang
7256902fb0 fix typo offset.ToAcutalOffset to offset.ToActualOffset 2021-02-07 12:11:51 +08:00
Chris Lu
90b117acf1 update ccache version 2021-01-08 02:17:43 -08:00
Chris Lu
707936f482 re-enable caching larger than 16MB
revert 62ce85610e
2020-10-03 14:12:38 -07:00
Chris Lu
b9887504e8 fix test 2020-09-27 23:19:50 -07:00
Chris Lu
f46eae284e adjust for test 2020-09-27 23:08:11 -07:00
Chris Lu
c49e2bb9a3 adjust 2020-09-27 12:07:45 -07:00
Chris Lu
62ce85610e skip caching too large chunks 2020-09-27 11:58:48 -07:00
Chris Lu
9ad2dcca2b more tests 2020-09-27 11:42:51 -07:00
Chris Lu
e43d86c796 fix pre allocated volume size 2020-09-27 10:58:19 -07:00
Chris Lu
31fc7bb2e1 refactor
adjust for faster test
2020-09-27 10:41:29 -07:00
Chris Lu
a41588279a change log level 5 to 4 2020-08-30 20:12:04 -07:00
Chris Lu
99d05f758c adjust logs 2020-08-18 23:39:18 -07:00
Chris Lu
6a92f0bc7a refactoring to typed Size
Go is amazing with refactoring!
2020-08-18 17:04:28 -07:00
Chris Lu
09e126bae5 refactoring: use interface 2020-08-17 20:20:08 -07:00
Chris Lu
be4d42b8e2 rename 2020-08-17 20:15:53 -07:00
Chris Lu
97e54a80d4 rename variables 2020-08-17 16:05:13 -07:00
Chris Lu
003d48da21 adjust logs 2020-08-15 19:55:28 -07:00
Chris Lu
bef356ce4c since we already know the chunk size, no need to iterate 2020-06-27 12:51:04 -07:00
Chris Lu
a808b3b5df incase the memory data is too small 2020-06-27 11:59:15 -07:00
Chris Lu
3dbd51c3c2 a little bit more efficient 2020-06-26 10:02:37 -07:00