seaweedfs

mirror of https://github.com/seaweedfs/seaweedfs.git synced 2026-05-14 05:41:29 +00:00

Author	SHA1	Message	Date
Lars Lehtonen	935fb42e1d	chore(weed/util/chunk_cache): remove unused functions (#9372 ) * chore(weed/util/chunk_cache): remove unused functions * fix(chunk_cache): bound ReadAt buffer in readNeedleSliceAt When the caller-provided buffer is larger than the remaining needle bytes, ReadAt would spill into the next needle and trigger the n != wanted error. Slice to data[:wanted] so the read stops at the needle boundary. --------- Co-authored-by: Chris Lu <chris.lu@gmail.com>	2026-05-08 13:12:11 -07:00
Chris Lu	45578a42e9	fix(volume): keep vacuum running past dangling .idx entries (#9115 ) * fix(volume): keep vacuum running past dangling .idx entries Vacuum compaction aborted entirely on the first .idx entry whose offset pointed past the end of the .dat file, surfacing as `cannot hydrate needle from file: EOF` and stalling progress on every other volume. In both Go and Rust: - During compaction, skip an unreadable needle and continue. The bytes it pointed at were already unreachable via reads, so dropping the index reference makes the post-vacuum volume consistent. Real EIO still bails out so a disk fault is not silently papered over. - At volume load, do a single linear scan of the .idx and confirm every (offset + actual size) fits inside .dat. The pre-existing integrity check only looked at the last 10 entries, so deeper corruption (e.g. left over from a crashed batched write) went undetected and only surfaced later as a vacuum EOF. A failure now marks the volume read-only at load time so an operator can react. Refs #8928 * fix(volume): only skip permanent-corruption needle reads during vacuum Address PR review feedback (gemini-code-assist + coderabbit): The original patch skipped any non-EIO read failure, which would silently drop needles on transient errors — Windows hardware bad-sector errors (ERROR_CRC etc.) never surface as syscall.EIO; tiered-storage network timeouts and EROFS would also slip through and shrink the volume. Switch to an explicit whitelist of permanent-corruption shapes: - Add needle.ErrorCorrupted sentinel and wrap CRC and "index out of range" errors with %w so callers can match via errors.Is. - copyDataBasedOnIndexFile now skips only when the read failure is io.EOF, io.ErrUnexpectedEOF, ErrorSizeMismatch, ErrorSizeInvalid, or ErrorCorrupted. Anything else (real disk faults, environmental errors, Windows hardware codes) aborts the compaction so an operator notices. - Mirror the same whitelist in the Rust volume server, matching on io::ErrorKind::UnexpectedEof and the NeedleError corruption variants (SizeMismatch, CrcMismatch, IndexOutOfRange, TailTooShort). Also add `defer v.Close()` in TestVerifyIndexFitsInDat so Windows t.TempDir() cleanup can release the .dat/.idx handles. Refs #8928 * fix(volume): wrap entry-not-found size-mismatch with ErrorSizeMismatch Address PR review: the fallback branch in ReadBytes returned an unwrapped fmt.Errorf, so isSkippableNeedleReadError (and any caller using errors.Is(..., ErrorSizeMismatch)) could not match it. Wrap with %w so the whitelist applies, while leaving the existing direct sentinel return for the OffsetSize==4 / offset<MaxPossibleVolumeSize retry path unchanged so ReadData's `err == ErrorSizeMismatch` retry still triggers. Refs #8928 * fix(volume): integrate dangling-idx check into existing index load walk Address PR review (gemini-code-assist, medium): the structural .idx check used to do a second linear scan of the index file at every volume load, doubling the disk-I/O cost on servers managing many volumes. Track the largest (offset + actual size) seen during the existing needle-map load walks (`LoadCompactNeedleMap`, `NewLevelDbNeedleMap`, `NewSortedFileNeedleMap`'s `newNeedleMapMetricFromIndexFile`, `DoOffsetLoading`) on a new `MaximumNeedleEnd` field on `mapMetric`, exposed as `MaxNeedleEnd()` on the NeedleMapper interface. `volume.load()` then compares `nm.MaxNeedleEnd()` to the .dat size after the load is complete — pure numeric comparison, no extra I/O. The standalone `verifyIndexFitsInDat` helper and its caller in `CheckVolumeDataIntegrity` are removed; the test that used to drive the helper directly now exercises the new path via `LoadCompactNeedleMap`. Mirror the same change in the Rust volume server: track `max_needle_end` on `NeedleMapMetric`, expose via `max_needle_end()` on `CompactNeedleMap`, `RedbNeedleMap`, and the `NeedleMap` enum. The Rust load walk already happens in `load_from_idx` for both map kinds, so the structural check becomes free. Refs #8928	2026-04-16 22:01:34 -07:00
Chris Lu	e1fa4ec756	perf(cache): drop OS page cache after disk cache reads (#9098 ) * perf(cache): drop OS page cache after disk cache reads After reading from the on-disk chunk cache, advise the kernel via FADV_DONTNEED to release the corresponding page cache pages. This prevents double-caching the same data in both user-space and kernel page caches, freeing RAM for other uses on systems with large disk caches. * fix(cache): guard dropReadCache against zero length and invalid fd A zero-length fadvise is interpreted as "to end of file" on Linux, which would inadvertently drop the page cache for the entire remainder of the cache volume. Also check fd >= 0 to avoid unnecessary syscalls when the backend file is closed. * perf(cache): only apply FADV_DONTNEED for reads >= 1 MiB For small needle reads the syscall overhead outweighs the memory savings, and the kernel page cache is more beneficial for warm data. Restrict fadvise to reads of at least 1 MiB where the freed page cache is meaningful.	2026-04-16 09:38:42 -07:00
Lars Lehtonen	80db692728	fix(weed/util/chunk_cache): fix dropped errors (#9042 )	2026-04-13 01:16:56 -07:00
promalert	9012069bd7	chore: execute goimports to format the code (#7983 ) * chore: execute goimports to format the code Signed-off-by: promalert <promalert@outlook.com> * goimports -w . --------- Signed-off-by: promalert <promalert@outlook.com> Co-authored-by: Chris Lu <chris.lu@gmail.com>	2026-01-07 13:06:08 -08:00
Chris Lu	f1f0856e50	FUSE Mount: enhance disk cache with volume ID and cookie validation (#7269 ) * enhance disk cache with volume ID and cookie validation * address comments * fix test * fmt	2025-09-24 00:31:32 -07:00
chrislu	ec155022e7	"golang.org/x/exp/slices" => "slices" and go fmt	2024-12-19 19:25:06 -08:00
chrislu	c9f3448692	ReadAt may return io.EOF t end of file related to https://github.com/seaweedfs/seaweedfs/issues/6219	2024-11-21 00:37:38 -08:00
Chris Lu	dc784bf217	merge current message queue code changes (#6201 ) * listing files to convert to parquet * write parquet files * save logs into parquet files * pass by value * compact logs into parquet format * can skip existing files * refactor * refactor * fix compilation * when no partition found * refactor * add untested parquet file read * rename package * refactor * rename files * remove unused * add merged log read func * parquet wants to know the file size * rewind by time * pass in stop ts * add stop ts * adjust log * minor * adjust log * skip .parquet files when reading message logs * skip non message files * Update subscriber_record.go * send messages * skip message data with only ts * skip non log files * update parquet-go package * ensure a valid record type * add new field to a record type * Update read_parquet_to_log.go * fix parquet file name generation * separating reading parquet and logs * add key field * add skipped logs * use in memory cache * refactor * refactor * refactor * refactor, and change compact log * refactor * rename * refactor * fix format * prefix v to version directory	2024-11-04 12:08:25 -08:00
Bruce	5428229347	fix file read crash (#6021 )	2024-09-14 08:33:35 -07:00
Eugeniy E. Mikhailov	dab0bb8097	Feature limit caching to prescribed number of bytes per file (#6009 ) * feature: we can check if a fileId is already in the cache We using this to protect cache from adding the same needle to the cache over and over. * fuse mount: Do not start dowloader if needle is already in the cache * added maxFilePartSizeInCache property to ChunkCache If file very large only first maxFilePartSizeInCache bytes are going to be put to the cache (subject to the needle size constrains). * feature: for large files put in cache no more than prescribed number of bytes Before this patch only the first needle of a large file was intended for caching. This patch uses maximum prescribed amount of bytes to be put in cache. This allows to bypass default 2MB maximum for a file part stored in the cache. * added dummy mock methods to satisfy interfaces of ChunkCache	2024-09-11 21:09:20 -07:00
Eugeniy E. Mikhailov	c04edeed68	bug fix in the data received from cache processing (#6002 ) The patch addresses #3745. The cache should return the exact amount of data requested by the buffer. By construction of the cache it is always all requested data range or we have error happening. The old use of minsize miscalculate the requested data size, if non zero offset is requested.	2024-09-10 13:30:18 -07:00
chrislu	18afdb15b6	Revert "weed mount, weed dav add option to force cache" This reverts commit `7367b976b0`.	2024-09-04 01:38:29 -07:00
chrislu	7367b976b0	weed mount, weed dav add option to force cache	2024-09-04 01:19:14 -07:00
chrislu	645ae8c57b	Revert "Revert "Merge branch 'master' of https://github.com/seaweedfs/seaweedfs "" This reverts commit `8cb42c39`	2023-09-25 09:35:16 -07:00
chrislu	8cb42c39ad	Revert "Merge branch 'master' of https://github.com/seaweedfs/seaweedfs " This reverts commit `2e5aa06026`, reversing changes made to `4d414f54a2`.	2023-09-18 16:12:50 -07:00
dependabot[bot]	a04bd4d26f	Bump github.com/rclone/rclone from 1.63.1 to 1.64.0 (#4850 ) * Bump github.com/rclone/rclone from 1.63.1 to 1.64.0 Bumps [github.com/rclone/rclone](https://github.com/rclone/rclone) from 1.63.1 to 1.64.0. - [Release notes](https://github.com/rclone/rclone/releases) - [Changelog](https://github.com/rclone/rclone/blob/master/RELEASE.md) - [Commits](https://github.com/rclone/rclone/compare/v1.63.1...v1.64.0) --- updated-dependencies: - dependency-name: github.com/rclone/rclone dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * API changes * go mod --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com> Co-authored-by: chrislu <chris.lu@gmail.com>	2023-09-18 14:43:05 -07:00
Guo Lei	5b905fb2b7	Lazy loading (#3958 ) * types packages is imported more than onece * lazy-loading * fix bugs * fix bugs * fix unit tests * fix test error * rename function * unload ldb after initial startup * Don't load ldb when starting volume server if ldbtimeout is set. * remove uncessary unloadldb * Update weed/command/server.go Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com> * Update weed/command/volume.go Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com> Co-authored-by: guol-fnst <goul-fnst@fujitsu.com> Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>	2022-11-14 00:19:27 -08:00
Ryan Russell	0fc242b084	docs: `panicing` -> `panicking` (#3687 ) Signed-off-by: Ryan Russell <git@ryanrussell.org> Signed-off-by: Ryan Russell <git@ryanrussell.org>	2022-09-15 01:38:04 -07:00
chrislu	26dbc6c905	move to https://github.com/seaweedfs/seaweedfs	2022-07-29 00:17:28 -07:00
justin	3551ca2fcf	enhancement: replace sort.Slice with slices.SortFunc to reduce reflection	2022-04-18 10:35:43 +08:00
chrislu	28b395bef4	better control for reader caching	2022-02-26 02:16:47 -08:00
chrislu	3ad5fa6f6f	chunk cache adds function ReadChunkAt	2022-02-25 21:55:04 -08:00
Eng Zer Jun	b92df1654c	test: use `T.TempDir` to create temporary test directory The directory created by `T.TempDir` is automatically removed when the test and all its subtests complete. Reference: https://pkg.go.dev/testing#T.TempDir Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>	2022-02-14 10:38:13 +08:00
Chris Lu	8965a53c4d	add warning error	2021-10-16 15:57:30 -07:00
Eng Zer Jun	a23bcbb7ec	refactor: move from io/ioutil to io and os package The io/ioutil package has been deprecated as of Go 1.16, see https://golang.org/doc/go1.16#ioutil. This commit replaces the existing io/ioutil functions with their new definitions in io and os packages. Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>	2021-10-14 12:27:58 +08:00
Chris Lu	556cc3a4ca	mount: avoid exception if disk cache is not initialized related to https://github.com/chrislusf/seaweedfs/issues/2102	2021-05-31 16:42:55 -07:00
Nathan Hawkins	042de9359c	make reader_at handle random reads more efficiently for FUSE	2021-04-28 19:13:37 -04:00
Chris Lu	6bc09b18c4	truncate is a bit faster to reuse the storage	2021-04-14 20:26:56 -07:00
Chris Lu	4abb511db3	make a local copy of the in memory cached data	2021-03-22 22:33:07 -07:00
bingoohuang	7256902fb0	fix typo offset.ToAcutalOffset to offset.ToActualOffset	2021-02-07 12:11:51 +08:00
Chris Lu	90b117acf1	update ccache version	2021-01-08 02:17:43 -08:00
Chris Lu	707936f482	re-enable caching larger than 16MB revert `62ce85610e`	2020-10-03 14:12:38 -07:00
Chris Lu	b9887504e8	fix test	2020-09-27 23:19:50 -07:00
Chris Lu	f46eae284e	adjust for test	2020-09-27 23:08:11 -07:00
Chris Lu	c49e2bb9a3	adjust	2020-09-27 12:07:45 -07:00
Chris Lu	62ce85610e	skip caching too large chunks	2020-09-27 11:58:48 -07:00
Chris Lu	9ad2dcca2b	more tests	2020-09-27 11:42:51 -07:00
Chris Lu	e43d86c796	fix pre allocated volume size	2020-09-27 10:58:19 -07:00
Chris Lu	31fc7bb2e1	refactor adjust for faster test	2020-09-27 10:41:29 -07:00
Chris Lu	a41588279a	change log level 5 to 4	2020-08-30 20:12:04 -07:00
Chris Lu	99d05f758c	adjust logs	2020-08-18 23:39:18 -07:00
Chris Lu	6a92f0bc7a	refactoring to typed Size Go is amazing with refactoring!	2020-08-18 17:04:28 -07:00
Chris Lu	09e126bae5	refactoring: use interface	2020-08-17 20:20:08 -07:00
Chris Lu	be4d42b8e2	rename	2020-08-17 20:15:53 -07:00
Chris Lu	97e54a80d4	rename variables	2020-08-17 16:05:13 -07:00
Chris Lu	003d48da21	adjust logs	2020-08-15 19:55:28 -07:00
Chris Lu	bef356ce4c	since we already know the chunk size, no need to iterate	2020-06-27 12:51:04 -07:00
Chris Lu	a808b3b5df	incase the memory data is too small	2020-06-27 11:59:15 -07:00
Chris Lu	3dbd51c3c2	a little bit more efficient	2020-06-26 10:02:37 -07:00

1 2

61 Commits