Commit Graph

127 Commits

Author SHA1 Message Date
Poorna
9dc29d7687 Avoid ILM expiry on deleted versions that are yet to replicate (#18175)
Fixes #18167
2023-10-06 06:55:15 -06:00
Klaus Post
57f84a8b4c Add abandoned folder scanning to metrics (#18076)
Include object and versions heal scan times when checking non-empty abandoned folders.

Furthermore don't add delay between healing versions, instead do one per object wait.
2023-09-24 22:15:31 -07:00
Harshavardhana
91ebac0a00 fix: move abandoned parts check after healing not in ILM path (#18087) 2023-09-22 12:07:52 -07:00
Harshavardhana
2add57cfed apply healing per object at 1024 cycles (#18050)
- we already have MRF for most recent failures
- we trigger healing during HEAD/GET operation

These are enough, also change the default max wait
from 5sec to 1sec for default scanner speed.
2023-09-19 09:24:22 -07:00
Harshavardhana
36385010f5 use optimized pathJoin instead of path.Join (#18042)
this avoids allocations in scanner routine, they are tiny but 
they allocate a lot over many cycles of the scanner.
2023-09-16 19:08:59 -07:00
Aditya Manthramurthy
1c99fb106c Update to minio/pkg/v2 (#17967) 2023-09-04 12:57:37 -07:00
Harshavardhana
9458485e43 avoid double logging from healing (#17950) 2023-08-30 18:46:04 -07:00
Poorna
b48bbe08b2 Add additional info for replication metrics API (#17293)
to track the replication transfer rate across different nodes,
number of active workers in use and in-queue stats to get
an idea of the current workload.

This PR also adds replication metrics to the site replication
status API. For site replication, prometheus metrics are
no longer at the bucket level - but at the cluster level.

Add prometheus metric to track credential errors since uptime
2023-08-30 01:00:59 -07:00
Krishnan Parthasarathi
87cb0081ec Retain current and upto NewerNoncurrentVersions versions (#17909)
applyNewerNoncurrentVersionLimit method should pass along versions
unaffected by NewerNoncurrentVersions rule for further ILM evaluation.
2023-08-24 09:26:29 -07:00
Harshavardhana
3a0125fa1f remove unexpected logging from peer calls (#17888)
also make sure RequestID is set for system logs
2023-08-21 14:25:24 -07:00
Anis Eleuch
4c6869cd9a ilm: Fix cleaning non current null versions (#17876) 2023-08-18 12:55:47 -07:00
Harshavardhana
6e860b6dc5 count all versions as part of DeleteAllVersionsAction (#17821) 2023-08-09 08:55:19 -07:00
Krishnan Parthasarathi
0120ff93bc admin-info: add DeleteMarkers count (#17659) 2023-07-18 10:49:40 -07:00
Harshavardhana
24e86d0c59 avoid passing around poolIdx, setIdx instead pass the relevant disks (#17660) 2023-07-17 09:52:05 -07:00
Harshavardhana
3e196fa7b3 fix: ILM newer noncurrent version limit must return correct versions (#17652)
objects/versions that are not expired via NewerNoncurrentVersions
must be properly returned to be applied under further ILM actions.

this would cause legitimately expired objects to be missed
from expiration.
2023-07-14 16:42:35 -07:00
Poorna
5e2f8d7a42 replication: Simplify mrf requeueing and add backlog handler (#17171)
Simplify MRF queueing and add backlog handler

- Limit re-tries to 3 to avoid repeated re-queueing. Fall offs
to be re-tried when the scanner revisits this object or upon access.

- Change MRF to have each node process only its MRF entries.

- Collect MRF backlog by the node to allow for current backlog visibility
2023-07-12 23:51:33 -07:00
Harshavardhana
82075e8e3a use strconv variants to improve on performance per 'op' (#17626)
```
BenchmarkItoa
BenchmarkItoa-8         	673628088	         1.946 ns/op	       0 B/op	       0 allocs/op
BenchmarkFormatInt
BenchmarkFormatInt-8    	592919769	         2.012 ns/op	       0 B/op	       0 allocs/op
BenchmarkSprint
BenchmarkSprint-8       	26149144	        49.06 ns/op	       2 B/op	       1 allocs/op
BenchmarkSprintBool
BenchmarkSprintBool-8   	26440180	        45.92 ns/op	       4 B/op	       1 allocs/op
BenchmarkFormatBool
BenchmarkFormatBool-8   	1000000000	         0.2558 ns/op	       0 B/op	       0 allocs/op
```
2023-07-11 07:46:58 -07:00
Harshavardhana
f6186965c3 honor DeleteAllVersions in list(), head() calls (#17604) 2023-07-08 15:42:10 -07:00
Harshavardhana
aae6846413 feat: allow expiration of all versions via ILM Expiration action (#17521)
Following extension allows users to specify immediate purge of
all versions as soon as the latest version of this object has
expired.

```
<LifecycleConfiguration>
    <Rule>
        <ID>ClassADocRule</ID>
        <Filter>
           <Prefix>classA/</Prefix>
        </Filter>
        <Status>Enabled</Status>
        <Expiration>
             <Days>3650</Days>
	     <ExpiredObjectAllVersions>true</ExpiredObjectAllVersions>
        </Expiration>
    </Rule>
    ...
```
2023-06-28 22:12:28 -07:00
Kaan Kabalak
21fbe88e1f Print certain log messages once per error (#17484) 2023-06-24 20:29:13 -07:00
Klaus Post
bf8a68879c fix: Time ILM Actions for scanner info (#17493)
ILM Actions were not timed fix it.
2023-06-23 07:48:36 -07:00
Aditya Manthramurthy
5a1612fe32 Bump up madmin-go and pkg deps (#17469) 2023-06-19 17:53:08 -07:00
Klaus Post
6f2406b0b6 fix: protect ReplicationStats against concurrent map iteration and write crash (#17403) 2023-06-12 09:17:11 -07:00
Krishnan Parthasarathi
3e128c116e Add lifecycle event source to audit log tags (#17248) 2023-05-22 15:28:56 -07:00
jiuker
7d433f16c4 before return make globalScannerMetrics.incTime call (#17230) 2023-05-18 13:45:05 -07:00
Krishnan Parthasarathi
0ec722bc54 Add tags to NewerNoncurrentVersions audit event (#17110) 2023-05-02 12:56:33 -07:00
Krishnan Parthasarathi
e7cac8acef Add tags to auditLogLifecycle (#17081) 2023-04-26 17:49:00 -07:00
Poorna
cd6dec49c0 Add trace support for ilm activity (#16993) 2023-04-11 19:22:32 -07:00
Shubhendu
4c204707fd Correct to remove null version while ILM rule application (#16971)
Signed-off-by: Shubhendu Ram Tripathi <shubhendu@minio.io>
Co-authored-by: Harshavardhana <harsha@minio.io>
2023-04-06 14:10:01 -07:00
Harshavardhana
c06e0bfef9 set correct Host: value for replication event notification (#16984) 2023-04-06 10:20:53 -07:00
ferhat elmas
714283fae2 cleanup ignored static analysis (#16767) 2023-03-06 08:56:10 -08:00
Klaus Post
9acf1024e4 Remove bloom filter (#16682)
Removes the bloom filter since it has so limited usability, often gets saturated anyway and adds a bunch of complexity to the scanner.

Also removes a tiny bit of CPU by each write operation.
2023-02-24 09:03:31 +05:30
Klaus Post
fd6622458b Add detailed scanner trace output and notifications (#16668) 2023-02-21 09:33:33 -08:00
Harshavardhana
b66d7dc708 add missing x-amz-id-2 to event notification date (#16646) 2023-02-20 15:41:47 +05:30
Krishnan Parthasarathi
2fa35def2c Fix DeleteObject when only free versions remain (#16289) 2022-12-21 16:24:07 -08:00
Harshavardhana
5d7e8f79ed fix: remove scanner healing with unnecessary logs (#16260) 2022-12-14 16:39:18 -08:00
Aditya Manthramurthy
a30cfdd88f Bump up madmin-go to v2 (#16162) 2022-12-06 13:46:50 -08:00
Klaus Post
a713aee3d5 Run staticcheck on CI (#16170) 2022-12-05 11:18:50 -08:00
Klaus Post
cc1d8f0057 Check for abandoned data when healing (#16122) 2022-11-28 10:20:55 -08:00
Krishnan Parthasarathi
6eef9b4a23 lifecycle: simplify Eval and HasActiveRules (#16036) 2022-11-10 07:17:45 -08:00
Anis Elleuch
3b1a9b9fdf Use the same lock for the scanner and site replication healing (#15985) 2022-11-08 08:55:55 -08:00
Harshavardhana
b57fbff7c1 ignore background healInfo in single drive setup (#15968) 2022-10-31 07:26:10 -07:00
Anis Elleuch
fc6c794972 Audit dangling object removal (#15933) 2022-10-24 11:35:07 -07:00
Anis Elleuch
ac85c2af76 lifecycle: refactor rules filtering and tagging support (#15914) 2022-10-21 10:46:53 -07:00
Harshavardhana
41e1654f9a remove spurious logging for object not found (#15842) 2022-10-12 04:28:21 -07:00
Harshavardhana
928feb0889 remove unused debug param from evalActionFromLifecycle (#15813) 2022-10-07 10:24:12 -07:00
Harshavardhana
ae4ee95d25 change default lock retry interval to 50ms (#15560)
competing calls on the same object on versioned bucket
mutating calls on the same object may unexpected have
higher delays.

This can be reproduced with a replicated bucket
overwriting the same object writes, deletes repeatedly.

For longer locks like scanner keep the 1sec interval
2022-08-19 16:21:05 -07:00
Poorna
21bf5b4db7 replication: heal proactively upon access (#15501)
Queue failed/pending replication for healing during listing and GET/HEAD
API calls. This includes healing of existing objects that were never
replicated or those in the middle of a resync operation.

This PR also fixes a bug in ListObjectVersions where lifecycle filtering
should be done.
2022-08-09 15:00:24 -07:00
Harshavardhana
0a8b78cb84 fix: simplify passing auditLog eventType (#15278)
Rename Trigger -> Event to be a more appropriate
name for the audit event.

Bonus: fixes a bug in AddMRFWorker() it did not
cancel the waitgroup, leading to waitgroup leaks.
2022-07-12 10:43:32 -07:00
Klaus Post
37a6b2da67 Allow compaction at bucket top level. (#15266)
If more than 1M folders (objects or prefixes) are found at the top level in a bucket allow it to be compacted.

While very suboptimal structure we should limit memory usage at some point.
2022-07-11 07:59:03 -07:00