Compare commits

..

18 Commits

Author SHA1 Message Date
copilot-swe-agent[bot]
cbc13366c6 Add comprehensive documentation and examples for AI detection system
Co-authored-by: blackpiglet <59276555+blackpiglet@users.noreply.github.com>
2026-02-02 03:25:37 +00:00
copilot-swe-agent[bot]
664b25cca1 Fix YAML syntax and validate AI detection workflow
Co-authored-by: blackpiglet <59276555+blackpiglet@users.noreply.github.com>
2026-02-02 03:23:57 +00:00
copilot-swe-agent[bot]
3504943019 Add AI-generated issue detection system with workflow and documentation
Co-authored-by: blackpiglet <59276555+blackpiglet@users.noreply.github.com>
2026-02-02 03:22:43 +00:00
copilot-swe-agent[bot]
acd4d5b183 Initial plan 2026-02-02 03:19:24 +00:00
Xun Jiang/Bruce Jiang
386599638f Merge pull request #9510 from Lyndon-Li/ignore-cache-volume-config-without-backup-repo-config
Some checks failed
Run the E2E test on kind / get-go-version (push) Failing after 1m13s
Run the E2E test on kind / build (push) Has been skipped
Run the E2E test on kind / setup-test-matrix (push) Successful in 4s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / get-go-version (push) Successful in 14s
Main CI / Build (push) Failing after 34s
Close stale issues and PRs / stale (push) Successful in 18s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 1m59s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-aws, main) (push) Failing after 1m21s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-gcp, main) (push) Failing after 1m21s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-microsoft-azure, main) (push) Failing after 1m36s
Log when cache volume configured but backup repo is not
2026-02-02 10:50:20 +08:00
Lyndon-Li
9796da389d ignore cache volume config when backup repo config is not provided
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
2026-01-30 18:35:36 +08:00
Scott Seago
dfb1d45831 Remove backup from running list when backup fails validation (#9498)
Some checks failed
Run the E2E test on kind / get-go-version (push) Failing after 1m4s
Run the E2E test on kind / build (push) Has been skipped
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / get-go-version (push) Successful in 12s
Main CI / Build (push) Failing after 32s
Close stale issues and PRs / stale (push) Successful in 15s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 1m34s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-aws, main) (push) Failing after 1m16s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-gcp, main) (push) Failing after 1m30s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-microsoft-azure, main) (push) Failing after 1m16s
Signed-off-by: Scott Seago <sseago@redhat.com>
2026-01-27 16:25:30 -05:00
Xun Jiang/Bruce Jiang
72beb35edc Maintenance Job only uses the first element of the LoadAffinity array from the ConfigMap. (#9494)
Some checks failed
Run the E2E test on kind / get-go-version (push) Failing after 1m7s
Run the E2E test on kind / build (push) Has been skipped
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / get-go-version (push) Successful in 17s
Main CI / Build (push) Failing after 35s
Close stale issues and PRs / stale (push) Successful in 14s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 1m46s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-aws, main) (push) Failing after 1m8s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-gcp, main) (push) Failing after 1m25s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-microsoft-azure, main) (push) Failing after 1m13s
Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
2026-01-23 14:27:50 -05:00
Wenkai Yin(尹文开)
7442d20f9d Merge pull request #9481 from Lyndon-Li/issue-fix-9478
Some checks failed
Run the E2E test on kind / get-go-version (push) Failing after 1m3s
Run the E2E test on kind / build (push) Has been skipped
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / get-go-version (push) Successful in 15s
Main CI / Build (push) Failing after 35s
Close stale issues and PRs / stale (push) Successful in 13s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 1m46s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-aws, main) (push) Failing after 1m34s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-gcp, main) (push) Failing after 1m22s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-microsoft-azure, main) (push) Failing after 1m18s
Issue 9478: Diagnose expose on peek error
2026-01-15 16:53:57 +08:00
lyndon-li
4dfb47dd21 Merge pull request #9487 from Lyndon-Li/issue-fix-for-cache-volume
Fix issue for cache volume
2026-01-15 11:43:36 +08:00
Lyndon-Li
e72fea8ecd fix issue for cache volume
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
2026-01-14 17:45:01 +08:00
lyndon-li
f388a5ce51 update doc statement for Windows support (#9486)
Some checks failed
Run the E2E test on kind / get-go-version (push) Failing after 1m7s
Run the E2E test on kind / build (push) Has been skipped
Run the E2E test on kind / setup-test-matrix (push) Successful in 4s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / get-go-version (push) Successful in 13s
Main CI / Build (push) Failing after 1m1s
Close stale issues and PRs / stale (push) Successful in 13s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 1m31s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-aws, main) (push) Failing after 1m20s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-gcp, main) (push) Failing after 1m17s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-microsoft-azure, main) (push) Failing after 1m5s
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
2026-01-14 00:25:58 -05:00
Lyndon-Li
e703e06eeb diagnose expose on peek error
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
2026-01-13 16:33:14 +08:00
Wenkai Yin(尹文开)
1feaafc03e Merge pull request #9474 from blackpiglet/add_role_rolebinding_in_restore_sequence
Some checks failed
Run the E2E test on kind / get-go-version (push) Failing after 1m58s
Run the E2E test on kind / build (push) Has been skipped
Run the E2E test on kind / setup-test-matrix (push) Successful in 5s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / get-go-version (push) Successful in 15s
Main CI / Build (push) Failing after 42m42s
Close stale issues and PRs / stale (push) Successful in 14s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 1m38s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-aws, main) (push) Failing after 1m12s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-gcp, main) (push) Failing after 1m22s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-microsoft-azure, main) (push) Failing after 1m30s
Add Role, RoleBinding, ClusterRole, and ClusterRoleBinding in restore sequence
2026-01-08 11:58:06 +08:00
Xun Jiang/Bruce Jiang
e446ce54f6 Merge pull request #9473 from vmware-tanzu/fix_e2e_version_check_issue
Some checks failed
Run the E2E test on kind / get-go-version (push) Failing after 1m29s
Run the E2E test on kind / build (push) Has been skipped
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / get-go-version (push) Successful in 14s
Main CI / Build (push) Failing after 44s
Close stale issues and PRs / stale (push) Successful in 16s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 1m44s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-aws, main) (push) Failing after 1m14s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-gcp, main) (push) Failing after 1m15s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-microsoft-azure, main) (push) Failing after 1m20s
Fix the version regexp to make sure releaseRe and tegRe can support string like v1.16
2026-01-07 22:41:40 +08:00
Xun Jiang
b7289b51c7 Add Role, RoleBinding, ClusterRole, and ClusterRoleBinding in restore sequence.
Ensure the RBAC resources are restored before pods.
The change help to avoid pod starting error when pod depends on the RBAC resources,
e.g., prometheus operator check whether it has enough permission before launching
controller, if prometheus operator pod starts before RBAC resources created, it
will not launch controllers, and it will not retry.
f7f07bcdfb/cmd/operator/main.go (L392-L400)

Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
2026-01-07 12:40:23 +08:00
lyndon-li
6eae73f0bf Merge pull request #9466 from Lyndon-Li/collect-kopia-content-log
Some checks failed
Run the E2E test on kind / get-go-version (push) Failing after 1m9s
Run the E2E test on kind / build (push) Has been skipped
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / get-go-version (push) Successful in 13s
Main CI / Build (push) Failing after 33s
Close stale issues and PRs / stale (push) Successful in 14s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 1m49s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-aws, main) (push) Failing after 1m24s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-gcp, main) (push) Failing after 1m56s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-microsoft-azure, main) (push) Failing after 1m28s
Collect kopia content log
2026-01-06 15:51:47 +08:00
Lyndon-Li
1425ebb369 collect kopia content log
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
2025-12-31 15:42:14 +08:00
36 changed files with 778 additions and 88 deletions

197
.github/AI-DETECTION-EXAMPLES.md vendored Normal file
View File

@@ -0,0 +1,197 @@
# AI Issue Detection - Examples
This document provides examples to help understand what triggers AI detection.
## Example 1: High AI Score (Score: 6/8) ❌
**This would be flagged:**
```markdown
## Description
When deploying Velero on an EKS cluster with `hostNetwork: true`, the application fails to start.
## Critical Problem
```
time="2026-01-26T16:40:55Z" level=fatal msg="failed to start metrics server"
```
Status: BLOCKER
## Affected Environment
| Parameter | Value |
|----------|----------|
| Cluster | Amazon EKS |
| Velero Version | 1.8.2 |
| Kubernetes | 1.33 |
## Root Cause Analysis
The controller-runtime metrics uses port 8080 as a hardcoded default...
## Resolution Attempts
### Attempt 1: Use extraArgs
Result: Failed
### Attempt 2: Configure metricsAddress
Result: Failed
## Expected Permanent Solution
Velero should:
1. Auto-detect an available port
2. Accept configuring the controller-runtime port
## Questions for Maintainers
1. Why does controller-runtime use hardcoded 8080?
2. Is there a roadmap to support hostNetwork?
## Labels and Metadata
Severity: CRITICAL
```
**Why flagged (Patterns detected: 6/8):**
-`futureDates` - References "2026-01-26" and "Kubernetes 1.33"
-`excessiveHeaders` - 8+ section headers
-`formalPhrases` - "Root Cause Analysis", "Expected Permanent Solution", "Questions for Maintainers", "Labels and Metadata"
-`aiSectionHeaders` - "## Description", "## Critical Problem", "## Affected Environment", "## Resolution Attempts"
-`perfectFormatting` - Perfect table structure
-`genericSolutions` - Mentions "auto-detect"
---
## Example 2: Medium AI Score (Score: 2/8) ✅
**This would NOT be flagged (below threshold):**
```markdown
**What steps did you take and what happened:**
I'm trying to restore a backup but getting this error:
```
error: backup "my-backup" not found
```
**What did you expect to happen:**
The backup should restore successfully
**Environment:**
- Velero version: 1.13.0
- Kubernetes version: 1.28
- Cloud provider: AWS
**Additional context:**
I can see the backup in S3 but Velero doesn't list it. Running `velero backup get` shows no backups.
```
**Why NOT flagged (Patterns detected: 2/8):**
-`futureDates` - Uses realistic versions
-`excessiveHeaders` - Only 3 headers
-`formalPhrases` - No formal AI phrases
-`excessiveTables` - Has a table but only 1
-`perfectFormatting` - Normal formatting
-`aiSectionHeaders` - Standard issue template headers
-`excessiveFormatting` - Has code blocks
-`genericSolutions` - No generic solutions
---
## Example 3: Legitimate Detailed Issue (Score: 3/8) ⚠️
**This would be flagged but is actually legitimate:**
```markdown
## Problem Description
VolumeGroupSnapshot restore fails with Ceph RBD driver.
## Environment
- Velero: 1.14.0
- Kubernetes: 1.28.3
- ODF: 4.14.2 with Ceph RBD CSI driver
## Root Cause
Ceph RBD stores group snapshot metadata in journal as `csi.groupid` omap key. During restore, when creating pre-provisioned VSC, the RBD driver reads this and populates `status.volumeGroupSnapshotHandle`.
The CSI snapshot controller looks for a VGSC with matching handle. Since Velero deletes VGSC after backup, it's not found.
## Reproduction Steps
1. Create backup with VGS
2. Delete namespace
3. Restore backup
4. Observe VS stuck with "cannot find group snapshot"
## Workaround
Create stub VGSC with matching `volumeGroupSnapshotHandle` and patch status.
## Proposed Fix
1. Backup: Capture `volumeGroupSnapshotHandle` in CSISnapshotInfo
2. Restore: Create stub VGSC if handle exists
## Code References
- Ceph RBD: https://github.com/ceph/ceph-csi/blob/devel/internal/rbd/snapshot.go#L167
- Velero deletion: https://github.com/vmware-tanzu/velero/blob/main/pkg/backup/actions/csi/pvc_action.go#L1124
```
**Why flagged (Patterns detected: 3/8):**
-`futureDates` - Uses current versions
-`excessiveHeaders` - Has 6 section headers
-`formalPhrases` - "Root Cause", "Proposed Fix"
-`excessiveTables` - No tables
-`perfectFormatting` - Normal formatting
-`aiSectionHeaders` - Technical, not generic
-`excessiveFormatting` - Reasonable formatting
-`genericSolutions` - Structured solution with code refs
**Maintainer Action**: This is a legitimate, well-researched issue. Verify the details with the contributor and remove the `potential-ai-generated` label.
---
## Example 4: Simple Valid Issue (Score: 0/8) ✅
**This would NOT be flagged:**
```markdown
Velero backup fails with error: `rpc error: code = Unavailable desc = connection error`
Running Velero 1.13 on GKE. Backups were working yesterday but now all fail with this error.
Logs show the node-agent pod is crashing. Any ideas?
```
**Why NOT flagged (Patterns detected: 0/8):**
- All patterns: None detected
---
## Key Takeaways
### Will Trigger Detection ❌
- Future dates/versions (2026+, K8s 1.33+)
- 4+ formal AI phrases
- 8+ section headers
- Perfect table formatting across multiple tables
- Generic AI section titles
- Auto-detect/generic solution patterns
### Will NOT Trigger ✅
- Realistic version numbers
- Actual error messages from real systems
- Normal issue formatting
- Moderate level of detail
- Standard GitHub issue template
### May Trigger (But Legitimate) ⚠️
- Very detailed technical analysis
- Multiple code references
- Well-structured proposals
- Extensive testing documentation
For these cases, maintainers will verify with the contributor and remove the flag once confirmed.

80
.github/AI-DETECTION-README.md vendored Normal file
View File

@@ -0,0 +1,80 @@
# AI-Generated Content Detection
This directory contains the AI-generated content detection system for Velero issues.
## Overview
The Velero project has implemented automated detection of potentially AI-generated issues to help maintain quality and ensure that issues describe real, verified problems.
## How It Works
### Detection Workflow
The workflow (`.github/workflows/ai-issue-detector.yml`) runs automatically when:
- A new issue is opened
- An existing issue is edited
### Detection Patterns
The detector analyzes issues for several AI-generation patterns:
1. **Excessive Tables** - More than 5 markdown tables
2. **Excessive Headers** - More than 8 consecutive section headers
3. **Formal Phrases** - Multiple formal section headers typical of AI (e.g., "Root Cause Analysis", "Operational Impact", "Expected Permanent Solution")
4. **Excessive Formatting** - Multiple horizontal rules and perfect formatting
5. **Future Dates** - Version numbers or dates that are unrealistic or in the future
6. **Perfect Formatting** - Overly structured tables with perfect alignment
7. **AI Section Headers** - Generic AI-style headers like "Critical Problem", "Resolution Attempts"
8. **Generic Solutions** - Auto-generated solution patterns with multiple YAML examples
### Scoring System
Each detected pattern adds to the AI score. If the score is 3 or higher (out of 8), the issue is flagged as potentially AI-generated.
### Actions Taken
When an issue is flagged:
1. A `potential-ai-generated` label is added
2. A `needs-triage` label is added
3. An automated comment is posted explaining:
- Why the issue was flagged
- What patterns were detected
- Guidelines for contributors to follow
- Request for verification
## For Contributors
If your issue is flagged:
1. **Don't panic** - This is not an accusation, just a request for verification
2. **Review the guidelines** in our [Code Standards](../site/content/docs/main/code-standards.md#ai-generated-content)
3. **Verify your content**:
- Ensure all version numbers are accurate
- Confirm error messages are from your actual environment
- Remove any placeholder or example content
- Simplify overly structured formatting
4. **Update the issue** with corrections if needed
5. **Comment to confirm** that the issue describes a real problem
## For Maintainers
When reviewing flagged issues:
1. Check if the technical details are realistic and verifiable
2. Look for signs of hallucinated content (fake version numbers, non-existent features)
3. Engage with the issue author to verify the problem
4. Remove the `potential-ai-generated` label once verified
5. Close issues that cannot be verified or describe non-existent problems
## Configuration
The detection patterns can be adjusted in the workflow file if needed. The threshold is currently set at 3 out of 8 patterns to balance false positives with detection accuracy.
## False Positives
The detector may occasionally flag legitimate issues, especially those that are:
- Very detailed and well-structured
- Using formal technical documentation style
- Reporting complex problems with extensive details
This is intentional - we prefer to verify detailed issues rather than miss AI-generated ones.

186
.github/MAINTAINER-AI-DETECTION-GUIDE.md vendored Normal file
View File

@@ -0,0 +1,186 @@
# Maintainer Guide: AI-Generated Issue Detection
This guide helps Velero maintainers understand and work with the AI-generated issue detection system.
## Overview
The AI detection system automatically analyzes new and edited issues to identify potential AI-generated content. This helps maintain issue quality and ensures contributors verify their submissions.
## How It Works
### Automatic Detection
When an issue is opened or edited, the workflow:
1. **Analyzes** the issue body for 8 different AI patterns
2. **Calculates** an AI confidence score (0-8)
3. **If score ≥ 3**: Adds labels and posts a comment
4. **If score < 3**: Takes no action (issue proceeds normally)
### Detection Patterns
| Pattern | Description | Weight |
|---------|-------------|--------|
| `excessiveTables` | More than 5 markdown tables | 1 |
| `excessiveHeaders` | More than 8 section headers | 1 |
| `formalPhrases` | 4+ AI-typical phrases (e.g., "Root Cause Analysis") | 1 |
| `excessiveFormatting` | Multiple horizontal rules (---) | 1 |
| `futureDates` | Dates/versions in 2026+ or 2030s | 1 |
| `perfectFormatting` | Multiple identical table structures | 1 |
| `aiSectionHeaders` | 4+ generic AI headers (e.g., "Critical Problem") | 1 |
| `genericSolutions` | Auto-detect patterns with multiple YAML blocks | 1 |
## Working with Flagged Issues
### Step 1: Review the Issue
When you see an issue labeled `potential-ai-generated`:
1. **Read the issue carefully**
2. **Check the detected patterns** (listed in the auto-comment)
3. **Look for red flags**:
- Future version numbers (e.g., "Kubernetes 1.33")
- Future dates (e.g., "2026-01-27")
- Non-existent features or configurations
- Perfect table formatting with no actual content
- Generic solutions that don't match Velero's architecture
### Step 2: Engage with the Contributor
**If the issue seems legitimate but over-formatted:**
```markdown
Thanks for the detailed report! Could you confirm:
1. Are you running Velero version X.Y.Z (you mentioned version A.B.C)?
2. Is the error message exactly as shown?
3. Have you actually tried the workarounds mentioned?
Once verified, we'll remove the AI-generated flag and investigate.
```
**If the issue appears to be unverified AI content:**
```markdown
This issue appears to contain AI-generated content that hasn't been verified.
Please review our [AI contribution guidelines](https://github.com/vmware-tanzu/velero/blob/main/site/content/docs/main/code-standards.md#ai-generated-content) and:
1. Confirm this describes a real problem in your environment
2. Verify all version numbers and error messages
3. Remove any placeholder or example content
4. Test that the issue is reproducible
If you can't verify the issue, please close it. We're happy to help with real problems!
```
### Step 3: Take Action
**For verified issues:**
1. Remove the `potential-ai-generated` label
2. Keep or remove `needs-triage` as appropriate
3. Proceed with normal issue triage
**For unverified/invalid issues:**
1. Request verification (see templates above)
2. If no response after 7 days, consider closing as `stale`
3. If clearly invalid, close with explanation
## Common Patterns
### False Positives (Legitimate Issues)
These may trigger the detector but are usually valid:
- **Very detailed bug reports** with extensive logs and testing
- **Technical design proposals** with multiple sections
- **Well-organized feature requests** with tables and examples
**Action**: Engage with contributor, ask clarifying questions, remove flag if verified.
### True Positives (AI-Generated)
Red flags that indicate unverified AI content:
- **Future version numbers**: "Kubernetes 1.33" (doesn't exist yet)
- **Future dates**: "2026-01-27" (if current date is before)
- **Non-existent features**: References to Velero features that don't exist
- **Generic solutions**: "Auto-detect available port" (not how Velero works)
- **Perfect formatting, wrong content**: Beautiful tables with incorrect info
**Action**: Request verification, ask for actual environment details, consider closing if unverified.
### Edge Cases
**Contributor using AI as a writing assistant:**
- Issue content is verified and accurate
- Just used AI to help structure/format the report
- **Action**: This is acceptable! Remove flag if content is verified.
**Legitimate issue that happens to match patterns:**
- Real problem with detailed analysis
- Includes proper version numbers and logs
- **Action**: Verify with contributor, remove flag once confirmed.
## Statistics and Monitoring
You can search for flagged issues:
```
is:issue label:potential-ai-generated
```
Monitor trends:
- High detection rate → May need to adjust thresholds
- Low detection rate → Patterns working well or need refinement
## Adjusting the System
### Modifying Detection Patterns
Edit `.github/workflows/ai-issue-detector.yml`:
```javascript
// Increase threshold to reduce false positives
if (aiScore >= 4) { // was 3
// Adjust pattern sensitivity
excessiveTables: (issueBody.match(/\|.*\|/g) || []).length > 8, // was 5
```
### Adding New Patterns
Add to the `aiPatterns` object:
```javascript
// Example: Detect excessive use of emojis
excessiveEmojis: (issueBody.match(/[\u{1F300}-\u{1F9FF}]/gu) || []).length > 10,
```
### Disabling the Workflow
Rename or delete `.github/workflows/ai-issue-detector.yml`
## Best Practices
1. **Be courteous**: Contributors may not realize their AI tool generated incorrect info
2. **Verify, don't assume**: Some detailed issues are legitimate
3. **Educate**: Point to the AI guidelines in code-standards.md
4. **Track patterns**: Note common AI-generated patterns for future improvements
5. **Iterate**: Adjust detection thresholds based on false positive rates
## FAQ
**Q: Should we reject all AI-assisted contributions?**
A: No! AI assistance is fine if the contributor verifies accuracy. We only flag unverified AI content.
**Q: What if a contributor is offended by the flag?**
A: Explain it's automated and not personal. We just need verification of technical details.
**Q: Can we automatically close flagged issues?**
A: No. Always engage with the contributor first. Some are legitimate.
**Q: What's an acceptable false positive rate?**
A: Aim for <10%. If higher, increase the threshold from 3 to 4 or 5.
## Support
Questions about the AI detection system? Tag @vmware-tanzu/velero-maintainers in issue #9501.

1
.github/labels.yaml vendored
View File

@@ -41,3 +41,4 @@ kind:
- tech-debt
- usage-error
- voting
- potential-ai-generated

132
.github/workflows/ai-issue-detector.yml vendored Normal file
View File

@@ -0,0 +1,132 @@
name: "Detect AI-Generated Issues"
on:
issues:
types: [opened, edited]
jobs:
detect-ai-content:
runs-on: ubuntu-latest
permissions:
issues: write
contents: read
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Analyze issue for AI-generated content
id: analyze
uses: actions/github-script@v7
with:
script: |
const issue = context.payload.issue;
const issueBody = issue.body || '';
const issueTitle = issue.title || '';
// AI detection patterns
const aiPatterns = {
// Overly structured markdown with extensive tables
excessiveTables: (issueBody.match(/\|.*\|/g) || []).length > 5,
// Multiple consecutive headers with consistent formatting
excessiveHeaders: (issueBody.match(/^#{1,6}\s+/gm) || []).length > 8,
// Overly formal language patterns common in AI
formalPhrases: [
'Root Cause Analysis',
'Operational Impact',
'Expected Permanent Solution',
'Questions for Maintainers',
'Labels and Metadata',
'Reference Files',
'Steps to Reproduce'
].filter(phrase => issueBody.includes(phrase)).length > 4,
// Excessive use of emojis or special characters
excessiveFormatting: issueBody.includes('---\n \n---') ||
(issueBody.match(/---/g) || []).length > 4,
// Unrealistic version numbers or dates in the future
futureDates: /202[6-9]|203\d/.test(issueBody),
// Overly detailed technical specs with perfect formatting
perfectFormatting: issueBody.includes('| Parameter | Value |') &&
issueBody.includes('| Aspect | Status | Impact |'),
// Generic AI-style section headers
aiSectionHeaders: [
'## Description',
'## Critical Problem',
'## Affected Environment',
'## Full Helm Configuration',
'## Resolution Attempts',
'## Related Information'
].filter(header => issueBody.includes(header)).length > 4,
// Unusual specificity combined with generic solutions
genericSolutions: issueBody.includes('auto-detect') &&
issueBody.includes('configuration:') &&
(issueBody.match(/```yaml/g) || []).length > 2
};
// Calculate AI score
let aiScore = 0;
let detectedPatterns = [];
for (const [pattern, detected] of Object.entries(aiPatterns)) {
if (detected) {
aiScore++;
detectedPatterns.push(pattern);
}
}
console.log('AI Score: ' + aiScore + '/8');
console.log('Detected patterns: ' + detectedPatterns.join(', '));
// If AI score is high, add label and comment
if (aiScore >= 3) {
// Add label
try {
await github.rest.issues.addLabels({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: issue.number,
labels: ['needs-triage', 'potential-ai-generated']
});
// Add comment
const confidence = Math.round(aiScore/8 * 100);
const repoPath = context.repo.owner + '/' + context.repo.repo;
const comment = '👋 Thank you for opening this issue!\n\n' +
'This issue has been flagged for review as it may contain AI-generated content (confidence: ' + confidence + '%).\n\n' +
'**Detected patterns:** ' + detectedPatterns.join(', ') + '\n\n' +
'If this issue was created with AI assistance, please review our [AI contribution guidelines](https://github.com/' + repoPath + '/blob/main/site/content/docs/main/code-standards.md#ai-generated-content).\n\n' +
'**Important:**\n' +
'- Please verify all technical details are accurate\n' +
'- Ensure version numbers, dates, and configurations reflect your actual environment\n' +
'- Remove any placeholder or example content\n' +
'- Confirm the issue is reproducible in your environment\n\n' +
'A maintainer will review this issue shortly. If this was flagged in error, please let us know!';
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: issue.number,
body: comment
});
core.setOutput('ai-detected', 'true');
core.setOutput('ai-score', aiScore);
} catch (error) {
console.log('Error adding label or comment:', error);
}
} else {
core.setOutput('ai-detected', 'false');
core.setOutput('ai-score', aiScore);
}
return {
aiDetected: aiScore >= 3,
score: aiScore,
patterns: detectedPatterns
};

View File

@@ -0,0 +1 @@
Add Role, RoleBinding, ClusterRole, and ClusterRoleBinding in restore sequence.

View File

@@ -0,0 +1 @@
Fix issue #9478, add diagnose info on expose peek fails

View File

@@ -0,0 +1 @@
Maintenance Job only uses the first element of the LoadAffinity array

View File

@@ -0,0 +1 @@
Remove backup from running list when backup fails validation

View File

@@ -340,20 +340,16 @@ func (s *nodeAgentServer) run() {
}
}
var cachePVCConfig *velerotypes.CachePVC
if s.dataPathConfigs != nil && s.dataPathConfigs.CachePVCConfig != nil {
if err := s.validateCachePVCConfig(*s.dataPathConfigs.CachePVCConfig); err != nil {
s.logger.WithError(err).Warnf("Ignore cache config %v", s.dataPathConfigs.CachePVCConfig)
} else {
cachePVCConfig = s.dataPathConfigs.CachePVCConfig
s.logger.Infof("Using cache volume configs %v", s.dataPathConfigs.CachePVCConfig)
}
}
var cachePVCConfig *velerotypes.CachePVC
if s.dataPathConfigs != nil && s.dataPathConfigs.CachePVCConfig != nil {
cachePVCConfig = s.dataPathConfigs.CachePVCConfig
s.logger.Infof("Using customized cachePVC config %v", cachePVCConfig)
}
var podLabels map[string]string
if s.dataPathConfigs != nil && len(s.dataPathConfigs.PodLabels) > 0 {
podLabels = s.dataPathConfigs.PodLabels
@@ -368,6 +364,8 @@ func (s *nodeAgentServer) run() {
if s.backupRepoConfigs != nil {
s.logger.Infof("Using backup repo config %v", s.backupRepoConfigs)
} else if cachePVCConfig != nil {
s.logger.Info("Backup repo config is not provided, using default values for cache volume configs")
}
pvbReconciler := controller.NewPodVolumeBackupReconciler(

View File

@@ -115,7 +115,11 @@ var (
"datauploads.velero.io",
"persistentvolumes",
"persistentvolumeclaims",
"clusterroles",
"roles",
"serviceaccounts",
"clusterrolebindings",
"rolebindings",
"secrets",
"configmaps",
"limitranges",

View File

@@ -307,6 +307,16 @@ func (b *backupReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctr
backupScheduleName := request.GetLabels()[velerov1api.ScheduleNameLabel]
b.backupTracker.Add(request.Namespace, request.Name)
defer func() {
switch request.Status.Phase {
case velerov1api.BackupPhaseCompleted, velerov1api.BackupPhasePartiallyFailed, velerov1api.BackupPhaseFailed, velerov1api.BackupPhaseFailedValidation:
b.backupTracker.Delete(request.Namespace, request.Name)
case velerov1api.BackupPhaseWaitingForPluginOperations, velerov1api.BackupPhaseWaitingForPluginOperationsPartiallyFailed, velerov1api.BackupPhaseFinalizing, velerov1api.BackupPhaseFinalizingPartiallyFailed:
b.backupTracker.AddPostProcessing(request.Namespace, request.Name)
}
}()
if request.Status.Phase == velerov1api.BackupPhaseFailedValidation {
log.Debug("failed to validate backup status")
b.metrics.RegisterBackupValidationFailure(backupScheduleName)
@@ -318,16 +328,6 @@ func (b *backupReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctr
// store ref to just-updated item for creating patch
original = request.Backup.DeepCopy()
b.backupTracker.Add(request.Namespace, request.Name)
defer func() {
switch request.Status.Phase {
case velerov1api.BackupPhaseCompleted, velerov1api.BackupPhasePartiallyFailed, velerov1api.BackupPhaseFailed, velerov1api.BackupPhaseFailedValidation:
b.backupTracker.Delete(request.Namespace, request.Name)
case velerov1api.BackupPhaseWaitingForPluginOperations, velerov1api.BackupPhaseWaitingForPluginOperationsPartiallyFailed, velerov1api.BackupPhaseFinalizing, velerov1api.BackupPhaseFinalizingPartiallyFailed:
b.backupTracker.AddPostProcessing(request.Namespace, request.Name)
}
}()
log.Debug("Running backup")
b.metrics.RegisterBackupAttempt(backupScheduleName)

View File

@@ -246,6 +246,7 @@ func TestProcessBackupValidationFailures(t *testing.T) {
clock: &clock.RealClock{},
formatFlag: formatFlag,
metrics: metrics.NewServerMetrics(),
backupTracker: NewBackupTracker(),
}
require.NotNil(t, test.backup)

View File

@@ -292,8 +292,14 @@ func (r *DataDownloadReconciler) Reconcile(ctx context.Context, req ctrl.Request
return ctrl.Result{}, nil
} else if dd.Status.Phase == velerov2alpha1api.DataDownloadPhaseAccepted {
if peekErr := r.restoreExposer.PeekExposed(ctx, getDataDownloadOwnerObject(dd)); peekErr != nil {
r.tryCancelDataDownload(ctx, dd, fmt.Sprintf("found a datadownload %s/%s with expose error: %s. mark it as cancel", dd.Namespace, dd.Name, peekErr))
log.Errorf("Cancel dd %s/%s because of expose error %s", dd.Namespace, dd.Name, peekErr)
diags := strings.Split(r.restoreExposer.DiagnoseExpose(ctx, getDataDownloadOwnerObject(dd)), "\n")
for _, diag := range diags {
log.Warnf("[Diagnose DD expose]%s", diag)
}
r.tryCancelDataDownload(ctx, dd, fmt.Sprintf("found a datadownload %s/%s with expose error: %s. mark it as cancel", dd.Namespace, dd.Name, peekErr))
} else if dd.Status.AcceptedTimestamp != nil {
if time.Since(dd.Status.AcceptedTimestamp.Time) >= r.preparingTimeout {
r.onPrepareTimeout(ctx, dd)
@@ -918,7 +924,7 @@ func (r *DataDownloadReconciler) setupExposeParam(dd *velerov2alpha1api.DataDown
cacheVolume = &exposer.CacheConfigs{
Limit: limit,
StorageClass: r.cacheVolumeConfigs.StorageClass,
ResidentThreshold: r.cacheVolumeConfigs.ResidentThreshold,
ResidentThreshold: r.cacheVolumeConfigs.ResidentThresholdInMB << 20,
}
}
}

View File

@@ -561,6 +561,7 @@ func TestDataDownloadReconcile(t *testing.T) {
ep.On("GetExposed", mock.Anything, mock.Anything, mock.Anything, mock.Anything, mock.Anything).Return(nil, nil)
} else if test.isPeekExposeErr {
ep.On("PeekExposed", mock.Anything, mock.Anything, mock.Anything, mock.Anything, mock.Anything).Return(errors.New("fake-peek-error"))
ep.On("DiagnoseExpose", mock.Anything, mock.Anything).Return("")
}
if !test.notMockCleanUp {

View File

@@ -298,8 +298,14 @@ func (r *DataUploadReconciler) Reconcile(ctx context.Context, req ctrl.Request)
return ctrl.Result{}, nil
} else if du.Status.Phase == velerov2alpha1api.DataUploadPhaseAccepted {
if peekErr := ep.PeekExposed(ctx, getOwnerObject(du)); peekErr != nil {
r.tryCancelDataUpload(ctx, du, fmt.Sprintf("found a du %s/%s with expose error: %s. mark it as cancel", du.Namespace, du.Name, peekErr))
log.Errorf("Cancel du %s/%s because of expose error %s", du.Namespace, du.Name, peekErr)
diags := strings.Split(ep.DiagnoseExpose(ctx, getOwnerObject(du)), "\n")
for _, diag := range diags {
log.Warnf("[Diagnose DU expose]%s", diag)
}
r.tryCancelDataUpload(ctx, du, fmt.Sprintf("found a du %s/%s with expose error: %s. mark it as cancel", du.Namespace, du.Name, peekErr))
} else if du.Status.AcceptedTimestamp != nil {
if time.Since(du.Status.AcceptedTimestamp.Time) >= r.preparingTimeout {
r.onPrepareTimeout(ctx, du)

View File

@@ -260,6 +260,12 @@ func (r *PodVolumeBackupReconciler) Reconcile(ctx context.Context, req ctrl.Requ
} else if pvb.Status.Phase == velerov1api.PodVolumeBackupPhaseAccepted {
if peekErr := r.exposer.PeekExposed(ctx, getPVBOwnerObject(pvb)); peekErr != nil {
log.Errorf("Cancel PVB %s/%s because of expose error %s", pvb.Namespace, pvb.Name, peekErr)
diags := strings.Split(r.exposer.DiagnoseExpose(ctx, getPVBOwnerObject(pvb)), "\n")
for _, diag := range diags {
log.Warnf("[Diagnose PVB expose]%s", diag)
}
r.tryCancelPodVolumeBackup(ctx, pvb, fmt.Sprintf("found a PVB %s/%s with expose error: %s. mark it as cancel", pvb.Namespace, pvb.Name, peekErr))
} else if pvb.Status.AcceptedTimestamp != nil {
if time.Since(pvb.Status.AcceptedTimestamp.Time) >= r.preparingTimeout {

View File

@@ -274,6 +274,12 @@ func (r *PodVolumeRestoreReconciler) Reconcile(ctx context.Context, req ctrl.Req
} else if pvr.Status.Phase == velerov1api.PodVolumeRestorePhaseAccepted {
if peekErr := r.exposer.PeekExposed(ctx, getPVROwnerObject(pvr)); peekErr != nil {
log.Errorf("Cancel PVR %s/%s because of expose error %s", pvr.Namespace, pvr.Name, peekErr)
diags := strings.Split(r.exposer.DiagnoseExpose(ctx, getPVROwnerObject(pvr)), "\n")
for _, diag := range diags {
log.Warnf("[Diagnose PVR expose]%s", diag)
}
_ = r.tryCancelPodVolumeRestore(ctx, pvr, fmt.Sprintf("found a PVR %s/%s with expose error: %s. mark it as cancel", pvr.Namespace, pvr.Name, peekErr))
} else if pvr.Status.AcceptedTimestamp != nil {
if time.Since(pvr.Status.AcceptedTimestamp.Time) >= r.preparingTimeout {
@@ -934,7 +940,7 @@ func (r *PodVolumeRestoreReconciler) setupExposeParam(pvr *velerov1api.PodVolume
cacheVolume = &exposer.CacheConfigs{
Limit: limit,
StorageClass: r.cacheVolumeConfigs.StorageClass,
ResidentThreshold: r.cacheVolumeConfigs.ResidentThreshold,
ResidentThreshold: r.cacheVolumeConfigs.ResidentThresholdInMB << 20,
}
}
}

View File

@@ -1024,6 +1024,7 @@ func TestPodVolumeRestoreReconcile(t *testing.T) {
ep.On("GetExposed", mock.Anything, mock.Anything, mock.Anything, mock.Anything, mock.Anything).Return(nil, nil)
} else if test.isPeekExposeErr {
ep.On("PeekExposed", mock.Anything, mock.Anything, mock.Anything, mock.Anything, mock.Anything).Return(errors.New("fake-peek-error"))
ep.On("DiagnoseExpose", mock.Anything, mock.Anything).Return("")
}
if !test.notMockCleanUp {

View File

@@ -1307,6 +1307,7 @@ func Test_csiSnapshotExposer_DiagnoseExpose(t *testing.T) {
Message: "fake-pod-message",
},
},
Message: "fake-pod-message-1",
},
}
@@ -1501,7 +1502,7 @@ end diagnose CSI exposer`,
&backupVSWithoutStatus,
},
expected: `begin diagnose CSI exposer
Pod velero/fake-backup, phase Pending, node name
Pod velero/fake-backup, phase Pending, node name , message fake-pod-message-1
Pod condition Initialized, status True, reason , message fake-pod-message
PVC velero/fake-backup, phase Pending, binding to
VS velero/fake-backup, bind to , readyToUse false, errMessage
@@ -1518,7 +1519,7 @@ end diagnose CSI exposer`,
&backupVSWithoutVSC,
},
expected: `begin diagnose CSI exposer
Pod velero/fake-backup, phase Pending, node name
Pod velero/fake-backup, phase Pending, node name , message fake-pod-message-1
Pod condition Initialized, status True, reason , message fake-pod-message
PVC velero/fake-backup, phase Pending, binding to
VS velero/fake-backup, bind to , readyToUse false, errMessage
@@ -1535,7 +1536,7 @@ end diagnose CSI exposer`,
&backupVSWithoutVSC,
},
expected: `begin diagnose CSI exposer
Pod velero/fake-backup, phase Pending, node name fake-node
Pod velero/fake-backup, phase Pending, node name fake-node, message
Pod condition Initialized, status True, reason , message fake-pod-message
node-agent is not running in node fake-node, err: daemonset pod not found in running state in node fake-node
PVC velero/fake-backup, phase Pending, binding to
@@ -1554,7 +1555,7 @@ end diagnose CSI exposer`,
&backupVSWithoutVSC,
},
expected: `begin diagnose CSI exposer
Pod velero/fake-backup, phase Pending, node name fake-node
Pod velero/fake-backup, phase Pending, node name fake-node, message
Pod condition Initialized, status True, reason , message fake-pod-message
PVC velero/fake-backup, phase Pending, binding to
VS velero/fake-backup, bind to , readyToUse false, errMessage
@@ -1572,7 +1573,7 @@ end diagnose CSI exposer`,
&backupVSWithoutVSC,
},
expected: `begin diagnose CSI exposer
Pod velero/fake-backup, phase Pending, node name fake-node
Pod velero/fake-backup, phase Pending, node name fake-node, message
Pod condition Initialized, status True, reason , message fake-pod-message
PVC velero/fake-backup, phase Pending, binding to fake-pv
error getting backup pv fake-pv, err: persistentvolumes "fake-pv" not found
@@ -1592,7 +1593,7 @@ end diagnose CSI exposer`,
&backupVSWithoutVSC,
},
expected: `begin diagnose CSI exposer
Pod velero/fake-backup, phase Pending, node name fake-node
Pod velero/fake-backup, phase Pending, node name fake-node, message
Pod condition Initialized, status True, reason , message fake-pod-message
PVC velero/fake-backup, phase Pending, binding to fake-pv
PV fake-pv, phase Pending, reason , message fake-pv-message
@@ -1612,7 +1613,7 @@ end diagnose CSI exposer`,
&backupVSWithVSC,
},
expected: `begin diagnose CSI exposer
Pod velero/fake-backup, phase Pending, node name fake-node
Pod velero/fake-backup, phase Pending, node name fake-node, message
Pod condition Initialized, status True, reason , message fake-pod-message
PVC velero/fake-backup, phase Pending, binding to fake-pv
PV fake-pv, phase Pending, reason , message fake-pv-message
@@ -1634,7 +1635,7 @@ end diagnose CSI exposer`,
&backupVSC,
},
expected: `begin diagnose CSI exposer
Pod velero/fake-backup, phase Pending, node name fake-node
Pod velero/fake-backup, phase Pending, node name fake-node, message
Pod condition Initialized, status True, reason , message fake-pod-message
PVC velero/fake-backup, phase Pending, binding to fake-pv
PV fake-pv, phase Pending, reason , message fake-pv-message
@@ -1698,7 +1699,7 @@ end diagnose CSI exposer`,
&backupVSC,
},
expected: `begin diagnose CSI exposer
Pod velero/fake-backup, phase Pending, node name fake-node
Pod velero/fake-backup, phase Pending, node name fake-node, message
Pod condition Initialized, status True, reason , message fake-pod-message
Pod event reason reason-2, message message-2
Pod event reason reason-6, message message-6

View File

@@ -664,6 +664,7 @@ func Test_ReastoreDiagnoseExpose(t *testing.T) {
Message: "fake-pod-message",
},
},
Message: "fake-pod-message-1",
},
}
@@ -815,7 +816,7 @@ end diagnose restore exposer`,
&restorePVCWithoutVolumeName,
},
expected: `begin diagnose restore exposer
Pod velero/fake-restore, phase Pending, node name
Pod velero/fake-restore, phase Pending, node name , message fake-pod-message-1
Pod condition Initialized, status True, reason , message fake-pod-message
PVC velero/fake-restore, phase Pending, binding to
end diagnose restore exposer`,
@@ -828,7 +829,7 @@ end diagnose restore exposer`,
&restorePVCWithoutVolumeName,
},
expected: `begin diagnose restore exposer
Pod velero/fake-restore, phase Pending, node name
Pod velero/fake-restore, phase Pending, node name , message fake-pod-message-1
Pod condition Initialized, status True, reason , message fake-pod-message
PVC velero/fake-restore, phase Pending, binding to
end diagnose restore exposer`,
@@ -841,7 +842,7 @@ end diagnose restore exposer`,
&restorePVCWithoutVolumeName,
},
expected: `begin diagnose restore exposer
Pod velero/fake-restore, phase Pending, node name fake-node
Pod velero/fake-restore, phase Pending, node name fake-node, message
Pod condition Initialized, status True, reason , message fake-pod-message
node-agent is not running in node fake-node, err: daemonset pod not found in running state in node fake-node
PVC velero/fake-restore, phase Pending, binding to
@@ -856,7 +857,7 @@ end diagnose restore exposer`,
&nodeAgentPod,
},
expected: `begin diagnose restore exposer
Pod velero/fake-restore, phase Pending, node name fake-node
Pod velero/fake-restore, phase Pending, node name fake-node, message
Pod condition Initialized, status True, reason , message fake-pod-message
PVC velero/fake-restore, phase Pending, binding to
end diagnose restore exposer`,
@@ -870,7 +871,7 @@ end diagnose restore exposer`,
&nodeAgentPod,
},
expected: `begin diagnose restore exposer
Pod velero/fake-restore, phase Pending, node name fake-node
Pod velero/fake-restore, phase Pending, node name fake-node, message
Pod condition Initialized, status True, reason , message fake-pod-message
PVC velero/fake-restore, phase Pending, binding to fake-pv
error getting restore pv fake-pv, err: persistentvolumes "fake-pv" not found
@@ -886,7 +887,7 @@ end diagnose restore exposer`,
&nodeAgentPod,
},
expected: `begin diagnose restore exposer
Pod velero/fake-restore, phase Pending, node name fake-node
Pod velero/fake-restore, phase Pending, node name fake-node, message
Pod condition Initialized, status True, reason , message fake-pod-message
PVC velero/fake-restore, phase Pending, binding to fake-pv
PV fake-pv, phase Pending, reason , message fake-pv-message
@@ -902,7 +903,7 @@ end diagnose restore exposer`,
&nodeAgentPod,
},
expected: `begin diagnose restore exposer
Pod velero/fake-restore, phase Pending, node name fake-node
Pod velero/fake-restore, phase Pending, node name fake-node, message
Pod condition Initialized, status True, reason , message fake-pod-message
PVC velero/fake-restore, phase Pending, binding to fake-pv
error getting restore pv fake-pv, err: persistentvolumes "fake-pv" not found
@@ -922,7 +923,7 @@ end diagnose restore exposer`,
&nodeAgentPod,
},
expected: `begin diagnose restore exposer
Pod velero/fake-restore, phase Pending, node name fake-node
Pod velero/fake-restore, phase Pending, node name fake-node, message
Pod condition Initialized, status True, reason , message fake-pod-message
PVC velero/fake-restore, phase Pending, binding to fake-pv
PV fake-pv, phase Pending, reason , message fake-pv-message
@@ -975,7 +976,7 @@ end diagnose restore exposer`,
},
},
expected: `begin diagnose restore exposer
Pod velero/fake-restore, phase Pending, node name fake-node
Pod velero/fake-restore, phase Pending, node name fake-node, message
Pod condition Initialized, status True, reason , message fake-pod-message
Pod event reason reason-2, message message-2
Pod event reason reason-5, message message-5

View File

@@ -592,6 +592,7 @@ func TestPodVolumeDiagnoseExpose(t *testing.T) {
Message: "fake-pod-message",
},
},
Message: "fake-pod-message-1",
},
}
@@ -691,7 +692,7 @@ end diagnose pod volume exposer`,
&backupPodWithoutNodeName,
},
expected: `begin diagnose pod volume exposer
Pod velero/fake-backup, phase Pending, node name
Pod velero/fake-backup, phase Pending, node name , message fake-pod-message-1
Pod condition Initialized, status True, reason , message fake-pod-message
end diagnose pod volume exposer`,
},
@@ -702,7 +703,7 @@ end diagnose pod volume exposer`,
&backupPodWithoutNodeName,
},
expected: `begin diagnose pod volume exposer
Pod velero/fake-backup, phase Pending, node name
Pod velero/fake-backup, phase Pending, node name , message fake-pod-message-1
Pod condition Initialized, status True, reason , message fake-pod-message
end diagnose pod volume exposer`,
},
@@ -713,7 +714,7 @@ end diagnose pod volume exposer`,
&backupPodWithNodeName,
},
expected: `begin diagnose pod volume exposer
Pod velero/fake-backup, phase Pending, node name fake-node
Pod velero/fake-backup, phase Pending, node name fake-node, message
Pod condition Initialized, status True, reason , message fake-pod-message
node-agent is not running in node fake-node, err: daemonset pod not found in running state in node fake-node
end diagnose pod volume exposer`,
@@ -726,7 +727,7 @@ end diagnose pod volume exposer`,
&nodeAgentPod,
},
expected: `begin diagnose pod volume exposer
Pod velero/fake-backup, phase Pending, node name fake-node
Pod velero/fake-backup, phase Pending, node name fake-node, message
Pod condition Initialized, status True, reason , message fake-pod-message
end diagnose pod volume exposer`,
},
@@ -739,7 +740,7 @@ end diagnose pod volume exposer`,
&nodeAgentPod,
},
expected: `begin diagnose pod volume exposer
Pod velero/fake-backup, phase Pending, node name fake-node
Pod velero/fake-backup, phase Pending, node name fake-node, message
Pod condition Initialized, status True, reason , message fake-pod-message
PVC velero/fake-backup-cache, phase Pending, binding to fake-pv-cache
error getting cache pv fake-pv-cache, err: persistentvolumes "fake-pv-cache" not found
@@ -755,7 +756,7 @@ end diagnose pod volume exposer`,
&nodeAgentPod,
},
expected: `begin diagnose pod volume exposer
Pod velero/fake-backup, phase Pending, node name fake-node
Pod velero/fake-backup, phase Pending, node name fake-node, message
Pod condition Initialized, status True, reason , message fake-pod-message
PVC velero/fake-backup-cache, phase Pending, binding to fake-pv-cache
PV fake-pv-cache, phase Pending, reason , message fake-pv-message
@@ -797,7 +798,7 @@ end diagnose pod volume exposer`,
},
},
expected: `begin diagnose pod volume exposer
Pod velero/fake-backup, phase Pending, node name fake-node
Pod velero/fake-backup, phase Pending, node name fake-node, message
Pod condition Initialized, status True, reason , message fake-pod-message
Pod event reason reason-2, message message-2
Pod event reason reason-4, message message-4

View File

@@ -18,6 +18,7 @@ limitations under the License.
import (
"context"
"io"
"github.com/kopia/kopia/repo/logging"
"github.com/sirupsen/logrus"
@@ -30,6 +31,10 @@ type kopiaLog struct {
logger logrus.FieldLogger
}
type repoLog struct {
logger logrus.FieldLogger
}
// SetupKopiaLog sets the Kopia log handler to the specific context, Kopia modules
// call the logger in the context to write logs
func SetupKopiaLog(ctx context.Context, logger logrus.FieldLogger) context.Context {
@@ -39,6 +44,10 @@ func SetupKopiaLog(ctx context.Context, logger logrus.FieldLogger) context.Conte
})
}
func RepositoryLogger(logger logrus.FieldLogger) io.Writer {
return &repoLog{logger: logger}
}
// Enabled decides whether a given logging level is enabled when logging a message
func (kl *kopiaLog) Enabled(level zapcore.Level) bool {
entry := kl.logger.WithField("null", "null")
@@ -160,3 +169,9 @@ func (kl *kopiaLog) logrusFieldsForWrite(ent zapcore.Entry, fields []zapcore.Fie
return copied
}
func (rl *repoLog) Write(p []byte) (int, error) {
rl.logger.Debug(string(p))
return len(p), nil
}

View File

@@ -671,7 +671,8 @@ func buildJob(
}
if config != nil && len(config.LoadAffinities) > 0 {
affinity := kube.ToSystemAffinity(config.LoadAffinities)
// Maintenance job only takes the first loadAffinity.
affinity := kube.ToSystemAffinity([]*kube.LoadAffinity{config.LoadAffinities[0]})
job.Spec.Template.Spec.Affinity = affinity
}

View File

@@ -19,6 +19,7 @@ package kopialib
import (
"context"
"encoding/json"
"io"
"os"
"strings"
"sync/atomic"
@@ -74,7 +75,9 @@ type kopiaObjectWriter struct {
rawWriter object.Writer
}
type openOptions struct{}
type openOptions struct {
repoLogger io.Writer
}
const (
defaultLogInterval = time.Second * 10
@@ -154,7 +157,7 @@ func (ks *kopiaRepoService) Open(ctx context.Context, repoOption udmrepo.RepoOpt
repoCtx := kopia.SetupKopiaLog(ctx, ks.logger)
r, err := openKopiaRepo(repoCtx, repoConfig, repoOption.RepoPassword, nil)
r, err := openKopiaRepo(repoCtx, repoConfig, repoOption.RepoPassword, &openOptions{repoLogger: kopia.RepositoryLogger(ks.logger)})
if err != nil {
return nil, err
}
@@ -199,7 +202,7 @@ func (ks *kopiaRepoService) Maintain(ctx context.Context, repoOption udmrepo.Rep
ks.logger.Info("Start to open repo for maintenance, allow index write on load")
r, err := openKopiaRepo(repoCtx, repoConfig, repoOption.RepoPassword, nil)
r, err := openKopiaRepo(repoCtx, repoConfig, repoOption.RepoPassword, &openOptions{repoLogger: kopia.RepositoryLogger(ks.logger)})
if err != nil {
return err
}
@@ -625,8 +628,10 @@ func (lt *logThrottle) shouldLog() bool {
return false
}
func openKopiaRepo(ctx context.Context, configFile string, password string, _ *openOptions) (repo.Repository, error) {
r, err := kopiaRepoOpen(ctx, configFile, password, &repo.Options{})
func openKopiaRepo(ctx context.Context, configFile string, password string, options *openOptions) (repo.Repository, error) {
r, err := kopiaRepoOpen(ctx, configFile, password, &repo.Options{
ContentLogWriter: options.repoLogger,
})
if os.IsNotExist(err) {
return nil, errors.Wrap(err, "error to open repo, repo doesn't exist")
}

View File

@@ -32,6 +32,7 @@ import (
"github.com/kopia/kopia/repo/maintenance"
"github.com/pkg/errors"
"github.com/vmware-tanzu/velero/pkg/kopia"
"github.com/vmware-tanzu/velero/pkg/repository/udmrepo"
"github.com/vmware-tanzu/velero/pkg/repository/udmrepo/kopialib/backend"
)
@@ -354,7 +355,7 @@ func (b *byteBufferReader) Seek(offset int64, whence int) (int64, error) {
var funcGetParam = maintenance.GetParams
func writeInitParameters(ctx context.Context, repoOption udmrepo.RepoOptions, logger logrus.FieldLogger) error {
r, err := openKopiaRepo(ctx, repoOption.ConfigFilePath, repoOption.RepoPassword, nil)
r, err := openKopiaRepo(ctx, repoOption.ConfigFilePath, repoOption.RepoPassword, &openOptions{repoLogger: kopia.RepositoryLogger(logger)})
if err != nil {
return err
}

View File

@@ -68,10 +68,10 @@ type RestorePVC struct {
type CachePVC struct {
// StorageClass specifies the storage class for cache PVC
StorageClass string
StorageClass string `json:"storageClass,omitempty"`
// ResidentThreshold specifies the minimum size of the backup data to create cache PVC
ResidentThreshold int64
// ResidentThresholdInMB specifies the minimum size of the backup data to create cache PVC
ResidentThresholdInMB int64 `json:"residentThresholdInMB,omitempty"`
}
type NodeAgentConfigs struct {

View File

@@ -140,7 +140,13 @@ func EnsureDeletePod(ctx context.Context, podGetter corev1client.CoreV1Interface
func IsPodUnrecoverable(pod *corev1api.Pod, log logrus.FieldLogger) (bool, string) {
// Check the Phase field
if pod.Status.Phase == corev1api.PodFailed || pod.Status.Phase == corev1api.PodUnknown {
message := GetPodTerminateMessage(pod)
message := ""
if pod.Status.Message != "" {
message += pod.Status.Message + "/"
}
message += GetPodTerminateMessage(pod)
log.Warnf("Pod is in abnormal state %s, message [%s]", pod.Status.Phase, message)
return true, fmt.Sprintf("Pod is in abnormal state [%s], message [%s]", pod.Status.Phase, message)
}
@@ -269,7 +275,7 @@ func ToSystemAffinity(loadAffinities []*LoadAffinity) *corev1api.Affinity {
}
func DiagnosePod(pod *corev1api.Pod, events *corev1api.EventList) string {
diag := fmt.Sprintf("Pod %s/%s, phase %s, node name %s\n", pod.Namespace, pod.Name, pod.Status.Phase, pod.Spec.NodeName)
diag := fmt.Sprintf("Pod %s/%s, phase %s, node name %s, message %s\n", pod.Namespace, pod.Name, pod.Status.Phase, pod.Spec.NodeName, pod.Status.Message)
for _, condition := range pod.Status.Conditions {
diag += fmt.Sprintf("Pod condition %s, status %s, reason %s, message %s\n", condition.Type, condition.Status, condition.Reason, condition.Message)

View File

@@ -925,9 +925,10 @@ func TestDiagnosePod(t *testing.T) {
Message: "fake-message-2",
},
},
Message: "fake-message-3",
},
},
expected: "Pod fake-ns/fake-pod, phase Pending, node name fake-node\nPod condition Initialized, status True, reason fake-reason-1, message fake-message-1\nPod condition PodScheduled, status False, reason fake-reason-2, message fake-message-2\n",
expected: "Pod fake-ns/fake-pod, phase Pending, node name fake-node, message fake-message-3\nPod condition Initialized, status True, reason fake-reason-1, message fake-message-1\nPod condition PodScheduled, status False, reason fake-reason-2, message fake-message-2\n",
},
{
name: "pod with all info and empty event list",
@@ -955,10 +956,11 @@ func TestDiagnosePod(t *testing.T) {
Message: "fake-message-2",
},
},
Message: "fake-message-3",
},
},
events: &corev1api.EventList{},
expected: "Pod fake-ns/fake-pod, phase Pending, node name fake-node\nPod condition Initialized, status True, reason fake-reason-1, message fake-message-1\nPod condition PodScheduled, status False, reason fake-reason-2, message fake-message-2\n",
expected: "Pod fake-ns/fake-pod, phase Pending, node name fake-node, message fake-message-3\nPod condition Initialized, status True, reason fake-reason-1, message fake-message-1\nPod condition PodScheduled, status False, reason fake-reason-2, message fake-message-2\n",
},
{
name: "pod with all info and events",
@@ -987,6 +989,7 @@ func TestDiagnosePod(t *testing.T) {
Message: "fake-message-2",
},
},
Message: "fake-message-3",
},
},
events: &corev1api.EventList{Items: []corev1api.Event{
@@ -1027,7 +1030,7 @@ func TestDiagnosePod(t *testing.T) {
Message: "message-6",
},
}},
expected: "Pod fake-ns/fake-pod, phase Pending, node name fake-node\nPod condition Initialized, status True, reason fake-reason-1, message fake-message-1\nPod condition PodScheduled, status False, reason fake-reason-2, message fake-message-2\nPod event reason reason-3, message message-3\nPod event reason reason-6, message message-6\n",
expected: "Pod fake-ns/fake-pod, phase Pending, node name fake-node, message fake-message-3\nPod condition Initialized, status True, reason fake-reason-1, message fake-message-1\nPod condition PodScheduled, status False, reason fake-reason-2, message fake-message-2\nPod event reason reason-3, message message-3\nPod event reason reason-6, message message-6\n",
},
}

View File

@@ -17,9 +17,8 @@ Velero supports storage providers for both cloud-provider environments and on-pr
### Velero on Windows
Velero does not officially support Windows. In testing, the Velero team was able to backup stateless Windows applications only. The File System Backup and backups of stateful applications or PersistentVolumes were not supported.
If you want to perform your own testing of Velero on Windows, you must deploy Velero as a Windows container. Velero does not provide official Windows images, but its possible for you to build your own Velero Windows container image to use. Note that you must build this image on a Windows node.
Velero supports to backup and restore Windows workloads, either stateless or stateful.
Velero node-agent and data mover pods could run in Windows nodes. To keep compatibility to the existing Velero plugins, Velero server runs in linux nodes only, so Velero requires at least one linux node in the cluster. Velero provides Windows images for specific Windows versions. For more information see [Backup Restore Windows Workloads][6].
## Install the CLI
@@ -71,3 +70,4 @@ Please refer to [this part of the documentation][5].
[3]: overview-plugins.md
[4]: customize-installation.md#install-an-additional-volume-snapshot-provider
[5]: customize-installation.md#optional-velero-cli-configurations
[6]: backup-restore-windows.md

View File

@@ -42,6 +42,46 @@ A command to do this is `make new-changelog CHANGELOG_BODY="Changes you have mad
If a PR does not warrant a changelog, the CI check for a changelog can be skipped by applying a `changelog-not-required` label on the PR. If you are making a PR on a release branch, you should still make a new file in the `changelogs/unreleased` folder on the release branch for your change.
## AI-Generated Content
We welcome contributions from all developers, including those who use AI tools to assist in their work. However, to maintain code quality and ensure contributions are accurate and appropriate, please follow these guidelines:
### Using AI Assistance
**Acceptable use:**
- Using AI tools (like GitHub Copilot, ChatGPT, Claude, etc.) to generate scaffolding or boilerplate code
- Getting AI assistance for writing unit tests
- Using AI to help understand complex code patterns
- AI-assisted debugging and problem-solving
- Using AI to help with documentation writing
**Requirements when using AI:**
1. **Always review and verify** AI-generated content before submitting
2. **Test thoroughly** - ensure the code works as expected in your environment
3. **Verify technical accuracy** - check that all version numbers, configurations, and technical details are correct
4. **Remove placeholders** - ensure there are no example or placeholder content
5. **Understand the code** - be able to explain and defend your changes during code review
6. **Disclose AI usage** - if a significant portion of your PR was AI-generated, mention it in the PR description
### What to Avoid
**Unacceptable practices:**
- Submitting entirely AI-generated PRs or issues without review or verification
- Including hallucinated information (false version numbers, non-existent APIs, etc.)
- Copying AI-generated content with placeholder or example data
- Submitting AI-generated issues describing problems you haven't actually experienced
- Using AI to generate issues about features or bugs without verifying they exist
### For Issues
When creating issues with AI assistance:
- Ensure the issue describes a **real problem** you have experienced
- Verify all version numbers, error messages, and configurations are from your actual environment
- Remove any AI-generated boilerplate or overly formal structure
- Focus on clarity and accuracy over comprehensive formatting
Issues that appear to be entirely AI-generated without proper verification may be labeled as `potential-ai-generated` and flagged for additional review.
## Copyright header
Whenever a source code file is being modified, the copyright notice should be updated to our standard copyright notice. That is, it should read “Copyright the Velero contributors.”

View File

@@ -16,7 +16,7 @@ A sample of cache PVC configuration as part of the ConfigMap would look like:
```json
{
"cachePVC": {
"thresholdInGB": 1,
"residentThresholdInMB": 1024,
"storageClass": "sc-wffc"
}
}
@@ -29,7 +29,7 @@ kubectl create cm node-agent-config -n velero --from-file=<json file name>
A must-have field in the configuration is `storageClass` which tells Velero which storage class is used to provision the cache PVC. Velero relies on Kubernetes dynamic provision process to provision the PVC, static provision is not supported.
The cache PVC behavior could be further fine tuned through `thresholdInGB`. Its value is compared to the size of the backup, if the size is smaller than this value, no cache PVC would be created when restoring from the backup. This ensures that cache PVCs are not created in vain when the backup size is too small and can be accommodated in the data mover pods' root disk.
The cache PVC behavior could be further fine tuned through `residentThresholdInMB`. Its value is compared to the size of the backup, if the size is smaller than this value, no cache PVC would be created when restoring from the backup. This ensures that cache PVCs are not created in vain when the backup size is too small and can be accommodated in the data mover pods' root disk.
This configuration decides whether and how to provision cache PVCs, but it doesn't decide their size. Instead, the size is decided by the specific backup repository. Specifically, Velero asks a cache limit from the backup repository and uses this limit to calculate the cache PVC size.
The cache limit is decided by the backup repository itself, for Kopia repository, if `cacheLimitMB` is specified in the backup repository configuration, its value will be used; otherwise, a default limit (5 GB) is used.

View File

@@ -77,19 +77,6 @@ data:
},
"keepLatestMaintenanceJobs": 1,
"loadAffinity": [
{
"nodeSelector": {
"matchExpressions": [
{
"key": "cloud.google.com/machine-family",
"operator": "In",
"values": [
"e2"
]
}
]
}
},
{
"nodeSelector": {
"matchExpressions": [
@@ -119,10 +106,10 @@ data:
}
EOF
```
This sample showcases two affinity configurations:
- matchLabels: maintenance job runs on nodes with label key `cloud.google.com/machine-family` and value `e2`.
Notice: although loadAffinity is an array, Velero only takes the first element of the array.
This sample showcases how to use affinity configuration:
- matchLabels: maintenance job runs on nodes located in `us-central1-a`, `us-central1-b` and `us-central1-c`.
The nodes matching one of the two conditions are selected.
To create the configMap, users need to save something like the above sample to a json file and then run below command:
```

View File

@@ -17,9 +17,8 @@ Velero supports storage providers for both cloud-provider environments and on-pr
### Velero on Windows
Velero does not officially support Windows. In testing, the Velero team was able to backup stateless Windows applications only. The File System Backup and backups of stateful applications or PersistentVolumes were not supported.
If you want to perform your own testing of Velero on Windows, you must deploy Velero as a Windows container. Velero does not provide official Windows images, but its possible for you to build your own Velero Windows container image to use. Note that you must build this image on a Windows node.
Velero supports to backup and restore Windows workloads, either stateless or stateful.
Velero node-agent and data mover pods could run in Windows nodes. To keep compatibility to the existing Velero plugins, Velero server runs in linux nodes only, so Velero requires at least one linux node in the cluster. Velero provides Windows images for specific Windows versions. For more information see [Backup Restore Windows Workloads][6].
## Install the CLI
@@ -71,3 +70,4 @@ Please refer to [this part of the documentation][5].
[3]: overview-plugins.md
[4]: customize-installation.md#install-an-additional-volume-snapshot-provider
[5]: customize-installation.md#optional-velero-cli-configurations
[6]: backup-restore-windows.md

View File

@@ -17,9 +17,8 @@ Velero supports storage providers for both cloud-provider environments and on-pr
### Velero on Windows
Velero does not officially support Windows. In testing, the Velero team was able to backup stateless Windows applications only. The File System Backup and backups of stateful applications or PersistentVolumes were not supported.
If you want to perform your own testing of Velero on Windows, you must deploy Velero as a Windows container. Velero does not provide official Windows images, but its possible for you to build your own Velero Windows container image to use. Note that you must build this image on a Windows node.
Velero supports to backup and restore Windows workloads, either stateless or stateful.
Velero node-agent and data mover pods could run in Windows nodes. To keep compatibility to the existing Velero plugins, Velero server runs in linux nodes only, so Velero requires at least one linux node in the cluster. Velero provides Windows images for specific Windows versions. For more information see [Backup Restore Windows Workloads][6].
## Install the CLI
@@ -71,3 +70,4 @@ Please refer to [this part of the documentation][5].
[3]: overview-plugins.md
[4]: customize-installation.md#install-an-additional-volume-snapshot-provider
[5]: customize-installation.md#optional-velero-cli-configurations
[6]: backup-restore-windows.md

View File

@@ -76,7 +76,7 @@ toc:
- page: Data Movement Node Selection Configuration
url: /data-movement-node-selection
- page: Data Movement Cache PVC Configuration
url: /data-movement-cache-volume.md
url: /data-movement-cache-volume
- page: Node-agent Concurrency
url: /node-agent-concurrency
- title: Plugins