Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataUpload Fails on Kubernetes 1.29 due to changed VSC SourceVolumeMode #8259

Open
msfrucht opened this issue Oct 3, 2024 · 1 comment
Open

Comments

@msfrucht
Copy link
Contributor

msfrucht commented Oct 3, 2024

What steps did you take and what happened:

Performed a DataUpload using Velero 1.14.1 on Kubernetes 1.29/OpenShift 4.16

The DataUpload fails with the error from the node-agent.

2024-10-03T09:21:23Z ERROR Reconciler error {"controller": "dataupload", "controllerGroup": "velero.io", "controllerKind": "DataUpload", "DataUpload": {"name":"be9184c2-b547-46f6-a4c0-c6a20d96e7e0-1","namespace":"ibm-backup-restore"}, "namespace": "ibm-backup-restore", "name": "be9184c2-b547-46f6-a4c0-c6a20d96e7e0-1", "reconcileID": "ea068f8e-f12b-4e14-8e69-ee44e8d72e19", "error": "error to delete volume snapshot content: error to assure VolumeSnapshotContent is deleted, snapcontent-f3d87ab0-5db8-49d9-bd91-f86c089be222: error to get VolumeSnapshotContent snapcontent-f3d87ab0-5db8-49d9-bd91-f86c089be222: client rate limiter Wait returned an error: context deadline exceeded", "errorVerbose": "client rate limiter Wait returned an error: context deadline exceeded\nerror to get VolumeSnapshotContent snapcontent-f3d87ab0-5db8-49d9-bd91-f86c089be222\ngithub.com/vmware-tanzu/velero/pkg/util/csi.EnsureDeleteVSC.func1\n\t/go/src/github.com/vmware-tanzu/velero/pkg/util/csi/volume_snapshot.go:229

The actual failure shows itself in the CSI driver logs for Ceph RBD and the snapshot-controller webhook pod.

E1003 17:01:23.202234       1 snapshot_controller.go:124] checkandUpdateContentStatus [snapcontent-f3d87ab0-5db8-49d9-bd91-f86c089be222]: error occurred failed to remove VolumeSnapshotBeingCreated annotation on the content snapcontent-f3d87ab0-5db8-49d9-bd91-f86c089be222: "snapshot controller failed to update snapcontent-f3d87ab0-5db8-49d9-bd91-f86c089be222 on API server: admission webhook \"volumesnapshotclasses.snapshot.storage.k8s.io\" denied the request: Spec.SourceVolumeMode is immutable but was changed from Filesystem to nil"
E1003 17:01:23.202271       1 snapshot_controller_base.go:265] could not sync content "snapcontent-f3d87ab0-5db8-49d9-bd91-f86c089be222": failed to remove VolumeSnapshotBeingCreated annotation on the content snapcontent-f3d87ab0-5db8-49d9-bd91-f86c089be222: "snapshot controller failed to update snapcontent-f3d87ab0-5db8-49d9-bd91-f86c089be222 on API server: admission webhook \"volumesnapshotclasses.snapshot.storage.k8s.io\" denied the request: Spec.SourceVolumeMode is immutable but was changed from Filesystem to nil"

During creation of the backup VSC the field Spec.SourceVolumeMode is not copied resulting in the failure. Newer versions of the snapshot-controller verify the SourceVolumeMode field against previous versions of the object.

What did you expect to happen:
DataUpload to succeed.

If you are using velero v1.7.0+:
The Velero Backup object has been deleted. I only have access to the velero and node-agent logs. The node-agent logs were the only ones of value for this issue.

node-agent-logs.zip

Anything else you would like to add:

Webhook logs.
webhook-logs.zip

Environment:

  • Velero version (use velero version): Velero 1.14
  • Velero features (use velero client config get features): EnableCSI
  • Kubernetes version (use kubectl version): 1.29
  • Kubernetes installer & version: unknown
  • Cloud provider or hardware configuration: Red Hat OpenShift 4.16
  • OS (e.g. from /etc/os-release): Red Hat Core

Vote on this issue!

This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.

  • 👍 for "I would like to see this bug fixed as soon as possible"
  • 👎 for "There are more important bugs to focus on right now"
@shubham-pampattiwar
Copy link
Collaborator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants