youtube image
From YouTube: Velero Community Meeting/Open Discussion - March 31, 2020

Description

March 31, 2020

Status Updates

[nrb]
Testing https://github.com/vmware-tanzu/velero/pull/2323 with review feedback
Adds the --features flag into the plugin framework for plugins now, still bugs to chase down (informer cache issues)
Led to https://github.com/vmware-tanzu/velero-plugin-for-gcp/pull/23
I created https://github.com/nrb/velero-csi-env based on Ashish’s script for setting up his host path driver
Reviewed https://github.com/vmware-tanzu/velero/pull/2373
[steve]
lot of reviews (CSI, cacert, Azure storage keys)
started looking at two-stage snapshot + backup process again (more discussion below)
[carlisia]
Community support this week + PR reviews + triaging new issues
Tested new Helm 3 chart with helm2 and helm3, PR merged.
Might write a blog post about how to connect Velero with a service
Community: please review the CLI install/config redesign PR: https://github.com/vmware-tanzu/velero/pull/2202
[ashish]
CSI
Update on VolumeSnapshotRef.UID issue
Discussion w/ CSI folks on the usage of secrets in volumesnapshotclass and volumesnapshotcontents
WIP
Opened Issue 2371: Which VolumeSnapshotContents to include in the backup
Building a catalog of buleprints/workflows for backing up and restoring stateful applications/databases. Survey to go out shortly.
[brito-rafa]
Will quickly screenshare and show an example of how a backup tar ball does look like with all API Groups and versions (as PR #2373 )

Discussion Topics

Seems there is a high interest in backing up volumes by default. Should/could we prioritize this? ]
Here’s a request to add an operator to our docs: https://github.com/vmware-tanzu/velero/issues/2375
Here’s a related issue: https://github.com/vmware-tanzu/velero/issues/605
Here’s another: https://github.com/vmware-tanzu/velero/issues/1871
[steve] two-stage snapshot + backup process design (https://github.com/vmware-tanzu/velero/issues/1519)
Problem Statement
Velero does not wait/check for snapshots to be made durable/restorable
A backup marked as completed may still be at risk if the snapshot data is not made durable
A restore may fail if its backup’s snapshots are not yet ready to be restored from
Things to Consider
Do we want to solve this for Velero snapshots? CSI snapshots? Both?
There’s potentially a difference between “durable” and “ready to restore” that needs to be modeled
With EBS/GCP, the snapshot can’t be restored until it’s been made durable by replicating the data to object storage, so they’re effectively the same thing
With the vSphere plugin and maybe the OpenEBS plugin, a local snapshot can be restored from, even if it hasn’t yet been made durable
Should Velero actively drive the upload process, or should it passively check for upload status as reported by an external component?
We don’t want to block the Velero backup queue while waiting for snapshot data to be made durable
We need to keep the timespan between pre- and post-hooks as short as possible, i.e. data replication should take place outside of hook execution
Since it may take a significant amount of time to make a snapshot durable, we need to be able to handle pod restarts, network interruptions, etc. gracefully.
[nolan] v1.3.2 release this week
https://github.com/vmware-tanzu/velero/pull/2350 for plugin dir in object storage
Anything else?
[Mayank] can we upgrade from 1.0.0 to 1.3.1? https://kubernetes.slack.com/archives/C6VCGP4MT/p1585647294073200
[Dylan] Concurrent Backup/Restores
https://github.com/vmware-tanzu/velero/issues/487

Contributor Shoutouts

@mansam for adding support for custom cert bundles (https://github.com/vmware-tanzu/velero/pull/2353, https://github.com/vmware-tanzu/velero-plugin-for-aws/pull/34 and more)

@jaygridley - Azure: support using static storage keys (https://github.com/vmware-tanzu/velero-plugin-for-microsoft-azure/pull/32)

Shoutout to Jonas for adding search to our docs!


Helm chart

@yurinnick for making the chart Helm 3 compatible (https://github.com/vmware-tanzu/helm-charts/pull/81)
@yurinnick for fixing timeout chart-testing parameter (https://github.com/vmware-tanzu/helm-charts/pull/85)