youtube image
From YouTube: Using GitOps for Kubernetes Reliability at Scale - Uma Mukkara, ChaosNative

Description

Don’t miss out! Join us at our next event: KubeCon + CloudNativeCon Europe 2022 in Valencia, Spain from May 17-20. Learn more at https://kubecon.io The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects.

Using GitOps for Kubernetes Reliability at Scale - Uma Mukkara, ChaosNative

GitOps is typically seen as a practice to get your things in order for infrastructure changes and infrastructure configuration management. However, there is another interesting use case for GitOps where it is used to chaos test the large scale deployments when changes happen to the application. Complete reliability as such is a complex subject and it requires carefully designed chaos engineering practices and SRE focus. In large scale systems where Chaos Engineering is scaled up, the chaos scenarios also become big enough to be maintained and applied. One solution is to maintain them and apply them through GitOps. In this lightning talk, I discuss a case study of how GitOps is used in a large customer environment to automate Chaos Engineering using LitmusChaos and FluxCD. The challenge of managing chaos experiments is presented when there are multiple team members and teams are involved. Then I discuss the schematic representation of how GitOps was structured between the application change and chaos experiments. Finally I discuss and summarise the GitOps best practices used to overcome these challenges.