youtube image
From YouTube: Sponsored Session: Salesforce - Using Sloop for Monitoring Highly Available Services

Description

Don’t miss out! Join us at our upcoming event: KubeCon + CloudNativeCon North America 2021 in Los Angeles, CA from October 12-15. Learn more at https://kubecon.io The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects.

Sponsored Session: Salesforce - Using Sloop for Monitoring Highly Available Services

Speakers: Sana Jawad, Hemanth Siddulugari

Kubernetes cluster’s state is ephemeral in nature and the workloads can run on any nodes or pods. There is a set of great tools available for visualizing “current” state of the cluster. But often times, live site incidents are mitigated and root cause analysis is left for later. This becomes particularly challenging for incidents that happen due to various K8s events since these events are only available for one hour on the cluster. After this, the only way to debug what happened is by correlating various logs and timelines from control plane, hence making it harder to root cause (increases MTRR). Sloop provides a one stop shop solution with a single pane of glass by showing historic view of cluster. In this demo, we will be sharing the top real incidents for which we found the root cause in a matter of minutes using Sloop.