youtube image
From YouTube: When Prometheus Can’t Take the Load Anymore - Liron Cohen, Riskified


Don’t miss out! Join us at our upcoming event: KubeCon + CloudNativeCon North America 2021 in Los Angeles, CA from October 12-15. Learn more at The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects.

When Prometheus Can’t Take the Load Anymore - Liron Cohen, Riskified

Riskified started from using a pair of Prometheus servers in each of its clusters, but soon enough, Prometheus couldn’t take the load anymore. Once it happened, the SRE team started to check what is the best tool for Multi, HA, long-term Prometheus. They decided to check Thanos, Cortex, and M3. In this session, Liron will share her outtakes of the different tools - which tool can provide the best performance and High Availability, the most cost-effective, and the easiest to deploy and operate.
By the end, you’ll get a better understanding of the different tools and which one is the best solution for your use case.