youtube image
From YouTube: Lightning Talk: Learning from Cortex to Improve Promscale HA - Matvey Arye, Timescale

Description

Don’t miss out! Join us at our next event: KubeCon + CloudNativeCon Europe 2022 in Valencia, Spain from May 17-20. Learn more at https://kubecon.io The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects.

Lightning Talk: Learning from Cortex to Improve Promscale HA - Matvey Arye, Timescale

In general, deploying Prometheus high-availability replicas is critical for robust production systems, since they protect against a crash of any one server. Promscale has supported ingesting and deduplicating data from Prometheus HA replicas since the first release – but our original method was based on database locks, which led to complex deployments, had problems with scalability and coupling, and was less resilient to certain kinds of failures. Our new system, which takes inspiration from Cortex, solves these issues and makes Promscale both easier to use and more robust. In this talk, we will discuss how our understanding of support for Prometheus HA has evolved and use our experience as a lens through which to build a mental model of how Prometheus HA works, and how users should think about a robust end-to-end HA solution. In this talk, we will discuss what guarantees Prometheus HA aims to achieve and the correctness properties that are involved. Next, we’ll cover how all of the services in a Prometheus HA setup connect together and how each component can provide robustness. Finally, we’ll discuss some interesting edge-cases that came up when designing our HA solution.