youtube image
From YouTube: High Available + Scalable Prometheus with Thanos in Alibaba - Guo'an Qin, Alibaba & Tao Li, Alibaba

Description

Join us for Kubernetes Forums Seoul, Sydney, Bengaluru and Delhi - learn more at kubecon.io

Don't miss KubeCon + CloudNativeCon 2020 events in Amsterdam March 30 - April 2, Shanghai July 28-30 and Boston November 17-20! Learn more at kubecon.io. The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects

High Available + Scalable Prometheus with Thanos in Alibaba - Guo'an Qin, Alibaba & Tao Li, Alibaba

Alibaba Group is using Kubernetes to support the world's largest e-commerce business. With the respect of the availability and scalability, how to provide reliable fine-grained monitoring and alerting services is a indeed challenge. In this talk, we'll share the experiences in developing a fine-grained monitoring system with high availability and scalability based on the open source project Prometheus and Thanos. This system mainly supports Alibaba's cluster management system, which has 4 million TPS and 10K requests per-second. We will have a discussion in following topics. 1) How to support a large-scale scenarios using Prometheus? 2) How to solve data query problem caused by multiple Prometheus instance with low query latency using Thanos? 3) The lessons we learnt from Prometheus and Thanos's configuration, such as target discovery and management of recording rule and alerting rule.

https://sched.co/NroK