youtube image
From YouTube: Large-Scale K8s Cluster Operation and Management - Lv Jiangzhao, JD

Description

Join us for Kubernetes Forums Seoul, Sydney, Bengaluru and Delhi - learn more at kubecon.io

Don't miss KubeCon + CloudNativeCon 2020 events in Amsterdam March 30 - April 2, Shanghai July 28-30 and Boston November 17-20! Learn more at kubecon.io. The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects

Large-Scale K8s Cluster Operation and Management - Lv Jiangzhao, JD

JDOS(JD Datacenter Operation System) is the very large-scale container cluster system that running in JD's datacenters across the world. It was designed and developed based on Kubernetes. Today, almost all the JD's business has been deployed and running on JDOS. At present, the number of containers in JD's production environment has been millions. How to manage such large-scale clusters is a challenging issue for JDOS developers and operators. However, JD have only 2 full-time SREs to manage the clusters. This presentation will share some of the following experiences: 1.Node Component's detection and management; 2.Master Component's fault detection and failure recovery, especially for the etcd nodes; 3.How to significantly reduce apiserver requests, in order to build a much larger k8s cluster.

To learn more click here: https://sched.co/FuKA