youtube image
From YouTube: Supporting Long-Lived Pods Using a Simple Kubernetes Webhook - Clément Labbe, Slack

Description

Don’t miss out! Join us at our upcoming hybrid event: KubeCon + CloudNativeCon North America 2022 from October 24-28 in Detroit (and online!). Learn more at https://kubecon.io. The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects.

Supporting Long-Lived Pods Using a Simple Kubernetes Webhook - Clément Labbe, Slack

Today's applications strive to boot fast, be stateless, and handle unexpected terminations gracefully. However, some applications like distributed caches can take a while to warm up to a running state, while batch workers would rather avoid being terminated before they're done. At Slack, such applications found their home in Kubernetes thanks to a two-sided system: one one hand an admission webhook injects tolerations in pods to inform their requirement to be long-lived, and on the other hand a custom service taints nodes with their uptime. This results in pods desiring a long life to be scheduled on young nodes less likely to be terminated early. This talk will first describe how to write a simple Kubernetes admission webhook (https://github.com/slackhq/simple-kubernetes-webhook) to inject tolerations in pods, then move onto the symbiotic node tainting system, and end with gotchas and some metrics on how this long-lived pod support is used at Slack.