youtube image
From YouTube: Elephant on Wheels: Petabyte-scale AI @ LinkedIn - Cong Gu & Abin Shahab, LinkedIn


Don’t miss out! Join us at our upcoming events: EnvoyCon Virtual on October 15 and KubeCon + CloudNativeCon North America 2020 Virtual from November 17-20. Learn more at The conferences feature presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects.

Elephant on Wheels: Petabyte-scale AI @ LinkedIn - Cong Gu & Abin Shahab, LinkedIn

Kubernetes has flourished at LinkedIn for AI workloads. It started as a proof of concept for Jupyter notebooks, and now it has become a key infrastructure for model training and model serving. LinkedIn AI has been traditionally Hadoop/YARN based, and its Hadoop data lake is one of the worlds largest. To allow AI and non-AI workloads to securely access HDFS, a scalable, secure, open-source integration with HDFS Kerberos called Kube2Hadoop was built. This enables AI modelers at LinkedIn to use data securely in their model exploration and training with KubeFlow components such as the mpi-operator. LinkedIn’s infra teams are also prototyping a multilevel scheduler on top of Kubernetes and YARN clusters on the cloud, which can intelligently route jobs to multiple clusters and can facilitate workflows across Kubernetes and YARN clusters.