youtube image
From YouTube: Building and Managing a Centralized ML Platform with Kubeflow at... Ricardo Rocha & Dejan Golubovic

Description

Don’t miss out! Join us at our upcoming event: KubeCon + CloudNativeCon North America 2021 in Los Angeles, CA from October 12-15. Learn more at https://kubecon.io The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects.

Building and Managing a Centralized ML Platform with Kubeflow at CERN - Ricardo Rocha & Dejan Golubovic, CERN

CERN’s main mission is to expand human knowledge trying to understand the nature of the universe, and machine learning has been growing as a solution for challenges in different areas of development and operations. Areas where ML is being looked at include particle classification using graph neural networks during reconstruction, 3DGANs for much faster generation of simulation data, or reinforced learning for beam calibration. This session presents a recently introduced centralized service covering most use cases, handling data preparation, model training and serving. How it tries to improve resource usage (especially important when handling scarce resources such as accelerators) by offering different resource types (GPU, vGPU, TPU) for each use case. The session will also describe our journey with Kubeflow, the machine learning platform running on top of Kubernetes, and how we integrated on-premises resources and the different possibilities being looked at to extend to public clouds.