youtube image
From YouTube: Accelerating High-Performance Machine Learning at Scale i... Alejandro Saucedo & Elena Neroslavskaya

Description

Don’t miss out! Join us at our upcoming hybrid event: KubeCon + CloudNativeCon North America 2022 from October 24-28 in Detroit (and online!). Learn more at https://kubecon.io. The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects.

Accelerating High-Performance Machine Learning at Scale in Kubernetes - Alejandro Saucedo, The Institute for Ethical AI & Machine Learning & Elena Neroslavskaya, Microsoft

Identifying the right tools for high-performance production machine learning may be overwhelming as the ecosystem continues to grow at break-neck speed. In this industry collaboration we aim to provide a hands-on guide on how practitioners can productionize optimized machine learning models in cloud native ecosystems using production-ready open source frameworks. We will dive into a practical use-case, deploying the renowned GPT-2 NLP machine learning model in Kubernetes leveraging the ONNX Runtime from the Seldon Core Triton server, which will provide us with a scalable production NLP microservice serving the ML model that can power intelligent text generation applications. We will present some of the key challenges currently being faced in the MLOps space, as well as how each of the tools in the stack interoperate throughout the production machine learning lifecycle.