Cloud Native Computing Foundation / Kubernetes AI Day EU 2022

Add meeting Rate page Subscribe

Cloud Native Computing Foundation / Kubernetes AI Day EU 2022

These are all the meetings we have in "Kubernetes AI Day EU…" (part of the organization "Cloud Native Computi…"). Click into individual meeting pages to watch the recording and search or read the transcript.

19 May 2022

A Component Registry for Kubeflow Pipelines - Christian Kadner, IBM

Kubeflow Pipelines are widely used to orchestrate machine learning (ML) workflows on Kubernetes. Pipelines and individual pipeline stages are often worked on collaboratively. To facilitate that process Kubeflow Pipelines support re-usable components, self-contained sets of code that performs one step in the ML workflow, like data preprocessing, data transformation, model training, and model serving. There is a rich set of components from community and vendors. What has been missing from the ecosystem however, is a registry for sharing reusable components with the public or among teams of data scientists. Thus many of the common tasks required to run ML workflows on Kubernetes like creating secrets, persistent volume claims, config maps have to be implemented again and again. A component registry can provide a rich catalog of components to solve those common tasks and ease the burden of creating ML workflows on Kubernetes.
  • 1 participant
  • 29 minutes
workflows
ai
kubeflow
lifecycle
process
preparation
kubernetes
models
stages
interface
youtube image

19 May 2022

A Deep Dive into Kubeflow Pipelines - Senthil Raja Chermapandian, Ericsson

A Machine Learning model is only a tiny piece in a series of multiple processing steps executed as part of an ML workflow. A pipeline is a description of an ML workflow, including all the components in the workflow and how they combine in the form of a graph. Kubeflow Pipelines (KFP) is an open-source project that helps to run Cloud-native ML pipelines on Kubernetes. While most previous talks on KFP have focused on Data Scientists and Data Engineers, this talk will dive deep into KFP, covering its architecture, platform components and how the platform components work together in executing the workflow.
  • 1 participant
  • 32 minutes
kubeflow
kubernetes
pipelines
flow
pipelining
workflows
tensorflow
presentation
cube
framework
youtube image

19 May 2022

Building AutoML Pipelines With Argo Workflows and Katib - Andrey Velichkevich, Apple + Johnu George, Nutanix

The fairly recent field of Automated Machine Learning (AutoML) provides the richness of powerful algorithms for model selection and hyperparameter (HP) tuning – one of the most important steps of the MLOps lifecycle. Katib is a popular Kubernetes native open source project to perform AutoML. Katib can tune HPs for models written in any framework such as Tensorflow, PyTorch, MXNet, and Scikit learn. To find the best HPs, metrics are evaluated after a model training step. Usually, the model training is a complex process which includes data preprocessing, data validation, actual training, and many more. This whole lifecycle can be represented by a workflow dependency graph by specifying dependencies between model operations. Argo Workflows provides a great container-native workflow engine to orchestrate jobs on Kubernetes, which makes it an ideal candidate for Katib Experiments. This talk will demonstrate how Argo Workflows natively integrates in Katib infrastructure.
  • 2 participants
  • 22 minutes
automated
automal
workflow
machine
automl
algorithms
advanced
analyzing
tensorflow
cluster
youtube image

19 May 2022

Closing Remarks - Alex Collins, Intuit + Jessica Andersson, Annotell
  • 2 participants
  • 5 minutes
kubernetes
thank
ai
contributions
nexus
presenters
crew
everybody
tech
hiring
youtube image

19 May 2022

Computer Vision Dog Breed Classification with Convolutional Neural Networks, TensorFlow and Kubeflow - Konstantinos Andriopoulos, Dorothea Kalliora, Arrikto

Sick of strangers at the dog park constantly commenting on how good looking your pup is, but at a loss when they ask, “What breed is it?” Me too! Why not use AI to answer the question for you? For data scientists looking for an open source and scalable way to tackle these sorts of problems, Kubernetes and Kubeflow make analyzing content in images and video much easier than trying to build everything from scratch and run it on bare metal or VMs. In this talk we’ll work through the development of a Notebook that leverages the combined powers of TensorFlow, ResNet-50 models, convolutional neural networks, VGG16, and transfer learning to see how accurately these algorithms can predict the breed of my dog. Spoiler alert! I have the genealogy results, so there will be a big reveal with DNA pitted against a variety of algorithms.
  • 5 participants
  • 29 minutes
kubeflow
workflow
keyflow
tensorflow
kubecon
qflow
kubernetes
kfb
lab
help
youtube image

19 May 2022

Debugging Machine Learning on the Edge with MLExray - Michelle Nquyen, Stanford
  • 4 participants
  • 27 minutes
machine
monitoring
mlx
tensorflow
deploying
platform
android
observability
kubernetes
biases
youtube image

19 May 2022

Efficient AutoML with Ludwig, Ray, and Nodeless Kubernetes - Anne Marie Holler, Elotl + Travis Addair, Predibase

The open-source platforms Ludwig and Ray make Deep Learning (DL) accessible to diverse users, by reducing complexity barriers to training, scaling, deploying, and serving DL models. Recently, Ludwig was extended to support AutoML, for tabular datasets (v0.4.1) and for text classification datasets (v0.5.0), using Ray Tune for hyperparameter search. In this talk, we discuss how Ludwig AutoML exploits heuristics developed using a set of training datasets to efficiently produce models for validation datasets. And we show how running Ludwig AutoML on cloud Kubernetes clusters, using Nodeless K8s to add right-sized GPU resources when they are needed and to remove them when not, reduces cost and operational overhead vs running directly on EC2.
  • 3 participants
  • 28 minutes
mlflow
ml
automation
automl
kubernetes
ludwig
efficient
advanced
abstraction
resources
youtube image

19 May 2022

Enhancing the Performance Testing Process for gRPC Model Inferencing at Scale - Ted Chang, Paul Van Eck, IBM

Performance testing is a critical part of software development that helps us to identify bottlenecks early on and avoid costly crashes that impact operation. When it comes to thousands of machine learning models of many different formats and sizes, ensuring that users can perform inference on these models in reasonable time is paramount. In this session, we show how a Kubernetes cluster is set up with KServe's ModelMesh to enable the high-density deployment of models for gRPC inference. Then, we demonstrate how we load test several thousands of models, and how Prometheus and Grafana are used to illustrate and monitor key performance metrics.
  • 4 participants
  • 32 minutes
microservice
microservices
deploying
kubernetes
providers
capacity
gig
models
automation
cluster
youtube image

19 May 2022

Exploring ML Model Serving with KServe (with fun drawings) - Alexa Nicole Griffith, Bloomberg

KServe (formerly known as KFServing) provides an easy-to-use platform for deploying machine learning (ML) models. KServe is built on top of Kubernetes and provides performant, high abstraction interfaces that allow data scientists to spend more time focusing on building new models, and less time worrying about the underlying infrastructure. This open source project provides a simple, pluggable solution for common infrastructure issues with inference models, like GPU scaling and ModelMesh serving for high volume/density use cases. From the perspective of an eager engineer new to the KServe community, we will explore the KServe features that solve common issues for engineers and data scientists who are interested in or responsible for machine learning model deployment. Expect to learn about KServe’s fundamental offerings, like out-of-the box model serving and monitoring, and its exciting new, advanced functionalities, such as its inference graph capabilities and ModelMesh features. We will discuss the host of new features added to the project since its publication in 2019 and also outline KServe’s roadmap as it moves forward towards its v1.0 release.
  • 5 participants
  • 28 minutes
serving
kf
kubernetes
knowledge
sklearn
hi
pod
kso
protocol
bloomberg
youtube image

19 May 2022

Kubernetes + AI Joining Forces in the Battle Against Cancer - Wojciech Małota-Wójcik, Ridge

A doctor is able to treat one patient at a time. On the other hand, engineers may create software analyzing thousands of cases every day! This presentation will focus on how geographically distributed Kubernetes clusters and AI are indispensable tools for IT professionals and computational biologists as they join forces to battle cancer. Computational biology — combining AI, medicine, mathematics, statistics, IoT and cloud computing — is increasing the precision and capacity of diagnostic processes. This evolution requires petabytes of storage, low-latency networking, efficient GPUs and ease of deployment on a massive scale. This leads us directly to the growing need for highly geographically distributed Kubernetes clusters, running as close to the hospital as possible. The presentation will review the basics of processing cancer images and then show actual examples of how AI algorithms are developed and deployed on K8s clusters and used by doctors to perform life-saving treatments.
  • 1 participant
  • 26 minutes
cancer
tomorrow
patients
taking
ai
presenting
researchers
prostate
tomography
advanced
youtube image

19 May 2022

Managing Multi-Cloud Apache Spark on Kubernetes - Ilan Filonenko, Aki Sukegawa, Bloomberg

Bloomberg has built multi-cloud quant platforms on top of Kubernetes to enable its users to develop sophisticated financial applications with integrated first-class data science capabilities. In this journey, it quickly became clear that managing data science infrastructure in a multi-cloud environment is challenging, especially when it comes to Apache Spark. While Kubernetes provides an excellent abstraction for designing composable infrastructure substrates, it comes with a list of challenges when dealing with auto-scaling, scheduling, preemption, and security. Given these challenges, this talk will explore how one can effectively manage an expansive Spark infrastructure solution that spans bare-metal and multiple public cloud platforms. We will also walk through various observability strategies, primarily focusing on how to surface cluster information to a varied group of Spark end-users by leveraging a variety of native Kubernetes resources, like node autoscalers, controllers, and custom PodConditions.
  • 5 participants
  • 31 minutes
kubernetes
sparkup
cloud
troubleshooting
server
providers
dependencies
apache
interface
gpu
youtube image

19 May 2022

Sponsored Keynote: Challenges and Opportunities in Making AI Easy and Efficient with Kuberenetes - Maulin Patel, Google
  • 1 participant
  • 10 minutes
gpu
kubernetes
ai
cpus
efficient
workloads
provisioning
scalability
reasons
aiml
youtube image