youtube image
From YouTube: Building a Batch Processing Platform... - Rakesh Subramanian Suresh & Aroop Maliakkal Padmanabhan

Description

Don’t miss out! Join us at our upcoming event: KubeCon + CloudNativeCon Europe 2023 in Amsterdam, The Netherlands from April 17-21. Learn more at https://kubecon.io​. The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects.

Building a Batch Processing Platform for Running Data Pipelines Using Argo & Kubernetes - Rakesh Subramanian Suresh & Aroop Maliakkal Padmanabhan, Intuit

Intuit has built a highly scalable batch processing platform with Kubernetes and Argo to enable data engineers to easily deploy, manage, and schedule data pipelines. With hundreds of AI & Data engineering teams managing over 100,000 data pipelines, pipeline deployments have many challenges, including scheduling, orchestration, and managing complex dependencies to eliminate the silos and increase processing effectiveness in the data lake. While there are solutions to these challenges independently, there isn’t one that holistically solves scheduling, pipeline dependency management, and infrastructure deployment and orchestration. In this talk, we will discuss utilizing Argo Events, Argo Workflow, and Kubernetes to build and effectively manage an orchestration and scheduling engine for running various data processing use cases. Besides, we will also cover the learnings and operational challenges of managing this multi-cluster Kubernetes infrastructure and how Argo can be integrated with Kafka for zero downtime scheduling.