youtube image
From YouTube: Pachyderm: Unlock the Power of Kubernetes for Big Data by Joey Zwicker, Pachyderm

Description

Pachyderm: Unlock the Power of Kubernetes for Big Data - Joey Zwicker, Pachyderm

Pachyderm is an open source big data analytics platform completely deployed on Kubernetes. Pachyderm leverages K8s's jobs API to process massive data workloads and build streaming pipelines. Pachyderm's hallmark feature is version-controlled data including viewing branches, commits and diffs for petabyte-scale data sets.

In this talk we'll demonstrate how Kubernetes and Pachyderm empowers data science teams to collaborate on a shared and unified data infrastructure. Everything is run on Kubernetes including streaming data ingestion, machine learning pipelines, to automatic service deployment using Rolling Updates.

Our talk will discuss how Pachyderm couldn't exist without a large swath of advanced Kubernetes primitives and includes demo where we stream data through the system and watch Kubernetes automatically schedule analytics containers and parallelize the data processing. This demo is inspired directly by how production users are managing data in Pachyderm today.
Join us for KubeCon + CloudNativeCon in Barcelona May 20 - 23, Shanghai June 24 - 26, and San Diego November 18 - 21! Learn more at https://kubecon.io. The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy and all of the other CNCF-hosted projects.