youtube image
From YouTube: HDFS CSI Plugin: Speed Up Kubernetes in On-Premises Big Data Cluster - Yi Chen & Junping Du, Tencent

Description

Join us for Kubernetes Forums Seoul, Sydney, Bengaluru and Delhi - learn more at kubecon.io

Don't miss KubeCon + CloudNativeCon 2020 events in Amsterdam March 30 - April 2, Shanghai July 28-30 and Boston November 17-20! Learn more at kubecon.io. The conference features presentations from developers and end users of Kubernetes, Prometheus, Envoy, and all of the other CNCF-hosted projects

HDFS CSI Plugin: Speed Up Kubernetes in On-Premises Big Data Cluster - Yi Chen & Junping Du, Tencent

Kubernetes not only becomes predominant in public cloud area these days, but also becomes a new trend in on-premises big data cluster environment, as an alternative of Hadoop YARN, a resource schedule component. In on-premise big data cluster, majority data are saved in HDFS. How to consume big data in HDFS with Kubernetes is a new challenge to users. In the talk we will introduce our CSI compatible HDFS plugin design and architecture first. Then, we will share our best practices and knowledge about how big data workload Spark use HDFS CSI plugin to access HDFS data when running on K8s. In the end, the TPC-DS benchmark suite will be used to analysis performance comparison between Spark on K8s with HDFS and Spark on YARN with HDFS.

https://sched.co/Nrq7