youtube image
From YouTube: Tech Talk | Diving into Delta-rs: kafka-delta-ingest

Description

Delta Lake committers Christian Williams and R. Tyler Croy from Scribd discuss with Denny Lee from Databricks the technical and business requirements around the Delta Rust API project: kafka-delta-ingest.

This project aims to build a highly efficient daemon for streaming data through Apache Kafka into Delta Lake and has been in production at Scribd for the last four weeks after six months of active development.

Come to learn about why they built it and how it's going.

Resource links:
https://github.com/delta-io/kafka-delta-ingest
https://kafka.apache.org/
https://delta.io/

Speakers:

R. Tyler Croy leads the Platform Engineering organization at Scribd and has been an open source developer for over 14 years. His open source work has been in the FreeBSD, Python, Ruby, Puppet, Jenkins, and now Delta Lake communities. The Platform Engineering team at Scribd has invested heavily in Delta and has been building new open source projects to expand the reach of Delta Lake across the organization. Tyler is also a Databricks Beacon.

Denny Lee is a developer advocate at Databricks, where he works on Delta Lake, Apache Spark, Data Sciences, and Healthcare Life Sciences. He has previously built enterprise DW/BI and big data systems at Microsoft including Azure Cosmos DB, Project Isotope (HDInsight), and SQL Server as well as the Senior Director of Data Sciences Engineering at SAP Concur. Denny holds a Masters in Biomedical Informatics from Oregon Health Sciences University.

Christian Williams is a senior engineer on Scribd's Core Platform team. He has done application and data engineering for 15 years working with a wide range of languages and platforms, most recently working with Kafka, Delta Lake, Rust, and AWS to deliver streaming data ingestion. Before working in software Christian was also one of the fastest sandwich artists in the greater Jacksonville area. Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. https://databricks.com/databricks-named-leader-by-gartner