youtube image
From YouTube: Tutorial: How Delta Lake Supercharges Data Lakes

Description

Delta Lake’s transaction log brings high reliability, performance, and ACID compliant transactions to data lakes. But exactly how does it accomplish this?

Working through concrete examples, we will take a close look at how the transaction logs are managed and leveraged by Delta to supercharge data lakes.

In this tech talk you will learn:
- Enabling and configuring OSS Delta Lake
- Creating Delta Lake tables
- Using history() to view metadata and table versioning
- How Delta manages the log files
- What goes into the transaction logs for various DML operations
- How Delta constructs snapshots of data
- The small file problem and how to mitigate it
- How to construct time travel queries
- Configuring Delta tables for deleted files and log retention

Speaker: Louis Frolio is a Senior Technical Instructor at Databricks. Leveraging his successful career in Data and AI, Louis trains Databricks business partners on Databricks and Spark. He has two Master Degrees, one in Applied Physics from the University of Massachusetts and a second in Strategic Analytics from Brandeis University. Louis lives in New England with his wife and son. As a former professional chef, Louis still considers himself a culinarian and uses his personal time to explore the world of food.

The notebooks for this video can be found at: https://github.com/databricks/tech-talks/tree/master/2020-08-27%20%7C%20How%20Delta%20Lake%20Supercharges%20Data%20Lakes Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. https://databricks.com/databricks-named-leader-by-gartner