Airflow Data Quality (Day 1 Lecture)

In this lecture, Zach explains the concept of cumulative DAGs in Airflow and their unique architecture. He also discusses the importance of using partition sensors and setting concurrency limits on DAGs. Additionally, he shares best practices for backfilling data and highlight the potential challenges and resource-intensive nature of backfilling. This lecture explains how to handle both small and large datasets in your pipeline and airflow. [Recorded on May22nd, 2024]

38 mins

Purchase Required

You need to purchase this content in order to view it

Airflow Data Quality (Day 2 Lecture)

Week 3: Airflow and Airflow Data Quality

Spark Fundamentals Spark Lab on REST API consumption (Day 1 Lab)