In this lecture, Zach explains the concept of cumulative DAGs in Airflow and their unique architecture. He also discusses the importance of using partition sensors and setting concurrency limits on DAGs. Additionally, he shares best practices for backfilling data and highlight the potential challenges and resource-intensive nature of backfilling. This lecture explains how to handle both small and large datasets in your pipeline and airflow. [Recorded on May22nd, 2024]
38 mins