In this lab, Zach will provide an overview of airflow backfilling practices and discuss the pipeline code using pyspark. He will explain the importance of including the username in the DAG name and demonstrate how to run glue jobs and backfill data. He will also show a table that was backfilled and discuss the comparison of tables using union all. [Recorded on May 22nd, 2024]
42 mins