Advanced Spark (Day 2 Lecture)

In this lecture, Zach covers the Spark APIs, specifically focusing on Dataset, Dataframe, and Spark SQL. He explains the differences between them and their use cases. Additionally, he discusses PySpark UDFs and when to use them. The lecture provides important insights and recommendations for working with Spark. [Recorded on May30th, 2024]

22 mins

Purchase Required

You need to purchase this content in order to view it

Spark Fundamentals (Day 1 Lecture)

Week 4: Batch Pipelines with Apache Spark

Spark Data Quality (Day 3 Lab)