Skip to main content

Introduction to Big Data Analytics using Spark and Python

This workshop will teach how to how to utilize Apache Spark and Python to perform large-scale, in-memory data analytics. Learning outcomes of this workshop include understanding the overall conceptual design of Spark and demonstrate the advantages of using Spark over traditional Hadoop MapReduce.

Participants will also learn to develop Spark programs using Python and to leverage Spark’s specific capabilities such as SQLContext and DataFrame to assist with data analytics.

Session Information

No live sessions are currently planned for this workshop.

Resources

For a self-guided version, you can read the Introduction to Big Data Analytics using Spark/Python guide on our workshop site.