Stream Processing

For this individual assignment, you analyze a continuous stream of data using Spark’s new Structured Streaming API, and write a blog post about your experience.

Getting started

Accept the assignment at github classroom, assignment A5.
Continue to work in the container you created for assignment 3.
Open localhost:9001 in the web browser. (And start a tunnel when working from Huygens.)
Import notebook “A5 Sparkling Streams.zpln” from your assignment repository into Zeppelin.

You find detailed instructions for the assignment in the notebook.

Blog Post

You will write a blogpost that addresses at least the following points:

What did you find easy or difficult about the assignment/the Spark Structured Streaming API
The questions found in the notebook
The code you wrote to do some analysis task at the end of the notebook and a brief explanation

Checkin your code in the assignment repo. Your blog post should posted on your main blog, as before, and entered in PeerGrade to start the peer review process (see Brightspace).

Happy Hacking!

Back to assignments overview

Stream Processing Hands-on session with Spark Structured Streaming

Getting started

Blog Post