Stream Processing Hands-on session with Spark Structured Streaming
For this individual assignment, you analyze a continuous stream of data using Spark’s new Structured Streaming API, and write a blog post about your experience.
Accept the assignment at github classroom, assignment A5.
Continue to work in the container you created for assignment 3.
Open localhost:9001 in the web browser. (And start a tunnel when working from Huygens.)
Import notebook “A5 Sparkling Streams.zpln” from your assignment repository into Zeppelin.
You find detailed instructions for the assignment in the notebook.
You will write a blogpost that addresses at least the following points:
- What did you find easy or difficult about the assignment/the Spark Structured Streaming API
- The questions found in the notebook
- The code you wrote to do some analysis task at the end of the notebook and a brief explanation
Checkin your code in the assignment repo. Your blog post should posted on your main blog, as before, and entered in PeerGrade to start the peer review process (see Brightspace).