Assignment 1C Docker
If Docker is new to you, it is highly recommended to learn more about its architecture and usage on the Docker site; the [getting started guide]((https://docs.docker.com/get-started/) is a useful exercise. In the course, you use docker with images provided by others (well, me), so it is not necessary to follow all the steps to upload your own Dockerfile though.
The real practical assignments of the course aim to give you basic knowledge of Spark, at the moment the de facto Big Data platform. We will work with Spark from Scala, a functional language that is executed on the Java Virtual Machine (JVM), so usage of Java libraries can be mixed with pure Scala in a hybrid environment.
The usefulness of Docker is immediately clear when you set out to explore the language and practice your Scala. Just run a Docker container that comes with a complete Scala environment pre-installed:
docker run -it --rm williamyeh/scala
The first time, the image is downloaded from the Docker hub, and a container is initialised and starts up by automatically running the Scala interpreter (known as the REPL).
Links to explore:
- Scala tutorial for Java Programmers;
- Excellent interactive tutorial;
- Scala at light-speed “for busy programmers who want to learn Scala fast — in about 2 hours or less”;
- Main Scala site and documentation.
The course is not about functional programming, so do not get carried away - at this point, you only want to acquire a very basic understanding of the language.
If you skip the first section of the Scala tutorial (where they
compile and run a Hello World program), i.e., start
you can follow along in the Scala interpreter.
:quit when you are done.)
To follow the Scala tutorial from the start, and also compile and
HelloWorld.scala program, start the Docker container
issuing a shell:
docker run -it williamyeh/scala /bin/bash
(Notice how I leave out the
--rm option when compared to the
previous command; that is because we want to revisit the same
You are now using a Linux computer as
root. You can for example
install additional software - but before that works, you need to
apt update apt-get update
Now, install the editor you prefer, e.g.,
personal preference, but use
nano if all of this is new to you!).
Install it in the container, e.g.:
apt-get install nano
Create the file (
nano HelloWorld.scala, use copy-paste to enter the code).
Next, compile the program and run it:
scalac HelloWorld.scala scala -classpath . HelloWorld
Alternatively, quit the container by typing
You can create the file in your normal desktop environment, save it,
and copy it into the root of the container’s file system (directory
/) using Docker’s copy command:
docker cp HelloWorld.scala HASH:/
You find the HASH value using
docker ps -a (or by using
autocompletion in the shell, press TAB) and issue the following
commands to copy the file and continue using the container (assuming
the value of HASH is
docker cp HelloWorld.scala vibrant_liskov:/ docker start vibrant_liskov docker attach vibrant_liskov
Pro tip: the container that ran most recently can also be queried
docker container ls and then combined on the
commandline into a oneliner to copy your file into user
directory as follows:
docker cp HelloWorld.scala $(docker container ls -lq):/root
Once you are fluent in using the docker client, it is easy to forget that every container and image used takes up disk space on the local machine. Clean up regularly!
The following command lists the running containers:
docker ps -f status=running
Any running containers you do not use, can be stopped using
stop HASH (tab autocompletion is easiest to find the corresponding
HASH). You can remove all inactive, exited containers that you do not
plan to restart by issuing:
docker container prune
Images that are not needed any longer can also be removed, use
image ls followed by
docker image rm for the image you can free up
Finalize the assignment
Don’t forget to read all instuctions under “finalize the assignment” in Assignment 1 part I.
Optional extra reading (not required for the course):