Skip to main content Link Search Menu Expand Document (external link)

๐Ÿ›  Methodology Assignment 3

Before attempting this assignment, make sure to read Lesson 4 and complete the tasks marked Action Item there. Submit this assignment to Gradescope by Sunday, October 23rd at 11:59PM. Post questions with the assignment here.

In this assignment, youโ€™ll:

  1. Create a Dockerfile with required specifications, building off of the base data science image maintained by UCSD ETS.
  2. Use that Dockerfile to build an image and push it to Docker Hub.

Follow the steps below. Unless otherwise specified, complete all steps on your personal computer. If youโ€™re unable to do so, see the last bullet point here.

Hereโ€™s a walkthrough video of the assignment, recorded during the lecture help session on Monday, October 17th at 4PM:

  1. Create a folder, and within it create a file with no extensions named Dockerfile. In it, copy the Dockerfile template from the tutorial from lecture.

  2. Modify the Dockerfile so that:
    • aria2, nmap, and traceroute are installed using apt-get
    • geopandas and babypandas are installed using conda or pip
      • There is nothing special about these packages; we just need you to install something!
    • The last line, CMD ["/bin/bash"], is uncommented
  3. Following the steps in the tutorial linked in lecture, use the Dockerfile you created in Step 2 to create an image (using docker build) and launch a container with that image (using docker run). Make sure that all packages are present.
    • Note: It may take >10 minutes to create the image for the first time. This is to be expected, because itโ€™s pulling ~5GB of files. Subsequent pulls will be much faster.
    • If you arenโ€™t able to build due to an error with the line RUN apt-get -y install aria2 nmap traceroute, add RUN apt update to your Dockerfile before the aforementioned line.
    • To verify that packages you installed via apt-get are installed, run which <package-name>, e.g. which nmap (for aria2, use which aria2c).
    • To verify that packages you installed via conda or pip are installed, open a Python interpreter and try to import them.
  4. Push the image to Docker Hub, again using the instructions in the tutorial from above. Make sure your repository is public, so that we can verify it was set up correctly!
    • Note: It may take up to 30 minutes for your image to appear on Docker Hub after you run docker push <image>. This is to be expected.
  5. After your image appears on Docker Hub, follow the instructions from lecture to launch a container on DSMLP using your image. Test out your environment on DSMLP. Can you start python and import babypandas?

  6. Submit your assignment to Gradescope by turning in the path to your image on Docker Hub (e.g. sjrampure/test-image:latest) and your Dockerfile. Make sure to include the tag (i.e. :latest). We will pull your image and test whether it meets the requirements stated above. You will not see an autograder score right away, unlike in Methodology Assignments 1 and 2.