This lesson is still being designed and assembled (Pre-Alpha version)

Introduction to Docker for (Data) Scientists: Glossary

Key Points

Introduction to Docker
  • Docker is a platform for developing, deplying, and running applications inside containers.

  • Containers help improve the portability, reproducibility, and scalability of software engineering and (data) science workflows.

  • A container image is a ‘blueprint’ that is used to create standardized units of software called containers.

  • Publishing your container image to a container registry allows other researchers to create containers from your image and reproduce your results or to build upon your image in order to extend your results.

Getting Started with Docker
  • The Binder Project is an open community that makes it possible to create sharable, interactive, and reproducible research projects.

  • The Binder Project uses repo2docker to create a Docker image for your source code repository and then uses the resulting image to create containers which it then deploys on Google Compute Platform for interactive use.

  • To use the Binder service you need to add a configuration file(s) to your repository and then add a few lines of markdown to your project README.md to enable the ‘Launch Binder’ button.

Getting Started with `repo2docker`
  • repo2docker is a tool that takes a source repository and builds a container image based on the configuration files found in the repository.

  • repo2docker can build an image based on repositories on your local machine or in the cloud (GitHub, GitLab, etc).

  • repo2docker is the tool used by BinderHub to build images on demand.

Simplifying Research Development with Docker Compose
  • Compose is a tool for defining and running Docker applications.

  • With Compose you use a single YAML file to configure your research project as a Docker application and then a single command,docker-compose up, to create the container and start your application.

Cookie-Cutter for (Data) Science
  • ???

Running Docker Containers
  • Existing Docker images can be pulled from container registries such as DockerHub.

  • To run a Docker container based on a Docker image use the docker container run command.

  • ???

  • Mount a directory on the host into a container by passing the --volume option to the docker container run command.

  • Pass the --rm flag when using the docker container run command to automatically remove the container when it exits.

Building Docker Images
  • Build a new Docker image using the docker image build command.

  • Push a new Docker image to a container registry using the docker image push command.

  • Remove a Docker image that you are no longer using with the docker image rm command.

Writing Dockerfiles
  • A Dockerfile is a text document that contains all the commands a user could call on the command line to assemble an image.

  • ???

  • The official Dockerfile reference is the best resource for writing a Dockerfile.

  • Always use a Dockerfile linter, such as fromlatest.io, to make sure that your Dockerfile conforms to “best practices”.

Additional Resources
  • ???

Glossary

FIXME