Containerisation is a concept in software engineering that helps complex applications be re-usable and portable; by putting apps or other processes in containers (similar to, but not quite identical to, virtual machines). A container is the running object itself, but the files in that container are known as the respective container image.
Simply put, while developers typically distribute just the source code (typically as Git repos), containerisation lets you ship the source code, all its dependencies,and everything else it needs, including (almost) the entire operating system, ensuring the app will run (near) identically across platforms. Thanks to various OS tricks, performance isn't significantly impacted; and you also get the advantage of complete separation between two running containers – neither needs to know that the other exists, and so if one breaks then the rest of the machine is completely unaffected.
Docker is one of the most popular tools for managing both running containers; and images. It's incredibly useful in a hackathon – the ability to simply get something that works on one machine to work just as well on another is invaluable, and you can also rely on existing container images for various parts of a project.
Installing Docker varies by OS.
On MacOS and Windows, download Docker Desktop.
Linux has Docker Engine which is also available in binaries.
Note
The binaries are available for Windows too, but they don't include Compose, which we'll use later.
You can also install Podman, a drop-in replacement for Docker for Windows, Linux and MacOS (just replace the docker in commands with podman). It is an open-source, lightweight alternative.
Once you have a working docker command (see below), you can proceed with the rest of this guide.
Tip
Try running docker run hello-world; docker commands might need root permission if your user isn't in the docker group.
Creating a container and running an image is as simple as docker run <image name>. Here, we'll work with the Postgres example image (hosted on Docker Hub; similar to GitHub in the sense that it's a repository, but of images).
Running that Postgres image is as simple as docker run postgres. But if you try this straight out of the box, you'll see it complains about environment variables, and we come to the next major point of running images: configuring them.
There are many ways to customize the behaviour of an image. Many are specified by the image authors (so read the documentation!!).
We can change the behaviour of an image by utilising:
-
Environment variables: add a
-e VARIABLE_NAME=variable-valueflag to yourdocker runcommand. You can add multiple. -
Ports: to bind a port from the image to a port on the host, use the
-pflag.-p 8080:80will forward container port 80 to host port 8080. So, navigating tolocalhost:8080in a web browser will show whatever is on the container's port80. -
Volumes (data volumes): You can pass files or directories to the container with the
-vflag. The flag below binds./some_local_dataon the host tovar/app/some_other_datain the container. This could be a directory or a file. Volumes are two-way, so if the container makes changes to the mounted data, the host can see this.-v ./some_local_data:/var/app/some_other_data
Other useful flags include:
--rmautomatically removes the container once the process exits.-i&-t, together, allow you to interface with the container in a shell-like way; you'll often see the-it(or-ti) flags on containers to be interacted with.--name <some container name>specifies the name of the container, often making it easier to refer to with other commands or containers.
Docker containers can, by default, access each-other by IP address. They can't access the host, however. To let running containers access the host, the network type has to be changed.
Docker has networks that define how containers connect to each other, the host machine, and the outside world. When Docker starts-up initially, a network called "bridge" is created. It uses the "bridge" network driver, which (unsurprisingly) bridges the network to the outside world – containers on a bridge network can see each other (on the same network) and the outside world, but not the host.
The other commonly-used network driver, "host", allows containers to see the host (as well as each other). If two containers don't share a network; even if the networks they're on have the same network type; they can't access each-other.
Containers can, by default, only access each other by IP, but on any user-created (non-default) network the container name will resolve to the container of interest; provided it's on the same network.
Tip
You can find the IP of a network by inspecting the docker inspect <container name or id> command. This also gives a lot of general useful information about any container.
Networking isn't massively useful in a hackathon context, since the isolation of containers isn't a massive concern, but it can be useful to know how to allow containers to connect to host services:
docker network create -d host my_networkcreates a network named "my_network" with the host driver,docker run --network name=my_network <container image and other flags>allows that container to access the host onlocalhost(the container must then use its own IP to address itself).
Other commands that might prove useful include:
docker exec [-it] <container_name> <command>runs a shell command in a running container.docker psshows all running containers.docker kill <container_name>ordocker restart <container_namekills & removes, or restarts a running container.docker rm <container_name>removes a non-running container.docker stop <container_name>stops (without removing) a running container.
docker --help will give you an exhaustive list including any that we've missed.
Let's say you want to set up a database in Postgres for your software (please also check out our HackPack!).
To get Postgres working, you need to set an environment variable for the password:
docker run postgres -e POSTGRES_PASSWORD=some_passwordBut if you want to be able to access it from localhost, you need to forward the relevant port:
docker run postgres -e POSTGRES_PASSWORD=some_password -p 5432:5432And if you want to have access to the data yourself, you need to bind the volume:
docker run postgres -e POSTGRES_PASSWORD=some_password -p 5432:5432 -v ./my/own/datadir:/var/lib/postgresqlAs you can imagine, this gets unwieldly rather fast, especially if you have multiple containers, and you need to iterate quickly on setup and configuration. That's where Docker Compose comes in!
Docker Compose is a declarative way of creating and running groups of containers and networks. Containers and networks are defined in a docker-compose.yml file, which looks something like this:
services:
my_postgres_container:
image: postgres:latest
environment:
POSTGRES_USER: some_username
ports:
- "5432:5432"
volumes:
- "./my/own/datadir:/var/lib/postgresql"
some_other_container:
image: something_else:latestWe can run this configuration with docker compose up -d (where the -d detatches you from the stdin/stdout of the containers). You can specify individual containers by name:
docker compose up -d some_other_containerSimilarly, you can also remove (docker down -d <container>) or restart (docker restart -d <container>) containers from the Compose.
Docker also allows you to build images yourself. This can be useful for sharing environments between team members, or potentially deploying somewhere. To do this, we're going to "dockerize" an existing application – specifically, some arbitrary Python Flask backend.
I've created an incredibly simple main.py, but running it isn't as simple – we need to have python installed, and ideally also gunicorn.
In order to create a Docker image, we need a Dockerfile, which is a list of instructions for Docker to follow to construct the image.
Our Dockerfile for main.py looks as follows:
FROM python:latest
WORKDIR /var/app/
COPY main.py ./
RUN pip install gunicorn flask
EXPOSE 8000
CMD ["gunicorn", "main:app", "-b", "0.0.0.0:8000"]The commands shown are the most commonly-used ones in Dockerfile. Here's what they do:
FROM: This is (almost) always the first command in any Dockerfile. It specifies a "base" image to build upon.WORKDIR: This sets the current working directory for the following commands (such asCOPYorRUN, in this case).COPY: This copies a file (or files) from the directory on the host where thedocker buildcommand is run, into the container image.RUN: This runs a command in the container. Here, it installs dependencies for the app that aren't included in the base image.EXPOSE: Exposes a port on the container. It doesn't do much internally, but acts as a sort-of documentation for users of the image.CMD: This specifies the default command to run when the container is started withdocker run.
In order to actually build this image, use the command:
docker build . -t <"a name for your image">The image can then be built with:
docker run <"a name for your image">Finally, docker run <a name for your image> -p 8000:8000 allows you to visit localhost:8000 and see your working app!
Sharing images between members of your group can often be useful. Images can be published to Docker Hub if you have a Docker Hub account and have logged-in (docker login) with docker push <image name>.
If you do publish your image, its name (used in docker build) should follow the format of <username>/<project name>:<version>.
To store the image, however, you first nee to create a repository (on Docker Hub itself). After publishing, other people can use your image with:
docker run <username>/<project name>:<version>