When building images with Docker, we often need to pass secrets in. For example, we may need credentials to pull dependencies from a private registry. Docker provides a number of methods to pass files and variables at build time, but not all are safe for secrets.
Docker layers
Before we dive into each method, we need to understand how Docker builds images.
FROM debian
RUN apt-get update
RUN apt-get install -y vim
When building the above Dockerfile, each command, that is RUN apt-get update
and RUN apt-get install -y vim
, is executed and stored as a separate layer. Each layer is a diff of the filesystem between the previous layer and after running the command.
Layers are cached and reused, saving on build times1 - this is why the initial build may be slow, but subsequent rebuilds are fast. If we added a new command to the above Dockerfile, it would not rebuild each layer again as nothing has changed, it will simply append a new layer with the result of running the new command.
Layers are the reason why most methods of passing files and variables from the host is not suitable for secrets. The layers are persisted, so even if we run commands that delete files, the layer that copied or created those files will still be there for inspection.
Leaky secrets
It might be tempting to simply use the COPY
2 or ARG
3 Dockerfile instruction - but neither is suitable for secrets.
COPY
ing files
We have a simple Dockerfile where we are copying a file with secrets from the host then removing it.
FROM node:14-alpine
WORKDIR /app
COPY ./secret.txt /app/secret.txt
RUN rm -rf /app/secret.txt
When we build and inspect it, we can see the layers corresponding to each command:
docker build -t copy_experiment .
docker history copy_experiment
# IMAGE CREATED CREATED BY SIZE COMMENT
# bff0ec4dd48c 3 hours ago RUN /bin/sh -c rm -rf /app/secret.txt # buil… 0B buildkit.dockerfile.v0
# <missing> 3 hours ago COPY ./secret.txt /app/secret.txt # buildkit 7B buildkit.dockerfile.v0
# <missing> 3 hours ago WORKDIR /app 0B buildkit.dockerfile.v0
# ...
The process to extract the secret.txt
file is as follows:
-
Export the image as a
.tar
4docker save copy_experiment -o image.tar
-
Extract the
.tar
to a directorytar -xvf image.tar -C image_extracted
-
Inspect the extracted contents
ls -la image_extracted
There are several directories corresponding the layers, each will have a
layer.tar
with the file diff between the layer and the previous layer. There is also amanifest.json
and another file<random>.json
file with a long random name in the root - we will use these to figure out the layer we want to inspect. -
Open the
manifest.json
and the<random>.json
The
<random>.json
file has ahistory
field which corresponds to the commands in theDockerfile
, some of which do not result in a layer and will haveempty_layer: true
.The
manifest.json
contains the layer directories under theLayers
field.Everything is in order, so with both files we can match the Dockerfile command with the layer directory.
-
On finding the layer, we can inspect the files inside the layer's archive
tar -vtf image_extracted/496831ca773c35092cacc3f1d9e5decf604f5d39b173911849144baa661b00dd/layer.tar # drwxr-xr-x 0/0 0 2022-03-07 08:51 app/ # -rw-r--r-- 0/0 7 2022-03-07 08:45 app/secret.txt
-
Print the contents of
secret.txt
file in the layertar -O -xf image_extracted/496831ca773c35092cacc3f1d9e5decf604f5d39b173911849144baa661b00dd/layer.tar app/secret.txt
Alternate way to browse layers
The above method of finding the layers by looking at the manifest.json
and <random>.json
files can become quite difficult when a Dockerfile has many commands and thus layers.
We can leverage the dive
5 tool to view the file system at each layer.
ARG
s
Using build Dockerfiles support the ARG
command, which defines a build argument. This argument is injected at build time with the --build-arg
6 flag. The argument can then be used like an environment variable in any of the commands.
The following Dockerfile defines a build argument SECRET
and exports it to the environment7 and uses it in a echo
command.
FROM node:14-alpine
ARG SECRET
ENV SECRET=${SECRET}
RUN echo ${SECRET}
The process to extract the SECRET
is as follows:
-
Build the image specifying a
SECRET
build argumentdocker build --build-arg SECRET=supersecretstring -t buildarg_experiment .
-
Inspect the history of the image
docker inspect buildarg_experiment # IMAGE CREATED CREATED BY SIZE COMMENT # fc4a2049cbdb 2 seconds ago RUN |1 SECRET=supersecretstring /bin/sh -c e… 0B buildkit.dockerfile.v0 # <missing> 2 seconds ago ENV SECRET=supersecretstring 0B buildkit.dockerfile.v0 # <missing> 2 seconds ago ARG SECRET 0B buildkit.dockerfile.v0 # ...
As you can see the
SECRET
is visible.
Safely injecting secrets
One strategy to avoid exposing secrets is to do a multi-stage Dockerfile8. This involves an initial build image that is responsible for building an executable, then copying that executable to the final run image. We can inject build arguments and copy secrets to the build image knowing that the layers created will not be persisted in the final run image.
A multi-stage Dockerfile is structure like so:
FROM node:14-alpine as builder
WORKDIR /app
COPY ./secret.txt /app/secret.txt
RUN touch runnable # imaginary executable
FROM node:14-alpine
WORKDIR /app
COPY /app/runnable .
We can see two FROM
9 commands where one is labelled as the builder
, representing our build image and the other is our run image.
When we inspect the resulting layers from the final run image there is no information about the secret.txt
.
docker build -t multistage_experiment .
docker inspect multistage_experiment
# IMAGE CREATED CREATED BY SIZE COMMENT
# 8c960c11ac09 About an hour ago COPY /app/runnable . # buildkit 0B buildkit.dockerfile.v0
# <missing> 3 hours ago WORKDIR /app 0B buildkit.dockerfile.v0
# ...
Another advantage of using a multi-stage Dockerfile is the ability to seperate the build and run environments. Typically the build environment has more dependencies as it needs an entire toolchain to build the executable, whereas the run environment just needs the runtime. For example, to build Java .jar
s, we need the Java Development Kit (JDK), but in order to run a .jar
we only need the Java Runtime Environment (JRE), which is much smaller.
Conclusion
Understanding how Docker builds images is important for security. Any file we COPY
into the final image, regardless if we delete it with a later command, will be exposed as an intermediate layer. For this reason, we should always copy the bare minimum into a Docker image and in most cases seperate the build and run images using multi-stage Dockerfiles.