When building images with Docker, we often need to pass secrets in. For example, we may need credentials to pull dependencies from a private registry. Docker provides a number of methods to pass files and variables at build time, but not all are safe for secrets.
Docker layers
Before we dive into each method, we need to understand how Docker builds images.
FROM debian
RUN apt-get update
RUN apt-get install -y vim
When building the above Dockerfile, each command, that is RUN apt-get update and RUN apt-get install -y vim, is executed and stored as a separate layer. Each layer is a diff of the filesystem between the previous layer and after running the command.
Layers are cached and reused, saving on build times1 - this is why the initial build may be slow, but subsequent rebuilds are fast. If we added a new command to the above Dockerfile, it would not rebuild each layer again as nothing has changed, it will simply append a new layer with the result of running the new command.
Layers are the reason why most methods of passing files and variables from the host is not suitable for secrets. The layers are persisted, so even if we run commands that delete files, the layer that copied or created those files will still be there for inspection.
Leaky secrets
It might be tempting to simply use the COPY2 or ARG3 Dockerfile instruction - but neither is suitable for secrets.
COPYing files
We have a simple Dockerfile where we are copying a file with secrets from the host then removing it.
FROM node:14-alpine
WORKDIR /app
COPY ./secret.txt /app/secret.txt
RUN rm -rf /app/secret.txt
When we build and inspect it, we can see the layers corresponding to each command:
docker build -t copy_experiment .
docker history copy_experiment
# IMAGE CREATED CREATED BY SIZE COMMENT
# bff0ec4dd48c 3 hours ago RUN /bin/sh -c rm -rf /app/secret.txt # buil… 0B buildkit.dockerfile.v0
# <missing> 3 hours ago COPY ./secret.txt /app/secret.txt # buildkit 7B buildkit.dockerfile.v0
# <missing> 3 hours ago WORKDIR /app 0B buildkit.dockerfile.v0
# ...
The process to extract the secret.txt file is as follows:
-
Export the image as a
.tar4docker save copy_experiment -o image.tar -
Extract the
.tarto a directorytar -xvf image.tar -C image_extracted -
Inspect the extracted contents
ls -la image_extracted
There are several directories corresponding the layers, each will have a
layer.tarwith the file diff between the layer and the previous layer. There is also amanifest.jsonand another file<random>.jsonfile with a long random name in the root - we will use these to figure out the layer we want to inspect. -
Open the
manifest.jsonand the<random>.jsonThe
<random>.jsonfile has ahistoryfield which corresponds to the commands in theDockerfile, some of which do not result in a layer and will haveempty_layer: true.The
manifest.jsoncontains the layer directories under theLayersfield.Everything is in order, so with both files we can match the Dockerfile command with the layer directory.
-
On finding the layer, we can inspect the files inside the layer's archive
tar -vtf image_extracted/496831ca773c35092cacc3f1d9e5decf604f5d39b173911849144baa661b00dd/layer.tar # drwxr-xr-x 0/0 0 2022-03-07 08:51 app/ # -rw-r--r-- 0/0 7 2022-03-07 08:45 app/secret.txt -
Print the contents of
secret.txtfile in the layertar -O -xf image_extracted/496831ca773c35092cacc3f1d9e5decf604f5d39b173911849144baa661b00dd/layer.tar app/secret.txt
Alternate way to browse layers
The above method of finding the layers by looking at the manifest.json and <random>.json files can become quite difficult when a Dockerfile has many commands and thus layers.
We can leverage the dive5 tool to view the file system at each layer.
Using build ARGs
Dockerfiles support the ARG command, which defines a build argument. This argument is injected at build time with the --build-arg6 flag. The argument can then be used like an environment variable in any of the commands.
The following Dockerfile defines a build argument SECRET and exports it to the environment7 and uses it in a echo command.
FROM node:14-alpine
ARG SECRET
ENV SECRET=${SECRET}
RUN echo ${SECRET}
The process to extract the SECRET is as follows:
-
Build the image specifying a
SECRETbuild argumentdocker build --build-arg SECRET=supersecretstring -t buildarg_experiment . -
Inspect the history of the image
docker inspect buildarg_experiment # IMAGE CREATED CREATED BY SIZE COMMENT # fc4a2049cbdb 2 seconds ago RUN |1 SECRET=supersecretstring /bin/sh -c e… 0B buildkit.dockerfile.v0 # <missing> 2 seconds ago ENV SECRET=supersecretstring 0B buildkit.dockerfile.v0 # <missing> 2 seconds ago ARG SECRET 0B buildkit.dockerfile.v0 # ...As you can see the
SECRETis visible.
Safely injecting secrets
One strategy to avoid exposing secrets is to do a multi-stage Dockerfile8. This involves an initial build image that is responsible for building an executable, then copying that executable to the final run image. We can inject build arguments and copy secrets to the build image knowing that the layers created will not be persisted in the final run image.
A multi-stage Dockerfile is structure like so:
FROM node:14-alpine as builder
WORKDIR /app
COPY ./secret.txt /app/secret.txt
RUN touch runnable # imaginary executable
FROM node:14-alpine
WORKDIR /app
COPY /app/runnable .
We can see two FROM9 commands where one is labelled as the builder, representing our build image and the other is our run image.
When we inspect the resulting layers from the final run image there is no information about the secret.txt.
docker build -t multistage_experiment .
docker inspect multistage_experiment
# IMAGE CREATED CREATED BY SIZE COMMENT
# 8c960c11ac09 About an hour ago COPY /app/runnable . # buildkit 0B buildkit.dockerfile.v0
# <missing> 3 hours ago WORKDIR /app 0B buildkit.dockerfile.v0
# ...
Another advantage of using a multi-stage Dockerfile is the ability to seperate the build and run environments. Typically the build environment has more dependencies as it needs an entire toolchain to build the executable, whereas the run environment just needs the runtime. For example, to build Java .jars, we need the Java Development Kit (JDK), but in order to run a .jar we only need the Java Runtime Environment (JRE), which is much smaller.
Conclusion
Understanding how Docker builds images is important for security. Any file we COPY into the final image, regardless if we delete it with a later command, will be exposed as an intermediate layer. For this reason, we should always copy the bare minimum into a Docker image and in most cases seperate the build and run images using multi-stage Dockerfiles.