Immutable layers, file deletion and image size in Docker
While I've been vaguely aware of the idea of immutable layers in Docker images, a recent presentation on Docker images caused me to dig a little deeper into how the immutability of these layers means you have to think a bit about what you're doing when creating and deleting content in the course of building an image. Here's what I found out.
When describing the construction of an image, in a Dockerfile
, most of the commands, such as RUN, COPY and ADD, cause layers to be created (see Understanding Image Layering Concept with Dockerfile). These layers are immutable, which means that to remove content later in the construction process, special whiteout files are created to mask that content's existence; the content itself remains in the image at the layer it was created.
Result : file1 file2 file4
Layer 4: .wh.file3
Layer 3: file1
Layer 2: file4
Layer 1: file1 file2 file3
In this layer example, the resulting image's file system makes 3 files available:
file1
in the version from Layer 3file2
in the version from Layer 1file4
in the version from Layer 2
In constructing Layer 4, file3
was removed; this removal was effected by the creation of the whiteout file .wh.file3
.
This has various implications, not least security, but also image size. I put together a simple example to help me understand.
Here's a Dockerfile
to build an image, in the contruction of which there's some work that uses tools in the build-essential
package (which contains various compilers and other tools). The contents of the build-essential
package are not required in the final image, so are removed after the work is done:
FROM debian:12
RUN apt-get update
RUN apt-get install -y build-essential
RUN echo "do something with build-essential"
RUN apt-get autoremove --purge -y build-essential
This is a contrived example; this sort of image construction would be far better realised using a multi-stage build approach, for example. But that's a topic for another time.
Assuming we use this Dockerfile
to build an image like this:
docker build -t sizetest:before .
we can then check the image with:
docker image ls sizetest
which reveals an image size of over 500MB:
REPOSITORY TAG IMAGE ID CREATED SIZE
sizetest before eed34853c568 35 seconds ago 534MB
We can also look at the resulting image's layers with:
docker image history sizetest:before
which shows us something like this, with each line representing a layer:
IMAGE CREATED CREATED BY SIZE COMMENT
eed34853c568 About a minute ago RUN /bin/sh -c apt-get autoremove --purge -y… 1.91MB buildkit.dockerfile.v0
<missing> About a minute ago RUN /bin/sh -c echo "do something with build… 0B buildkit.dockerfile.v0
<missing> About a minute ago RUN /bin/sh -c apt-get install -y build-esse… 397MB buildkit.dockerfile.v0
<missing> About a minute ago RUN /bin/sh -c apt-get update # buildkit 19.5MB buildkit.dockerfile.v0
<missing> 9 days ago /bin/sh -c #(nop) CMD ["bash"] 0B
<missing> 9 days ago /bin/sh -c #(nop) ADD file:b4987bca8c4c4c640… 117MB
The two older layers ("9 days ago") are from the base image
debian:12
referred to withFROM
in theDockerfile
.
Note that the layer representing the installation of the build-essential
package added 397MB ... and the layer representing the removal of that package added an extra 1.91MB - the result of those "whiteout" files being added to mask the build-essential
package's content.
The key observation here is that each and every RUN
command in the Dockerfile
causes a new immutable layer to be created.
Consider this alternative Dockerfile
:
FROM debian:12
RUN apt-get update
RUN apt-get install -y build-essential \
&& echo "do something with build-essential" \
&& apt-get autoremove --purge -y build-essential
The difference is that the installation, use and subsequent removal of the build-essential
package happens in the same RUN
command, i.e. within the same layer.
Building an image based on this version of the Dockerfile
, thus:
docker build -t sizetest:after .
we can then compare the image sizes:
docker image ls sizetest
which reveals a drastically smaller image:
REPOSITORY TAG IMAGE ID CREATED SIZE
sizetest before eed34853c568 35 seconds ago 534MB
sizetest after db43075f1e65 10 minutes ago 139MB
Checking out the layers of this image (with docker image history sizetest:after
) shows a contrast in layer make-up:
IMAGE CREATED CREATED BY SIZE COMMENT
db43075f1e65 10 minutes ago RUN /bin/sh -c apt-get install -y build-esse… 3.31MB buildkit.dockerfile.v0
<missing> 19 hours ago RUN /bin/sh -c apt-get update # buildkit 19.5MB buildkit.dockerfile.v0
<missing> 9 days ago /bin/sh -c #(nop) CMD ["bash"] 0B
<missing> 9 days ago /bin/sh -c #(nop) ADD file:b4987bca8c4c4c640… 117MB
So there you have it. That's what I learned - the immutability of layers is important to know about and to consider when constructing Dockerfile
instructions.