My Profile Photo

Sheogorath's Blog

Docker - Minimize your containers with alpine linux

on

Today I have something delicious for you. Docker is a really useful application virtualization platform. As I can’t explain the whole stack I leave this to the docs and expect you to know the Docker basics.

The base image

Referring to the old best practice most containers are build against debian or ubuntu. There are other base images, too, which makes it pretty easy to get a first container of nearly every project.

In my example I’ll use InspIRCd to show you how you can minimize the container size.

The Dockerfile I used to inspire my own “dockerization” was from “Knut Ahlers”: https://github.com/luzifer-docker/inspircd/blob/0094e0c6667962bd18d15eab56fe818f9c768d78/Dockerfile

This image is based on Debian jessie and takes 163 MB in compressed format.

Our new image for InspIRCd takes now 17 MB compressed and does nearly the same thing.

You may ask how we got there?

The base is important

The old best practice says you should build your images using ubuntu or debian. The new best practice therefore says you should try to use alpine as base image.

alpine is the base image for alpine-linux which is a minimalist Linux distribution. The biggest point: Instead of taking more than 100 MB for the base image the image of alpine takes something around 5MB. This automatically minimizes the whole image a lot and is the biggest point of compression for your image.

Our new image starts now with:

FROM alpine:3.4

Another important fact is that we use a minor version for the base image. This ensures our image will build in the future as the minor version should only receive bug fixes but keeps its structure.

You should take a look at new releases to update your image accordingly. Pinning to the minor version instead of the latest endures that nothing will break silently.

Dependencies

With a new base image you have to get familiar with its distribution. And also have to learn how to find the dependencies of your project. Using alpine is pretty easy. They provide a useful online registry for their package management apk. Just make sure you checked the right minor version and start searching.

RUN apk update && apk add gcc g++ make git gnutls gnutls-dev gnutls-c++ \
       pkgconfig perl perl-net-ssleay perl-io-socket-ssl perl-libwww

In most cases the names are similar to the package names on Debian or ubuntu.

Get your sources in

To get the InspIRCd source we use git. So we add an RUN statement which clones our repository into your image:

RUN  mkdir -p /src && \
     cd /src && \
     git clone https://github.com/inspircd/inspircd.git inspircd

You have to install git with your dependencies

You can also use the ADD statement with an URL to the release .zip-file of your latest stable. Or COPY to include the sources from your repository directly.

Build your sources

For InspIRCd it was easy. All dependencies could be found in the package repository and the build system written in Perl was able to fulfil its work. Unfortunately it’s not always that simple.

RUN ./configure --disable-interactive --prefix=/inspircd/ --uid 10000 --enable-gnutls && \
    make && \
    make install

We had to change an internal call because alpine uses musl-libc instead of glibc

Setup your run-time-environment

When your build was successful you should take a look at your finished image. For example which parts are persistent data and which can be rebuild. What’s the perfect workdir and what ports do I need to expose?

In case of InspIRCd this part stays nearly like in the original image:

WORKDIR /inspircd/

VOLUME ["/inspircd/conf"]

EXPOSE 6667 6697

ENTRYPOINT ["/inspircd/bin/inspircd"]  
CMD ["--nofork"]  

Improvements

Now we can go into fine tuning.

Size

The simplest way to improve your image size is reducing the number of layers. Especially the RUN-layers are reducible. In most cases to a single layer.

To stay in a readable way use line-breaks.

RUN apk update && apk add gcc g++ make git gnutls gnutls-dev gnutls-c++ \
       pkgconfig perl perl-net-ssleay perl-io-socket-ssl perl-libwww \
    mkdir -p /src /conf && \
    cd /src && \
    git clone https://github.com/inspircd/inspircd.git inspircd -b insp20 && \
    cd /src/inspircd && \
    ./configure --disable-interactive --prefix=/inspircd/ --uid 10000 --enable-gnutls && \
    make && \
    make install

You can also delete all unneeded resources. InspIRCd as a program written in C++ needs things like a compiler when building. But you don’t need them to run the software. So remove them.

RUN apk del gcc g++ make git perl perl-net-ssleay perl-io-socket-ssl perl-libwww wget

You must add and remove the packages in the same layer to reduce the size

Same applies for the sources.

RUN rm -rf /src

Configuration for running out of the box

I build images with the following mantra in mind:

A container has to be able to run with a simple docker run statement which fits one or two lines. And all this without the need of mounting something external.

That’s why we add some sample configuration while building:

COPY conf /conf

Security improvements

By default every process in the container is run as root. Until you run it manually as another user or change the default user of the container.

To do this you need to add a user first and then use it after build:

RUN adduser -u 10000 -h /inspircd/ -D -S inspircd

USER inspircd

Add Healthcheck for the Container

When using Docker it’s useful to check that your containers are okay. This can be done by the new HEALTHCHECK-statement.

In our case it was something quick and dirty but it works:

HEALTHCHECK CMD  /usr/bin/nc 127.0.0.1 6667 < /dev/null || exit 1

This only checks for an open port. Not very exact but enough for a first example

Allow flexible builds

To allow your image to be used for more than just the version you are providing with it, it’s useful to use the ARG-statement.

ARG VERSION=insp20
ARG CONFIGUREARGS=
ARG ADDPACKAGES=
ARG DELPACKAGES=

This allows us to use those variables in the Dockefile like you use environment variables in any other case. Variables from the ARG statement are only accessible in the Dockerfile context

Result

So what was the “final” Dockerfile? Read it:

FROM alpine:3.4

MAINTAINER Adam adam@anope.org
MAINTAINER Sheogorath <sheogorath@shivering-isles.com>

ARG VERSION=insp20
ARG CONFIGUREARGS=
ARG ADDPACKAGES=
ARG DELPACKAGES=

COPY conf /conf

RUN apk update && apk add gcc g++ make git gnutls gnutls-dev gnutls-c++ \
       pkgconfig perl perl-net-ssleay perl-io-socket-ssl perl-libwww \
       wget $ADDPACKAGES && \
    adduser -u 10000 -h /inspircd/ -D -S inspircd && \
    mkdir -p /src /conf && \
    cd /src && \
    git clone https://github.com/inspircd/inspircd.git inspircd -b $VERSION && \
    cd /src/inspircd && \
    ./configure --disable-interactive --prefix=/inspircd/ --uid 10000 --enable-gnutls $CONFIGUREARGS && \
    make && \
    make install && \
    apk del gcc g++ make git perl perl-net-ssleay perl-io-socket-ssl perl-libwww wget $DELPACKAGES && \
    rm -rf /src && \
    rm -rf /inspircd/conf && ln -s /conf /inspircd/conf

VOLUME ["/inspircd/conf"]

WORKDIR /inspircd/

USER inspircd

EXPOSE 6667 6697

HEALTHCHECK CMD  /usr/bin/nc 127.0.0.1 6667 < /dev/null || exit 1

ENTRYPOINT ["/inspircd/bin/inspircd"]
CMD ["--nofork"]

Please notice we already extended it a bit. Check the real final version here on Docker Hub. And feel free to contribute!

Conclusion

It’s pretty easy to optimize your containers. Some issues still exist with this “new” container and you can pretty sure face more issues while building the container. But it’ll help your software to get better and easier. Imagine: When you’re able to install it in basic version in less than a minute that’s amazing for your costumers and users.

The important part here: Your image is fast in downloading because the image is just a few MB. And it is fast extracted which speeds up your deployment a lot.

So as always: I hope you enjoyed it and you’ll share this post with your friends and colleagues. If you have questions or improvements you can contact me on Mastodon or use the comment section down below.

Edit: @srust from the Docker Community pointed out that you can simplify the package handling by using --virtual <name>. For detailed information see: https://github.com/gliderlabs/docker-alpine/blob/master/docs/usage.md#virtual-packages