Best Practices for Refactoring Large Dockerfiles to Reduce Size and Improve Maintainability
When Dockerfiles grow too large they become hard to understand, can hide security flaws, cause Git conflicts, and bloat image size, so the article presents five refactoring techniques—including using official images, extracting dependencies, multi‑stage builds, sorting parameters, and proper tagging—to keep Dockerfiles small, clear, and secure.
When a Dockerfile exceeds a reasonable size, it becomes difficult to understand and maintain, may hide security issues, cause more Git conflicts, and lead to unnecessarily large images.
The recommended solution is to split the Dockerfile into multiple smaller files and apply a series of refactoring techniques.
Refactor 1: Use Official Images for Dependencies
Avoid recreating artifacts from official images; for example, use an official Terraform image instead of installing Terraform manually.
Original Dockerfile:
FROM golang:1.12
RUN apt-get update && \
apt-get upgrade -y && \
apt-get install -y git openssh-client zip
WORKDIR $GOPATH/src/github.com/hashicorp/terraform
RUN git clone https://github.com/hashicorp/terraform.git ./ && \
git checkout v0.12.9 && \
./scripts/build.sh
WORKDIR /my-config
COPY . /my-config/
CMD ["terraform init"]Refactored Dockerfile:
FROM hashicorp/terraform:0.12.9 AS terraform
FROM golang:1.12
COPY --from=terraform /go/bin/terraform /usr/bin/terraform
WORKDIR /my-config
COPY . /my-config/
CMD ["terraform init"]Refactor 2: Extract Dependencies into a Separate Dockerfile
If no official image exists, build the artifact in a separate Dockerfile and copy it into the main one.
Original Dockerfile:
FROM golang:1.12
RUN apt-get update && \
apt-get upgrade -y && \
apt-get install -y git openssh-client
WORKDIR /go/src/gitlab.com/sahilm/
RUN git clone https://github.com/sahilm/yamldiff.git
RUN cd yamldiff && \
go get -u github.com/golang/dep/cmd/dep && \
dep ensure && \
GOOS=linux go build -o /usr/local/yamldiff
WORKDIR /my-app
COPY . /my-app/
CMD ["./run.sh"]Refactored Dockerfile for building yamldiff :
FROM golang:1.12
RUN apt-get update && \
apt-get upgrade -y && \
apt-get install -y git openssh-client
WORKDIR /go/src/gitlab.com/sahilm/
RUN git clone https://github.com/sahilm/yamldiff.git
RUN cd yamldiff && \
go get -u github.com/golang/dep/cmd/dep && \
dep ensure && \
GOOS=linux go build -o /usr/local/yamldiff
CMD ["bash"]Application Dockerfile that uses the built artifact:
FROM Marvalero/yamldiff:latest AS yamldiff
FROM golang:1.12
COPY --from=yamldiff /usr/bin/yamldiff /usr/bin/yamldiff
WORKDIR /my-app
COPY . /my-app/
CMD ["./run.sh"]Refactor 3: Split the Image into Multiple Stages
Docker’s multi‑stage build feature allows you to separate build and runtime stages, making the final image clearer and more secure.
FROM golang:1.12
RUN apt-get update && \
apt-get upgrade -y && \
apt-get install -y git openssh-client
WORKDIR /go/src/gitlab.com/sahilm/
RUN git clone https://github.com/sahilm/yamldiff.git
RUN cd yamldiff && \
go get -u github.com/golang/dep/cmd/dep && \
dep ensure && \
GOOS=linux go build -o /usr/local/yamldiff
CMD ["bash"]Refactored multi‑stage Dockerfile:
FROM golang:1.12 AS Builder
RUN apt-get update && \
apt-get upgrade -y && \
apt-get install -y git openssh-client
WORKDIR /go/src/gitlab.com/sahilm/
RUN git clone https://github.com/sahilm/yamldiff.git
RUN cd yamldiff && \
go get -u github.com/golang/dep/cmd/dep && \
dep ensure && \
GOOS=linux go build -o /usr/local/yamldiff
FROM ubuntu:18.04
COPY --from=Builder /usr/local/yamldiff /usr/local/yamldiff
CMD ["bash"]Refactor 4: Sort Multi‑Line Parameters
Sort the list of packages in RUN commands to make it easier to spot duplicates.
FROM ubuntu:18.04
RUN apt-get -yqq install \
ca-certificates \
bash \
jq \
wget \
curl \
openssh-client \
build-essential \
libpng-dev \
python \
zip
CMD ["bash"]After sorting:
FROM ubuntu:18.04
RUN apt-get -yqq install \
bash \
build-essential \
ca-certificates \
curl \
jq \
libpng-dev \
openssh-client \
python \
wget \
zip
CMD ["bash"]Refactor 5: Tagging Images Properly
Maintain clean and meaningful tags for Docker images. Use three types of tags: branch name, semantic version (major.minor.patch), and commit SHA.
Branch‑name tags identify the latest image for a specific branch (e.g., master , feature/new‑class ), avoiding the ambiguous latest tag.
Version tags distinguish patches from major changes using semantic versioning.
Commit tags record the exact Git commit the image was built from, useful when a version tag is not possible.
Applying these practices helps keep Docker images organized, secure, and easier to maintain.
DevOps Cloud Academy
Exploring industry DevOps practices and technical expertise.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.