Cristhian Villegas
DevOps12 min read1 views

Docker Course #3: Dockerfile — Build Your Own Images

Docker Course #3: Dockerfile — Build Your Own Images

Welcome to the Docker Course - Part 3 of 10

Docker Logo

Source: Wikimedia Commons

Welcome back to the Docker Course! This is article 3 of 10. In the previous articles, we installed Docker and explored Docker images. Now it is time to learn one of the most powerful skills in Docker: building your own images with a Dockerfile.

A Dockerfile is a text file with instructions that tell Docker exactly how to assemble your image. By the end of this article, you will be able to write Dockerfiles from scratch and build custom images for Node.js and Python applications.

📌 Prerequisites: Make sure you have completed Parts 1 and 2 of this course. You should have Docker installed and understand the basics of images and containers.

What is a Dockerfile?

A Dockerfile is a plain text file (no file extension) that contains a series of instructions. Each instruction creates a layer in the final image. Docker reads the Dockerfile from top to bottom and executes each instruction sequentially.

Here is the simplest possible Dockerfile:

dockerfile
1# Start from an Ubuntu base image
2FROM ubuntu:22.04
3
4# Run a command during the build
5RUN echo "Hello from the build process!"
6
7# Set the default command when the container starts
8CMD ["echo", "Hello from the container!"]

Save this as a file named Dockerfile (no extension) in an empty directory, then build it:

bash
1# Build the image and tag it as "my-first-image"
2docker build -t my-first-image .
3
4# Run a container from your new image
5docker run my-first-image
6# Output: Hello from the container!

The . at the end tells Docker to use the current directory as the build context — the set of files Docker can access during the build.

Essential Dockerfile Instructions

Let's go through every important instruction you will use in your Dockerfiles:

FROM — The Base Image

Every Dockerfile must start with a FROM instruction. It sets the base image that your image is built upon.

dockerfile
1# Use an official Node.js image
2FROM node:20-alpine
3
4# Use a minimal Python image
5FROM python:3.12-slim
6
7# Use a bare-bones Alpine Linux
8FROM alpine:3.19
9
10# Multi-stage: use a different name for this stage
11FROM node:20 AS builder

WORKDIR — Set the Working Directory

WORKDIR sets the directory where subsequent commands will run. If the directory does not exist, Docker creates it automatically.

dockerfile
1WORKDIR /app
2
3# All subsequent COPY, RUN, CMD commands execute inside /app

COPY and ADD — Bring Files into the Image

COPY copies files from your build context into the image. ADD does the same but also supports URLs and automatic tar extraction. Prefer COPY unless you specifically need ADD's extra features.

dockerfile
1# Copy package.json first (for better layer caching)
2COPY package.json package-lock.json ./
3
4# Copy the entire source directory
5COPY . .
6
7# Copy a specific file to a specific location
8COPY config.json /etc/app/config.json

RUN — Execute Commands During Build

RUN executes commands in a new layer. Use it to install packages, compile code, or perform any setup needed during the build.

dockerfile
1# Install dependencies
2RUN npm install --production
3
4# Install system packages
5RUN apt-get update && apt-get install -y curl && rm -rf /var/lib/apt/lists/*
6
7# Chain commands to keep the layer small
8RUN pip install --no-cache-dir -r requirements.txt
💡 Tip: Each RUN instruction creates a new layer. Chain related commands with && to reduce the number of layers and keep your image small. Always clean up cache and temporary files in the same RUN command.

ENV and ARG — Variables

ENV sets environment variables that persist in the running container. ARG sets build-time variables that are only available during the build.

dockerfile
1# ENV persists at runtime
2ENV NODE_ENV=production
3ENV PORT=3000
4
5# ARG is only available during build
6ARG APP_VERSION=1.0.0
7RUN echo "Building version $APP_VERSION"

EXPOSE — Document the Port

EXPOSE documents which port the application listens on. It does not actually publish the port — you still need -p when running the container.

dockerfile
1EXPOSE 3000

CMD and ENTRYPOINT — Container Startup

CMD sets the default command that runs when the container starts. ENTRYPOINT sets the main executable. They work together:

dockerfile
1# CMD alone — can be overridden by docker run arguments
2CMD ["node", "server.js"]
3
4# ENTRYPOINT + CMD — ENTRYPOINT is fixed, CMD provides default arguments
5ENTRYPOINT ["python"]
6CMD ["app.py"]
7# docker run my-image          -> runs: python app.py
8# docker run my-image test.py  -> runs: python test.py
⚠️ Exec form vs Shell form: Always use the exec form (JSON array) for CMD and ENTRYPOINT: CMD ["node", "server.js"]. The shell form (CMD node server.js) wraps the command in /bin/sh -c, which does not handle signals properly and can cause issues with graceful shutdown.

The .dockerignore File

Just like .gitignore, a .dockerignore file tells Docker which files and directories to exclude from the build context. This is critical for:

  • Keeping your image small
  • Speeding up the build (less data to send to the daemon)
  • Preventing sensitive files from ending up in the image
bash
1# .dockerignore
2node_modules
3npm-debug.log
4.git
5.gitignore
6.env
7.env.*
8Dockerfile
9docker-compose*.yml
10README.md
11.vscode
12coverage
13dist
14*.md
🚨 Security: Always add .env files and any files containing secrets to your .dockerignore. If you COPY them into the image, anyone with access to the image can extract your secrets — even if you delete the files in a later layer (the data remains in earlier layers).

Building a Node.js Application Image

Let's build a real Docker image for a Node.js Express application. First, create a project directory with these files:

json
1{
2  "name": "docker-demo-node",
3  "version": "1.0.0",
4  "scripts": {
5    "start": "node server.js"
6  },
7  "dependencies": {
8    "express": "^4.18.2"
9  }
10}

Create server.js:

bash
1# server.js
2const express = require('express');
3const app = express();
4const PORT = process.env.PORT || 3000;
5
6app.get('/', (req, res) => {
7  res.json({
8    message: 'Hello from Docker!',
9    environment: process.env.NODE_ENV || 'development',
10    timestamp: new Date().toISOString()
11  });
12});
13
14app.get('/health', (req, res) => {
15  res.json({ status: 'healthy' });
16});
17
18app.listen(PORT, () => {
19  console.log('Server running on port ' + PORT);
20});

Now create the Dockerfile:

dockerfile
1# Use Node.js 20 on Alpine for a small image
2FROM node:20-alpine
3
4# Set the working directory
5WORKDIR /app
6
7# Copy dependency files first (layer caching optimization)
8COPY package.json package-lock.json ./
9
10# Install production dependencies only
11RUN npm ci --only=production
12
13# Copy the application code
14COPY . .
15
16# Set environment variables
17ENV NODE_ENV=production
18ENV PORT=3000
19
20# Document the port
21EXPOSE 3000
22
23# Run as non-root user for security
24USER node
25
26# Start the application
27CMD ["node", "server.js"]

Build and run it:

bash
1# Build the image
2docker build -t my-node-app .
3
4# Run the container
5docker run -d -p 3000:3000 --name node-demo my-node-app
6
7# Test it
8curl http://localhost:3000
9# {"message":"Hello from Docker!","environment":"production","timestamp":"..."}
10
11# Check health endpoint
12curl http://localhost:3000/health
13# {"status":"healthy"}
14
15# Clean up
16docker rm -f node-demo

Building a Python Application Image

Now let's build an image for a Python Flask application. Create a project directory with:

bash
1# requirements.txt
2flask==3.0.0
3gunicorn==21.2.0

Create app.py:

bash
1# app.py
2from flask import Flask, jsonify
3import os
4from datetime import datetime
5
6app = Flask(__name__)
7
8@app.route('/')
9def home():
10    return jsonify({
11        'message': 'Hello from Docker + Python!',
12        'environment': os.getenv('FLASK_ENV', 'production'),
13        'timestamp': datetime.utcnow().isoformat()
14    })
15
16@app.route('/health')
17def health():
18    return jsonify({'status': 'healthy'})
19
20if __name__ == '__main__':
21    app.run(host='0.0.0.0', port=5000)

Create the Dockerfile:

dockerfile
1# Use Python 3.12 slim variant
2FROM python:3.12-slim
3
4# Set the working directory
5WORKDIR /app
6
7# Install dependencies first (caching optimization)
8COPY requirements.txt .
9RUN pip install --no-cache-dir -r requirements.txt
10
11# Copy application code
12COPY . .
13
14# Set environment variables
15ENV FLASK_ENV=production
16ENV PORT=5000
17
18# Document the port
19EXPOSE 5000
20
21# Create a non-root user
22RUN adduser --disabled-password --no-create-home appuser
23USER appuser
24
25# Use gunicorn for production
26CMD ["gunicorn", "--bind", "0.0.0.0:5000", "--workers", "2", "app:app"]

Build and run:

bash
1# Build the image
2docker build -t my-python-app .
3
4# Run the container
5docker run -d -p 5000:5000 --name python-demo my-python-app
6
7# Test it
8curl http://localhost:5000
9# {"message":"Hello from Docker + Python!","environment":"production","timestamp":"..."}
10
11# Clean up
12docker rm -f python-demo

Layer Caching and Build Optimization

One of the most important concepts for efficient Docker builds is layer caching. Docker caches each layer and reuses it if the instruction and its inputs have not changed. If a layer changes, all subsequent layers are invalidated.

This is why we always copy dependency files before copying the full source code:

dockerfile
1# GOOD: Dependencies change rarely, source code changes often
2COPY package.json package-lock.json ./    # Layer 1: cached unless deps change
3RUN npm ci --only=production              # Layer 2: cached unless deps change
4COPY . .                                  # Layer 3: rebuilt on every code change
5
6# BAD: Copying everything first invalidates npm install cache on every code change
7COPY . .                                  # Every code change invalidates this layer
8RUN npm ci --only=production              # Always rebuilt, even if deps didn't change

Additional optimization tips:

  • Use Alpine or slim base images to reduce image size
  • Combine RUN commands with && to reduce layers
  • Clean up in the same RUN: RUN apt-get update && apt-get install -y curl && rm -rf /var/lib/apt/lists/*
  • Use .dockerignore to exclude unnecessary files
  • Order instructions from least to most frequently changing
💡 Tip: Use docker build --no-cache -t my-app . to force a clean build without using any cached layers. This is useful when you need to pull the latest base image or update packages.

Best Practices for Dockerfiles

Follow these best practices to write production-quality Dockerfiles:

  1. Use specific base image tags: FROM node:20.11-alpine not FROM node:latest
  2. Run as non-root user: Add USER node or create a dedicated user
  3. Use multi-stage builds for compiled languages (we will cover this in a later article)
  4. Minimize the number of layers: Chain related RUN commands
  5. Never store secrets in the image: Use environment variables or secret mounts
  6. Include a health check: HEALTHCHECK CMD curl -f http://localhost:3000/health || exit 1
  7. Use COPY instead of ADD unless you need tar auto-extraction
  8. Set proper labels for image metadata
dockerfile
1# Example with best practices and labels
2FROM node:20-alpine
3
4LABEL maintainer="[email protected]"
5LABEL version="1.0.0"
6LABEL description="Production-ready Node.js application"
7
8WORKDIR /app
9
10COPY package.json package-lock.json ./
11RUN npm ci --only=production && npm cache clean --force
12
13COPY . .
14
15ENV NODE_ENV=production
16EXPOSE 3000
17
18HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
19  CMD wget --no-verbose --tries=1 --spider http://localhost:3000/health || exit 1
20
21USER node
22CMD ["node", "server.js"]

For the complete reference of all Dockerfile instructions, see the official Dockerfile reference documentation.

Summary

In this third article of the Docker Course, we covered:

  • What a Dockerfile is and how Docker processes it
  • All essential instructions: FROM, WORKDIR, COPY, RUN, ENV, ARG, EXPOSE, CMD, ENTRYPOINT
  • The importance of .dockerignore for security and performance
  • Building a Node.js Express application image step by step
  • Building a Python Flask application image step by step
  • Layer caching and how to optimize your builds
  • Best practices for production-quality Dockerfiles
  • The difference between CMD and ENTRYPOINT

In the next article (Part 4 of 10), we will tackle Docker volumes — how to persist data beyond the container's lifecycle using named volumes, bind mounts, and backup strategies. See you there!

Share:
CV

Cristhian Villegas

Software Engineer specializing in Java, Spring Boot, Angular & AWS. Building scalable distributed systems with clean architecture.

Comments

Sign in to leave a comment

No comments yet. Be the first!

Related Articles