nimbuscode.dev/blog/posts/docker-basics-for-python
C:\> cat BLOG/DOCKER_PYTHON.md

Docker Basics for Python Developers

Introduction

"It works on my machine" is perhaps one of the most frustrating phrases in software development. We've all experienced the headache of code that runs perfectly in one environment but fails mysteriously in another. This is where Docker comes in, offering a solution that allows developers to package their applications along with all dependencies and configurations into standardized units called containers.

For Python developers, Docker provides a way to escape the challenges of environment management and version conflicts. Whether you're developing a simple Flask application, a complex Django web service, or a data science project with a multitude of dependencies, Docker can simplify your workflow and ensure consistency across development, testing, and production environments.

In this guide, we'll explore Docker fundamentals from a Python developer's perspective, covering everything from creating your first Dockerfile to orchestrating multi-container applications with Docker Compose. By the end, you'll have the knowledge to integrate Docker into your Python development workflow and deploy containerized applications with confidence.

Why Use Docker for Python Development?

Before diving into the technical details, let's understand what makes Docker particularly valuable for Python developers:

Environment Consistency

Python projects often rely on specific versions of Python itself, along with numerous packages and system dependencies. Docker ensures everyone working on a project uses identical environments, eliminating the "it works on my machine" problem.

Dependency Isolation

While virtual environments help isolate Python dependencies, Docker takes isolation to the next level by containing not just Python packages, but also system libraries, configurations, and environment variables.

Simplified Onboarding

New team members can get up and running quickly with a simple docker-compose up rather than following lengthy setup documentation that might be outdated or incomplete.

Production Parity

Docker helps reduce the gap between development and production environments, making it more likely that code that works in development will also work when deployed.

Microservice Architecture

For projects adopting a microservice architecture, Docker provides a natural way to develop, test, and deploy individual services independently.

Testing and CI/CD Integration

Docker simplifies continuous integration by providing consistent environments for running tests, ensuring that pipeline failures are due to code issues rather than environment differences.

Core Docker Concepts

Before we start coding, let's establish a shared understanding of key Docker terminology:

  • Container: A lightweight, standalone, executable package that includes everything needed to run a piece of software, including the code, runtime, system tools, libraries, and settings.
  • Image: A read-only template with instructions for creating a Docker container. Images are built from a Dockerfile.
  • Dockerfile: A text file containing instructions for building a Docker image.
  • Docker Hub: A registry service where you can find and share Docker images.
  • Volume: A mechanism for persisting data generated by and used by Docker containers.
  • Docker Compose: A tool for defining and running multi-container Docker applications.
  • Registry: A storage and content delivery system for Docker images.

The relationship between these components is fairly straightforward: you write a Dockerfile to define your application environment, build an image from this Dockerfile, and then run a container from that image. Multiple containers can be orchestrated using Docker Compose.

Installing Docker

Before proceeding, ensure you have Docker installed on your system. Visit the Docker website for official installation instructions for your operating system.

Once installed, verify your installation by running:

docker --version
docker-compose --version

You should see version information for both commands, indicating that Docker is properly installed.

Creating Your First Python Dockerfile

A Dockerfile is essentially a set of instructions for building a Docker image. Let's create a basic Dockerfile for a simple Python application:

# Use an official Python runtime as a parent image
FROM python:3.9-slim

# Set working directory in the container
WORKDIR /app

# Copy the current directory contents into the container
COPY . /app/

# Install any needed packages specified in requirements.txt
RUN pip install --no-cache-dir -r requirements.txt

# Make port 5000 available to the world outside this container
EXPOSE 5000

# Define environment variable
ENV NAME World

# Run app.py when the container launches
CMD ["python", "app.py"]

Let's break down each instruction:

  • FROM specifies the base image. In this case, we're using the official Python 3.9 image with a slim variant, which is smaller in size.
  • WORKDIR sets the working directory within the container.
  • COPY transfers files from your local machine to the container.
  • RUN executes commands during the image build process.
  • EXPOSE informs Docker that the container listens on the specified network port at runtime.
  • ENV sets environment variables within the container.
  • CMD specifies the command to run when the container starts.

For this to work, you'll need a simple app.py file:

from flask import Flask
import os

app = Flask(__name__)

@app.route('/')
def hello():
    name = os.environ.get('NAME', 'World')
    return f'Hello, {name}!'

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000)

And a requirements.txt file:

flask==2.0.1

This setup creates a simple Flask web server that responds with "Hello, World!" when accessed.

Building and Running Containers

With our Dockerfile in place, let's build an image and run a container from it.

Building an Image

From the directory containing your Dockerfile, run:

docker build -t my-python-app .

This command builds a Docker image tagged as "my-python-app" based on the instructions in the Dockerfile. The . at the end specifies the build context (current directory).

Running a Container

Once the image is built, you can run a container from it:

docker run -p 5000:5000 my-python-app

This command starts a container from the "my-python-app" image and maps port 5000 in the container to port 5000 on your host machine. You should now be able to access your Flask application by navigating to http://localhost:5000 in your web browser.

Useful Docker Commands

Here are some essential Docker commands for managing your containers and images:

# List running containers
docker ps

# List all containers (including stopped ones)
docker ps -a

# List images
docker images

# Stop a running container
docker stop [container_id]

# Remove a container
docker rm [container_id]

# Remove an image
docker rmi [image_id]

# View container logs
docker logs [container_id]

# Execute a command in a running container
docker exec -it [container_id] [command]

Optimizing Python Docker Images

The basic Dockerfile we created works, but there are several optimizations we can make to improve build speed, reduce image size, and enhance security:

Use Multi-Stage Builds

Multi-stage builds allow you to use multiple FROM statements in your Dockerfile. Each FROM instruction can use a different base image, and begins a new stage of the build. You can selectively copy artifacts from one stage to another, leaving behind everything you don't need:

# Build stage
FROM python:3.9-slim AS build

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Runtime stage
FROM python:3.9-slim

WORKDIR /app

# Copy only the installed packages from the build stage
COPY --from=build /usr/local/lib/python3.9/site-packages /usr/local/lib/python3.9/site-packages
COPY . .

EXPOSE 5000
CMD ["python", "app.py"]

Layer Caching

Docker builds images layer by layer, and each layer is cached. You can optimize your build by ordering the instructions in your Dockerfile from least to most likely to change:

FROM python:3.9-slim

WORKDIR /app

# Copy only requirements.txt first
COPY requirements.txt .

# Install dependencies
RUN pip install --no-cache-dir -r requirements.txt

# Now copy the rest of the application
COPY . .

EXPOSE 5000
CMD ["python", "app.py"]

This way, if you change your application code but not your dependencies, Docker will use the cached layer for the pip install step, making your builds faster.

Use Lightweight Base Images

Consider using the python:3.9-alpine image instead of python:3.9-slim for an even smaller footprint. Alpine Linux is a minimal distribution that's much smaller than most other distributions:

FROM python:3.9-alpine

WORKDIR /app

# Install build dependencies for Python packages with C extensions
RUN apk add --no-cache gcc musl-dev

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

EXPOSE 5000
CMD ["python", "app.py"]

Note that Alpine Linux uses apk instead of apt-get for package management, and you may need to install additional system libraries depending on the Python packages you're using.

Run as a Non-Root User

By default, Docker containers run as root, which can pose security risks. It's a good practice to create a non-root user in your Dockerfile:

FROM python:3.9-slim

# Create non-root user
RUN groupadd -r appuser && useradd -r -g appuser appuser

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

# Change ownership of the working directory
RUN chown -R appuser:appuser /app

# Switch to non-root user
USER appuser

EXPOSE 5000
CMD ["python", "app.py"]

Working with Volumes for Development

During development, you often need to make changes to your code without rebuilding the Docker image each time. Docker volumes allow you to mount a directory from your host machine into the container, enabling real-time code changes.

To use a volume, modify your docker run command:

docker run -p 5000:5000 -v $(pwd):/app my-python-app

This mounts the current directory ($(pwd)) to the /app directory in the container. Now, any changes you make to your local files will be reflected in the container.

For applications that need to restart when code changes (like Flask in development mode), you can use a process manager like Nodemon:

# Add to your Dockerfile
RUN pip install nodemon

# Change CMD to use nodemon
CMD ["nodemon", "--exec", "python", "app.py"]

Alternatively, you can use Flask's built-in development server with the reloader enabled:

# In app.py
if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000, debug=True)

Managing Python Dependencies

Effective dependency management is crucial for creating reproducible Docker images. Here are some best practices for handling Python dependencies in Docker:

Use Explicit Versions

Always specify exact versions of packages in your requirements.txt file to ensure reproducibility:

# Good: Explicit versions
flask==2.0.1
requests==2.26.0
numpy==1.21.2

# Bad: Unspecified or loose versions
flask
requests>=2.20.0
numpy

Consider Using Poetry or Pipenv

Tools like Poetry or Pipenv provide more robust dependency management than pip alone. Here's an example Dockerfile using Poetry:

FROM python:3.9-slim

WORKDIR /app

# Install Poetry
RUN pip install poetry

# Copy Poetry configuration files
COPY pyproject.toml poetry.lock* ./

# Configure Poetry to not use a virtual environment in the container
RUN poetry config virtualenvs.create false

# Install dependencies
RUN poetry install --no-dev

# Copy the rest of the application
COPY . .

EXPOSE 5000
CMD ["python", "app.py"]

Use pip-tools for Requirements Management

pip-tools helps manage dependencies by generating pinned requirements files from high-level specifications:

# In requirements.in
flask
requests
numpy

# Generate requirements.txt with pip-compile
$ pip-compile requirements.in

# Install with pip
$ pip install -r requirements.txt

This approach gives you the best of both worlds: you can specify high-level dependencies in requirements.in while ensuring reproducible builds with the pinned versions in requirements.txt.

Multi-Container Applications with Docker Compose

Real-world Python applications often require multiple services, such as a web server, database, cache, and more. Docker Compose simplifies the process of defining and running multi-container applications.

Let's create a docker-compose.yml file for a Flask application with a PostgreSQL database:

version: '3'

services:
  web:
    build: .
    ports:
      - "5000:5000"
    volumes:
      - .:/app
    environment:
      - DATABASE_URL=postgresql://postgres:postgres@db/myapp
    depends_on:
      - db
  
  db:
    image: postgres:13
    volumes:
      - postgres_data:/var/lib/postgresql/data
    environment:
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=postgres
      - POSTGRES_DB=myapp
    ports:
      - "5432:5432"

volumes:
  postgres_data:

This configuration defines two services:

  • web: Our Flask application, built from the local Dockerfile
  • db: A PostgreSQL database using the official postgres:13 image

The depends_on directive ensures that the database container starts before the web container. The volumes section creates a named volume for the PostgreSQL data, ensuring that data persists even if the container is removed.

To start all services defined in docker-compose.yml:

docker-compose up

To run in detached mode (in the background):

docker-compose up -d

To stop all services:

docker-compose down

For a complete rebuild of all services:

docker-compose up --build

Comments (0)

Sort by: