The fundamental challenge in software development is ensuring that an application works reliably across different environments.
-
The Development Phase: A developer writes code, connects to a database (like MySQL, Postgres), sets up a web server, and configures numerous settings. On their local machine, the application works perfectly.
-
The Handoff Challenge: The problem begins when the project is shared.
-
To Colleagues: Team members must replicate the entire setup perfectly. This involves installing the correct versions of all software and matching every configuration detail.
-
To the Testing Team: The testing team faces the same challenge. They need to build the exact environment to validate the application. Any mismatch can lead to failures, which are sent back to the developer.
-
To the Operations (Ops) Team: For production deployment, the ops team must recreate the setup on powerful servers, which often run different operating systems (e.g., Linux) and have different hardware than the developer's laptop.
-
-
Why It Fails: An application can fail in a new environment for many reasons:
-
Mismatched Dependencies: Different versions of libraries, databases, or web servers.
-
OS Incompatibility: Features that work on the developer's OS (e.g., Windows) might not work on the server's OS (e.g., Linux).
-
Configuration Errors: The manual process of setting up the environment is complex and prone to errors.
-
This leads to the classic developer response: "But it works on my machine!" The core issue is that we are only shipping the application code, not the environment it needs to run. The ideal solution would be to package the application with its entire configured environment, but simply copying a developer's hard drive is not a feasible or effective solution.
Virtualization is a technology that offers a solution to this problem by allowing you to run multiple operating systems on a single physical machine.
-
Hardware: The physical components (CPU, RAM, Motherboard, etc.).
-
Operating System (OS): Software that manages the hardware and provides services for applications (e.g., Windows, macOS, Linux).
-
Applications (Apps): Software that users interact with.
The flow is: User -> App -> OS -> Hardware.
-
Hardware: Your physical machine.
-
Host OS: The main operating system on your hardware (e.g., Windows).
-
Hypervisor: A special software (like VMware, VirtualBox) that creates and manages virtual machines.
-
Virtual Hardware: The simulated hardware environment created by the hypervisor.
-
Guest OS: A second, complete OS installed on the virtual hardware (e.g., Ubuntu Linux).
-
Your Application: Runs inside the Guest OS.
Instead of just sharing application code, you share the entire Guest OS as a single file called an image. This ensures the application runs in the exact same environment on any machine with a hypervisor.
-
Server Isolation: Run multiple, isolated applications on one physical server, each within its own Guest OS (Virtual Machine).
-
Resource Utilization: Efficiently partition a single powerful server into many smaller virtual servers.
While powerful, virtualization has significant drawbacks:
-
High Resource Consumption: Running a full, second OS consumes a large amount of CPU and RAM. Running multiple applications means running multiple Guest OSes, which is very heavy.
-
Large Size: OS images are very large (often gigabytes).
-
Slow Startup: Booting up an entire Guest OS takes time.
-
Licensing Costs: You may need to pay for licenses for both the Host OS and the Guest OS.
Containerization solves the problems of virtualization by being far more lightweight and efficient. It packages an application with all its dependencies into a single, portable unit called a container.
Think of a shipping container. Goods are packed into a standard-sized container at the factory. This same container is then moved via truck, ship, and another truck to the final destination without ever being unpacked and repacked. The container standardizes transport.
Software containers do the same thing: the application is "packed" into a container on the developer's machine, and that exact same container is moved to testing, production, and the cloud.
Containerization eliminates the need for a Guest OS. Instead, it uses a Container Engine (like Docker).
-
Hardware: Your physical machine.
-
Host OS: The main operating system (e.g., Windows, Linux).
-
Container Engine (e.g., Docker): Software that creates and runs containers.
-
Containers: Each container includes your Application and its Dependencies (libraries, compilers, etc.). Critically, all containers share the Host OS's kernel.
-
Lightweight & Fast: Containers don't include a Guest OS, so they are much smaller (megabytes instead of gigabytes) and start almost instantly.
-
Efficient: Since there is only one OS, you can run many more containers on a single machine compared to virtual machines, saving CPU and RAM.
-
Isolation: Containers are isolated from each other, providing security without the overhead of a full OS.
-
Portability: A container created on a developer's machine will run identically on any other machine (testing, production, cloud) that has a container engine installed.
-
Consistency: It completely solves the "it works on my machine" problem.
-
Scalability: You can easily create multiple instances of the same container to handle increased load.
Docker is a platform and a set of tools designed to make it easy to create, deploy, and run applications using containers. It provides everything needed to achieve containerization.
-
Docker Engine: The core background service that creates, manages, and runs containers. Users interact with the engine through a command-line interface (CLI).
-
Image: A lightweight, standalone, executable package that includes everything needed to run a piece of software, including the code, a runtime, libraries, environment variables, and config files. It's a read-only template or blueprint.
-
Container: A runnable instance of an image. You can create, start, stop, move, or delete a container. It's the "running" version of the blueprint.
-
Dockerfile: A text file that contains a list of commands and instructions that Docker uses to build a custom image automatically. You specify the base OS, dependencies, application code, and configurations here.
-
Docker Hub: A cloud-based registry service (like GitHub or an App Store) where you can find and share container images. Many official images for popular software (like MongoDB, Tomcat, Ubuntu) are available here.
-
Networking: Docker provides its own networking components to allow containers to communicate with each other in an isolated environment, and also allows you to "expose" ports to connect applications on your host machine to applications inside a container.
-
Volumes: The preferred mechanism for persisting data generated by and used by Docker containers. They allow data to exist even after a container is removed.
-
Docker Compose: A tool for defining and running multi-container Docker applications. With a single command, you can start and connect all the services (e.g., a web server, a database, a caching service) required for your application.
Getting started with Docker involves installing Docker Desktop, a graphical application that includes the Docker Engine, CLI, and other tools.
Before starting, you can check if Docker is already installed by opening a terminal (Command Prompt, PowerShell, or Terminal) and running:
docker version
If it's not installed, you will see an error like "docker
is not recognized..."
-
Go to the official Docker website: Search for "Docker download" or go directly to docker.com.
-
Choose your OS: The website will provide download links for Docker Desktop for Windows, Mac (Intel and Apple Silicon chips), and Linux.
-
Installation:
-
Windows/Mac: Download the installer and follow the straightforward on-screen instructions.
-
Linux: The website will provide a series of commands to run in your terminal to install Docker.
-
-
System Requirements: Docker can be resource-intensive. For a smooth experience on Windows, 16GB of RAM is recommended, though it can run on 8GB.
If you don't want to or cannot install Docker on your local machine, you can use Play with Docker.
-
It's a free, browser-based environment that gives you access to a Docker instance.
-
You need a Docker Hub account to log in.
-
Limitation: Sessions are temporary and typically last for 4 hours, making it great for quick tests but not for long-term projects.
-
Restart your Terminal: After installation is complete, close and reopen your command line/terminal.
-
Verify Again: Run
docker version
again. This time, it should successfully display the version numbers for the Docker client and engine. -
Launch Docker Desktop: Open the Docker Desktop application. It will show a dashboard where you can manage your containers, images, and volumes.
-
Pulling your first image: You can use the Docker Desktop search bar or the command line to download (or "pull") an image from Docker Hub. For example, to get the lightweight Ubuntu image, you would run:
docker pull ubuntu
This image will now appear in the "Images" tab of Docker Desktop, ready to be run as a container.
Once Docker is installed, you can interact with it through the Docker Desktop application or, more commonly, through the command line.
-
Image: A lightweight, read-only template (blueprint) containing instructions to create a container.
-
Container: A running instance of an image. It's the actual, working software. Containers are heavier than images as they are live processes.
Docker Hub is the default public registry for Docker images. You can search for images directly on the hub.docker.com website or through the Docker Desktop UI.
-
Docker Official Image: The most trusted images, maintained by Docker.
-
Verified Publisher: Images from trusted third-party vendors.
-
Community Images: Images created and shared by the general public. It's best to use official or verified images when possible.
The easiest way to start is with the hello-world
image. This is typically done from the command line (Terminal on Mac/Linux, CMD/PowerShell on Windows).
docker run hello-world
When you execute this command, Docker performs the following steps:
-
Checks Locally: It looks for the
hello-world:latest
image on your local machine. -
Pulls from Docker Hub: If it can't find the image locally, it automatically downloads (pulls) it from Docker Hub.
-
Creates and Runs a Container: It creates a new container from the image, runs it, and displays the "Hello from Docker!" message.
-
Exits: The
hello-world
container's job is just to print a message, so it stops and exits immediately after.
-
List all local images:
docker images
This shows all the images you have downloaded, including their repository name, tag (version), image ID, and size.
-
List running containers:
docker ps
This command shows only the containers that are currently active and running.
-
List all containers (running and stopped):
docker ps -a
This is useful for seeing containers that have completed their task and exited, like
hello-world
. The output includes the container ID, the image it was created from, its status, and a randomly assigned name (e.g.,nice_borg
).
This section covers the essential commands for managing the full lifecycle of your Docker images and containers.
To see a list of all available Docker commands, you can always use the help
command.
docker help
To keep your system clean, you'll often need to remove old containers and images.
-
Remove a container:
docker rm <container_id_or_name>
You can use the full ID or just the first few unique characters.
-
Remove an image:
docker rmi <image_id>
-
The Golden Rule: You cannot remove an image if there is a container (even a stopped one) that was created from it. You must remove the dependent container(s) first. If you try, Docker will show an error.
While docker run
is a convenient shortcut, it's helpful to understand the individual steps it combines.
-
Search for an image on Docker Hub:
docker search <image_name> # Example: docker search hello-world
-
Pull (download) the image from Docker Hub to your local machine:
docker pull <image_name> # Example: docker pull hello-world
After this step,
docker images
will show the new image, but no container has been created yet. -
Create a container from the image. This prepares the container but does not start it:
docker create <image_name> # Example: docker create hello-world
This command will output a long container ID. Now,
docker ps -a
will show the newly created (but stopped) container. -
Start the container to run it:
docker start <container_id_or_name> # Example: docker start b1d... (using the first few chars of the ID)
-
Stop a running container:
docker stop <container_id_or_name>
-
Pause a running container: This suspends all processes within the container without stopping it.
docker pause <container_id_or_name>
You cannot pause a container that is already stopped.
Remember, the docker run
command performs the pull
(if necessary), create
, and start
steps all in one go.
To understand how Docker works, it's essential to know its client-server architecture.
-
Docker Client: This is the primary way you interact with Docker. When you type commands like
docker run
ordocker pull
into your terminal, you are using the Docker client. -
Docker Daemon (dockerd): This is the Docker engine, a persistent background process that listens for API requests from the Docker client. The daemon does all the heavy lifting: building, running, and managing your containers. It also manages Docker objects like images, networks, and volumes.
-
Registry: This is a remote repository where Docker images are stored. Docker Hub is the default public registry, but companies can also host their own private registries.
-
A user issues a command to the Docker Client (e.g.,
docker pull ubuntu
). -
The Docker Client sends this command to the Docker Daemon via a REST API.
-
The Docker Daemon receives and processes the request.
-
If the command is to
pull
an image, the daemon connects to the Registry (e.g., Docker Hub), finds the image, and downloads it to the local machine. -
If the command is to
run
a container, the daemon uses a local image to create and start the container, managing its network and storage volumes.
-
-
The daemon is responsible for the entire lifecycle of Docker objects.
To run a Java application, you need a Java Development Kit (JDK). Instead of installing it on your host machine, you can run it inside a container, bundling your application with its runtime.
-
Search Docker Hub: You can look for official Java images. A popular and well-maintained option is
openjdk
.docker search openjdk
-
Check Architecture: Images are built for specific computer architectures (e.g.,
amd64
for most PCs,arm64
for Apple Silicon). Docker Hub lists the supported architectures for each image tag. It's crucial to pick one that matches your machine.
Once you've found a suitable image and tag, pull it to your local machine.
docker pull openjdk:22-jdk
This command downloads the OpenJDK version 22 image.
To use the tools inside the JDK image, like jshell
(an interactive Java REPL), you need to run the container in interactive mode.
docker run -it openjdk:22-jdk
-
-i
(interactive): Keeps STDIN open even if not attached. -
-t
(tty): Allocates a pseudo-TTY, which connects your terminal to the container's input/output.
This command will start the container and drop you directly into a jshell
prompt. You can now execute Java code inside the container:
jshell> int num = 9;
num ==> 9
jshell> System.out.println("Hello from a container!");
Hello from a container!
This demonstrates that you have a fully functional JDK environment running in an isolated container, ready for you to add your own application code.
The goal is to run our own custom application inside a container. Here, we'll create a simple Spring Boot web application and package it into an executable JAR file, making it ready for containerization.
The easiest way to start is with the Spring Initializr (start.spring.io
).
-
Project: Maven
-
Language: Java
-
Spring Boot Version: 3.1.5 (or any stable version)
-
Project Metadata:
-
Group:
com.telescope
-
Artifact:
rest-demo
-
-
Packaging: Jar
-
Dependencies: Add
Spring Web
to build a web application. -
Click "Generate" to download the project zip file.
After unzipping and opening the project in your IDE (e.g., IntelliJ), create a simple controller to handle web requests.
-
Create a new Java class named
HelloController
. -
Add the following code:
import org.springframework.web.bind.annotation.RequestMapping; import org.springframework.web.bind.annotation.RestController; @RestController public class HelloController { @RequestMapping("/") public String greet() { return "Hello World!"; } }
-
@RestController
: Marks this class as a controller where every method returns a domain object instead of a view. -
@RequestMapping("/")
: Maps HTTP requests to the root URL (/
) to thegreet()
method.
-
Before packaging, make sure the application works.
-
Run the main application class from your IDE.
-
Port Conflict: If the default port
8080
is in use, you'll see an error. You can change it by adding the following line tosrc/main/resources/application.properties
:server.port=8081
-
Restart the application. Open a web browser and navigate to
http://localhost:8081
. You should see "Hello World!".
To run the application standalone, you need to package it into an executable JAR file.
-
Since this is a Maven project, you can use the Maven wrapper to create the package.
-
Open a terminal in the root directory of your project and run:
mvn package
-
This command compiles your code, runs tests, and packages everything into a single JAR file located in the
target/
directory (e.g.,target/rest-demo-0.0.1-SNAPSHOT.jar
).
You can now run this JAR file from your terminal without needing an IDE.
-
Make sure you've stopped the application running from your IDE.
-
In your terminal, run the following command (adjust the JAR file name if needed):
java -jar target/rest-demo-0.0.1-SNAPSHOT.jar
-
The application will start up again. Verify it's working by visiting
http://localhost:8081
in your browser.
Now that we have a self-contained, executable JAR, the next step is to get this file into our JDK container and run it there.
Running multiple commands manually to copy files and commit changes to create an image is tedious and not scalable. The standard, automated way to create a custom Docker image is by using a Dockerfile
.
A Dockerfile
is a text document that contains all the commands, in order, needed to build a given image.
In the root directory of your Spring Boot project, create a new file named Dockerfile
(no file extension).
Add the following content to your Dockerfile
. Each instruction creates a new layer in the image.
# Start with a base image that has Java installed
FROM openjdk:22-jdk
# Add the compiled JAR file from our host machine into the image
# Format: ADD <source_on_host> <destination_in_image>
ADD target/rest-demo-0.0.1-SNAPSHOT.jar rest-demo.jar
# Tell Docker what command to run when a container from this image starts
ENTRYPOINT ["java", "-jar", "rest-demo.jar"]
-
FROM
: Specifies the base image to build upon. We're using theopenjdk
image because our application needs a Java runtime. -
ADD
: Copies files from a source on the host machine to a destination inside the image. Here, we copy our application's JAR file into the root directory of the image and rename it for simplicity. -
ENTRYPOINT
: Configures the container to run as an executable. This is the command that will be executed when the container starts.
Now, use the docker build
command to create your custom image. Run this command from the root directory of your project.
docker build -t rest-demo:v3 .
-
docker build
: The command to build an image from a Dockerfile. -
-t rest-demo:v3
: The-t
flag allows you to tag the image with a name and optional tag (version). Here, the name isrest-demo
and the tag isv3
. -
.
: The final dot specifies the build context, which is the set of files at the specified path..
means the current directory. Docker needs this context to find theDockerfile
and any files youADD
orCOPY
.
Docker will execute the steps in the Dockerfile
, using the cache for steps that haven't changed, and output a new image. You can verify this by running docker images
.
Finally, run a container from your newly created image.
docker run -p 8081:8081 rest-demo:v3
-
-p 8081:8081
: This maps the port. Our Spring Boot app runs on port8081
inside the container. This command exposes that internal port to port8081
on our host machine. -
rest-demo:v3
: The name and tag of the image we want to run.
The container will start, and the ENTRYPOINT
command will execute, running your Spring Boot application. You can now go to http://localhost:8081
in your browser and see "Hello World!" being served from your containerized application.
Real-world applications often require multiple services working together, such as a web application and a database. This section covers building a Spring Boot application that connects to a PostgreSQL database, setting the stage for managing it with Docker Compose.
Use the Spring Initializr (start.spring.io
) with the following settings:
-
Project: Maven
-
Language: Java
-
Artifact:
student-app
-
Packaging: Jar
-
Dependencies:
-
Spring Web
(for the REST controller) -
Spring Data JPA
(to easily interact with the database) -
PostgreSQL Driver
(the specific driver for our database)
-
-
Entity: Define a
Student
class annotated with@Entity
,@Id
, and@GeneratedValue
. This class represents the data in your database table. It should have fields likeid
,name
, andage
, along with constructors, getters, and setters. -
Repository: Create a
StudentRepo
interface that extendsJpaRepository<Student, Integer>
. Annotate it with@Repository
. This gives you standard database methods for free.
Create a StudentController
that uses the repository to fetch data.
@RestController
public class StudentController {
@Autowired
private StudentRepo repo;
@GetMapping("getStudents")
public List<Student> getStudents() {
return repo.findAll();
}
}
In src/main/resources/application.properties
, configure the database connection for local testing.
# Server Port
server.port=8090
# Datasource Properties for local testing
spring.datasource.url=jdbc:postgresql://localhost:5432/studentdb
spring.datasource.username=postgres
spring.datasource.password=your_password
# JPA Properties
spring.jpa.hibernate.ddl-auto=update
spring.jpa.show-sql=true
spring.sql.init.mode=always
-
ddl-auto=update
: Hibernate will update the database schema based on your entities. -
init.mode=always
: Ensuresdata.sql
is always run on startup.
-
Create a
data.sql
file insrc/main/resources
with someINSERT
statements to populate the database on startup. -
Run the application. If your local PostgreSQL server is running correctly, you should be able to navigate to
http://localhost:8090/getStudents
and see the data.
With a working application that depends on a separate database, we're now ready to define and run both services together using Docker Compose.
When your application consists of multiple services (like an app and a database), you need a way to define, run, and link them together. This is the job of Docker Compose. It uses a YAML file to configure all of your application's services.
-
Package: Make sure your
student-app
is packaged into a JAR file. You can configure thepom.xml
to produce a simpler final name (e.g.,student-app.jar
) and runmvn clean package
. -
Dockerfile: Create a
Dockerfile
in the root of thestudent-app
project. This file is responsible only for building the application image.FROM openjdk:22-jdk ADD target/student-app.jar student-app.jar ENTRYPOINT ["java","-jar","student-app.jar"]
In the root of your project, create a file named docker-compose.yml
. This file will define our two services: app
and postgres
.
version: '3.8' # Specifies the Compose file format version
services: # Defines all the services (containers) that make up your app
# The definition for our Spring Boot application container
app:
build: . # Build an image from the Dockerfile in the current directory
ports:
- "8090:8080" # Map port 8090 on the host to port 8080 in the container
# The definition for our PostgreSQL database container
postgres:
image: postgres:latest # Use the official postgres image from Docker Hub
ports:
- "5433:5432" # Map port 5433 on the host to port 5432 in the container
environment:
- POSTGRES_USER=hemanth
- POSTGRES_PASSWORD=1234
- POSTGRES_DB=employeedb
Instead of using docker run
, you now use a single docker-compose
command from your terminal in the same directory as the docker-compose.yml
file.
docker-compose up --build
-
docker-compose up
: This command starts (and creates, if necessary) all the services defined in thedocker-compose.yml
file. -
--build
: This flag tells Compose to build theapp
image from itsDockerfile
before starting the service.
When you run this command, both containers will start. However, the app
container will quickly fail and exit. Checking the logs (docker-compose logs app
), you will see a Connection refused
error.
Why? By default, Docker Compose creates a network for your application and attaches each service to it. However, the application and database containers are still isolated. Your application, configured to connect to localhost:5432
, is looking for a database on its own localhost
, not on the postgres
container's localhost
.
To fix this, we need to make the application aware of the database service on the Docker network and ensure the services start in the correct order.
To fix the Connection refused
error, we need to enable communication between the app
container and the postgres
container. This involves two key steps: updating the application's connection string and defining a shared network in Docker Compose.
Inside a Docker network, containers can refer to each other by their service name. The localhost
in the app
container refers to itself, not the postgres
container.
Modify src/main/resources/application.properties
to use the service name (postgres
) as the hostname.
# Old connection string
# spring.datasource.url=jdbc:postgresql://localhost:5432/studentdb
# New connection string for Docker networking
spring.datasource.url=jdbc:postgresql://postgres:5432/studentdb
# Also, update credentials to match the docker-compose environment variables
spring.datasource.username=naveen
spring.datasource.password=1234
Docker Compose provides internal DNS resolution, so the app
container can find the postgres
container using its service name.
While Docker Compose creates a default network, explicitly defining one gives you more control and makes the configuration clearer.
Update your docker-compose.yml
to create a network and attach both services to it.
version: '3.8'
services:
app:
build: .
ports:
- "8090:8090" # Host port : Container port
networks: # Attach this service to the 's-network'
- s-network
postgres:
image: postgres:latest
ports:
- "5433:5432"
environment:
- POSTGRES_USER=naveen
- POSTGRES_PASSWORD=1234
- POSTGRES_DB=studentdb
networks: # Attach this service to the same 's-network'
- s-network
# Top-level key to define networks
networks:
s-network: # Name of our custom network
driver: bridge # The default driver for single-host networking
-
Stop the currently running containers:
docker-compose down
-
Rebuild and start the services with the new configuration:
docker-compose up --build
With these changes, the app
container can now successfully connect to the postgres
container over the shared s-network
. You can visit http://localhost:8090/getStudents
and see the data being served from the containerized database.
Our application now works, but the data inside the PostgreSQL container is ephemeral. By default, when a container is removed, its entire filesystem is also deleted. This means if we run docker-compose down
, all the data our application has saved to the database will be lost forever.
To solve this, we need to store the data outside the container in a persistent location. The recommended way to do this in Docker is with Volumes. A volume is a storage mechanism managed by Docker that exists independently of a container's lifecycle.
We can create a named volume and "mount" it to a specific directory inside our postgres
container. The official PostgreSQL image stores its data in the /var/lib/postgresql/data
directory. By mounting a volume to this path, we tell Docker to store all the database files in our named volume on the host machine, instead of inside the container's temporary filesystem.
First, we need to declare a named volume at the top level of our docker-compose.yml
file. Let's call it db-data
.
Next, we attach this named volume to our postgres
service at the correct data directory.
Update your docker-compose.yml
file with the volumes
keys:
version: '3.8'
services:
app:
build: .
ports:
- "8090:8090"
networks:
- s-network
# Make sure app starts after the database is ready
depends_on:
- postgres
postgres:
image: postgres:latest
ports:
- "5433:5432"
environment:
- POSTGRES_USER=naveen
- POSTGRES_PASSWORD=1234
- POSTGRES_DB=studentdb
networks:
- s-network
# Mount the named volume to the container's data directory
volumes:
- db-data:/var/lib/postgresql/data
networks:
s-network:
driver: bridge
# Top-level key to define a named volume
volumes:
db-data:
-
volumes: db-data
(top-level): This creates a named volume calleddb-data
. Docker will manage this volume's storage location on the host machine. -
volumes: - db-data:/var/lib/postgresql/data
(underpostgres
service): This mounts thedb-data
volume to the/var/lib/postgresql/data
directory inside the container. -
depends_on
: This new key ensures that Docker Compose will start thepostgres
container before starting theapp
container, preventing connection errors during startup.
-
Run
docker-compose up --build
to start your application. -
Use your application to add, update, or delete some data.
-
Bring the services down using
docker-compose down
. This will stop and remove the containers. -
Bring the services back up with
docker-compose up
.
This time, when the new postgres
container starts, Docker will re-attach the existing db-data
volume. Your database will be in the exact same state as before you brought it down, with all your changes preserved. Your data is now persistent.