Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added Assets/Buildah-logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Assets/Kubernetes-logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Assets/Podman-logo.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Assets/Skopeo-logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Assets/containerization.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
19 changes: 19 additions & 0 deletions Assets/docker-logo.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
51 changes: 51 additions & 0 deletions Concepts/Containerization/Container Lifecycle.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
---
aliases:
- Concepts/Container Lifecycle
- Container Lifecycle
tags:
- seedling
publish: false
---

The container lifecycle describes the stages a container goes through—from image creation, initialization, running, pausing, stopping, and ultimately removal.
*All example code below user Docker as the runtime environment*
## Creation

`docker create --name my-container my-image:latest`

The container has been defined from an image and assigned resources but is not yet running. Docker (or another runtime) initializes the container metadata, filesystem, and environment.

## Running

`docker start` or `docker run` (create and start at the same time)

The container is executing the application or command. To monitor running containers:

`docker ps -a`

## Paused

`docker pause my_container`

The container process is temporarily suspended. To unpause a container:

`docker unpause my_container`

## Exit

To manually stop/kill a container: `docker kill my_container` or `docker stop my_container`

The container has completed execution or has been manually stopped. Alternatively, you can use:

`docker rm -f my_container` to remove the container

Documentation for these command can be read at [Dockerdocs - CLI commands](https://docs.docker.com/reference/cli/docker/)

%% wiki footer: Please don't edit anything below this line %%

## This note in GitHub

<span class="git-footer">[Edit In GitHub](https://github.dev/data-engineering-community/data-engineering-wiki/blob/main/Concepts/Container%20Lifecycle.md "git-hub-edit-note") | [Copy this note](https://raw.githubusercontent.com/data-engineering-community/data-engineering-wiki/main/Concepts/Container%20Lifecycle.md "git-hub-copy-note")</span>

<span class="git-footer">Was this page helpful?
[👍](https://tally.so/r/mOaxjk?rating=Yes&url=https://dataengineering.wiki/Concepts/Container%20Lifecycle) or [👎](https://tally.so/r/mOaxjk?rating=No&url=https://dataengineering.wiki/Concepts/Container%20Lifecycle)</span>
76 changes: 76 additions & 0 deletions Concepts/Software Engineering/Containerization.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
---
aliases:
- Concepts/Containerization
- Containerization
tags:
- incubating
publish: false
---

Containerization is a special form of **virtualization** that packages the software code with its operating system libraries and dependencies into a light-weight executable unit called a **Container**.
![[containerization.png| 550]]

## Containerization Advantages

- **Portability**: By packaging the code, its dependencies and OS, it solves the problem of "But it works on my machine", allowing the containerized software to run uniformly and consistently on various platforms.
- **Efficiency**: The software running in the containerized environment utilizes the host machine's OS kernel. Hence, it comes with a smaller size and requires less time to start up.
- **Faster deployment**: An application running from a container can be deployed easily and rapidly scaled due to their portability and efficiency.
- **Security**: The usage of containerization reduces the risk of the application being negatively impacted by security threats or exploits.
- **Microservices architecture**: Enables the development of modular, independently deployable services by using **Container** as deployment method.
- **Automation workflow**: Containerization can be integrated with **CI/CD** workflows, collaborating with various tools to create an automatic and consistent pipeline for building, testing, and deploying applications.

## Containerization Disadvantages

- **OS kernel**: Containerization uses the OS kernel to operate so if there are any vulnerabilities in the kernel it can potentially lead to errors on the application.
- **More security concerns**: Containerization require many components to function such as **Container**, **Container Image**, services,... These could be targets for exploits and attacks.
- **Increase Complexity**: The process of creating and managing **Container** is a thorny task that require deep and wide system knowledge.
- **Compatibility**: Containerized applications may face compatibility issues when interacting with legacy systems.
- **Require efforts**: Adopting Containerization requires significant amount of time and effort to master and apply into real-world system.

## Container

A container is a lightweight, portable unit that packages an application along with its dependencies and runs it in isolation using the host system's OS kernel. Containers ensure consistency across environments, making them ideal for scalable and reproducible deployments.

## Container Image

A container image is a read-only template used to create containers, containing all the necessary code, libraries, configurations, and dependencies. It serves as the blueprint for running containers and can be stored, shared, and versioned through container registries.

## Container Orchestration

Container orchestration refers to the automated management of containerized applications, including deployment, scaling, networking, and lifecycle management. Tools like Kubernetes help coordinate these containers across clusters, ensuring reliability and high availability in production environments.

## When to use Containerization

### Do's

- **Working with a Microservices architecture**: If you're working with microservices, containerization is a great fit. Containers allow you to deploy, scale, and manage each microservice independently making your application more scalable and resilient.
- **DevOps workflow**: This combined with containerization delivers consistency and speed to the process of CI/CD, enables efficient development, testing, deployment pipelines.
- **Complex dependencies**: By encapsulating dependencies within a **Container**, this allow applications to run consistently regardless of infrastructure.

### Don'ts

- **Simple application**: Using containerization for simple apps is overkill, straight deployment might be better regarding management complexity and speed.
- **Legacy system adoption**: "If it works, leave it be", refactor a working legacy system to newer technology might contain potential risks and tradeoffs that need consideration.
- **Fear of missing out**: "Everyone is using containerization so we need to use it too", this a false understanding of containerization usage. It is true that it provides many benefits to the development and deployment process, but when deciding to use a technology, you need to consider many aspects of a bigger picture. Everything has a tradeoff and containerization is not doing everything by itself, it needs to cooperate with its surroundings.

## Containerization Tools

- [[Docker]]: A widely used platform for building, running, and managing containerized applications with a robust ecosystem and CLI support.
- [[Podman]]: A daemonless (serviceless) container engine that offers Docker-compatible commands while supporting rootless containers for enhanced security.
- [[Kubernetes]]: An open-source orchestration system that automates the deployment, scaling, and management of containerized applications across clusters.
- [[Skopeo]]: A command-line tool for managing container images, allowing you to inspect, copy, and sign images without needing to pull them locally.
- [[Buildah]]: A tool for building Open Container Initiative (OCI) and Docker images from scratch or using **Dockerfiles**, often integrated with Podman for complete container workflows.

## Best Practices

- [Containerization best practices - Simform](https://www.simform.com/blog/containerization-best-practices)
- [Containerization best practices - Dev Communities](https://dev.to/aws-builders/the-art-of-creating-container-images-and-best-practices-3p9d)

%% wiki footer: Please don't edit anything below this line %%

## This note in GitHub

<span class="git-footer">[Edit In GitHub](https://github.dev/data-engineering-community/data-engineering-wiki/blob/main/Concepts/Software%20Engineering/Containerization.md "git-hub-edit-note") | [Copy this note](https://raw.githubusercontent.com/data-engineering-community/data-engineering-wiki/main/Concepts/Software%20Engineering/Containerization.md "git-hub-copy-note")</span>

<span class="git-footer">Was this page helpful?
[👍](https://tally.so/r/mOaxjk?rating=Yes&url=https://dataengineering.wiki/Concepts/Software%20Engineering/Containerization) or [👎](https://tally.so/r/mOaxjk?rating=No&url=https://dataengineering.wiki/Concepts/Software%20Engineering/Containerization)</span>
28 changes: 28 additions & 0 deletions Templates/Tool Template - Containerization.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
---
Aliases: []
Tags: [seedling]
publish: false
---

(optional) Logo

Brief description of the tool.

## {{title}} Official Documentation

## {{title}} Advantages

## {{title}} Disadvantages

## {{title}} Learning Resources

## {{title}} Recent Posts

%% wiki footer: Please don't edit anything below this line %%

## This note in GitHub

<span class="git-footer">[Edit In GitHub](https://github.dev/data-engineering-community/data-engineering-wiki/blob/main/Tools/Containerization/{{title}}.md "git-hub-edit-note") | [Copy this note](https://raw.githubusercontent.com/data-engineering-community/data-engineering-wiki/main/Tools/Containerization/{{title}}.md "git-hub-copy-note")</span>

<span class="git-footer">Was this page helpful?
[👍](https://tally.so/r/mOaxjk?rating=Yes&url=https://dataengineering.wiki/Tools/Conatinerization/{{title}}) or [👎](https://tally.so/r/mOaxjk?rating=No&url=https://dataengineering.wiki/Tools/Conatinerization/{{title}})</span>
Loading