Skip to content

Add Development Docs #13

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
135 changes: 135 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,141 @@ Or install it yourself as:
}
```

## Definitions and States

### Node

A node represent a host that can run containers. It can be a physical or a virtual machine with Docker HTTP API enabled or even a Kubernetes cluster. In this case Container Broker will consider the whole Kubernetes cluster as a single node and the slots will represent its pods.

These are the node statuses:

* `available`: Host is responding to connections and can accept new tasks
* `unstable`: Host is not responding to connections for a few time. No new tasks are assigned to it.
* `unavailable`: Host is not responding to connections for a long time. All running tasks are moved to other nodes.

```mermaid
stateDiagram-v2
[*] --> unavailable
unavailable --> available : Becomes responsive
available --> unstable : Not responding for a few time\nNo new tasks are assigned
unstable --> unavailable : Keeps not responding for a long time. \nAll Tasks are moved to other nodes
unstable --> available : Becomes responsive again
```

### Slot

A slot represents a possible container that can be created in a node, limited by an execution type. This is mainly used to limit the amount of tasks that can be run in paralell in a single node. For example, a node can have 2 slots for `cpu` execution type and 10 slots for `network` execution type.

These are the slot statuses:

* `available`: No task is assigned to this slot.
* `attaching`: A task is being assigned to this slot and the container is being created.
* `running`: The container was created and started.
* `releasing`: The container finished its execution and is being removed. The logs are being fetched.

```mermaid
stateDiagram
[*] --> available
available --> attaching : Lock aquired to run a task
attaching --> available : Candidate task already picked by another slot \n or error creating the container
attaching --> running : Container created and started
running --> releasing : Container finished its execution
running --> available : Node becomes unavailable
releasing --> available : Container removed and logs fetched
```

### Task

A task represents a shell command and a Docker image that needs to be run in a Docker container. It has an execution type that is used to find a slot in a node that can run it.

These are the task statuses:

* `waiting`: Task is waiting to be assigned to a slot.
* `starting`: Task is assigned to an slot and container is being created.
* `started`: Container started to run.
* `retry`: Container exit status was not 0 or node stopped responding. Retry count is below the max retries. Task is waiting to be assigned to a slot again.
* `failed`: The retry count is above the max retries and task won't be retried again.
* `completed`: Container exit status was 0 and logs were fetched.
* `error`: A failed task was marked as definitive error, either manually or by a timeout (20h).

```mermaid
stateDiagram-v2
[*] --> waiting
waiting --> starting : An available slot is found
starting --> started: Container created \n and started

started --> completed : Successful \ntermination
completed --> [*]

started --> retry : Retry count \nbelow max retries
started --> failed : Retry count \nreached max retries
retry --> starting : An available slot\n is found
failed --> error : Unretryable error
failed --> starting : Manually set \nto try again
error --> [*]
```

## Jobs

Theses are the main jobs that runs in background and keep the system working.

### UpdateAllNodesStatusJob

Continuously updates the status of **available** nodes. It runs each 5 seconds.

For each node, the job fetches all node's containers and check its status.
If a container finished then the slot is marked as releasing. At this moment the logs are fetched and then slot is marked as available. Task is marked as completed or failed depending on exit status.

### RunTasksForAllExecutionTypesJob

This job is triggered at startup and after each task creation or completion. It starts the execution of all unstarted tasks for each execution type.

While there are any pending tasks, it finds an available slot and marks it as `attaching`. Then it finds a task in `waiting` or `retry` state.

If no tasks are found, the slot is marked as `available` again. Changing this to `attaching` first is needed to lock it because multiple instances of this job can run in parallel.

If a task is found, it marks it as `starting` and assigns it to the slot. Then it pulls the Docker image (if it does not exist on the node yet), creates the container with the proper command and volume mapping, and starts the container. The slot is marked as `running` and the task is marked as `started`. At this moment, the `UpdateAllNodesStatusJob` takes care of the task status checking.

### MonitorUnresponsiveNodes

Continuously checks the status of **unstable** and **unavailable** nodes, which are not checked by `UpdateAllNodesStatusJob`.

If an unstable node keeps not responding for certain amount of time (currently 2 minutes) then it is marked as `unavailable`. All its running tasks are marked as `retry` and are automatically picked by the task runner job. If the retry count already reached the max retries then the task is just marked as `failed`.

If the connection succeeds then the node is marked as available again and passes to be monitored by `UpdateAllNodesStatusJob`.

### Flow

```mermaid
graph TD;
Scheduler -- continuously--> UpdateAllNodesStatusJob
UpdateAllNodesStatusJob -- for each available node --> UpdateNodeStatusJob
UpdateNodeStatusJob[UpdateNodeStatusJob\n<em>fetch containers, match with slots \n and check each if finished</em>]
UpdateNodeStatusJob -- for each finished slot --> ReleaseSlotJob
ReleaseSlotJob[ReleaseSlotJob \n<em> fetch logs and update task status]
ReleaseSlotJob --> RemoveRunnerJob[RemoveRunnerJob\n<em>remove container or pod]
ReleaseSlotJob --> RunTasksJob

Scheduler -- only at startup --> RunTasksForAllExecutionTypesJob
RunTasksForAllExecutionTypesJob --> RunTasksJob
RunTasksJob[RunTasksJob\n<em>all from specific execution type]
RunTasksJob -- task and slot assigned --> RunTaskJob
RunTaskJob[RunTaskJob\n<em>pull image \n create and start pod or container \n change task to started]

Scheduler -- continuously --> MonitorUnresponsiveNodesJob
MonitorUnresponsiveNodesJob -- for each unresponsive node --> MonitorUnresponsiveNodeJob
MonitorUnresponsiveNodeJob -- changed from \n unstable to unavailable --> MigrateTasksFromDeadNodeJob
MigrateTasksFromDeadNodeJob --> RunTasksJob
MonitorUnresponsiveNodeJob -- becomes available --> RunTasksForAllExecutionTypesJob

Requests --> TaskCreation[Task Creation]
Requests --> NodeCreateUpdate[Node Create/Update]
NodeCreateUpdate --> AdjustNodeSlotsJob
TaskCreation --> RunTasksJob
AdjustNodeSlotsJob --> RunTasksJob
AdjustNodeSlotsJob[AdjustNodeSlotsJob]
```

## Development

After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
Expand Down