Skip to content

Commit 9483d29

Browse files
authored
[HWORKS-2175] Kueue - queues, cohorts and topologies (#479)
1 parent 32ac867 commit 9483d29

File tree

5 files changed

+170
-4
lines changed

5 files changed

+170
-4
lines changed
Loading
Loading
Loading

docs/user_guides/projects/scheduling/kube_scheduler.md

Lines changed: 35 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,16 @@
11
---
22
description: Documentation on how to configure Kubernetes scheduling options for Hopsworks workloads.
33
---
4+
45
# Scheduler
56

67
## Introduction
78

8-
Hopsworks allows users to configure [Affinity](https://kubernetes.io/docs/tasks/configure-pod-container/assign-pods-nodes-using-node-affinity/) and [Priority Classes](https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/#priorityclass) when running workloads on Hopsworks, this includes jobs, jupyter notebooks and model deployments.
9+
Hopsworks allows users to configure some Kubernetes scheduler abstractions, such as [Affinity](https://kubernetes.io/docs/tasks/configure-pod-container/assign-pods-nodes-using-node-affinity/) and [Priority Classes](https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/#priorityclass). Hopsworks also supports additional scheduling abstractions backed by Kueue. This includes [Queues](https://kueue.sigs.k8s.io/docs/concepts/cluster_queue/), [Cohorts](https://kueue.sigs.k8s.io/docs/concepts/cohort/) and [Topologies](https://kueue.sigs.k8s.io/docs/concepts/topology_aware_scheduling/). All these scheduling abstractions are supported in jobs, jupyter notebooks and model deployments. Kueue abstractions however, are currently not supported for Spark jobs.
910

10-
Hopsworks Admins can control which labels and priority classes can be used the cluster (see [Cluster configuration](#cluster-configuration) section) and by which project (see [Default Project configuration](#default-project-configuration) section)
11+
Hopsworks Admins can control which labels and priority classes can be used the cluster (see [Cluster configuration](#cluster-configuration) section) and by which project (see [Default Project configuration](#default-project-configuration) section)
1112

12-
Within a project, data owners can set defaults for jobs and Jupyter notebooks running within that project (see: [Project defaults](#project-defaults) section).
13+
Within a project, data owners can set defaults for jobs and Jupyter notebooks running within that project (see: [Project defaults](#project-defaults) section).
1314

1415
### Node Labels, Node Affinity and Node Anti-Affinity
1516

@@ -44,7 +45,31 @@ Common uses:
4445

4546
For more information on Priority Classes, you can check the Kubernetes [Priority Classes documentation](https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/#priorityclass) page.
4647

47-
## Cluster Configuration
48+
## Kueue
49+
50+
Hopsworks adds the integration with Kueue to offer more advanced scheduling abstractions such as queues, cohorts and topologies.
51+
52+
For a more detailed view on how Hopsworks uses the Kueue abstractions you can check the [Kueue details](./kueue_details.md) section.
53+
54+
### Queues, Cohorts
55+
56+
Jobs, notebooks and model deployments are submitted to these queues. Hopsworks administrator can define quotas on how many resources a queue can use. Queues can be grouped together in cohorts in order to add the ability to borrow resources from each other when the other queue does not use its resources.
57+
58+
When creating a new job, the user can select a queue for the job in the `Advance configuration -> Scheduler section`.
59+
60+
![Default queue for user and system jobs](../../../assets/images/guides/project/scheduler/job_queue.png)
61+
62+
### Topologies
63+
64+
The integration of Hopsworks with Kueue, also provides access to the topology abstraction. Topologies can be defined, so that the user can decide for the pods of jobs or model deployments to run somehow grouped together. The user could decide for example, that all pods of a job should run on the same host, because the pods need to transfer a lot of data between each other, and we want to avoid network traffic to lower the latency.
65+
66+
The user can select the topology unit for jobs, notebooks and model deployments in the `Advance configuration -> Scheduler section`.
67+
68+
![Default queue for user and system jobs](../../../assets/images/guides/project/scheduler/job_topology_unit.png)
69+
70+
## Admin configuration
71+
72+
### Affinity and priority classes
4873

4974
Hopsworks admins can control the affinity labels and priority classes available on the Hopsworks cluster from the `Cluster Settings -> Scheduler` page:
5075

@@ -72,6 +97,12 @@ Hopsworks Cluster can run within a shared Kubernets Cluster. The first configura
7297

7398
If the roles above are configured properly (default behaviour), admins can only select values from the drop down menu. If the roles are missing, admins would be required to enter them as free text and should be careful about typos. Any typos here will be propagated in the other configuration and use levels leading to errors or missbehaviour when running computation.
7499

100+
### Queues
101+
102+
Every new project gets automatic access to the default Hopsworks queue. An administrator can define the default queue for projects user jobs and system jobs.
103+
104+
![Default queue for user and system jobs](../../../assets/images/guides/project/scheduler/default_queue.png)
105+
75106
## Project Configuration
76107

77108
Hopsworks admins can configure the labels and priority classes that can be used by default within a project. This will be a subset of the ones configured for Hopsworks.
Lines changed: 135 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,135 @@
1+
---
2+
description: Kueue abstractions
3+
---
4+
5+
# Kueue
6+
7+
## Introduction
8+
9+
Hopsworks provides the integration with Kueue to provide the additional scheduling abstractions. Hopsworks currently acts only as a "reader" to the Kueue abstractions and currently does not manage the lifecycle of Kueue abstraction with the exception of the default localqueue for each namespace. All the other abstractions are expected to be managed by the administrators of Hopsworks, directly on the Kubernetes cluster.
10+
11+
However Hopsworks and Kueue integration currently only supports frameworks python and ray for jobs, notebooks and model deployments. The same queues are also used for Hopsworks internal jobs (zipping, git operations, python library installation). Spark is currently not supported, and thus will not be managed by Kueue for scheduling, and instead it will bypass the queues setup (important to note when thinking about queue quotas) and instead are managed directly by the Kubernetes Scheduler.
12+
13+
### Resource flavors
14+
15+
When trying to define queues in Kueue, the first abstraction that needs to be defined is a [Resource Flavor](https://kueue.sigs.k8s.io/docs/concepts/resource_flavor/). The resource flavor defines the resources that a queue will later manage. Hopsworks helm chart installs and uses a default ResourceFlavor
16+
17+
```
18+
apiVersion: kueue.x-k8s.io/v1beta1
19+
kind: ResourceFlavor
20+
metadata:
21+
name: default-flavor
22+
spec:
23+
nodeLabels:
24+
cloud.provider.com/region: europe
25+
topologyName: default
26+
```
27+
28+
Node labels filter the available nodes to this resource flavor and is required for [topologies](#Topologies)
29+
30+
### Cluster Queues
31+
32+
[Cluster Queues](https://kueue.sigs.k8s.io/docs/concepts/cluster_queue/) are the actual queues for submitting jobs and model deployments to. The default hopsworks queue looks like:
33+
34+
```
35+
apiVersion: kueue.x-k8s.io/v1beta1
36+
kind: ClusterQueue
37+
metadata:
38+
name: other
39+
spec:
40+
cohort: cluster
41+
namespaceSelector: {}
42+
preemption:
43+
borrowWithinCohort:
44+
policy: Never
45+
reclaimWithinCohort: Never
46+
withinClusterQueue: Never
47+
queueingStrategy: BestEffortFIFO
48+
resourceGroups:
49+
- coveredResources:
50+
- cpu
51+
- memory
52+
- pods
53+
- nvidia.com/gpu
54+
flavors:
55+
- name: default-flavor
56+
resources:
57+
- name: cpu
58+
nominalQuota: "0"
59+
- name: memory
60+
nominalQuota: "0"
61+
- name: pods
62+
nominalQuota: "0"
63+
- name: nvidia.com/gpu
64+
nominalQuota: "0"
65+
```
66+
67+
The [preemption](https://kueue.sigs.k8s.io/docs/concepts/cluster_queue/#preemption) and [nominal quotas](https://kueue.sigs.k8s.io/docs/concepts/cluster_queue/#flavors-and-resources) are set to the minimal as this queue is designed to have lowest priority in getting resources allocated. If a cluster is underutilized and there are resources available, it can still borrow up to the maximum resources present in the parent cohort, but by design this queue has no dedicated resources. The presumption is that other, more important queues, defined by the cluster administrator will have higher preference in getting resources.
68+
69+
### Local Queues
70+
71+
[Local Queues](https://kueue.sigs.k8s.io/docs/concepts/local_queue/) are the mechanism to provide access to a queue (cluster queue) to a specific project in Hopsworks (Kubernetes namespace).
72+
73+
Every new project gets automatic access to the default Hopsworks queue. An administrator can define the default queue for projects user jobs and system jobs.
74+
75+
![Default queue for user and system jobs](../../../assets/images/guides/project/scheduler/default_queue.png)
76+
77+
### Cohorts
78+
79+
[Cohorts](https://kueue.sigs.k8s.io/docs/concepts/cohort/) are groupings of cluster queues that have some meaning together and can share resources. Hopsworks defines a default `cluster` cohort
80+
81+
```
82+
apiVersion: kueue.x-k8s.io/v1alpha1
83+
kind: Cohort
84+
metadata:
85+
name: cluster
86+
spec:
87+
resourceGroups:
88+
- coveredResources:
89+
- cpu
90+
- memory
91+
- pods
92+
- nvidia.com/gpu
93+
flavors:
94+
- name: default-flavor
95+
resources:
96+
- name: cpu
97+
nominalQuota: 100
98+
- name: memory
99+
nominalQuota: 200Gi
100+
- name: pods
101+
nominalQuota: 100
102+
- name: nvidia.com/gpu
103+
nominalQuota: 50
104+
```
105+
106+
Cohorts can contain other cohorts and thus you can create a hierarchy of cohorts. Cohorts can set [fair sharing weight](https://kueue.sigs.k8s.io/docs/concepts/admission_fair_sharing/) where using
107+
108+
```
109+
fairSharing:
110+
weight
111+
```
112+
113+
in the definition of a cohort, the user can control a priority towards borrowing resources from other cohorts.
114+
115+
### Topologies
116+
117+
[Topologies](https://kueue.sigs.k8s.io/docs/concepts/topology_aware_scheduling/) defines a way of grouping together pods belonging to the same job/deployment so that they are colocated within the same topology unit. Hopsworks defines a default topology:
118+
119+
```
120+
apiVersion: kueue.x-k8s.io/v1alpha1
121+
kind: Topology
122+
metadata:
123+
name: default
124+
spec:
125+
levels:
126+
- nodeLabel: cloud.provider.com/region
127+
- nodeLabel: cloud.provider.com/zone
128+
- nodeLabel: kubernetes.io/hostname
129+
```
130+
131+
The topology is defined in the Resource Flavor used by a Cluster Queue.
132+
133+
When creating a new job, the user can select a topology unit for the job to run in and thus decide if all pods of a job should run on the same hostname, in the same zone or in the same region. The user can select the topology for jobs, notebooks and deployments in the `Advance configuration -> Scheduler section`.
134+
135+
![Default queue for user and system jobs](../../../assets/images/guides/project/scheduler/job_topology_unit.png)

0 commit comments

Comments
 (0)