-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Open
Labels
area/cluster-autoscalerarea/core-autoscalerDenotes an issue that is related to the core autoscaler and is not specific to any provider.Denotes an issue that is related to the core autoscaler and is not specific to any provider.wg/device-managementCategorizes an issue or PR as relevant to WG Device Management.Categorizes an issue or PR as relevant to WG Device Management.
Description
Which component are you using?:
/area cluster-autoscaler
/area core-autoscaler
/wg device-management
Is your feature request designed to solve a problem? If so describe the problem this feature should solve.:
KEP-4815 adds support for partitionable devices to DRA. This means that the Devices exposed in ResourceSlices might "overlap", and allocating one Device might make other Devices unallocatable. The feature is behind a separate feature gate and went to alpha in 1.33.
Describe the solution you'd like.:
Cluster Autoscaler should be able to handle most of KEP-4815 out of the box, since all the additional partition-aware logic will be added to the DRA scheduler plugin that CA delegates to. However, there are some things we'll have to tackle:
- The only part that won't work out of the box is calculating utilization for scale-down. The current logic assumes that all devices within a resource pool are identical, which isn't the case for partitioned devices. CA DRA: review calculating Node utilization for DRA resources #7781 tracks designing how to calculate utilization for DRA in general, which should include partitionable devices. However, solving the full problem will require a KEP and might take some time. We might want to consider adapting the current logic to be partition-aware in the meantime - if that's feasible.
- We need to add integration tests for partitionable devices to
static_autoscaler_dra_test.go
. - We need to test autoscaling partitionable devices in a real cluster.
Metadata
Metadata
Assignees
Labels
area/cluster-autoscalerarea/core-autoscalerDenotes an issue that is related to the core autoscaler and is not specific to any provider.Denotes an issue that is related to the core autoscaler and is not specific to any provider.wg/device-managementCategorizes an issue or PR as relevant to WG Device Management.Categorizes an issue or PR as relevant to WG Device Management.