Skip to content
Open
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
137 changes: 137 additions & 0 deletions vertical-pod-autoscaler/enhancements/8459-memory-per-cpu/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
# AEP-8459: MemoryPerCPU

<!-- toc -->
- [Summary](#summary)
- [Motivation](#motivation)
- [Goals](#goals)
- [Non-Goals](#non-goals)
- [Proposal](#proposal)
- [Design Details](#design-details)
- [API Changes](#api-changes)
- [Behavior](#behavior)
- [Feature Enablement and Rollback](#feature-enablement-and-rollback)
- [How can this feature be enabled / disabled in a live cluster?](#how-can-this-feature-be-enabled--disabled-in-a-live-cluster)
- [Kubernetes Version Compatibility](#kubernetes-version-compatibility)
- [Validation](#validation)
- [Test Plan](#test-plan)
- [Implementation History](#implementation-history)
- [Future Work](#future-work)
- [Alternatives](#alternatives)
<!-- /toc -->

## Summary

This AEP proposes a new feature to allow enforcing a fixed memory-per-CPU ratio (`memoryPerCPU`) in Vertical Pod Autoscaler (VPA) recommendations.
The feature is controlled by a new alpha feature gate `MemoryPerCPURatio` (default off).

## Motivation

Many workloads scale their memory requirements proportionally to CPU. Today, VPA independently recommends CPU and memory, which can result in skewed recommendations (too much memory for small CPU, or too little memory for high CPU).

By introducing `memoryPerCPU`, users can enforce a predictable ratio between CPU and memory, reducing risk of misconfiguration and simplifying tuning for ratio-based workloads.

In addition, some environments or organizations prefer to keep a fixed CPU-to-memory ratio for reasons such as:
* **Billing models** – Many cloud providers price instances based on predefined CPU/memory bundles. Enforcing a fixed ratio makes VPA recommendations align better with billing units, avoiding unexpected cost patterns.
* **Operational simplicity** – A consistent CPU/memory ratio across workloads reduces variability and simplifies capacity planning.

### Goals

* Allow users to specify a `memoryPerCPU` ratio in `VerticalPodAutoscaler` objects.
* Ensure VPA recommendations respect the ratio across Target, LowerBound, UpperBound, and UncappedTarget.
* Provide a feature gate to enable/disable the feature cluster-wide.

### Non-Goals

* Redesign of the VPA recommender algorithm beyond enforcing the ratio.
* Supporting multiple ratio policies per container (only one `memoryPerCPU` is supported).
* Retroactive migration of existing VPAs without explicit user opt-in.

## Proposal

Extend `ContainerResourcePolicy` with a new optional field:

```yaml
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: my-app
spec:
resourcePolicy:
containerPolicies:
- containerName: app
minAllowed:
cpu: 1
memory: 4Gi
maxAllowed:
cpu: 4
memory: 16Gi
controlledResources: ["cpu", "memory"]
controlledValues: RequestsAndLimits
memoryPerCPU: "4Gi"
```

When enabled, VPA will adjust CPU or memory recommendations to maintain:

```
memory_bytes = cpu_cores * memoryPerCPU
```

## Design Details

### API Changes

* New field `memoryPerCPU` (`resource.Quantity`) in `ContainerResourcePolicy`.
* Feature gate: `MemoryPerCPURatio` (alpha, default off).

### Behavior

* If both CPU and memory are controlled, VPA enforces the ratio.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if both (cpu and memory) are not specified? Should that be a validation error? It seems, like we should enforce that if you specify both you should get an error, this way we'll ensure that either you specify all the pieces of the puzzle, or none.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially, my thinking was to simply ignore memoryPerCPU if either CPU or memory was not specified in controlledResources.

But if the philosophy is rather to fail fast and return a validation error whenever memoryPerCPU is set without both CPU and memory being present, I’m fine with that approach too, I can update the AEP accordingly.

* Applies to Target, LowerBound, UpperBound, and UncappedTarget.
* Ratio enforcement is strict:
* If the memory recommendation would exceed `cpu * memoryPerCPU`, then **CPU is increased** to satisfy the ratio.
* If the CPU recommendation would exceed `memory / memoryPerCPU`, then **memory is increased** to satisfy the ratio.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm inclined to say we should error out if the math doesn't stand with the cpu and memory values, adjusting seems "magical", and I'd advice against it. Explicitness is always better.

Copy link
Author

@Jrmy2402 Jrmy2402 Sep 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point, implicit adjustments can indeed feel “magical.”
In this case, though, the whole purpose of the feature is to enforce the ratio automatically: if CPU or memory drifts away from the configured ratio, VPA brings them back in line.

If we only validated and errored, users wouldn’t get the behavior they’re asking for (“always keep memory = cpu × memoryPerCPU”), they’d just see failures.
That would make the feature much less useful in practice.

Or maybe I didn’t fully understand your point?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're talking about two distinct things 😅 I was more asking about the validation case, where we ensure that the provided memory and cpu ensure we can reach the configured memoryPerCPU. Iow. if I specify cpu=1, memory=4Gi and use memoryPerCPU=5 then the math won't work and that should fail validation, but cpu=1, memory=4Gi and memoryPerCPU=3 will work, b/c that's achievable.

Whereas you're talking about the actual enforcement, which indeed will be "magical" 😉, and that's totally fine.

Does that make sense?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely, we were talking about two different layers.

I’ve pushed a commit to clarify the validation side: 93d9437

* If ratio cannot be applied (e.g., missing CPU), fallback to standard recommendations.
* With the `MemoryPerCPURatio` feature gate disabled, the `memoryPerCPU` field is ignored and recommendations fall back to standard VPA behavior.

### Feature Enablement and Rollback

#### How can this feature be enabled / disabled in a live cluster?

* Feature gate name: `MemoryPerCPURatio`
* Default: Off (Alpha)
* Components depending on the feature gate:
* recommender

**When enabled**:
* VPA honors `memoryPerCPU` in recommendations.

**When disabled**:
* `memoryPerCPU` is ignored.
* Recommendations behave as before.

### Kubernetes Version Compatibility

The `memoryPerCPU` feature requires VPA version 1.5.0 or higher. The feature is being introduced as alpha and will follow the standard Kubernetes feature gate graduation process:
- Alpha: v1.5.0 (default off)
- Beta: TBD (default on)
- GA: TBD (default on)

### Validation

* `memoryPerCPU` must be > 0.
* Value must be a valid `resource.Quantity` (e.g., `512Mi`, `4Gi`).

### Test Plan

* Unit tests ensuring ratio enforcement logic.
* E2E tests comparing behavior with different configurations

## Implementation History

* 2025-08-19: Initial proposal

## Future Work


## Alternatives

Loading