-
Notifications
You must be signed in to change notification settings - Fork 214
docs(proposal): add EP-002 for MSR fallback power meter support #2271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
7fd5fd5
to
f64731f
Compare
Introduces enhancement proposal for adding MSR (Model Specific Register) support as a fallback mechanism when Intel RAPL powercap sysfs interface is unavailable. This improves Kepler's deployment flexibility in environments with restricted powercap access. The proposal includes: - Architecture design using powerReader abstraction - Security considerations for MSR access (PLATYPUS mitigation) - Phased implementation plan with backward compatibility - Configuration for opt-in MSR fallback behavior Signed-off-by: Sunil Thaha <sthaha@redhat.com>
request not use AI to generate enhancement proposals. a short, concise proposal can convey the intention better, IMHO. |
- **Primary Goal**: Implement MSR-based RAPL reading as automatic fallback when | ||
powercap is unavailable | ||
- **Secondary Goal**: Maintain existing CPUPowerMeter interface compatibility | ||
- **Tertiary Goal**: Provide configurable control over fallback behavior for | ||
security-conscious deployments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- **Primary Goal**: Implement MSR-based RAPL reading as automatic fallback when | |
powercap is unavailable | |
- **Secondary Goal**: Maintain existing CPUPowerMeter interface compatibility | |
- **Tertiary Goal**: Provide configurable control over fallback behavior for | |
security-conscious deployments | |
- Implement MSR-based RAPL reading as automatic fallback when | |
powercap is unavailable | |
- Maintain existing CPUPowerMeter interface compatibility | |
- Provide configurable control over fallback behavior for | |
security-conscious deployments |
style CPUPowerMeter fill:#e1f5fe | ||
style raplPowerMeter fill:#b3e5fc | ||
style powercapReader fill:#81d4fa | ||
style msrReader fill:#ffccbc | ||
style zoneAdapter fill:#c5e1a5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style CPUPowerMeter fill:#e1f5fe | |
style raplPowerMeter fill:#b3e5fc | |
style powercapReader fill:#81d4fa | |
style msrReader fill:#ffccbc | |
style zoneAdapter fill:#c5e1a5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make the text visible
kepler_node_package_energy_millijoule{node="node1"} 12345 | ||
kepler_node_core_energy_millijoule{node="node1"} 6789 | ||
kepler_node_dram_energy_millijoule{node="node1"} 3456 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what metrics are these?
2. **Phase 2**: Enable MSR fallback in staging environments | ||
3. **Phase 3**: Gradual rollout to production with monitoring |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is meant by "staging environment", and "rollout to production" ?
style zoneAdapter fill:#c5e1a5 | ||
``` | ||
|
||
### Key Design Choices |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reading MSR will require higher privileges
I agree, let me strip to down to bare minimal and resubmit. |
Introduces enhancement proposal for adding MSR (Model Specific Register) support as a fallback mechanism when Intel RAPL powercap sysfs interface is unavailable. This improves Kepler's deployment flexibility in environments with restricted powercap access.
The proposal includes: