Skip to content

Conversation

sthaha
Copy link
Collaborator

@sthaha sthaha commented Aug 12, 2025

This commit implements EP-002 MSR fallback power meter enhancement proposal.
Add MSR (Model Specific Register) support as fallback when powercap interface
is unavailable. This enhancement improves Kepler's compatibility across
different systems and kernel configurations.

Key changes:

  • Add MSR reader implementation with Intel RAPL register support
  • Create raplReader interface abstracting powercap and MSR backends
  • Extract existing powercap logic into dedicated reader component
  • Enhance RAPL power meter with automatic fallback detection
  • Add MSR configuration with security-conscious opt-in defaults
  • Implement comprehensive test coverage with mock MSR data

The MSR fallback is disabled by default due to PLATYPUS attack vectors
(CVE-2020-8694/8695) and must be explicitly enabled via configuration.
When enabled, the system automatically falls back to MSR if powercap
is unavailable, maintaining transparent operation.

Introduces enhancement proposal for adding MSR (Model Specific Register)
support as a fallback mechanism when Intel RAPL powercap sysfs interface
is unavailable. This improves Kepler's deployment flexibility in
environments with restricted powercap access.

The proposal includes:
- Architecture design using powerReader abstraction
- Security considerations for MSR access (PLATYPUS mitigation)
- Phased implementation plan with backward compatibility
- Configuration for opt-in MSR fallback behavior

Signed-off-by: Sunil Thaha <sthaha@redhat.com>
@github-actions github-actions bot added feat A new feature or enhancement docs Documentation changes labels Aug 12, 2025
@sthaha sthaha marked this pull request as draft August 12, 2025 06:30
Copy link
Contributor

⚠️ Config changes detected in this PR
Please make sure that the config changes are updated in the following places as part of this PR:

  • docs/configuration/configuration.md
  • compose/dev/kepler-dev/etc/kepler/config.yaml
  • hack/config.yaml
  • manifests/k8s/configmap.yaml
  • manifests/helm/kepler/values.yaml

Copy link

codecov bot commented Aug 12, 2025

Codecov Report

❌ Patch coverage is 62.53041% with 154 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.78%. Comparing base (6d9c50c) to head (7f7308a).
⚠️ Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
internal/device/rapl_power_meter.go 34.28% 66 Missing and 3 partials ⚠️
internal/device/msr_reader.go 71.91% 36 Missing and 14 partials ⚠️
internal/device/msr_zone.go 66.10% 14 Missing and 6 partials ⚠️
internal/device/powercap_reader.go 77.77% 8 Missing and 4 partials ⚠️
config/config.go 80.00% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2273      +/-   ##
==========================================
- Coverage   92.27%   88.78%   -3.50%     
==========================================
  Files          39       42       +3     
  Lines        4142     4501     +359     
==========================================
+ Hits         3822     3996     +174     
- Misses        257      410     +153     
- Partials       63       95      +32     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

This commit implements EP-002 MSR fallback power meter enhancement proposal.
Add MSR (Model Specific Register) support as fallback when powercap interface
is unavailable. This enhancement improves Kepler's compatibility across
different systems and kernel configurations.

Key changes:
- Add MSR reader implementation with Intel RAPL register support
- Create raplReader interface abstracting powercap and MSR backends
- Extract existing powercap logic into dedicated reader component
- Enhance RAPL power meter with automatic fallback detection
- Add MSR configuration with security-conscious opt-in defaults
- Implement comprehensive test coverage with mock MSR data

The MSR fallback is disabled by default due to PLATYPUS attack vectors
(CVE-2020-8694/8695) and must be explicitly enabled via configuration.
When enabled, the system automatically falls back to MSR if powercap
is unavailable, maintaining transparent operation.

Signed-off-by: Sunil Thaha <sthaha@redhat.com>
@sthaha sthaha force-pushed the feat-rapl-msr-impl branch from 707625d to 7f7308a Compare August 12, 2025 07:07
Copy link
Contributor

⚠️ Config changes detected in this PR
Please make sure that the config changes are updated in the following places as part of this PR:

  • docs/configuration/configuration.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation changes feat A new feature or enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant