MSE observer for NVFP4 #1840

shubhra · 2025-09-17T19:59:20Z

SUMMARY:
"MSE Observer implemented for NVFP4 for both weights and activations. Originally, I had added a new observer mseact for input activations global scale calculation but moved it under the same observer to keep it clean."

github-actions · 2025-09-17T19:59:29Z

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

gemini-code-assist

Summary of Changes

Hello @shubhra, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the quantization capabilities by introducing a specialized Mean Squared Error (MSE) observer for NVFP4 activations. The new mseact observer is tailored to accurately determine quantization parameters, including a dynamic global scale, for input activations. Concurrently, the base MSE observer has been updated to support global scale calculation and includes detailed debug logging, providing better visibility into the quantization process. These changes collectively improve the precision and debuggability of FP4 quantization for both weights and activations.

Highlights

New mseact Observer for Activations: Introduced MovingAverageMSEObserverActivations specifically designed for NVFP4 quantization of input activations, enabling dynamic global scale calculation.
Global Scale Calculation for MSE Observer: The existing MSEObserver (mse.py) now includes a calculate_gparam method to generate global scales based on MSE-optimized min/max values.
Enhanced Debugging: Added extensive debug print statements across key methods in mse.py to provide better visibility into the quantization parameter calculation process.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces an MSE observer for NVFP4, including a new mseact observer for activations. The overall direction is good, but there are several areas for improvement. The most significant issues are the presence of numerous debug print statements across the modified and new files, which should be removed, and substantial code duplication in the new mseact.py file that could be resolved through inheritance. I've also noted some minor issues regarding import locations and type hints.

src/llmcompressor/observers/mse.py

src/llmcompressor/observers/mseact.py

src/llmcompressor/observers/mse.py

src/llmcompressor/observers/mseact.py

Signed-off-by: Shubhra Pandit <shubhra.pandit@gmail.com>

… one observer Signed-off-by: Shubhra Pandit <shubhra.pandit@gmail.com>

Signed-off-by: Shubhra Pandit <shubhra.pandit@gmail.com>

…ndling for fp4 quantization scheme. Signed-off-by: Shubhra Pandit <shubhra.pandit@gmail.com>

Signed-off-by: Shubhra Pandit <shubhra.pandit@gmail.com>

brian-dellabetta

Thanks for the contribution! I've not worked on Observer logic much, so will leave approval to others, but code looks clean.

I wonder if this might be better served as a completely separate class that subclasses MovingAverageMSEObserver, sounds like that is what you had in mind in your summary, an "mseact" observer

brian-dellabetta · 2025-09-26T18:25:51Z

src/llmcompressor/observers/mse.py

        self.averaging_constant = averaging_constant
        self.grid = grid
        self.norm = norm
+        self.is_activation = base_name != "weight"


if this is the only place we use base_name, it might just be better to expose is_activation: bool = False to the constructor instead of base_name

brian-dellabetta · 2025-09-26T18:26:33Z

src/llmcompressor/observers/mse.py

        averaging_constant: float = 0.01,
        grid: float = 100.0,
        norm: float = 2.4,
+        base_name: str = "weight",


will have to confirm with @shanjiaz if this needs to be added elsewhere, since there are a couple different places Observers are instantiated

src/llmcompressor/observers/mse.py

Co-authored-by: Brian Dellabetta <brian-dellabetta@users.noreply.github.com>

shanjiaz

Seems like the initialization step in calibration is covered, looks good to me!

kylesayrs · 2025-10-07T21:15:59Z

I think this is going to be a simpler approach

#1903

gemini-code-assist bot reviewed Sep 17, 2025

View reviewed changes

dsikka requested changes Sep 17, 2025

View reviewed changes

src/llmcompressor/observers/mse.py Outdated Show resolved Hide resolved

src/llmcompressor/observers/mse.py Outdated Show resolved Hide resolved

src/llmcompressor/observers/mseact.py Outdated Show resolved Hide resolved

shubhra force-pushed the shubhra/mse_nvfp4 branch 9 times, most recently from 2fc0d27 to 4dc86f7 Compare September 25, 2025 18:05

shubhra requested review from brian-dellabetta and dsikka September 25, 2025 18:32

Shubhra Pandit and others added 15 commits September 26, 2025 13:40

MSE support for NVFP4

50930c3

Signed-off-by: Shubhra Pandit <shubhra.pandit@gmail.com>

Add mse support for input activations global scale via MSE

eba6f56

Signed-off-by: Shubhra Pandit <shubhra.pandit@gmail.com>

Remove prints

476564e

Signed-off-by: Shubhra Pandit <shubhra.pandit@gmail.com>

Consolidate mse for both weights and activations (global scale) under…

5d2eaee

… one observer Signed-off-by: Shubhra Pandit <shubhra.pandit@gmail.com>

Remove imports that aren't needed

5e51a09

Signed-off-by: Shubhra Pandit <shubhra.pandit@gmail.com>

Clean up init

133ac2e

Signed-off-by: Shubhra Pandit <shubhra.pandit@gmail.com>

Support for differentiating between activations and weights

d702c7a

Signed-off-by: Shubhra Pandit <shubhra.pandit@gmail.com>

Remove prints

4ff715f

Signed-off-by: Shubhra Pandit <shubhra.pandit@gmail.com>

Corrected mse implementation for fp4

20de7e4

Signed-off-by: Shubhra Pandit <shubhra.pandit@gmail.com>

Update check for activation and local scales

c17d292

Signed-off-by: Shubhra Pandit <shubhra.pandit@gmail.com>

Update check for activation and local scales

ef54001

Signed-off-by: Shubhra Pandit <shubhra.pandit@gmail.com>

Change the way we check for if we are doing local scale

d7b03c8

Signed-off-by: Shubhra Pandit <shubhra.pandit@gmail.com>

Clarify comment in MovingAverageMSEObserver regarding global scale ha…

c6fef13

…ndling for fp4 quantization scheme. Signed-off-by: Shubhra Pandit <shubhra.pandit@gmail.com>

Change the local scale identification method

9e933ce

Signed-off-by: Shubhra Pandit <shubhra.pandit@gmail.com>

Remove unnecessary import

4aa7452

Signed-off-by: Shubhra Pandit <shubhra.pandit@gmail.com>

shubhra added 5 commits September 26, 2025 13:40

Fix a long line error

d1fcb3e

Signed-off-by: Shubhra Pandit <shubhra.pandit@gmail.com>

Fix format errors

8ebdffb

Signed-off-by: Shubhra Pandit <shubhra.pandit@gmail.com>

Fix format errors

814e9e4

Signed-off-by: Shubhra Pandit <shubhra.pandit@gmail.com>

Fix format errors

20a0ecc

Signed-off-by: Shubhra Pandit <shubhra.pandit@gmail.com>

ruff format file

d5729ce

Signed-off-by: Shubhra Pandit <shubhra.pandit@gmail.com>

shubhra force-pushed the shubhra/mse_nvfp4 branch from 4dc86f7 to d5729ce Compare September 26, 2025 17:40

brian-dellabetta reviewed Sep 26, 2025

View reviewed changes

brian-dellabetta requested a review from shanjiaz September 26, 2025 18:33

Update src/llmcompressor/observers/mse.py

408bdbb

Co-authored-by: Brian Dellabetta <brian-dellabetta@users.noreply.github.com>

shanjiaz approved these changes Oct 6, 2025

View reviewed changes

brian-dellabetta mentioned this pull request Oct 9, 2025

[Observers] Refactor for better FP4 support, static observers #1903

Draft

MSE observer for NVFP4 #1840

Are you sure you want to change the base?

MSE observer for NVFP4 #1840

Conversation

shubhra commented Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Sep 17, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

brian-dellabetta left a comment

Choose a reason for hiding this comment

Uh oh!

brian-dellabetta Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

brian-dellabetta Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

shanjiaz left a comment

Choose a reason for hiding this comment

Uh oh!

kylesayrs commented Oct 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

shubhra commented Sep 17, 2025 •

edited

Loading