Skip to content

Conversation

@Jacenty-And-Intel
Copy link
Contributor

@Jacenty-And-Intel Jacenty-And-Intel commented Nov 14, 2025

Summary

After the vpux type is added for float4_e2m1, quantization should be enabled for this type, as it is for FP8. This MLIR change will allow for further adjustments to enable FP4 quantization in the compiler and enable the addition of PSS tests.

JIRA ticket

Related PR in NPU Compiler and/or OpenVINO repository with sub-module update

Other related tickets

List tickets for additional work, eg, something was found during review but you agreed to address it in another Jira

  • E-xxxxx

@Jacenty-And-Intel Jacenty-And-Intel requested a review from a team as a code owner November 14, 2025 12:50
@Jacenty-And-Intel Jacenty-And-Intel changed the title Enable quantization for FP4 type [EISW-191805] Enable quantization for FP4 type Nov 14, 2025
@andrey-golubev
Copy link
Contributor

andrey-golubev commented Nov 14, 2025

Can this be first brought to upstream MLIR? Also, does it by any chance conflict with what @Roman-Pevnyi is trying to do in upstream? (see llvm/llvm-project#152966)

@Jacenty-And-Intel
Copy link
Contributor Author

Can this be first brought to upstream MLIR?

@andrey-golubev Wanted to merge it in the fork first. Right now enabling PSS tests for FP4 and further work on quantization to f4 are blocked without the change in MLIR.

Also, does it by any chance conflict with what @Roman-Pevnyi is trying to do in upstream? (see llvm/llvm-project#152966)

It does not conflict, in fact, it works very well with Roman's change. Once that change is approved, enabling FP4 will be much simpler, and enabling further types will be even easier. However, we don't know when it will be merged upstream, and it is currently a blocker on our side.

@sartil
Copy link
Contributor

sartil commented Nov 14, 2025

Do you have a testing PR in vpux-plugin that verifies CI when updating this submodule, to be linked here?
Also, could you please add the status of running check-mlir tests with this changes (see https://mlir.llvm.org/getting_started/TestingGuide/ on how to run the target)?

@andrey-golubev
Copy link
Contributor

@andrey-golubev Wanted to merge it in the fork first. Right now enabling PSS tests for FP4 and further work on quantization to f4 are blocked without the change in MLIR.

I would strongly prefer to do it the other way around since merging to upstream is usually a painful process. Quantization changes are also evidently painful: the dialect evolved quite a bit since LLVM 18 up to LLVM 21 (we're now at LLVM 20).

Also, could you please add the status of running check-mlir tests with this changes (see https://mlir.llvm.org/getting_started/TestingGuide/ on how to run the target)?

I think this is now nice to do, but we also have automated tests running in CI! (See https://github.com/intel/npu-plugin-llvm/actions/runs/19365101483/job/55416928201?pr=175)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants