-
Notifications
You must be signed in to change notification settings - Fork 252
Add block quantization e2e test #1867
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>
👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review. Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI - this may fail in vllm as I think tyler reverted the PR to support
vllm-project/vllm#25607
Ah! That makes sense it was failing for me locally. I was planning on trying to serve the model in vllm directly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wrong. Should now work with this PR: vllm-project/vllm#25219
@shanjiaz can we get this in soon |
Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool cool cool cool
quant_modifiers: | ||
QuantizationModifier: | ||
targets: "Linear" | ||
scheme: "FP8_BLOCK" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parenthesis are not necessary in yamls unless they are in a list. Have you tested if this is still parsable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like you have
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'll just remove this recipe file.
SUMMARY:
Added e2d testing for block quantization.
TEST PLAN:
Tested locally with the following command:
log: