Skip to content

Conversation

shanjiaz
Copy link
Collaborator

@shanjiaz shanjiaz commented Sep 25, 2025

SUMMARY:
Added e2d testing for block quantization.

TEST PLAN:
Tested locally with the following command:

python -m pytest tests/e2e/vLLM/test_vllm.py -vv -s

log:

================= vLLM GENERATION =================

PROMPT:
The capital of France is
GENERATED TEXT:
 Paris, which is located in the Île-de-France region. The

PROMPT:
The president of the US is
GENERATED TEXT:
 paying for the protests against him. The White House has reportedly cut

PROMPT:
My name is
GENERATED TEXT:
 [insert name], and I am a [insert job title]. I am excited

PASSED

===================================================================================================================== 1 passed in 130.10s (0:02:10) =====================================================================================================================

shanjiaz and others added 2 commits September 25, 2025 13:22
Copy link

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

Copy link
Collaborator

@dsikka dsikka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI - this may fail in vllm as I think tyler reverted the PR to support
vllm-project/vllm#25607

@shanjiaz
Copy link
Collaborator Author

FYI - this may fail in vllm as I think tyler reverted the PR to support vllm-project/vllm#25607

Ah! That makes sense it was failing for me locally. I was planning on trying to serve the model in vllm directly.

Copy link
Collaborator

@dsikka dsikka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wrong. Should now work with this PR: vllm-project/vllm#25219

@dsikka
Copy link
Collaborator

dsikka commented Oct 1, 2025

@shanjiaz can we get this in soon

@shanjiaz shanjiaz added the ready When a PR is ready for review label Oct 10, 2025
@shanjiaz shanjiaz marked this pull request as ready for review October 10, 2025 18:31
Signed-off-by: shanjiaz <zsjwpianpian@gmail.com>
@shanjiaz shanjiaz requested a review from dsikka October 10, 2025 18:47
Copy link
Collaborator

@brian-dellabetta brian-dellabetta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool cool cool cool

quant_modifiers:
QuantizationModifier:
targets: "Linear"
scheme: "FP8_BLOCK"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parenthesis are not necessary in yamls unless they are in a list. Have you tested if this is still parsable?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like you have

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'll just remove this recipe file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready When a PR is ready for review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants