Skip to content

Conversation

WeiweiZhang1
Copy link
Contributor

No description provided.

Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
"""
# Get the names of layers in quantization blocks
supported_types = self.supported_types
dynamic_config = {}
Copy link
Contributor

@wenhuach21 wenhuach21 Sep 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest avoiding the term 'dynamic config,' which is not used in academic literature and was coined by Unlosh, and personally, I think it's a little confusing, as there are static activation quantization and dynamic activation quantization

@wenhuach21 wenhuach21 changed the title enable dynamic quantization config saving enable regex quantization config saving Sep 16, 2025
@wenhuach21 wenhuach21 changed the title enable regex quantization config saving enable regex quantization config saving for mixed bits Sep 16, 2025
WeiweiZhang1 and others added 2 commits September 16, 2025 16:02
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
@wenhuach21
Copy link
Contributor

wenhuach21 commented Sep 16, 2025

TODO:
1 Supported the inference for AutoRound format in Transformers
2 Validated the inference for AutoRound format in vLLMs/Sglang
3 ADD UTs

return model

quantization_config = kwargs["serialization_dict"]
quantization_config.pop("regex_config") #as awq do not support mixed bits config saving
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if its 16 bits, we could convert it to not_convert_module, I forget the name

result = model.generate("Uncovering deep insights begins with")[0] # tokens
assert "!!!" not in model.tokenizer.decode(result) # string output
assert (model.model.model.decoder.layers[0].self_attn.k_proj.bits == 8)
assert (model.model.model.decoder.layers[0].self_attn.q_proj.bits == 4)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test the regex , full name, part name and make sure the model could inference correctly.



if __name__ == "__main__":
unittest.main()
Copy link
Contributor

@wenhuach21 wenhuach21 Sep 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For AutoRound format, please make sure inference is ready first then support it on exporting side

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants