-
Notifications
You must be signed in to change notification settings - Fork 55
enable regex quantization config saving for mixed bits #825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
for more information, see https://pre-commit.ci
auto_round/autoround.py
Outdated
""" | ||
# Get the names of layers in quantization blocks | ||
supported_types = self.supported_types | ||
dynamic_config = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest avoiding the term 'dynamic config,' which is not used in academic literature and was coined by Unlosh, and personally, I think it's a little confusing, as there are static activation quantization and dynamic activation quantization
for more information, see https://pre-commit.ci
TODO: |
for more information, see https://pre-commit.ci
Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>
for more information, see https://pre-commit.ci
…ithub.com/intel/auto-round into enable_dynamic_quantization_config_saving
for more information, see https://pre-commit.ci
return model | ||
|
||
quantization_config = kwargs["serialization_dict"] | ||
quantization_config.pop("regex_config") #as awq do not support mixed bits config saving |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if its 16 bits, we could convert it to not_convert_module, I forget the name
test/test_cuda/test_mix_bits.py
Outdated
result = model.generate("Uncovering deep insights begins with")[0] # tokens | ||
assert "!!!" not in model.tokenizer.decode(result) # string output | ||
assert (model.model.model.decoder.layers[0].self_attn.k_proj.bits == 8) | ||
assert (model.model.model.decoder.layers[0].self_attn.q_proj.bits == 4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test the regex , full name, part name and make sure the model could inference correctly.
|
||
|
||
if __name__ == "__main__": | ||
unittest.main() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For AutoRound format, please make sure inference is ready first then support it on exporting side
No description provided.