enable regex quantization config saving for mixed bits #825

WeiweiZhang1 · 2025-09-16T06:50:51Z

No description provided.

Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>

for more information, see https://pre-commit.ci

wenhuach21 · 2025-09-16T06:54:43Z

auto_round/autoround.py

        """
        # Get the names of layers in quantization blocks
        supported_types = self.supported_types
+        dynamic_config = {}


I suggest avoiding the term 'dynamic config,' which is not used in academic literature and was coined by Unlosh, and personally, I think it's a little confusing, as there are static activation quantization and dynamic activation quantization

Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>

for more information, see https://pre-commit.ci

wenhuach21 · 2025-09-16T09:07:11Z

TODO:
1 Supported the inference for AutoRound format in Transformers
2 Validated the inference for AutoRound format in vLLMs/Sglang
3 ADD UTs

for more information, see https://pre-commit.ci

Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>

for more information, see https://pre-commit.ci

Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>

…ithub.com/intel/auto-round into enable_dynamic_quantization_config_saving

Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>

for more information, see https://pre-commit.ci

wenhuach21 · 2025-09-24T03:57:04Z

auto_round/export/export_to_awq/export.py

        return model

    quantization_config = kwargs["serialization_dict"]
+    quantization_config.pop("regex_config") #as awq do not support mixed bits config saving


if its 16 bits, we could convert it to not_convert_module, I forget the name

wenhuach21 · 2025-09-24T03:58:15Z

test/test_cuda/test_mix_bits.py

-        result = model.generate("Uncovering deep insights begins with")[0]  # tokens
-        assert "!!!" not in model.tokenizer.decode(result)  # string output
+        assert (model.model.model.decoder.layers[0].self_attn.k_proj.bits == 8)
+        assert (model.model.model.decoder.layers[0].self_attn.q_proj.bits == 4)


test the regex , full name, part name and make sure the model could inference correctly.

wenhuach21 · 2025-09-24T03:59:52Z

test/test_cuda/test_mix_bits.py


+
 if __name__ == "__main__":
    unittest.main()


For AutoRound format, please make sure inference is ready first then support it on exporting side

enable dynamic quantization config saving

2b1577b

Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>

WeiweiZhang1 added the WIP label Sep 16, 2025

[pre-commit.ci] auto fixes from pre-commit.com hooks

56a2218

for more information, see https://pre-commit.ci

wenhuach21 reviewed Sep 16, 2025

View reviewed changes

wenhuach21 changed the title ~~enable dynamic quantization config saving~~ enable regex quantization config saving Sep 16, 2025

wenhuach21 changed the title ~~enable regex quantization config saving~~ enable regex quantization config saving for mixed bits Sep 16, 2025

WeiweiZhang1 and others added 2 commits September 16, 2025 16:02

fixtypo

db99785

Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>

[pre-commit.ci] auto fixes from pre-commit.com hooks

81e8086

for more information, see https://pre-commit.ci

WeiweiZhang1 and others added 9 commits September 24, 2025 09:46

Merge branch 'main' into enable_dynamic_quantization_config_saving

d5b9a46

[pre-commit.ci] auto fixes from pre-commit.com hooks

4e58090

for more information, see https://pre-commit.ci

rebase code, refine config saving

c75ebdc

Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>

Merge branch 'main' into enable_dynamic_quantization_config_saving

ae20df7

[pre-commit.ci] auto fixes from pre-commit.com hooks

21ff4b9

for more information, see https://pre-commit.ci

refine ut

b97f3fc

Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>

Merge branch 'enable_dynamic_quantization_config_saving' of https://g…

be7af05

…ithub.com/intel/auto-round into enable_dynamic_quantization_config_saving

fix UT

b91bf20

Signed-off-by: Zhang, Weiwei1 <weiwei1.zhang@intel.com>

[pre-commit.ci] auto fixes from pre-commit.com hooks

cd5c693

for more information, see https://pre-commit.ci

wenhuach21 reviewed Sep 24, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

enable regex quantization config saving for mixed bits #825

enable regex quantization config saving for mixed bits #825

Uh oh!

WeiweiZhang1 commented Sep 16, 2025

Uh oh!

wenhuach21 Sep 16, 2025 •

edited

Loading

Uh oh!

wenhuach21 commented Sep 16, 2025 •

edited

Loading

Uh oh!

wenhuach21 Sep 24, 2025

Uh oh!

wenhuach21 Sep 24, 2025

Uh oh!

wenhuach21 Sep 24, 2025 •

edited

Loading

Uh oh!

Uh oh!

enable regex quantization config saving for mixed bits #825

Are you sure you want to change the base?

enable regex quantization config saving for mixed bits #825

Uh oh!

Conversation

WeiweiZhang1 commented Sep 16, 2025

Uh oh!

wenhuach21 Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wenhuach21 commented Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wenhuach21 Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

wenhuach21 Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

wenhuach21 Sep 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wenhuach21 Sep 16, 2025 •

edited

Loading

wenhuach21 commented Sep 16, 2025 •

edited

Loading

wenhuach21 Sep 24, 2025 •

edited

Loading