[V1 Loader] Support Ernie text（moe and dense） #3110

YuanRisheng · 2025-07-31T07:16:59Z

新loader支持加载ernie模型：
当前缺少量化，不方便测300B模型，以21B模型h20测试结果如下：

旧版loader	旧版loader	新版loader
单线程内存占用	80G	8G
单线程加载耗时	46秒	18秒
4线程内存占用	320G	32G
4线程加载耗时	34秒	35秒

新Loader相比旧Loader节省90%左右的内存占用，在单卡条件下加载性能提升60%，多卡加载性能基本持平

paddle-bot · 2025-07-31T07:17:04Z

Thanks for your contribution!

…nto add_load

bukejiyu · 2025-08-08T09:26:07Z

fastdeploy/model_executor/models/ernie4_5_moe.py

+                for expert_id in range(self.fd_config.model_config.moe_num_experts)
+                for param_name, weight_name, shard_id in param_name_maping
+            ]
+            expert_params_mapping.append(


这个是不是没必要专门加直接在 general_params_mapping加是不是就行了

已修改用默认Loader加载，但是这个感觉放在加在这里比较合理，因为这个函数返回的是moe用到的map

yuanlehome · 2025-08-11T03:51:58Z

fastdeploy/model_executor/layers/moe/moe.py

+            if isinstance(loaded_weight, np.ndarray):
+                size = loaded_weight.shape[-1]
+            else:
+                size = loaded_weight.get_shape()[-1]


会有ndarray和tensor两种情况吗？

会有ndarray和pyslice俩种情况

yuanlehome · 2025-08-11T03:53:46Z

fastdeploy/model_executor/layers/linear.py

-        param.copy_(loaded_weight, False)
+        if loaded_shard_id is None:
+            # Loaded weight is already fused on disk.
+            if self.fd_config.model_config.pretrained_config.tensor_parallel_degree != 1:


这个判断用类成员变量就可以吧？self.nranks

yuanlehome · 2025-08-11T03:53:57Z

fastdeploy/model_executor/layers/linear.py

-            param = param[:, (self.num_heads_per_rank + self.kv_num_heads_per_rank) * self.head_dim :]
+        if loaded_shard_id is None:
+            # Loaded weight is already fused on disk
+            if self.fd_config.model_config.pretrained_config.tensor_parallel_degree != 1:


yuanlehome · 2025-08-11T03:59:45Z

ernie有纯文/多模/moe/非moe，这个pr都支持了吗？如果不是，pr描述和标题需要改的准确些～

bukejiyu · 2025-08-11T03:57:52Z

fastdeploy/model_executor/models/ernie4_5_moe.py

@@ -428,6 +429,80 @@ def set_state_dict(self, state_dict: Dict[str, Union[np.ndarray, paddle.Tensor]]
        else:
            self.lm_head.load_state_dict(state_dict)

+    def make_expert_params_mapping(self):


要不改一下这个类方法支持fused的权重构造 mapping 这样就不用这个模型单独写自己的专家mapping了

这里我觉得每个model管理自己的专家映射更合理，统一写在moe.py里不太合理

已按照review建议修改

bukejiyu · 2025-08-11T04:09:59Z

fastdeploy/model_executor/models/ernie4_5_moe.py

+                for param_name, weight_name, shard_id in param_name_maping
+            ]
+            expert_params_mapping.append(
+                ("experts.gate_correction_bias", "moe_statics.e_score_correction_bias", None, "gate_bias")


这个gate_correciton_bias可以放到general_params_mapping下面，一开始设计的时候是打算 fusemoe只包含 gate up down 这3个权重的，gate_correction_bias本来打算拆出来

general_params_mapping放的是moe和dense模型共用到的一些映射，gate_correction_bias在dense里也会存在吗，应该不是吧

new loader support 0.3B

3665545

YuanRisheng added 8 commits July 31, 2025 09:08

fix weight

6f6f49d

support parallel load

e6c6ea9

support parallel load

6e5b37b

fix slice

4482778

Merge branch 'develop' of https://github.com/YuanRisheng/FastDeploy i…

37e8e3b

…nto add_load

Merge branch 'develop' of https://github.com/YuanRisheng/FastDeploy i…

3439265

…nto add_load

resolve conflict

968c5d0

support moe

c68f7ad

YuanRisheng changed the title ~~New Loader Support 0.3B~~ New Loader Support Ernie Aug 8, 2025

bukejiyu reviewed Aug 8, 2025

View reviewed changes

delete code

b68eab4

yuanlehome reviewed Aug 11, 2025

View reviewed changes

yuanlehome changed the title ~~New Loader Support Ernie~~ [V1 Loader] Support Ernie text Aug 11, 2025

bukejiyu reviewed Aug 11, 2025

View reviewed changes

YuanRisheng changed the title ~~[V1 Loader] Support Ernie text~~ [V1 Loader] Support Ernie text（moe and dense） Aug 14, 2025

YuanRisheng added 3 commits August 14, 2025 02:58

merge develop

c430ee4

perfect code

0c723cd

perfect code

b2dd7f8

yuanlehome approved these changes Aug 14, 2025

View reviewed changes

Jiang-Jia-Jun merged commit 09c979f into PaddlePaddle:develop Aug 14, 2025
11 of 14 checks passed

[V1 Loader] Support Ernie text（moe and dense） #3110

[V1 Loader] Support Ernie text（moe and dense） #3110

Uh oh!

Conversation

YuanRisheng commented Jul 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

paddle-bot bot commented Jul 31, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yuanlehome commented Aug 11, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

YuanRisheng commented Jul 31, 2025 •

edited

Loading