实现HuggingFace Cache和缩专家加载 #11042

lshpku · 2025-09-02T02:06:05Z

实现了两个非常有用的 HuggingFace 小工具：

HuggingFace Cache：将当前机器所需的 HuggingFace 权重文件保存在本机的 /dev/shm 中，下次加载的时候从本地高速读取
HuggingFace 缩专家加载：即专家数小于 256 时自动对专家维度进行 slice

注1：如果在加载过程中不慎中断了程序，下次启动前需手动rm -f /dev/shm/lshrun_*.lock，否则 Cache 会死锁

注2：缩层还需要改special_cases，例如：

# 29层时改成：
special_cases = {(0, 0): "model", (28, 2): "model.layers.61", (28, 3): "model", (28, 4): "lm_head"}

# 21层时改成：
special_cases = {(0, 0): "model", (21, 1): "model.layers.61", (21, 2): "model", (21, 3): "lm_head"}

paddle-bot · 2025-09-02T02:06:10Z

Thanks for your contribution!

lshpku force-pushed the hf-cache-shrink branch 4 times, most recently from 255abc5 to 23c0add Compare September 8, 2025 05:04

lshpku force-pushed the hf-cache-shrink branch from 23c0add to 8dbaa4f Compare September 17, 2025 00:53

lshpku changed the title ~~实现HuggingFace Cache以及缩专家加载~~ 实现HuggingFace Cache和缩专家加载 Sep 17, 2025

lshpku force-pushed the hf-cache-shrink branch from 8dbaa4f to 3f157f4 Compare September 18, 2025 05:25

Implement hugging-face cache and expert slice

0fd02e5

lshpku force-pushed the hf-cache-shrink branch from 3f157f4 to 0fd02e5 Compare September 19, 2025 07:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

实现HuggingFace Cache和缩专家加载 #11042

实现HuggingFace Cache和缩专家加载 #11042

Uh oh!

lshpku commented Sep 2, 2025 •

edited

Loading

Uh oh!

paddle-bot bot commented Sep 2, 2025

Uh oh!

Uh oh!

实现HuggingFace Cache和缩专家加载 #11042

Are you sure you want to change the base?

实现HuggingFace Cache和缩专家加载 #11042

Uh oh!

Conversation

lshpku commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

paddle-bot bot commented Sep 2, 2025

Uh oh!

Uh oh!

lshpku commented Sep 2, 2025 •

edited

Loading