Open
Description
The author use llama-7b model with llm_layers=32. Therefore, I speculate that when I use gpt2, it should be changed to 12, and any value larger than 12 should be meaningless. However, I have observed the following phenomena.
gpt2 with llm_layers=12 cost 5-6GB of vram
gpt2 with llm_layers=32 cost 10-11GB of vram
How?
Metadata
Metadata
Assignees
Labels
No labels