Support for Plamo2ForCausalLM architecture #13874

terribleplan · 2025-05-28T21:13:08Z

terribleplan
May 28, 2025

The Plamo 2 Translate model was released recently, and it claims to have a new architecture (Plamo2ForCausalLM). It'd be neat to see support for this model added to lcpp. Not sure what more info is needed to open this as an issue and get it moving.

yt-koike · 2025-05-29T04:17:08Z

yt-koike
May 29, 2025

+1

0 replies

whoreson · 2025-06-06T09:54:56Z

whoreson
Jun 6, 2025

plamo-2-translate is an extremely good model for this task (way better than Gemma 3 or Granite), and it's damn small too.

0 replies

mitmul · 2025-06-23T08:01:47Z

mitmul
Jun 23, 2025

I'm currently working on this implementation here: #13930, but so far the output is completely incorrect. I'm trying to fix it, but lacking experience with llama.cpp development, I don't know how to examine intermediate outputs to identify where the issue might be occurring. Could someone please offer some advice?

0 replies

mitmul · 2025-06-25T12:12:25Z

mitmul
Jun 25, 2025

To investigate the intermediate values, I implemented a callback in llm_build_plamo2 to construct the graph, then printed some values within the decode function of llama-context.cpp. This allowed me to determine exactly where the implementation is different from expected behavior.

Currently, while token ID conversion to embeddings is correct, I discovered that the output already differs at the point immediately before the first Mamba layer after applying RMS norm. I'll continue to debug this issue.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support for Plamo2ForCausalLM architecture #13874

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 4 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Support for Plamo2ForCausalLM architecture #13874

Uh oh!

Uh oh!

terribleplan May 28, 2025

Replies: 4 comments

Uh oh!

yt-koike May 29, 2025

Uh oh!

whoreson Jun 6, 2025

Uh oh!

mitmul Jun 23, 2025

Uh oh!

Uh oh!

mitmul Jun 25, 2025

terribleplan
May 28, 2025

yt-koike
May 29, 2025

whoreson
Jun 6, 2025

mitmul
Jun 23, 2025

mitmul
Jun 25, 2025