Fix: Add Perceiver model registry test and correct auto mappings #41434

aijadugar · 2025-10-08T07:08:54Z

Summary

This PR adds testing and registry validation for the Perceiver model within the Hugging Face Transformers codebase.

Changes Made

Tested a standalone test_registry.py to validate Perceiver model registration and tokenizer mapping.
Corrected auto model and tokenizer mappings to ensure perceiver is properly recognized in:
- modeling_auto.py
- tokenization_auto.py
Verified successful forward pass for PerceiverModel with output shape torch.Size([1, 256, 1280]).
Fixed missing config attributes (e.g., input_channels) to enable correct model instantiation.
Confirmed tokenizer functionality via PerceiverTokenizer.

Verification

…ry entries (huggingface#41387)

…ences

zucchini-nlp · 2025-10-08T09:00:28Z

src/transformers/models/perception_lm/modeling_perception_lm.py

+class PerceptionEncoder(PreTrainedModel):
+    config_class = PretrainedConfig
+
+    def __init__(self, config):
+        super().__init__(config)
+        self.dummy_layer = None
+
+    def forward(self, x):
+        return x
+


i don't think this is what we want to do. If the model does not exist, we should delete it from modeling_auto which is the case for perception LM

zucchini-nlp · 2025-10-08T09:00:58Z

src/transformers/models/parakeet/tokenization_parakeet_fast.py

+
+class ParakeetCTCTokenizer(PreTrainedTokenizerBase):
+    def __init__(self, vocab_file=None, **kwargs):
+        super().__init__()
+        self.vocab_file = vocab_file
+
+    def _tokenize(self, text):
+        return text.split()
+
+    def _convert_token_to_id(self, token):
+        return 0
+
+    def _convert_id_to_token(self, index):
+        return ""


same here, we can set the slow tokenizer to None if the model has only fast tokenizer (e..g see Chameleon)

aijadugar · 2025-10-09T18:29:22Z

I couldn't find the what I need to do...

zucchini-nlp · 2025-10-10T08:48:08Z

sorry, PerceiverLM was already fixed in another PR

github-actions · 2025-10-15T15:50:03Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, parakeet, perception_lm

aijadugar added 2 commits October 8, 2025 11:20

Fix: remove invalid PerceptionEncoder and ParakeetCTCTokenizer regist…

bd9604f

…ry entries (huggingface#41387)

Fix: add Perceiver model registry test and correct auto mapping refer…

6c2837d

…ences

zucchini-nlp reviewed Oct 8, 2025

View reviewed changes

zucchini-nlp mentioned this pull request Oct 8, 2025

perception_lm: remove perception_encoder from auto-maps #41437

Closed

4 tasks

aijadugar requested a review from zucchini-nlp October 8, 2025 11:39

Merge branch 'huggingface:main' into fix/perceiver-registry-test

9d43873

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix: Add Perceiver model registry test and correct auto mappings #41434

Fix: Add Perceiver model registry test and correct auto mappings #41434

aijadugar commented Oct 8, 2025

Uh oh!

zucchini-nlp Oct 8, 2025

Uh oh!

zucchini-nlp Oct 8, 2025

Uh oh!

aijadugar commented Oct 9, 2025

Uh oh!

zucchini-nlp commented Oct 10, 2025

Uh oh!

github-actions bot commented Oct 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix: Add Perceiver model registry test and correct auto mappings #41434

Are you sure you want to change the base?

Fix: Add Perceiver model registry test and correct auto mappings #41434

Conversation

aijadugar commented Oct 8, 2025

Summary

Changes Made

Verification

Uh oh!

zucchini-nlp Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

aijadugar commented Oct 9, 2025

Uh oh!

zucchini-nlp commented Oct 10, 2025

Uh oh!

github-actions bot commented Oct 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants