🚨 Always return Cache objects in modelings (to align with generate) #39765

manueldeprada · 2025-07-29T19:21:40Z

This PR removes reliance on Cache.from_legacy_cache(past_key_values) for initializing None past_key_values, replacing it with explicit cache initialization. The previous approach also set return_legacy_cache=True, unintentionally returning legacy tuples and masking other issues.

This change is necessary to support the upcoming deprecation of from_legacy_cache in v4.58.

Note: This update revealed an issue in pipelines, where loader_batch_item expects legacy tuples when iterating over ModelOutputs. It failed when encountering Cache objects.

HuggingFaceDocBuilderDev · 2025-07-29T19:34:59Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

gante

LGTM! thank you for working on it

Cyrilvallez

Hey, sorry for the delay on this! Thanks a lot, happy to start cleaning up everything to finally drop the legacy format soon! This needs a rebase to fix the conflict though! And for EncoderDecoderCache, let's not provide default values to the init, let's instantiate with 2 DynamicCache in the models instead (see the 2 PR I linked!)

src/transformers/models/biogpt/modeling_biogpt.py

src/transformers/cache_utils.py

…om_legacy-init

Cyrilvallez

Perfect, thanks a lot! Looks like you just need to run make fix-copies for Flaubert (probably only the docstrings change) to make CI happy.
If not consistent anymore based on real change, then you can just break the copy!
Feel free to merge once it's done and CI is happy!

…om_legacy-init

gante · 2025-08-14T13:52:14Z

@manueldeprada @Cyrilvallez sorry to potentially add one more task here :D Should we update all DynamicCache() added in this PR to DynamicCache(config=self.config), following the pattern introduced in #40039 ?

manueldeprada · 2025-08-15T10:12:11Z

tests/utils/test_cache_utils.py

@@ -1197,6 +1197,28 @@ def test_dynamic_cache(self):
            "DynamicCache Scenario 2 layer 1 failed",
        )

+    def test_dynamic_cache_batch_select_indices(self):


This additional test does not hurt, since batch_select_indices was never tested

src/transformers/pipelines/pt_utils.py

src/transformers/cache_utils.py

Cyrilvallez

Alright, thanks for reverting previous changes.
We just need to remove the kwarg in all the initialization of EncoderDecoderCache to simplify our lives with #40008.
Also, I just checked generate, and we actually return Cache format all the time, even when kv are provided as tuple. So, I think it makes sense to do the same here, and always return a Cache, whatever the input, not only half the time. So we can drop return_legacy_cache in all modeling, and never go back to legacy format 🤗

examples/modular-transformers/modeling_dummy_bert.py

src/transformers/pipelines/pt_utils.py

tests/generation/test_utils.py

tests/trainer/test_trainer.py

…om_legacy-init

github-actions · 2025-08-18T11:11:06Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: autoformer, bark, bart, bert, bert_generation, big_bird, bigbird_pegasus, biogpt, blenderbot, blenderbot_small, blip, bridgetower, camembert, clvp, cpmant, ctrl

manueldeprada · 2025-08-18T11:17:40Z

@Cyrilvallez all done!! thanks, now the diff is very nice: +378 −538 🧹 🧹 😄

manueldeprada · 2025-08-18T11:17:50Z

run-slow: autoformer, bark, bart, bert, bert_generation, big_bird, bigbird_pegasus, biogpt, blenderbot, blenderbot_small, blip, bridgetower, camembert, clvp, cpmant, ctrl

github-actions · 2025-08-18T11:19:22Z

This comment contains run-slow, running the specified jobs:

models: ['models/autoformer', 'models/bark', 'models/bart', 'models/bert', 'models/bert_generation', 'models/big_bird', 'models/bigbird_pegasus', 'models/biogpt', 'models/blenderbot', 'models/blenderbot_small', 'models/blip', 'models/bridgetower', 'models/camembert', 'models/clvp', 'models/cpmant', 'models/ctrl']
quantizations: [] ...

manueldeprada · 2025-08-18T14:19:46Z

Checked the slow tests and all the failures seem unrelated. There are a bunch of them that will be fixed by #40008 !

Cyrilvallez

LGTM! Thanks a lot for the cleanup! Happy to get rid of it! Feel free to merge! 🤗

gante · 2025-08-18T14:24:59Z

Generally, we don't want any link Trainer <-> Cache as caching is only an inference speedup trick

@Cyrilvallez for completeness, PEFT has some fine-tuning methods using caches -- which is why we don't simply set use_cache &= not self.training. AFAIK Trainer has no use for caches, except in the eval step.

gante

LGTM, thank you for iterating 💛

Cyrilvallez · 2025-08-18T14:27:54Z

Nice, thanks @gante for the heads-up! cc @BenjaminBossan then just in case, but I believe peft already supports both formats anyway, as recent models have been returning Cache objects for a long time! 🤗

BenjaminBossan · 2025-08-18T14:53:48Z

Thanks for the heads up, I ran the relevant tests against the latest transformers main branch and they all passed. The new warning was not triggered, so I think we're good on the PEFT side.

watch the world burn

fe5f374

fix models, pipelines

6567a14

manueldeprada force-pushed the no-from_legacy-init branch from 595a1a4 to 6567a14 Compare August 5, 2025 11:14

fix models that are both enc-dec and StandaloneDecoder

605ba14

gante previously approved these changes Aug 5, 2025

View reviewed changes

manueldeprada marked this pull request as ready for review August 5, 2025 17:28

Merge branch 'main' into no-from_legacy-init

0f4a749

manueldeprada changed the title ~~[draft] No more using from_legacy_cache as initialization~~ Stop using from_legacy_cache as Cache initialization Aug 5, 2025

manueldeprada and others added 3 commits August 5, 2025 20:12

Merge branch 'main' into no-from_legacy-init

85812b9

replace routing_weights.device.index with a

7cdd948

Merge branch 'fix-cpu-tests' into no-from_legacy-init

375146f

manueldeprada requested a review from Cyrilvallez August 5, 2025 19:31

Merge branch 'main' into no-from_legacy-init

c6fc969

Cyrilvallez reviewed Aug 12, 2025

View reviewed changes

src/transformers/models/biogpt/modeling_biogpt.py Show resolved Hide resolved

src/transformers/cache_utils.py Outdated Show resolved Hide resolved

manueldeprada added 4 commits August 12, 2025 13:25

cyril review

0d8d68b

Merge branch 'main' of github.com:huggingface/transformers into no-fr…

053cdf1

…om_legacy-init

raise error if None is passed

f725e21

fix xlm model

495ef60

Cyrilvallez previously approved these changes Aug 12, 2025

View reviewed changes

manueldeprada and others added 8 commits August 12, 2025 17:23

fixes

00a1cd4

ops

18dde86

more models

3bfcfae

Merge branch 'main' of github.com:huggingface/transformers into no-fr…

d6fb95e

…om_legacy-init

fix flaubert

94e4712

fix bert test, dont use legacy

c7f336a

Merge branch 'main' into no-from_legacy-init

e051637

fix pipelines definetely, sanitize cache's select_index, create tests

959271f

manueldeprada changed the title ~~Stop using from_legacy_cache as Cache initialization~~ Stop using from_legacy_cache as Cache init, make pipeline use Caches, fix and test Cache.select_indices Aug 14, 2025

manueldeprada added 2 commits August 15, 2025 12:10

revert cache changes

04021bc

ops

56a1b09

manueldeprada commented Aug 15, 2025

View reviewed changes

fix gpt2 test

35dd4db

manueldeprada commented Aug 15, 2025

View reviewed changes

src/transformers/pipelines/pt_utils.py Outdated Show resolved Hide resolved

make the error a warning

259de5a

manueldeprada commented Aug 15, 2025

View reviewed changes

src/transformers/cache_utils.py Outdated Show resolved Hide resolved

manueldeprada requested a review from Cyrilvallez August 15, 2025 11:43

Cyrilvallez reviewed Aug 15, 2025

View reviewed changes

manueldeprada added 5 commits August 18, 2025 12:32

remove kwargs and return_legacy_cache

4693a10

cont

260fab9

cyril review

9baf489

Merge branch 'main' of github.com:huggingface/transformers into no-fr…

a0eaa4a

…om_legacy-init

fix reformer

28cd663

manueldeprada requested review from Cyrilvallez and removed request for Cyrilvallez August 18, 2025 11:19

manueldeprada requested review from gante and Cyrilvallez August 18, 2025 14:19

Cyrilvallez changed the title ~~🚨 Return Cache objects in models when past_key_values are not provided (to align with generate)~~ 🚨 Always return Cache objects in modelings (to align with generate) Aug 18, 2025

Cyrilvallez approved these changes Aug 18, 2025

View reviewed changes

manueldeprada merged commit a36d51e into huggingface:main Aug 18, 2025
24 of 25 checks passed

gante approved these changes Aug 18, 2025

View reviewed changes

🚨 Always return Cache objects in modelings (to align with generate) #39765

🚨 Always return Cache objects in modelings (to align with generate) #39765

Uh oh!

Conversation

manueldeprada commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Jul 29, 2025

Uh oh!

gante left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Cyrilvallez left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Cyrilvallez left a comment

Choose a reason for hiding this comment

Uh oh!

gante commented Aug 14, 2025

Uh oh!

manueldeprada Aug 15, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Cyrilvallez left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Aug 18, 2025

Uh oh!

manueldeprada commented Aug 18, 2025

Uh oh!

manueldeprada commented Aug 18, 2025

Uh oh!

github-actions bot commented Aug 18, 2025

Uh oh!

manueldeprada commented Aug 18, 2025

Uh oh!

Cyrilvallez left a comment

Choose a reason for hiding this comment

Uh oh!

gante commented Aug 18, 2025

Uh oh!

Uh oh!

gante left a comment

Choose a reason for hiding this comment

Uh oh!

Cyrilvallez commented Aug 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BenjaminBossan commented Aug 18, 2025

Uh oh!

Uh oh!

manueldeprada commented Jul 29, 2025 •

edited

Loading

gante left a comment •

edited

Loading

Cyrilvallez left a comment •

edited

Loading

Cyrilvallez commented Aug 18, 2025 •

edited

Loading