-
Notifications
You must be signed in to change notification settings - Fork 30.2k
🚨 Always return Cache objects in modelings (to align with generate) #39765
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🚨 Always return Cache objects in modelings (to align with generate) #39765
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
595a1a4
to
6567a14
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! thank you for working on it
from_legacy_cache
as initializationfrom_legacy_cache
as Cache initialization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, sorry for the delay on this! Thanks a lot, happy to start cleaning up everything to finally drop the legacy format soon! This needs a rebase to fix the conflict though! And for EncoderDecoderCache, let's not provide default values to the init, let's instantiate with 2 DynamicCache in the models instead (see the 2 PR I linked!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect, thanks a lot! Looks like you just need to run make fix-copies
for Flaubert (probably only the docstrings change) to make CI happy.
If not consistent anymore based on real change, then you can just break the copy!
Feel free to merge once it's done and CI is happy!
from_legacy_cache
as Cache initializationfrom_legacy_cache
as Cache init, make pipeline
use Caches, fix and test Cache.select_indices
@manueldeprada @Cyrilvallez sorry to potentially add one more task here :D Should we update all |
@@ -1197,6 +1197,28 @@ def test_dynamic_cache(self): | |||
"DynamicCache Scenario 2 layer 1 failed", | |||
) | |||
|
|||
def test_dynamic_cache_batch_select_indices(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This additional test does not hurt, since batch_select_indices
was never tested
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, thanks for reverting previous changes.
We just need to remove the kwarg
in all the initialization of EncoderDecoderCache
to simplify our lives with #40008.
Also, I just checked generate
, and we actually return Cache
format all the time, even when kv
are provided as tuple
. So, I think it makes sense to do the same here, and always return a Cache, whatever the input, not only half the time. So we can drop return_legacy_cache
in all modeling, and never go back to legacy format 🤗
[For maintainers] Suggested jobs to run (before merge) run-slow: autoformer, bark, bart, bert, bert_generation, big_bird, bigbird_pegasus, biogpt, blenderbot, blenderbot_small, blip, bridgetower, camembert, clvp, cpmant, ctrl |
@Cyrilvallez all done!! thanks, now the diff is very nice: +378 −538 🧹 🧹 😄 |
run-slow: autoformer, bark, bart, bert, bert_generation, big_bird, bigbird_pegasus, biogpt, blenderbot, blenderbot_small, blip, bridgetower, camembert, clvp, cpmant, ctrl |
This comment contains run-slow, running the specified jobs: models: ['models/autoformer', 'models/bark', 'models/bart', 'models/bert', 'models/bert_generation', 'models/big_bird', 'models/bigbird_pegasus', 'models/biogpt', 'models/blenderbot', 'models/blenderbot_small', 'models/blip', 'models/bridgetower', 'models/camembert', 'models/clvp', 'models/cpmant', 'models/ctrl'] |
Checked the slow tests and all the failures seem unrelated. There are a bunch of them that will be fixed by #40008 ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks a lot for the cleanup! Happy to get rid of it! Feel free to merge! 🤗
@Cyrilvallez for completeness, PEFT has some fine-tuning methods using caches -- which is why we don't simply set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you for iterating 💛
Nice, thanks @gante for the heads-up! cc @BenjaminBossan then just in case, but I believe peft already supports both formats anyway, as recent models have been returning Cache objects for a long time! 🤗 |
Thanks for the heads up, I ran the relevant tests against the latest transformers main branch and they all passed. The new warning was not triggered, so I think we're good on the PEFT side. |
This PR removes reliance on
Cache.from_legacy_cache(past_key_values)
for initializingNone
past_key_values
, replacing it with explicit cache initialization. The previous approach also setreturn_legacy_cache=True
, unintentionally returning legacy tuples and masking other issues.This change is necessary to support the upcoming deprecation of
from_legacy_cache
in v4.58.Note: This update revealed an issue in
pipelines
, whereloader_batch_item
expects legacy tuples when iterating overModelOutputs
. It failed when encounteringCache
objects.