-
Notifications
You must be signed in to change notification settings - Fork 30
[work in progress] Support of generate #261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Next steps are to crate tests and add docs |
There are 3 open questions:
|
As a side note, to support downstream evals during training, we do not necessarily need |
Let's keep the existing format and not use a new rule.
The
I don't really see the point of |
I just added future parity with However, I meant a different thing here — Fast-LLM's That said, @tscholak is okay with using tuples as keys in |
We still need to support different global batch sizes for the same For example, consider a training scenario with:
This configuration uses However, when performing evaluation during training, we cannot use the same Additionally, |
Yeah, you're right — |
The tuple and dot formats are basically the same, it's just parsed vs unparsed. I don't think there is much use for the unparsed version outside of a runnable, because dicts are better in-code.
Again, Note that sequential micro-batches are irrelevant for inference, it's just separate batches for all practical purposes. |
✨ Description
Adds support for
generate
and extends support forforward
without handlingcache
,past_key_values
,labels
,attention
output, orinputs_embeds
.position_ids
are ignored and reconstructed from the attention mask.Currently, only data-parallel generation is supported.
Closes #217
🔍 Type of change
Select all that apply:
📝 Changes
List the key changes introduced in this PR:
✅ Checklist
Make sure the following tasks are completed before submitting the PR:
General
Dependencies and Configuration
Testing
Performance Impact
📊 Performance Impact Details
If there is any impact on performance, describe it and provide benchmark results, if applicable:
🗒️ Additional Notes
Include any additional context, information, or considerations here, such as known issues, follow-up tasks, or backward compatibility concerns.