-
Notifications
You must be signed in to change notification settings - Fork 252
[Cache] Fix environment variable handling for offline mode #1902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review. Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed. |
Summary of ChangesHello @ralphbean, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly improves the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request improves offline mode support by ensuring Hugging Face environment variables for caching are respected. The changes correctly propagate the cache_dir
from model arguments to dataset arguments, and adjust hf_hub_download
to use the default caching behavior. My main feedback is on how the cache_dir
is added to dataset_args
, suggesting a more explicit definition in the DatasetArguments
dataclass for better code clarity and maintainability.
For context, I got interested in fixing this after trying to make llm-compressor work in combination with hermeto -> hermetoproject/hermeto#1141 |
655881e
to
b8c6ed6
Compare
Previously, llm-compressor ignored HF_HUB_CACHE and other environment variables when loading models and datasets, making offline mode difficult to use with unified cache directories. This change: - Removes hard-coded TRANSFORMERS_CACHE in model_load/helpers.py to respect HF_HOME, HF_HUB_CACHE environment variables - Propagates cache_dir from model_args to dataset_args to enable unified cache directory for both models and datasets - Updates dataset loading to use cache_dir parameter instead of hardcoded None Now users can specify cache_dir parameter or use HF_HOME/HF_HUB_CACHE environment variables for true offline operation. Signed-off-by: Ralph Bean <rbean@redhat.com> Co-Authored-By: Claude <noreply@anthropic.com>
b8c6ed6
to
2dea601
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution! One question
repo_id=cache_path, | ||
filename="config.json", | ||
cache_dir=TRANSFORMERS_CACHE, | ||
cache_dir=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why isn't this model_args.cache_dir instead of None?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to remove model_args.cache_dir
and dataset_args.cache_dir
if their removal means that the user can use HF_HUB_CACHE
for both
SUMMARY:
Previously, llm-compressor ignored HF_HUB_CACHE and other environment variables when loading models and datasets, making offline mode difficult to use with unified cache directories.
This change:
Now users can specify cache_dir parameter or use HF_HOME/HF_HUB_CACHE environment variables for true offline operation.
Offline mode is super helpful to supply-chain security use cases. It helps us generate trustworthy SBOMs for AI stuff. 🔐 🧠
TEST PLAN:
I start with the oneshot example from the README, and called it
example.py
:Next, remove your hf local cache to ensure your system has nothing available to it yet:
❯ rm -rf ~/.cache/huggingface
Then, run
example.py
with the HF_HUB_OFFLINE=1 env var. This should fail, proving that you have nothing cached.Good. Now, run it with
HF_HUB=./hf-hub
which will run it in online mode, populating the cache in a new non-standard location (just to be sure things don't get mixed up during our test):Now, finally, you can run with both HF_HOME and HF_HUB_OFFLINE=1 and prove to yourself that llm-compressor uses that freshly-populated cache for both the model and the dataset.