Skip to content

Commit 4f77130

Browse files
metric-spacemrm8488zhongdongyinnovation64yao-matrix
authored
DDPO (#1403)
* First commit * [idefics]: Fix grammar (#1402) * Add: zh/optimizing-bark.md (#1404) * update soc3-zn * Update _blog.yml Try to resolve conflicts * Update: proofreading zh/ethics-soc-3.md * add how-to-generate cn version Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * unity game in hf space translation completed * Update: punctuations of how-to-generate.md * hf-bitsandbytes-integration cn done Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Proofread hf-bitsandbytes-integration.md * Proofread: red-teaming.md * Update: add red-teaming to zh/_blog.yml * Update _blog.yml * Update: add red-teaming to zh/_blog.yml Fix: red-teaming title in zh/_blog.yml * Fix: red-teaming PPLM translation * deep-learning-with-proteins cn done Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Add: stackllama.md * if blog translation completed * Update unity-in-spaces.md Add a link for AI game * Update if.md Fix “普罗大众” to “普惠大众” * deep-learning-with-proteins cn done Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * add starcoder cn Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Update: formatting and punctuations of starcoder.md * add starcoder cn Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Update: proofreading zh/unity-in-spaces.md * fix(annotated-diffusion.md): fix image shape desc in PIL and Tensor (#1080) modifiy the comment after ToTensor with the correct image shape CHW * Add text-to-video blog (#1058) Adds an overview of text-to-video generative models, task specific challenges, datasets, and more. Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix broken link in text-to-video.md (#1083) * Update: proofreading zh/unity-in-spaces.md Fix: incorrect _blog.yml format * Update: proofreading zh/deep-learning-with-proteins.md * update ethics-diffusers-cn (#6) * update ethics-diffusers * update ethics-diffusers --------- Co-authored-by: Zhongdong Yang <zhongdong_y@outlook.com> * Update: proofreading zh/ethics-diffusers.md * 1. introducing-csearch done (#11) 2. text-to-video done Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Update: proofread zh/text-to-video.md * Update: proofreading zh/introducing-csearch.md * generative-ai-models-on-intel-cpu cn done (#13) Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Update: proofread zh/generative-ai-models-on-intel-cpu.md Signed-off-by: Yang, Zhongdong <zhongdong_y@outlook.com> * add starchat-alpha zh translation (#10) * Preparing blogpost annoucing `safetensors` security audit + official support. (#1096) * Preparing blogpost annoucing `safetensors` security audit + official support. * Taking into account comments + Grammarly. * Update safetensors-official.md * Apply suggestions from code review Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update safetensors-official.md * Apply suggestions from code review Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> * Apply suggestions from code review * Update safetensors-official.md Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> * Apply suggestions from code review * Adding thumbnail. * Include changes from Stella. * Update safetensors-official.md * Update with Stella's comments. * Remove problematic sentence. * Rename + some rephrasing. * Apply suggestions from code review Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Update safetensors-security-audit.md Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Last fixes. * Apply suggestions from code review Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> --------- Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Hotfixing safetensors. (#1131) * Removing the checklist formatting is busted. (#1132) * Update safetensors-security-audit.md (#1134) * [time series transformers] update dataloader API (#1135) * update dataloader API * revert comment * add back Cached transform * New post: Hugging Face and IBM (#1130) * Initial version * Minor fixes * Update huggingface-and-ibm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update huggingface-and-ibm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Resize image * Update blog index --------- Co-authored-by: Julien Simon <julsimon@huggingface.co> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Show authors of safetensors blog post (#1137) Update: proofread zh/starchat-alpha.md * add megatron-training & assisted-generation (#8) * add megatron-training * add megatron-training * add megatron-training * add megatron-training * add assisted-generation * add assisted-generation * add assisted-generation * Update: proofreading zh/assisted-generation * Update: proofread zh/megatron-training.md * rwkv model blog translation completed (#12) * rwkv model blog translation completed * add 3 additional parts in the blog tail * Update: proofread zh/rwkv.md * Fix: missing subtitle/notes for image references. * encoder-decoder cn done (#14) Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Co-authored-by: Zhongdong Yang <zhongdong_y@outlook.com> * Update: proofread zh/encoder-decoder.md * constrained-beam-search cn done (#15) Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Update: proofread zh/constrained-beam-search.md * Update: zh/unity-api.md + zh/unity-asr.md * unity ai speech recognition blog translation completed * add (GameObject) to attach its Chinese translation * finish unity-api translation * add unity series entry to zh/_blog.yml * Update: proofread zh/unity-{api,asr}.md * Update zh/falcon.md Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Update: zh/falcon.md * instruction-tuning-sd cn done (#21) Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Update: zh/instruction-tuning-sd.md * fine-tune-whisper cn done (#23) Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Update: zh/fine-tune-whisper.md * add mms_adapters and policy (#22) Update: zh/policy-ntia-rfc.md * Update: refine zh/mms_adapters.md Update: remove incompleted file * Update: zh/llm-leaderboard.md, zh/autoformer.md * add llm-leaderboard CN translation * add CN translation for autoformer * Update: proofreading zh/autoformer.md * BridgeTower blog post (#1118) * Update BridgeTower blog post (#1277) * LLM Eval: minor typos and nits (#1263) * Fix anchor link to custom pipeline section. (#485) * Update: zh/llm-leaderboard.md, zh/autoformer.md * add llm-leaderboard CN translation * add CN translation for autoformer Update: proofreading zh/autoformer.md Update: proofreading zh/llm-leaderboard.md * Update: proofreading zh/ethics-soc-4.md * Update "How to deploy LLM" blog post to use `huggingface_hub` in example (#1290) * Use InferenceClient from huggingface_hub * Update inference-endpoints-llm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> --------- Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update BridgeTower blog post (#1295) * Removed duplicate numbering (#1171) * Update: zh/evaluating-mmlu-leaderboard.md Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Co-authored-by: Zhongdong Yang <zhongdong_y@outlook.com> Update: proofreading zh/evaluating-mmlu-leaderboard.md * Translate train-optimize-sd-intel.md to zh (#16) * Translate "stackllama" into Chinese * Create train-optimize-sd-intel.md Add new Update: zh/train-optimize-sd-intel.md * Update: zh/dedup.md & zh/stable-diffusion-finetuning-intel.md * dedup cn done Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * stable-diffusion-finetuning-intel cn done Signed-off-by: Yao, Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Update: proofread zh/stable-diffusion-finetuning-intel.md * Update: proofread zh/dedup.md * Update: zh/inference-endpoints-llm.md Co-authored-by: Zhongdong Yang <zhongdong_y@outlook.com> Update: proofread zh/inference-endpoints-llm.md * Update: zh/llama2.md Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Proofread: zh/llama2.md * Update: zh/diffusers-turns-1.md Proofread: zh/diffusers-turns-1.md * Fix: zh/diffusers-turns-1.md wrong meta data format Policy blog: Open ML Considerations in the EU AI Act (#1342) * Create .gitignore * Add files via upload * Create eu-ai-act-oss.md * Delete .gitignore * Update eu-ai-act-oss.md * Update eu-ai-act-oss.md * Update eu-ai-act-oss.md * Update _blog.yml * Update eu-ai-act-oss.md * Update: zh/game-jam-first-edition-results.md Update: zh/game-jam-first-edition-results.md * Add: zh/bridgetower.md, zh/getting-started-habana.md, zh/habana-gaudi-2-benchmark.md * 3 Gaudi posts cn done: - bridgetower.md - getting-started-habana.md - habana-gaudi-2-benchmark.md Signed-off-by: Yao, Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Co-authored-by: Zhongdong Yang <zhongdong_y@outlook.com> * Update: zh/bridgetower.md, zh/getting-started-habana.md, zh/habana-gaudi-2-benchmark.md * Add: zh/transformers-design-philosophy.md Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Update: proofread zh/transformers-design-philosophy.md * Add: zh/os-llms.md * Translate os-llms.md * Update _blog.yml Update: proofread zh/os-llms.md * Add: zh/dpo-trl.md * update soc3-zn * Update _blog.yml Try to resolve conflicts * Update: proofreading zh/ethics-soc-3.md * add how-to-generate cn version Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * unity game in hf space translation completed * Update: punctuations of how-to-generate.md * hf-bitsandbytes-integration cn done Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Proofread hf-bitsandbytes-integration.md * Proofread: red-teaming.md * Update: add red-teaming to zh/_blog.yml * Update _blog.yml * Update: add red-teaming to zh/_blog.yml Fix: red-teaming title in zh/_blog.yml * Fix: red-teaming PPLM translation * deep-learning-with-proteins cn done Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Add: stackllama.md * if blog translation completed * Update unity-in-spaces.md Add a link for AI game * Update if.md Fix “普罗大众” to “普惠大众” * deep-learning-with-proteins cn done Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * add starcoder cn Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Update: formatting and punctuations of starcoder.md * add starcoder cn Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Update: proofreading zh/unity-in-spaces.md * fix(annotated-diffusion.md): fix image shape desc in PIL and Tensor (#1080) modifiy the comment after ToTensor with the correct image shape CHW * Add text-to-video blog (#1058) Adds an overview of text-to-video generative models, task specific challenges, datasets, and more. Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix broken link in text-to-video.md (#1083) * Update: proofreading zh/unity-in-spaces.md Fix: incorrect _blog.yml format * Update: proofreading zh/deep-learning-with-proteins.md * update ethics-diffusers-cn (#6) * update ethics-diffusers * update ethics-diffusers --------- Co-authored-by: Zhongdong Yang <zhongdong_y@outlook.com> * Update: proofreading zh/ethics-diffusers.md * 1. introducing-csearch done (#11) 2. text-to-video done Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Update: proofread zh/text-to-video.md * Update: proofreading zh/introducing-csearch.md * generative-ai-models-on-intel-cpu cn done (#13) Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Update: proofread zh/generative-ai-models-on-intel-cpu.md Signed-off-by: Yang, Zhongdong <zhongdong_y@outlook.com> * add starchat-alpha zh translation (#10) * Preparing blogpost annoucing `safetensors` security audit + official support. (#1096) * Preparing blogpost annoucing `safetensors` security audit + official support. * Taking into account comments + Grammarly. * Update safetensors-official.md * Apply suggestions from code review Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update safetensors-official.md * Apply suggestions from code review Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> * Apply suggestions from code review * Update safetensors-official.md Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> * Apply suggestions from code review * Adding thumbnail. * Include changes from Stella. * Update safetensors-official.md * Update with Stella's comments. * Remove problematic sentence. * Rename + some rephrasing. * Apply suggestions from code review Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Update safetensors-security-audit.md Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Last fixes. * Apply suggestions from code review Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> --------- Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Hotfixing safetensors. (#1131) * Removing the checklist formatting is busted. (#1132) * Update safetensors-security-audit.md (#1134) * [time series transformers] update dataloader API (#1135) * update dataloader API * revert comment * add back Cached transform * New post: Hugging Face and IBM (#1130) * Initial version * Minor fixes * Update huggingface-and-ibm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update huggingface-and-ibm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Resize image * Update blog index --------- Co-authored-by: Julien Simon <julsimon@huggingface.co> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Show authors of safetensors blog post (#1137) Update: proofread zh/starchat-alpha.md * add megatron-training & assisted-generation (#8) * add megatron-training * add megatron-training * add megatron-training * add megatron-training * add assisted-generation * add assisted-generation * add assisted-generation * Update: proofreading zh/assisted-generation * Update: proofread zh/megatron-training.md * rwkv model blog translation completed (#12) * rwkv model blog translation completed * add 3 additional parts in the blog tail * Update: proofread zh/rwkv.md * Fix: missing subtitle/notes for image references. * encoder-decoder cn done (#14) Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Co-authored-by: Zhongdong Yang <zhongdong_y@outlook.com> * Update: proofread zh/encoder-decoder.md * constrained-beam-search cn done (#15) Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Update: proofread zh/constrained-beam-search.md * Update: zh/unity-api.md + zh/unity-asr.md * unity ai speech recognition blog translation completed * add (GameObject) to attach its Chinese translation * finish unity-api translation * add unity series entry to zh/_blog.yml * Update: proofread zh/unity-{api,asr}.md * Update zh/falcon.md Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Update: zh/falcon.md * instruction-tuning-sd cn done (#21) Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Update: zh/instruction-tuning-sd.md * fine-tune-whisper cn done (#23) Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Update: zh/fine-tune-whisper.md * add mms_adapters and policy (#22) Update: zh/policy-ntia-rfc.md * Update: refine zh/mms_adapters.md Update: remove incompleted file * Update: zh/llm-leaderboard.md, zh/autoformer.md * add llm-leaderboard CN translation * add CN translation for autoformer * Update: proofreading zh/autoformer.md * BridgeTower blog post (#1118) * Update BridgeTower blog post (#1277) * LLM Eval: minor typos and nits (#1263) * Fix anchor link to custom pipeline section. (#485) * Update: zh/llm-leaderboard.md, zh/autoformer.md * add llm-leaderboard CN translation * add CN translation for autoformer Update: proofreading zh/autoformer.md Update: proofreading zh/llm-leaderboard.md * Update: proofreading zh/ethics-soc-4.md * Update "How to deploy LLM" blog post to use `huggingface_hub` in example (#1290) * Use InferenceClient from huggingface_hub * Update inference-endpoints-llm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> --------- Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update BridgeTower blog post (#1295) * Removed duplicate numbering (#1171) * Update: zh/evaluating-mmlu-leaderboard.md Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Co-authored-by: Zhongdong Yang <zhongdong_y@outlook.com> Update: proofreading zh/evaluating-mmlu-leaderboard.md * Translate train-optimize-sd-intel.md to zh (#16) * Translate "stackllama" into Chinese * Create train-optimize-sd-intel.md Add new Update: zh/train-optimize-sd-intel.md * Update: zh/dedup.md & zh/stable-diffusion-finetuning-intel.md * dedup cn done Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * stable-diffusion-finetuning-intel cn done Signed-off-by: Yao, Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Update: proofread zh/stable-diffusion-finetuning-intel.md * Update: proofread zh/dedup.md * Update: zh/inference-endpoints-llm.md Co-authored-by: Zhongdong Yang <zhongdong_y@outlook.com> Update: proofread zh/inference-endpoints-llm.md * Update: zh/llama2.md Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Proofread: zh/llama2.md * Update: zh/diffusers-turns-1.md Proofread: zh/diffusers-turns-1.md * Fix: zh/diffusers-turns-1.md wrong meta data format Policy blog: Open ML Considerations in the EU AI Act (#1342) * Create .gitignore * Add files via upload * Create eu-ai-act-oss.md * Delete .gitignore * Update eu-ai-act-oss.md * Update eu-ai-act-oss.md * Update eu-ai-act-oss.md * Update _blog.yml * Update eu-ai-act-oss.md * Update: zh/game-jam-first-edition-results.md Update: zh/game-jam-first-edition-results.md * Add: zh/bridgetower.md, zh/getting-started-habana.md, zh/habana-gaudi-2-benchmark.md * 3 Gaudi posts cn done: - bridgetower.md - getting-started-habana.md - habana-gaudi-2-benchmark.md Signed-off-by: Yao, Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Co-authored-by: Zhongdong Yang <zhongdong_y@outlook.com> * Update: zh/bridgetower.md, zh/getting-started-habana.md, zh/habana-gaudi-2-benchmark.md * Add: zh/transformers-design-philosophy.md Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Update: proofread zh/transformers-design-philosophy.md * dpo-trl cn done Signed-off-by: Yao, Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Signed-off-by: Yang, Zhongdong <zhongdong_y@outlook.com> Co-authored-by: innovation64 <liyang19991126@126.com> Co-authored-by: Zhongdong Yang <zhongdong_y@outlook.com> Co-authored-by: SuSung-boy <872414318@qq.com> Co-authored-by: Zhongdong Yang <yangzd1996@outlook.com> Co-authored-by: Luke Cheng <2258420+chenglu@users.noreply.github.com> Co-authored-by: yaoqih <40328311+yaoqih@users.noreply.github.com> Co-authored-by: Shiliang Chen <36809537+csl122@users.noreply.github.com> Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: 李洋 <45715979+innovation64@users.noreply.github.com> Co-authored-by: Hoi2022 <120370631+Hoi2022@users.noreply.github.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> Co-authored-by: Victor Muštar <victor.mustar@gmail.com> Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by: Julien Simon <3436143+juliensimon@users.noreply.github.com> Co-authored-by: Julien Simon <julsimon@huggingface.co> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: gxy-gxy <57594446+gxy-gxy@users.noreply.github.com> Co-authored-by: regisss <15324346+regisss@users.noreply.github.com> Co-authored-by: Lucain <lucainp@gmail.com> Co-authored-by: Eswar Divi <76403422+EswarDivi@users.noreply.github.com> Co-authored-by: Qi Zhang <82949744+Vermillion-de@users.noreply.github.com> * Update: proofread zh/dpo-trl.md * Add: zh/optimizing-bark.md Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Update: zh/optimizing-bark.md --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Signed-off-by: Yang, Zhongdong <zhongdong_y@outlook.com> Co-authored-by: innovation64 <liyang19991126@126.com> Co-authored-by: Yao, Matrix <matrix.yao@intel.com> Co-authored-by: SuSung-boy <872414318@qq.com> Co-authored-by: Luke Cheng <2258420+chenglu@users.noreply.github.com> Co-authored-by: yaoqih <40328311+yaoqih@users.noreply.github.com> Co-authored-by: Shiliang Chen <36809537+csl122@users.noreply.github.com> Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: 李洋 <45715979+innovation64@users.noreply.github.com> Co-authored-by: Yao Matrix <yaoweifeng0301@126.com> Co-authored-by: Hoi2022 <120370631+Hoi2022@users.noreply.github.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> Co-authored-by: Victor Muštar <victor.mustar@gmail.com> Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by: Julien Simon <3436143+juliensimon@users.noreply.github.com> Co-authored-by: Julien Simon <julsimon@huggingface.co> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: gxy-gxy <57594446+gxy-gxy@users.noreply.github.com> Co-authored-by: regisss <15324346+regisss@users.noreply.github.com> Co-authored-by: Lucain <lucainp@gmail.com> Co-authored-by: Eswar Divi <76403422+EswarDivi@users.noreply.github.com> Co-authored-by: Qi Zhang <82949744+Vermillion-de@users.noreply.github.com> * Add AutoGPTQ integration blogpost (#1389) * add draft v1 * up * add colab example * Apply suggestions from code review Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * add title * Update gptq-integration.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update gptq-integration.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update gptq-integration.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update gptq-integration.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * more details * Update gptq-integration.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * added `PanEa` in the list of authors * add last sections * add `qwopqwop200` to the list of authors * add a sentence * add correct author name * change title * Apply suggestions from code review Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * add link * Update gptq-integration.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> --------- Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * [gptq] attempt to fix LaTeX (#1405) * [gptq] latex, take 2 (#1406) * [gptq] latex formatting (#1407) As tested in Codespaces * New case studies (#1361) * Add files via upload * Create writer-case-study.md * Update writer-case-study.md * Update _blog.yml * Create snorkel-case-study.md * Update _blog.yml * Create mantis-case-study.md * Create genomicsengland-case-study.md * Create databricks-case-study.md * Add files via upload * Update mantis-case-study.md * Update snorkel-case-study.md * Update _blog.yml * Update writer-case-study.md * Update mantis-case-study.md * Update mantis-case-study.md * Update mantis-case-study.md * Update mantis-case-study.md * Update databricks-case-study.md * Update _blog.yml * Update _blog.yml * Update _blog.yml * Update _blog.yml * Delete mantis.png * Update snorkel-case-study.md * Delete genomicsengland-case-study.md * Delete genomics.png * Fix databricks-case-study.md cc @VioletteLepercq * Fix writer-case-study.md cc @VioletteLepercq * Update writer-case-study.md (#1408) * Update snorkel-case-study.md (#1409) * Update mantis-case-study.md (#1410) * Add: zh/gptq-integration.md (#1411) * first draft of the translation, not all done yet. * second draft, done the translation to the whole blog, but still require a proofread * complete _blog.yml * improve latex formatting * fix latex format * adding back missing lines * change Chinese translation for word "democratize" * add space for digits Co-authored-by: Zhongdong Yang <yangzd1996@outlook.com> * add spaces for digits Co-authored-by: Zhongdong Yang <yangzd1996@outlook.com> * uppercase first character for the word "colab" Co-authored-by: Zhongdong Yang <yangzd1996@outlook.com> * improve chinese grammar usage Co-authored-by: Zhongdong Yang <yangzd1996@outlook.com> * add spaces for digits Co-authored-by: Zhongdong Yang <yangzd1996@outlook.com> * change the used bracket Co-authored-by: Zhongdong Yang <yangzd1996@outlook.com> * add spaces for digits Co-authored-by: Zhongdong Yang <yangzd1996@outlook.com> * add proofreader meta data Co-authored-by: Zhongdong Yang <yangzd1996@outlook.com> --------- Co-authored-by: Zhongdong Yang <yangzd1996@outlook.com> * Update falcon.md (#1415) I found a comma missing in the code, and I added the missing comma. * Deprecation of Git Authentication using password (#1393) * Draft: blog post about password git deprecation * Update password-git-deprecation.md * Apply suggestions from code review Co-authored-by: Julien Chaumond <julien@huggingface.co> * Update password-git-deprecation.md * Add thumb * Apply suggestions from code review Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update password-git-deprecation.md * Apply suggestions from code review Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com> * Update password-git-deprecation.md * Update password-git-deprecation.md Co-authored-by: Julien Chaumond <julien@huggingface.co> * Adding a sentence about ssh and token advantage, add in blog * Apply suggestions from code review Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com> * Update _blog.yml --------- Co-authored-by: Julien Chaumond <julien@huggingface.co> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com> * GPTQ blogpost - Fix ToC (#1416) * Final fix ToC GPTQ blogpost (#1417) * Codellama (#1419) * init * add demo * Update codellama.md Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update codellama.md Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update codellama.md Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * 4-bit section, tweaks * adding image * Clarify FIM availability. * Move image to dataset. * update table * FIM typo Thanks ArthurZ! * update * Update codellama.md Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update codellama.md Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Apply suggestions from code review * Update codellama.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update codellama.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update codellama.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update codellama.md Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * Update codellama.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> --------- Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * Clonable demo (#1420) * [codellama] Minor demo description tweaks. (#1421) * Add a line about HuggingChat in Code Llama blog post (#1423) * Add a little blob about huggingchat in code llama * revert formatting * [CodeLlama]: simplify infilling with `<FILL_ME>` (#1424) * CodeLlama: simplify infilling with `<FILL_ME>`. Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Use `AutoTokenizer` It now works after [these PRs](https://huggingface.co/codellama/CodeLlama-7b-hf/discussions/11) have been merged. --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Change Name to code llama * Fix title * safecoder cn done (#40) (#1418) Co-authored-by: Yao Matrix <yaoweifeng0301@126.com> * Add some info on dtypes (#1425) * Add some info on dtypes * Apply suggestions from code review Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update codellama.md --------- Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update BridgeTower blog post with H100 benchmark (#1426) * typo fixes. (#1427) * [Code LLama] add section on vscode extension (#1429) * [Code LLama] add section on vscode extension * Update codellama.md Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update codellama.md --------- Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Typo (#1430) * Delete assets/160_codellama/vscode.png (#1431) @mishig25's PR was correctly referencing the image from `documentation-images` but the image had been uploaded here, and I missed it in the review 🤦 I uploaded the image to `documentation-images` so it's being correctly displayed, removing from here. * Add FlowGPT talk to events (#1428) * Add FlowGPT talk to events * Update _events.yml * [AudioLDM 2] Blog post (#1432) * up * up * add thumbnail * fix yml * fix blog * up * [AudioLDM2] Blog post fixes (#1434) * [AudioLDM2] Add diffusion tag * update blog post * [AudioLDM2] Second round of fixes (#1435) * Fix grammar error (#1433) * Add CHS translation of password-git-deprecation (#41) (#1437) * add CHS translation of password-git-deprecation.md * Add to index * CHS Typo fixes. (#1438) * Fix typo in pytorch-ddp-accelerate-transformers.md (#1436) * Add zh for deploy-deepfloydif-using-bentoml.md (#42) (#1441) * Update classification-use-cases.md (#1439) * Update classification-use-cases.md * Add files via upload * Update _blog.yml * Delete assets/78_ml_director_insights/blogthumbnail.png * Update accelerate-deepspeed.md (#1442) label error * [AudioLDM2] Fix math (#1446) * Fix typo (#1448) * Add blog (#1449) * test * add falcon * minor stuff * acknowledge * more params * more params * changes * changes * changes * change date * Move assets to dataset. * Additional links in the ToC * Use headings instead of bold * huggingface-cli login * Update falcon-180b.md * Update falcon-180b.md * Update falcon-180b.md * Update falcon-180b.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update falcon-180b.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> --------- Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update falcon-180b.md (#1450) * Fix demo link (#1451) * Fix falcon url (#1453) * Add precision comment (#1454) * Update falcon-180b.md * Update falcon-180b.md * Update falcon-180b.md --------- Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Add CHS translation (#1455) * Fix YAML issue. (#1457) * Update falcon-180b.md (#1458) * Add: zh/codellama.md (#1465) * update soc3-zn * Update _blog.yml Try to resolve conflicts * Update: proofreading zh/ethics-soc-3.md * add how-to-generate cn version Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * unity game in hf space translation completed * Update: punctuations of how-to-generate.md * hf-bitsandbytes-integration cn done Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Proofread hf-bitsandbytes-integration.md * Proofread: red-teaming.md * Update: add red-teaming to zh/_blog.yml * Update _blog.yml * Update: add red-teaming to zh/_blog.yml Fix: red-teaming title in zh/_blog.yml * Fix: red-teaming PPLM translation * deep-learning-with-proteins cn done Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Add: stackllama.md * if blog translation completed * Update unity-in-spaces.md Add a link for AI game * Update if.md Fix “普罗大众” to “普惠大众” * deep-learning-with-proteins cn done Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * add starcoder cn Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Update: formatting and punctuations of starcoder.md * add starcoder cn Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Update: proofreading zh/unity-in-spaces.md * fix(annotated-diffusion.md): fix image shape desc in PIL and Tensor (#1080) modifiy the comment after ToTensor with the correct image shape CHW * Add text-to-video blog (#1058) Adds an overview of text-to-video generative models, task specific challenges, datasets, and more. Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix broken link in text-to-video.md (#1083) * Update: proofreading zh/unity-in-spaces.md Fix: incorrect _blog.yml format * Update: proofreading zh/deep-learning-with-proteins.md * update ethics-diffusers-cn (#6) * update ethics-diffusers * update ethics-diffusers --------- Co-authored-by: Zhongdong Yang <zhongdong_y@outlook.com> * Update: proofreading zh/ethics-diffusers.md * 1. introducing-csearch done (#11) 2. text-to-video done Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Update: proofread zh/text-to-video.md * Update: proofreading zh/introducing-csearch.md * generative-ai-models-on-intel-cpu cn done (#13) Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Update: proofread zh/generative-ai-models-on-intel-cpu.md Signed-off-by: Yang, Zhongdong <zhongdong_y@outlook.com> * add starchat-alpha zh translation (#10) * Preparing blogpost annoucing `safetensors` security audit + official support. (#1096) * Preparing blogpost annoucing `safetensors` security audit + official support. * Taking into account comments + Grammarly. * Update safetensors-official.md * Apply suggestions from code review Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update safetensors-official.md * Apply suggestions from code review Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> * Apply suggestions from code review * Update safetensors-official.md Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> * Apply suggestions from code review * Adding thumbnail. * Include changes from Stella. * Update safetensors-official.md * Update with Stella's comments. * Remove problematic sentence. * Rename + some rephrasing. * Apply suggestions from code review Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Update safetensors-security-audit.md Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Last fixes. * Apply suggestions from code review Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> --------- Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Hotfixing safetensors. (#1131) * Removing the checklist formatting is busted. (#1132) * Update safetensors-security-audit.md (#1134) * [time series transformers] update dataloader API (#1135) * update dataloader API * revert comment * add back Cached transform * New post: Hugging Face and IBM (#1130) * Initial version * Minor fixes * Update huggingface-and-ibm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update huggingface-and-ibm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Resize image * Update blog index --------- Co-authored-by: Julien Simon <julsimon@huggingface.co> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Show authors of safetensors blog post (#1137) Update: proofread zh/starchat-alpha.md * add megatron-training & assisted-generation (#8) * add megatron-training * add megatron-training * add megatron-training * add megatron-training * add assisted-generation * add assisted-generation * add assisted-generation * Update: proofreading zh/assisted-generation * Update: proofread zh/megatron-training.md * rwkv model blog translation completed (#12) * rwkv model blog translation completed * add 3 additional parts in the blog tail * Update: proofread zh/rwkv.md * Fix: missing subtitle/notes for image references. * encoder-decoder cn done (#14) Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Co-authored-by: Zhongdong Yang <zhongdong_y@outlook.com> * Update: proofread zh/encoder-decoder.md * constrained-beam-search cn done (#15) Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Update: proofread zh/constrained-beam-search.md * Update: zh/unity-api.md + zh/unity-asr.md * unity ai speech recognition blog translation completed * add (GameObject) to attach its Chinese translation * finish unity-api translation * add unity series entry to zh/_blog.yml * Update: proofread zh/unity-{api,asr}.md * Update zh/falcon.md Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Update: zh/falcon.md * instruction-tuning-sd cn done (#21) Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Update: zh/instruction-tuning-sd.md * fine-tune-whisper cn done (#23) Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Update: zh/fine-tune-whisper.md * add mms_adapters and policy (#22) Update: zh/policy-ntia-rfc.md * Update: refine zh/mms_adapters.md Update: remove incompleted file * Update: zh/llm-leaderboard.md, zh/autoformer.md * add llm-leaderboard CN translation * add CN translation for autoformer * Update: proofreading zh/autoformer.md * BridgeTower blog post (#1118) * Update BridgeTower blog post (#1277) * LLM Eval: minor typos and nits (#1263) * Fix anchor link to custom pipeline section. (#485) * Update: zh/llm-leaderboard.md, zh/autoformer.md * add llm-leaderboard CN translation * add CN translation for autoformer Update: proofreading zh/autoformer.md Update: proofreading zh/llm-leaderboard.md * Update: proofreading zh/ethics-soc-4.md * Update "How to deploy LLM" blog post to use `huggingface_hub` in example (#1290) * Use InferenceClient from huggingface_hub * Update inference-endpoints-llm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> --------- Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update BridgeTower blog post (#1295) * Removed duplicate numbering (#1171) * Update: zh/evaluating-mmlu-leaderboard.md Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Co-authored-by: Zhongdong Yang <zhongdong_y@outlook.com> Update: proofreading zh/evaluating-mmlu-leaderboard.md * Translate train-optimize-sd-intel.md to zh (#16) * Translate "stackllama" into Chinese * Create train-optimize-sd-intel.md Add new Update: zh/train-optimize-sd-intel.md * Update: zh/dedup.md & zh/stable-diffusion-finetuning-intel.md * dedup cn done Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * stable-diffusion-finetuning-intel cn done Signed-off-by: Yao, Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Update: proofread zh/stable-diffusion-finetuning-intel.md * Update: proofread zh/dedup.md * Update: zh/inference-endpoints-llm.md Co-authored-by: Zhongdong Yang <zhongdong_y@outlook.com> Update: proofread zh/inference-endpoints-llm.md * Update: zh/llama2.md Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Proofread: zh/llama2.md * Update: zh/diffusers-turns-1.md Proofread: zh/diffusers-turns-1.md * Fix: zh/diffusers-turns-1.md wrong meta data format Policy blog: Open ML Considerations in the EU AI Act (#1342) * Create .gitignore * Add files via upload * Create eu-ai-act-oss.md * Delete .gitignore * Update eu-ai-act-oss.md * Update eu-ai-act-oss.md * Update eu-ai-act-oss.md * Update _blog.yml * Update eu-ai-act-oss.md * Update: zh/game-jam-first-edition-results.md Update: zh/game-jam-first-edition-results.md * Add: zh/bridgetower.md, zh/getting-started-habana.md, zh/habana-gaudi-2-benchmark.md * 3 Gaudi posts cn done: - bridgetower.md - getting-started-habana.md - habana-gaudi-2-benchmark.md Signed-off-by: Yao, Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Co-authored-by: Zhongdong Yang <zhongdong_y@outlook.com> * Update: zh/bridgetower.md, zh/getting-started-habana.md, zh/habana-gaudi-2-benchmark.md * Add: zh/transformers-design-philosophy.md Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Update: proofread zh/transformers-design-philosophy.md * Add: zh/os-llms.md * Translate os-llms.md * Update _blog.yml Update: proofread zh/os-llms.md * Add: zh/dpo-trl.md * update soc3-zn * Update _blog.yml Try to resolve conflicts * Update: proofreading zh/ethics-soc-3.md * add how-to-generate cn version Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * unity game in hf space translation completed * Update: punctuations of how-to-generate.md * hf-bitsandbytes-integration cn done Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Proofread hf-bitsandbytes-integration.md * Proofread: red-teaming.md * Update: add red-teaming to zh/_blog.yml * Update _blog.yml * Update: add red-teaming to zh/_blog.yml Fix: red-teaming title in zh/_blog.yml * Fix: red-teaming PPLM translation * deep-learning-with-proteins cn done Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Add: stackllama.md * if blog translation completed * Update unity-in-spaces.md Add a link for AI game * Update if.md Fix “普罗大众” to “普惠大众” * deep-learning-with-proteins cn done Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * add starcoder cn Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Update: formatting and punctuations of starcoder.md * add starcoder cn Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Update: proofreading zh/unity-in-spaces.md * fix(annotated-diffusion.md): fix image shape desc in PIL and Tensor (#1080) modifiy the comment after ToTensor with the correct image shape CHW * Add text-to-video blog (#1058) Adds an overview of text-to-video generative models, task specific challenges, datasets, and more. Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix broken link in text-to-video.md (#1083) * Update: proofreading zh/unity-in-spaces.md Fix: incorrect _blog.yml format * Update: proofreading zh/deep-learning-with-proteins.md * update ethics-diffusers-cn (#6) * update ethics-diffusers * update ethics-diffusers --------- Co-authored-by: Zhongdong Yang <zhongdong_y@outlook.com> * Update: proofreading zh/ethics-diffusers.md * 1. introducing-csearch done (#11) 2. text-to-video done Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Update: proofread zh/text-to-video.md * Update: proofreading zh/introducing-csearch.md * generative-ai-models-on-intel-cpu cn done (#13) Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Update: proofread zh/generative-ai-models-on-intel-cpu.md Signed-off-by: Yang, Zhongdong <zhongdong_y@outlook.com> * add starchat-alpha zh translation (#10) * Preparing blogpost annoucing `safetensors` security audit + official support. (#1096) * Preparing blogpost annoucing `safetensors` security audit + official support. * Taking into account comments + Grammarly. * Update safetensors-official.md * Apply suggestions from code review Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update safetensors-official.md * Apply suggestions from code review Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> * Apply suggestions from code review * Update safetensors-official.md Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> * Apply suggestions from code review * Adding thumbnail. * Include changes from Stella. * Update safetensors-official.md * Update with Stella's comments. * Remove problematic sentence. * Rename + some rephrasing. * Apply suggestions from code review Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Update safetensors-security-audit.md Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Last fixes. * Apply suggestions from code review Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> --------- Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Hotfixing safetensors. (#1131) * Removing the checklist formatting is busted. (#1132) * Update safetensors-security-audit.md (#1134) * [time series transformers] update dataloader API (#1135) * update dataloader API * revert comment * add back Cached transform * New post: Hugging Face and IBM (#1130) * Initial version * Minor fixes * Update huggingface-and-ibm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update huggingface-and-ibm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Resize image * Update blog index --------- Co-authored-by: Julien Simon <julsimon@huggingface.co> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Show authors of safetensors blog post (#1137) Update: proofread zh/starchat-alpha.md * add megatron-training & assisted-generation (#8) * add megatron-training * add megatron-training * add megatron-training * add megatron-training * add assisted-generation * add assisted-generation * add assisted-generation * Update: proofreading zh/assisted-generation * Update: proofread zh/megatron-training.md * rwkv model blog translation completed (#12) * rwkv model blog translation completed * add 3 additional parts in the blog tail * Update: proofread zh/rwkv.md * Fix: missing subtitle/notes for image references. * encoder-decoder cn done (#14) Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Co-authored-by: Zhongdong Yang <zhongdong_y@outlook.com> * Update: proofread zh/encoder-decoder.md * constrained-beam-search cn done (#15) Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Update: proofread zh/constrained-beam-search.md * Update: zh/unity-api.md + zh/unity-asr.md * unity ai speech recognition blog translation completed * add (GameObject) to attach its Chinese translation * finish unity-api translation * add unity series entry to zh/_blog.yml * Update: proofread zh/unity-{api,asr}.md * Update zh/falcon.md Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Update: zh/falcon.md * instruction-tuning-sd cn done (#21) Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Update: zh/instruction-tuning-sd.md * fine-tune-whisper cn done (#23) Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Update: zh/fine-tune-whisper.md * add mms_adapters and policy (#22) Update: zh/policy-ntia-rfc.md * Update: refine zh/mms_adapters.md Update: remove incompleted file * Update: zh/llm-leaderboard.md, zh/autoformer.md * add llm-leaderboard CN translation * add CN translation for autoformer * Update: proofreading zh/autoformer.md * BridgeTower blog post (#1118) * Update BridgeTower blog post (#1277) * LLM Eval: minor typos and nits (#1263) * Fix anchor link to custom pipeline section. (#485) * Update: zh/llm-leaderboard.md, zh/autoformer.md * add llm-leaderboard CN translation * add CN translation for autoformer Update: proofreading zh/autoformer.md Update: proofreading zh/llm-leaderboard.md * Update: proofreading zh/ethics-soc-4.md * Update "How to deploy LLM" blog post to use `huggingface_hub` in example (#1290) * Use InferenceClient from huggingface_hub * Update inference-endpoints-llm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> --------- Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update BridgeTower blog post (#1295) * Removed duplicate numbering (#1171) * Update: zh/evaluating-mmlu-leaderboard.md Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Co-authored-by: Zhongdong Yang <zhongdong_y@outlook.com> Update: proofreading zh/evaluating-mmlu-leaderboard.md * Translate train-optimize-sd-intel.md to zh (#16) * Translate "stackllama" into Chinese * Create train-optimize-sd-intel.md Add new Update: zh/train-optimize-sd-intel.md * Update: zh/dedup.md & zh/stable-diffusion-finetuning-intel.md * dedup cn done Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * stable-diffusion-finetuning-intel cn done Signed-off-by: Yao, Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Update: proofread zh/stable-diffusion-finetuning-intel.md * Update: proofread zh/dedup.md * Update: zh/inference-endpoints-llm.md Co-authored-by: Zhongdong Yang <zhongdong_y@outlook.com> Update: proofread zh/inference-endpoints-llm.md * Update: zh/llama2.md Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Proofread: zh/llama2.md * Update: zh/diffusers-turns-1.md Proofread: zh/diffusers-turns-1.md * Fix: zh/diffusers-turns-1.md wrong meta data format Policy blog: Open ML Considerations in the EU AI Act (#1342) * Create .gitignore * Add files via upload * Create eu-ai-act-oss.md * Delete .gitignore * Update eu-ai-act-oss.md * Update eu-ai-act-oss.md * Update eu-ai-act-oss.md * Update _blog.yml * Update eu-ai-act-oss.md * Update: zh/game-jam-first-edition-results.md Update: zh/game-jam-first-edition-results.md * Add: zh/bridgetower.md, zh/getting-started-habana.md, zh/habana-gaudi-2-benchmark.md * 3 Gaudi posts cn done: - bridgetower.md - getting-started-habana.md - habana-gaudi-2-benchmark.md Signed-off-by: Yao, Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Co-authored-by: Zhongdong Yang <zhongdong_y@outlook.com> * Update: zh/bridgetower.md, zh/getting-started-habana.md, zh/habana-gaudi-2-benchmark.md * Add: zh/transformers-design-philosophy.md Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Update: proofread zh/transformers-design-philosophy.md * dpo-trl cn done Signed-off-by: Yao, Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Signed-off-by: Yang, Zhongdong <zhongdong_y@outlook.com> Co-authored-by: innovation64 <liyang19991126@126.com> Co-authored-by: Zhongdong Yang <zhongdong_y@outlook.com> Co-authored-by: SuSung-boy <872414318@qq.com> Co-authored-by: Zhongdong Yang <yangzd1996@outlook.com> Co-authored-by: Luke Cheng <2258420+chenglu@users.noreply.github.com> Co-authored-by: yaoqih <40328311+yaoqih@users.noreply.github.com> Co-authored-by: Shiliang Chen <36809537+csl122@users.noreply.github.com> Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: 李洋 <45715979+innovation64@users.noreply.github.com> Co-authored-by: Hoi2022 <120370631+Hoi2022@users.noreply.github.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> Co-authored-by: Victor Muštar <victor.mustar@gmail.com> Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by: Julien Simon <3436143+juliensimon@users.noreply.github.com> Co-authored-by: Julien Simon <julsimon@huggingface.co> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> Co-authored-by: gxy-gxy <57594446+gxy-gxy@users.noreply.github.com> Co-authored-by: regisss <15324346+regisss@users.noreply.github.com> Co-authored-by: Lucain <lucainp@gmail.com> Co-authored-by: Eswar Divi <76403422+EswarDivi@users.noreply.github.com> Co-authored-by: Qi Zhang <82949744+Vermillion-de@users.noreply.github.com> * Update: proofread zh/dpo-trl.md * Add: zh/optimizing-bark.md Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Update: zh/optimizing-bark.md * Add: zh/sd_distillation.md * update sd_distillation cn * update sd_distillation cn * update sd_distillation cn * update sd_distillation cn * Update zh/sd_distillation.md * Update zh/sd_distillation.md --------- Co-authored-by: Zhongdong Yang <yangzd1996@outlook.com> Co-authored-by: Zhongdong Yang <zhongdong_y@outlook.com> Update: zh/sd_distillation.md * Add zh for deploy-deepfloydif-using-bentoml.md (#42) * Add: zh/codellama.md & Update: zh/llama2.md * update soc3-zn * Update _blog.yml Try to resolve conflicts * Update: proofreading zh/ethics-soc-3.md * add how-to-generate cn version Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * unity game in hf space translation completed * Update: punctuations of how-to-generate.md * hf-bitsandbytes-integration cn done Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Proofread hf-bitsandbytes-integration.md * Proofread: red-teaming.md * Update: add red-teaming to zh/_blog.yml * Update _blog.yml * Update: add red-teaming to zh/_blog.yml Fix: red-teaming title in zh/_blog.yml * Fix: red-teaming PPLM translation * deep-learning-with-proteins cn done Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Add: stackllama.md * if blog translation completed * Update unity-in-spaces.md Add a link for AI game * Update if.md Fix “普罗大众” to “普惠大众” * deep-learning-with-proteins cn done Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * add starcoder cn Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Update: formatting and punctuations of starcoder.md * add starcoder cn Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Update: proofreading zh/unity-in-spaces.md * fix(annotated-diffusion.md): fix image shape desc in PIL and Tensor (#1080) modifiy the comment after ToTensor with the correct image shape CHW * Add text-to-video blog (#1058) Adds an overview of text-to-video generative models, task specific challenges, datasets, and more. Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix broken link in text-to-video.md (#1083) * Update: proofreading zh/unity-in-spaces.md Fix: incorrect _blog.yml format * Update: proofreading zh/deep-learning-with-proteins.md * update ethics-diffusers-cn (#6) * update ethics-diffusers * update ethics-diffusers --------- Co-authored-by: Zhongdong Yang <zhongdong_y@outlook.com> * Update: proofreading zh/ethics-diffusers.md * 1. introducing-csearch done (#11) 2. text-to-video done Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Update: proofread zh/text-to-video.md * Update: proofreading zh/introducing-csearch.md * generative-ai-models-on-intel-cpu cn done (#13) Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Update: proofread zh/generative-ai-models-on-intel-cpu.md Signed-off-by: Yang, Zhongdong <zhongdong_y@outlook.com> * add starchat-alpha zh translation (#10) * Preparing blogpost annoucing `safetensors` security audit + official support. (#1096) * Preparing blogpost annoucing `safetensors` security audit + official support. * Taking into account comments + Grammarly. * Update safetensors-official.md * Apply suggestions from code review Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update safetensors-official.md * Apply suggestions from code review Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> * Apply suggestions from code review * Update safetensors-official.md Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> * Apply suggestions from code review * Adding thumbnail. * Include changes from Stella. * Update safetensors-official.md * Update with Stella's comments. * Remove problematic sentence. * Rename + some rephrasing. * Apply suggestions from code review Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Update safetensors-security-audit.md Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Last fixes. * Apply suggestions from code review Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> --------- Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> Co-authored-by: DeltaPenrose <128761972+DeltaPenrose@users.noreply.github.com> * Hotfixing safetensors. (#1131) * Removing the checklist formatting is busted. (#1132) * Update safetensors-security-audit.md (#1134) * [time series transformers] update dataloader API (#1135) * update dataloader API * revert comment * add back Cached transform * New post: Hugging Face and IBM (#1130) * Initial version * Minor fixes * Update huggingface-and-ibm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update huggingface-and-ibm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Resize image * Update blog index --------- Co-authored-by: Julien Simon <julsimon@huggingface.co> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Show authors of safetensors blog post (#1137) Update: proofread zh/starchat-alpha.md * add megatron-training & assisted-generation (#8) * add megatron-training * add megatron-training * add megatron-training * add megatron-training * add assisted-generation * add assisted-generation * add assisted-generation * Update: proofreading zh/assisted-generation * Update: proofread zh/megatron-training.md * rwkv model blog translation completed (#12) * rwkv model blog translation completed * add 3 additional parts in the blog tail * Update: proofread zh/rwkv.md * Fix: missing subtitle/notes for image references. * encoder-decoder cn done (#14) Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Co-authored-by: Zhongdong Yang <zhongdong_y@outlook.com> * Update: proofread zh/encoder-decoder.md * constrained-beam-search cn done (#15) Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Update: proofread zh/constrained-beam-search.md * Update: zh/unity-api.md + zh/unity-asr.md * unity ai speech recognition blog translation completed * add (GameObject) to attach its Chinese translation * finish unity-api translation * add unity series entry to zh/_blog.yml * Update: proofread zh/unity-{api,asr}.md * Update zh/falcon.md Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Update: zh/falcon.md * instruction-tuning-sd cn done (#21) Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Update: zh/instruction-tuning-sd.md * fine-tune-whisper cn done (#23) Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * Update: zh/fine-tune-whisper.md * add mms_adapters and policy (#22) Update: zh/policy-ntia-rfc.md * Update: refine zh/mms_adapters.md Update: remove incompleted file * Update: zh/llm-leaderboard.md, zh/autoformer.md * add llm-leaderboard CN translation * add CN translation for autoformer * Update: proofreading zh/autoformer.md * BridgeTower blog post (#1118) * Update BridgeTower blog post (#1277) * LLM Eval: minor typos and nits (#1263) * Fix anchor link to custom pipeline section. (#485) * Update: zh/llm-leaderboard.md, zh/autoformer.md * add llm-leaderboard CN translation * add CN translation for autoformer Update: proofreading zh/autoformer.md Update: proofreading zh/llm-leaderboard.md * Update: proofreading zh/ethics-soc-4.md * Update "How to deploy LLM" blog post to use `huggingface_hub` in example (#1290) * Use InferenceClient from huggingface_hub * Update inference-endpoints-llm.md Co-authored-by: Pedro Cuenca <pedro@huggingface.co> --------- Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * Update BridgeTower blog post (#1295) * Removed duplicate numbering (#1171) * Update: zh/evaluating-mmlu-leaderboard.md Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Co-authored-by: Zhongdong Yang <zhongdong_y@outlook.com> Update: proofreading zh/evaluating-mmlu-leaderboard.md * Translate train-optimize-sd-intel.md to zh (#16) * Translate "stackllama" into Chinese * Create train-optimize-sd-intel.md Add new Update: zh/train-optimize-sd-intel.md * Update: zh/dedup.md & zh/stable-diffusion-finetuning-intel.md * dedup cn done Signed-off-by: Yao, Matrix <matrix.yao@intel.com> * stable-diffusion-finetuning-intel cn done Signed-off-by: Yao, Matrix <matrix.yao@intel.com> --------- Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Update: proofread zh/stable-diffusion-finetuning-intel.md * Update: proofread zh/dedup.md * Update: zh/inference-endpoints-llm.md Co-authored-by: Zhongdong Yang <zhongdong_y@outlook.com> Update: proofread zh/inference-endpoints-llm.md * Update: zh/llama2.md Signed-off-by: Yao, Matrix <matrix.yao@intel.com> Proofread: zh/llama2.md * Update: zh/diffusers-turns-1.md Proofread: zh/diffusers-turns-1.md * Fix: zh/diffuser…
1 parent c1c255e commit 4f77130

File tree

3 files changed

+181
-1
lines changed

3 files changed

+181
-1
lines changed

_blog.yml

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2902,4 +2902,16 @@
29022902
tags:
29032903
- guide
29042904
- community
2905-
- nlp
2905+
- nlp
2906+
2907+
- local: trl-ddpo
2908+
title: "Finetune Stable Diffusion Models with DDPO via TRL"
2909+
author: metric-space
2910+
guest: true
2911+
thumbnail: /blog/assets/166_trl_ddpo/thumbnail.png
2912+
date: September 29, 2023
2913+
tags:
2914+
- guide
2915+
- diffusers
2916+
- rl
2917+
- rlhf

assets/166_trl_ddpo/thumbnail.png

1.32 MB
Loading

trl-ddpo.md

Lines changed: 168 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,168 @@
1+
---
2+
title: "Finetune Stable Diffusion Models with DDPO via TRL"
3+
thumbnail: /blog/assets/166_trl_ddpo/thumbnail.png
4+
authors:
5+
- user: metric-space
6+
guest: true
7+
- user: sayakpaul
8+
- user: kashif
9+
- user: leandro
10+
---
11+
12+
# Finetune Stable Diffusion Models with DDPO via TRL
13+
14+
<!-- {blog_metadata} -->
15+
<!-- {authors} -->
16+
17+
## Introduction
18+
19+
Diffusion models (e.g., DALL-E 2, Stable Diffusion) are a class of generative models that are widely successful at generating images most notably of the photorealistic kind. However, the images generated by these models may not always be on par with human preference or human intention. Thus arises the alignment problem i.e. how does one go about making sure that the outputs of a model are aligned with human preferences like “quality” or that outputs are aligned with intent that is hard to express via prompts? This is where Reinforcement Learning comes into the picture.
20+
21+
In the world of Large Language Models (LLMs), Reinforcement learning (RL) has proven to become a very effective tool for aligning said models to human preference. It’s one of the main recipes behind the superior performance of systems like ChatGPT. More precisely, RL is the critical ingredient of Reinforcement Learning from Human Feedback (RLHF), which makes ChatGPT chat like human beings.
22+
23+
In [Training Diffusion Models with Reinforcement Learning, Black](https://arxiv.org/abs/2305.13301) et al. show how to augment diffusion models to leverage RL to fine-tune them with respect to an objective function via a method named Denoising Diffusion Policy Optimization (DDPO).
24+
25+
In this blog post, we discuss how DDPO came to be, a brief description of how it works, and how DDPO can be incorporated into an RLHF workflow to achieve model outputs more aligned with the human aesthetics. We then quickly switch gears to talk about how you can apply DDPO to your models with the newly integrated `DDPOTrainer` from the `trl` library and discuss our findings from running DDPO on Stable Diffusion.
26+
27+
## The Advantages of DDPO
28+
29+
DDPO is not the only working answer to the question of how to attempt to fine-tune diffusion models with RL.
30+
31+
Before diving in, there are two key points to remember when it comes to understanding the advantages of one RL solution over the other
32+
33+
1. Computational efficiency is key. The more complicated your data distribution gets, the higher your computational costs get.
34+
2. Approximations are nice, but because approximations are not the real thing, associated errors stack up.
35+
36+
Before DDPO, Reward-weighted regression (RWR) was an established way of using Reinforcement Learning to fine-tune diffusion models. RWR reuses the denoising loss function of the diffusion model along with training data sampled from the model itself and per-sample loss weighting that depends on the reward associated with the final samples. This algorithm ignores the intermediate denoising steps/samples. While this works, two things should be noted:
37+
38+
1. Optimizing by weighing the associated loss, which is a maximum likelihood objective, is an approximate optimization
39+
2. The associated loss is not an exact maximum likelihood objective but an approximation that is derived from a reweighed variational bound
40+
41+
The two orders of approximation have a significant impact on both performance and the ability to handle complex objectives.
42+
43+
DDPO uses this method as a starting point. Rather than viewing the denoising step as a single step by only focusing on the final sample, DDPO frames the whole denoising process as a multistep Markov Decision Process (MDP) where the reward is received at the very end. This formulation in addition to using a fixed sampler paves the way for the agent policy to become an isotropic Gaussian as opposed to an arbitrarily complicated distribution. So instead of using the approximate likelihood of the final sample (which is the path RWR takes), here the exact likelihood of each denoising step which is extremely easy to compute ( \\( \ell(\mu, \sigma^2; x) = -\frac{n}{2} \log(2\pi) - \frac{n}{2} \log(\sigma^2) - \frac{1}{2\sigma^2} \sum_{i=1}^n (x_i - \mu)^2 \\) ).
44+
45+
If you’re interested in learning more details about DDPO, we encourage you to check out the [original paper](https://arxiv.org/abs/2305.13301) and the [accompanying blog post](https://bair.berkeley.edu/blog/2023/07/14/ddpo/).
46+
47+
## DDPO algorithm briefly
48+
49+
Given the MDP framework used to model the sequential nature of the denoising process and the rest of the considerations that follow, the tool of choice to tackle the optimization problem is a policy gradient method. Specifically Proximal Policy Optimization (PPO). The whole DDPO algorithm is pretty much the same as Proximal Policy Optimization (PPO) but as a side, the portion that stands out as highly customized is the trajectory collection portion of PPO
50+
51+
Here’s a diagram to summarize the flow:
52+
53+
![dppo rl schematic](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/ddpo/dppo_rl.png)
54+
55+
## DDPO and RLHF: a mix to enforce aestheticness
56+
57+
The general training aspect of [RLHF](https://huggingface.co/blog/rlhf) can roughly be broken down into the following steps:
58+
59+
1. Supervised fine-tuning a “base” model learns to the distribution of some new data
60+
2. Gathering preference data and training a reward model using it.
61+
3. Fine-tuning the model with reinforcement learning using the reward model as a signal.
62+
63+
It should be noted that preference data is the primary source for capturing human feedback in the context of RLHF.
64+
65+
When we add DDPO to the mix, the workflow gets morphed to the following:
66+
67+
1. Starting with a pretrained Diffusion Model
68+
2. Gathering preference data and training a reward model using it.
69+
3. Fine-tuning the model with DDPO using the reward model as a signal
70+
71+
Notice that step 3 from the general RLHF workflow is missing in the latter list of steps and this is because empirically it has been shown (as you will get to see yourself) that this is not needed.
72+
73+
To get on with our venture to get a diffusion model to output images more in line with the human perceieved notion of what it means to be aesthetic, we follow these steps:
74+
75+
1. Starting with a pretrained Stable Diffusion (SD) Model
76+
2. Training a frozen [CLIP](https://huggingface.co/openai/clip-vit-large-patch14) model with a trainable regression head on the [Aesthetic Visual Analysis](http://refbase.cvc.uab.es/files/MMP2012a.pdf) (AVA) dataset to predict how much people like an input image on average
77+
3. Fine-tuning the SD model with DDPO using the aesthetic predictor model as the reward signaller
78+
79+
We keep these steps in mind while moving on to actually getting these running which is described in the following sections.
80+
81+
## Training Stable Diffusion with DDPO
82+
83+
### Setup
84+
85+
To get started, when it comes to the hardware side of things and this implementation of DDPO, at the very least access to a A100 NVIDIA GPU is required for successful training. Anything below this GPU type will soon run into Out-of-memory issues.
86+
87+
Use pip to install the `trl` library
88+
89+
```bash
90+
pip install trl[diffusers]
91+
```
92+
93+
This should get the main library installed. The following dependencies are for tracking and image logging. After getting `wandb` installed, be sure to login to save the results to a personal account
94+
95+
```bash
96+
pip install wandb torchvision
97+
```
98+
99+
Note: you could choose to use `tensorboard` rather than `wandb` for which you’d want to install the `tensorboard` package via `pip`.
100+
101+
### A Walkthrough
102+
103+
The main classes within the `trl` library responsible for DDPO training are the `DDPOTrainer` and `DDPOConfig` classes. See [docs](https://huggingface.co/docs/trl/ddpo_trainer#getting-started-with-examplesscriptsstablediffusiontuningpy) for more general info on the `DDPOTrainer` and `DDPOConfig`. There is an [example training script](https://github.com/huggingface/trl/blob/main/examples/scripts/stable_diffusion_tuning.py) in the `trl` repo. It uses uses both of these classes in tandem with default implementations of required inputs and default parameters to finetune a default pretrained Stable Diffusion Model from `RunwayML` .
104+
105+
This example script uses `wandb` for logging and uses an aesthetic reward model whose weights are read from a public facing HuggingFace repo (so gathering data and training the aesthetic reward model is already done for you). The default prompt dataset used is a list of animal names.
106+
107+
There is only one commandline flag argument that is required of the user to get things up and running. Additionally, the user is expected to have a [huggingface user access token](https://huggingface.co/docs/hub/security-tokens) that will be used to upload the model post finetuning to HuggingFace hub.
108+
109+
The following bash command gets things running:
110+
111+
```python
112+
python stable_diffusion_tuning.py --hf_user_access_token <token>
113+
```
114+
115+
The following table contains key hyperparameters that are directly correlated with positive results:
116+
117+
| Parameter | Description | Recommended value for single GPU training (as of now) |
118+
| --- | --- | --- |
119+
| `num_epochs` | The number of epochs to train for | 200 |
120+
| `train_batch_size` | The batch size to use for training | 3 |
121+
| `sample_batch_size` | The batch size to use for sampling | 6 |
122+
| `gradient_accumulation_steps` | The number of accelerator based gradient accumulation steps to use | 1 |
123+
| `sample_num_steps` | The number of steps to sample for | 50 |
124+
| `sample_num_batches_per_epoch` | The number of batches to sample per epoch | 4 |
125+
| `per_prompt_stat_tracking` | Whether to track stats per prompt. If false, advantages will be calculated using the mean and std of the entire batch as opposed to tracking per prompt | `True` |
126+
| `per_prompt_stat_tracking_buffer_size` | The size of the buffer to use for tracking stats per prompt | 32 |
127+
| `mixed_precision` | Mixed precision training | `True` |
128+
| `train_learning_rate` | Learning rate | 3e-4 |
129+
130+
The provided script is merely a starting point. Feel free to adjust the hyperparameters or even overhaul the script to accommodate different objective functions. For instance, one could integrate a function that gauges JPEG compressibility or [one that evaluates visual-text alignment using a multi-modal model](https://github.com/kvablack/ddpo-pytorch/blob/main/ddpo_pytorch/rewards.py#L45), among other possibilities.
131+
132+
## Lessons learned
133+
134+
1. The results seem to generalize over a wide variety of prompts despite the minimally sized training prompts size. This has been thoroughly verified for the objective function that rewards aesthetics
135+
2. Attempts to try to explicitly generalize at least for the aesthetic objective function by increasing the training prompt size and varying the prompts seem to slow down the convergence rate for barely noticeable learned general behavior if at all this exists
136+
3. While LoRA is recommended and is tried and tested multiple times, the non-LoRA is something to consider, among other reasons from empirical evidence, non-Lora does seem to produce relatively more intricate images than LoRA. However, getting the right hyperparameters for a stable non-LoRA run is significantly more challenging.
137+
4. Recommendations for the config parameters for non-Lora are: set the learning rate relatively low, something around `1e-5` should do the trick and set `mixed_precision` to `None`
138+
139+
## Results
140+
141+
The following are pre-finetuned (left) and post-finetuned (right) outputs for the prompts `bear`, `heaven` and `dune` (each row is for the outputs of a single prompt):
142+
143+
| pre-finetuned | post-finetuned |
144+
|:-------------------------:|:-------------------------:|
145+
| ![nonfinetuned_bear.png](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/ddpo/nonfinetuned_bear.png) | ![finetuned_bear.png](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/ddpo/finetuned_bear.png) |
146+
| ![nonfinetuned_heaven.png](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/ddpo/nonfinetuned_heaven.png) | ![finetuned_heaven.png](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/ddpo/finetuned_heaven.png) |
147+
| ![nonfinetuned_dune.png](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/ddpo/nonfinetuned_dune.png) | ![finetuned_dune.png](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/ddpo/finetuned_dune.png) |
148+
149+
## Limitations
150+
151+
1. Right now `trl`'s DDPOTrainer is limited to finetuning vanilla SD models;
152+
2. In our experiments we primarily focused on LoRA which works very well. We did a few experiments with full training which can lead to better quality but finding the right hyperparameters is more challenging.
153+
154+
## Conclusion
155+
156+
Diffusion models like Stable Diffusion, when fine-tuned using DDPO, can offer significant improvements in the quality of generated images as perceived by humans or any other metric once properly conceptualized as an objective function
157+
158+
The computational efficiency of DDPO and its ability to optimize without relying on approximations, especially over earlier methods to achieve the same goal of fine-tuning diffusion models, make it a suitable candidate for fine-tuning diffusion models like Stable Diffusion
159+
160+
`trl` library's `DDPOTrainer` implements DDPO for finetuning SD models.
161+
162+
Our experimental findings underline the strength of DDPO in generalizing across a broad range of prompts, although attempts at explicit generalization through varying prompts had mixed results. The difficulty of finding the right hyperparameters for non-LoRA setups also emerged as an important learning.
163+
164+
DDPO is a promising technique to align diffusion models with any reward function and we hope that with the release in TRL we can make it more accessible to the community!
165+
166+
## Acknowledgements
167+
168+
Thanks to Chunte Lee for the thumbnail of this blog post.

0 commit comments

Comments
 (0)