Skip to content

Commit 01240fe

Browse files
[training ] add Kontext i2i training (#11858)
* feat: enable i2i fine-tuning in Kontext script. * readme * more checks. * Apply suggestions from code review Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com> * fixes * fix * add proj_mlp to the mix * Update README_flux.md add note on installing from commit `05e7a854d0a5661f5b433f6dd5954c224b104f0b` * fix * fix --------- Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
1 parent ce338d4 commit 01240fe

File tree

2 files changed

+229
-44
lines changed

2 files changed

+229
-44
lines changed

examples/dreambooth/README_flux.md

Lines changed: 49 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -263,9 +263,19 @@ This reduces memory requirements significantly w/o a significant quality loss. N
263263
## Training Kontext
264264

265265
[Kontext](https://bfl.ai/announcements/flux-1-kontext) lets us perform image editing as well as image generation. Even though it can accept both image and text as inputs, one can use it for text-to-image (T2I) generation, too. We
266-
provide a simple script for LoRA fine-tuning Kontext in [train_dreambooth_lora_flux_kontext.py](./train_dreambooth_lora_flux_kontext.py) for T2I. The optimizations discussed above apply this script, too.
266+
provide a simple script for LoRA fine-tuning Kontext in [train_dreambooth_lora_flux_kontext.py](./train_dreambooth_lora_flux_kontext.py) for both T2I and I2I. The optimizations discussed above apply this script, too.
267267

268-
Make sure to follow the [instructions to set up your environment](#running-locally-with-pytorch) before proceeding to the rest of the section.
268+
**important**
269+
270+
> [!NOTE]
271+
> To make sure you can successfully run the latest version of the kontext example script, we highly recommend installing from source, specifically from the commit mentioned below.
272+
> To do this, execute the following steps in a new virtual environment:
273+
> ```
274+
> git clone https://github.com/huggingface/diffusers
275+
> cd diffusers
276+
> git checkout 05e7a854d0a5661f5b433f6dd5954c224b104f0b
277+
> pip install -e .
278+
> ```
269279
270280
Below is an example training command:
271281
@@ -294,6 +304,42 @@ accelerate launch train_dreambooth_lora_flux_kontext.py \
294304
Fine-tuning Kontext on the T2I task can be useful when working with specific styles/subjects where it may not
295305
perform as expected.
296306

307+
Image-guided fine-tuning (I2I) is also supported. To start, you must have a dataset containing triplets:
308+
309+
* Condition image
310+
* Target image
311+
* Instruction
312+
313+
[kontext-community/relighting](https://huggingface.co/datasets/kontext-community/relighting) is a good example of such a dataset. If you are using such a dataset, you can use the command below to launch training:
314+
315+
```bash
316+
accelerate launch train_dreambooth_lora_flux_kontext.py \
317+
--pretrained_model_name_or_path=black-forest-labs/FLUX.1-Kontext-dev \
318+
--output_dir="kontext-i2i" \
319+
--dataset_name="kontext-community/relighting" \
320+
--image_column="output" --cond_image_column="file_name" --caption_column="instruction" \
321+
--mixed_precision="bf16" \
322+
--resolution=1024 \
323+
--train_batch_size=1 \
324+
--guidance_scale=1 \
325+
--gradient_accumulation_steps=4 \
326+
--gradient_checkpointing \
327+
--optimizer="adamw" \
328+
--use_8bit_adam \
329+
--cache_latents \
330+
--learning_rate=1e-4 \
331+
--lr_scheduler="constant" \
332+
--lr_warmup_steps=200 \
333+
--max_train_steps=1000 \
334+
--rank=16\
335+
--seed="0"
336+
```
337+
338+
More generally, when performing I2I fine-tuning, we expect you to:
339+
340+
* Have a dataset `kontext-community/relighting`
341+
* Supply `image_column`, `cond_image_column`, and `caption_column` values when launching training
342+
297343
### Misc notes
298344

299345
* By default, we use `mode` as the value of `--vae_encode_mode` argument. This is because Kontext uses `mode()` of the distribution predicted by the VAE instead of sampling from it.
@@ -307,4 +353,4 @@ To enable aspect ratio bucketing, pass `--aspect_ratio_buckets` argument with a
307353
Since Flux Kontext finetuning is still an experimental phase, we encourage you to explore different settings and share your insights! 🤗
308354

309355
## Other notes
310-
Thanks to `bghira` and `ostris` for their help with reviewing & insight sharing ♥️
356+
Thanks to `bghira` and `ostris` for their help with reviewing & insight sharing ♥️

0 commit comments

Comments
 (0)