You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[LoRA](https://hf.co/docs/peft/conceptual_guides/adapter#low-rank-adaptation-lora) is a lightweight adapter that is fast and easy to train, making them especially popular for generating images in a certain way or style. These adapters are commonly stored in a safetensors file, and are widely popular on model sharing platforms like [civitai](https://civitai.com/).
75
+
[LoRAs](../tutorials/using_peft_for_inference) are lightweight checkpoints fine-tuned to generate images or video in a specific style. If you are using a checkpoint trained with a Diffusers training script, the LoRA configuration is automatically saved as metadata in a safetensors file. When the safetensors file is loaded, the metadata is parsed to correctly configure the LoRA and avoids missing or incorrect LoRA configurations.
76
76
77
-
LoRAs are loaded into a base model with the [`~loaders.StableDiffusionLoraLoaderMixin.load_lora_weights`] method.
77
+
The easiest way to inspect the metadata, if available, is by clicking on the Safetensors logo next to the weights.
For LoRAs that aren't trained with Diffusers, you can still save metadata with the `transformer_lora_adapter_metadata` and `text_encoder_lora_adapter_metadata` arguments in [`~loaders.FluxLoraLoaderMixin.save_lora_weights`] as long as it is a safetensors file.
prompt ="bl3uprint, a highly detailed blueprint of the empire state building, explaining how to build all parts, many txt, blueprint grid backdrop"
94
-
negative_prompt ="lowres, cropped, worst quality, low quality, normal quality, artifacts, signature, watermark, username, blurry, more than one bridge, bad architecture"
Copy file name to clipboardExpand all lines: examples/dreambooth/README_flux.md
+49-3Lines changed: 49 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -263,9 +263,19 @@ This reduces memory requirements significantly w/o a significant quality loss. N
263
263
## Training Kontext
264
264
265
265
[Kontext](https://bfl.ai/announcements/flux-1-kontext) lets us perform image editing as well as image generation. Even though it can accept both image and text as inputs, one can use it for text-to-image (T2I) generation, too. We
266
-
provide a simple script for LoRA fine-tuning Kontext in [train_dreambooth_lora_flux_kontext.py](./train_dreambooth_lora_flux_kontext.py) for T2I. The optimizations discussed above apply this script, too.
266
+
provide a simple script for LoRA fine-tuning Kontext in [train_dreambooth_lora_flux_kontext.py](./train_dreambooth_lora_flux_kontext.py) for both T2I and I2I. The optimizations discussed above apply this script, too.
267
267
268
-
Make sure to follow the [instructions to set up your environment](#running-locally-with-pytorch) before proceeding to the rest of the section.
268
+
**important**
269
+
270
+
> [!NOTE]
271
+
> To make sure you can successfully run the latest version of the kontext example script, we highly recommend installing from source, specifically from the commit mentioned below.
272
+
> To do this, execute the following steps in a new virtual environment:
Fine-tuning Kontext on the T2I task can be useful when working with specific styles/subjects where it may not
295
305
perform as expected.
296
306
307
+
Image-guided fine-tuning (I2I) is also supported. To start, you must have a dataset containing triplets:
308
+
309
+
* Condition image
310
+
* Target image
311
+
* Instruction
312
+
313
+
[kontext-community/relighting](https://huggingface.co/datasets/kontext-community/relighting) is a good example of such a dataset. If you are using such a dataset, you can use the command below to launch training:
More generally, when performing I2I fine-tuning, we expect you to:
339
+
340
+
* Have a dataset `kontext-community/relighting`
341
+
* Supply `image_column`, `cond_image_column`, and `caption_column` values when launching training
342
+
297
343
### Misc notes
298
344
299
345
* By default, we use `mode` as the value of `--vae_encode_mode` argument. This is because Kontext uses `mode()` of the distribution predicted by the VAE instead of sampling from it.
@@ -307,4 +353,4 @@ To enable aspect ratio bucketing, pass `--aspect_ratio_buckets` argument with a
307
353
Since Flux Kontext finetuning is still an experimental phase, we encourage you to explore different settings and share your insights! 🤗
308
354
309
355
## Other notes
310
-
Thanks to `bghira` and `ostris` for their help with reviewing & insight sharing ♥️
356
+
Thanks to `bghira` and `ostris` for their help with reviewing & insight sharing ♥️
0 commit comments