Intro
Black Forest Labs has released FLUX.1 [dev], a 12-billion-parameter rectified flow transformer capable of generating images from text descriptions. Since its release, it has been branded as the "Midjourney Killer." Here are the steps to try it out locally.
Disclaimer
This model is large (~24GB) and requires a powerful graphics card and CPU to run locally. The performance may vary depending on your PC's capabilities.
To try it online instead, visit FLUX.1 devdevdev - a Hugging Face Space by Black Forest Labs.
Hugging Face
Hugging Face makes it easier to try out open-source AI models from various creators. Once you create an account, you can find the black-forest-labs/FLUX.1-dev model.
In this tutorial, we will use the diffusers library.
Click on Diffusers to get a code snippet.
Code
Create a local Python file with the following contents:
import torch
from diffusers import FluxPipeline
pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16)
pipe.enable_model_cpu_offload() #save some VRAM by offloading the model to CPU. Remove this if you have enough GPU power
prompt = "A cat holding a sign that says hello world"
image = pipe(
prompt,
height=1024,
width=1024,
guidance_scale=3.5,
num_inference_steps=50,
max_sequence_length=512,
generator=torch.Generator("cpu").manual_seed(0)
).images[0]
image.save("flux-dev.png")
Make sure you have the required dependencies installed. If you haven't already, log in to Hugging Face on the CLI via the huggingface-cli login
command. For the token, generate a read token at https://huggingface.co/settings/tokens.
Once you run the file, the model will download, and you can experiment with different prompts and tweak the settings for your desired output.
Happy hacking!