21.08.2024 22:18

Training a Flux Character LoRA on Civitai

So, you want to create a LoRA for generating images of a specific person—your girlfriend, friend, or yourself. All you need is a dataset with captions, a Civitai account, and some buzz—at least 2000 buzz (equal to $2). You can even earn buzz for free, about 200 buzz per day, by doing Civitai dailies like liking, sharing posts, subscribing, and more.

Dataset

It’s better if you have 20-30 images in your dataset—mostly close-up photos, some mid-shots, and one full-body shot. It’s also a good idea to include not only front-facing photos but side shots too.

The size of the images should be 512x512 for good results and faster training.

Here’s my dataset:

Now, you need to have captions for each of your images. Captions should work like tags, with the first word being a unique tag that triggers the model to recognize your person.

You can describe all the photos manually or use tools:

Civitai auto tag (but it gives errors for me).
Caption helper (you need API tokens for Groq and OpenAI GPT-4, which I don’t have).
Directly generate captions for each image using ChatGPT—this might be simpler than getting an API key for some people (like me).

You can use my prompt for generating good captions:

Generate a description of an image with the following requirements:

Hair: Describe the color and length of the hair.
Face Orientation: The face must always be in a front-facing view.
Outfit and Setting: Provide detailed descriptions of the outfit, pose, and any objects being interacted with.
Environment: Describe the background and surroundings, including lighting, atmosphere, and any relevant objects or scenery.
Overall Scene: Capture the mood or atmosphere of the scene, whether it's peaceful, intense, magical, etc.
Format: The description should be in the form of a list of tags, separated by commas. The first tag must always be #TRIGGERWORD#.

You can tweak it for your needs.

Base Rules for Captioning Photos for LoRA:

The first tag is the unique trigger word.
All other tags should describe everything you don’t want to be tied to LoRA. For example, if you want to change hair color and length in generated images, you need to describe those in every photo. Anything not described and that’s the same in each photo in the dataset will be linked to the trigger word.

Here are examples for some images:

0 - triggerword, blonde hair, long hair, white cropped tank top, standing, medium shot, modern interior, light-colored wall, black shelving unit, flat-screen television, minimalist aesthetic;
1 - triggerword, blonde hair, long hair, black cropped long-sleeve top, white pants, standing, outdoor, scenic view, cityscape background, river, greenery, sunny day.

If you don’t use Civitai’s auto tag, it’s better to save images and captions in the same folder with matching names (txt files for captions).

folder screen

To upload the folder to the dataset, just archive it into a zip file.

Settings

The config that works for me is shown in the screenshots. Good results usually start from 2 epochs, but I went with 18-20.

You also need to set prompts for images to see progress per epoch. I set 2 prompts from the dataset and left the third one empty—it worked fine. Don’t experiment too much with crazy prompts—they might not include the person, and you could mess up picking the best epoch.

train settings 1

train settings 2

After about 3 hours, you’ll have all 20 epochs trained and can check each one.

Pick the best one and press "Next." Here, you can save some settings for the model, but most of them can be changed later.

The main ones to set if you want to generate images on-site are to allow access to the model on Civitai and to uncheck the label saying the model contains a real person (like a celebrity, but dont break the rules of service and don't train celebrity loras with this options). Without these two options, on-site generation won’t work.

model setting

Now, just wait a few minutes for the model to be verified, and you’ll be able to generate images directly on Civitai like any other LoRA or model.

Here are examples of generations with my LoRA:

triggerword, long dark hair, devilish appearance, red horns, black leather outfit, red cape, dark red glowing eyes, sharp claws, standing pose, fiery background, flames, dark shadows, sinister expression, hellish atmosphere, intense glow, malevolent setting

CG2ZA174NF56ZNVDX7BE3RG0J0

triggerword, blonde hair, Pokemon trainer outfit, red and white cap, short-sleeved jacket, fingerless gloves, standing pose, holding a Pokeball, Pikachu beside her, outdoor setting, grassy field, blue sky, bright day, adventure atmosphere, energetic expression

RQW74KAP9893TCQZMWH3A845A0

triggerword, long blonde wavy hair, Hogwarts uniform, Ravenclaw tie, dark gray pleated skirt, white button-up shirt, dark gray cardigan, ghostly appearance, dimly lit bathroom, cracked tiles, broken stall door, dusty mirror, dripping faucets, gloomy atmosphere, blue tint, haunted, melancholic setting

TW98SPQ67T1W8JDVT63D3EWVM0

Training a Flux Character LoRA on Civitai

Dataset

Base Rules for Captioning Photos for LoRA:

Settings

Other Good Articles on Civitai

Stairway to Heaven: Customizing WLED

Training a Flux Character LoRA on Civitai

My Music AI Album Release

DIY wash & cure station for resin printers

Epoxy сountertop: step by step

DIY Honda: Printing Parts in the Garage