P-Image LoRA: Training and Inference

This notebook is the full walkthrough for P-Image LoRA Training and Inference: we train a LoRA adapter for text-to-image (custom style or concept) with p-image-trainer and run inference with p-image-lora. See the P-Image and LoRA documentation for full parameter lists.

Two workflows — don’t mix them:

Workflow

Use case

Training

Inference

P-Image LoRA

Text-to-image (prompt → image)

p-image-trainer

p-image-lora

P-Image-Edit LoRA

Image editing (input image → edited image)

p-image-edit-trainer

p-image-edit-lora

This notebook covers P-Image LoRA only.

In this notebook we will:

  1. Generate comic noir stylized images (and captions) using Flux-2-Klein 9B

  2. Prepare the dataset in the correct format for p-image-trainer

  3. Upload the dataset to HuggingFace

  4. Train a LoRA adapter with p-image-trainer

  5. Download and extract the trained LoRA weights

  6. Upload the LoRA weights to HuggingFace

  7. Test inference with p-image-lora

Setup

Install required packages and set up authentication.

[ ]:
% pip install replicate huggingface-hub pillow tqdm datasets
[ ]:
import os
import zipfile
import requests
import random
import requests
from IPython.display import Image, display
from tqdm import tqdm
from datasets import load_dataset
from replicate.client import Client
from huggingface_hub import HfApi, upload_file

replicate_token = os.environ.get("REPLICATE_API_TOKEN")
if not replicate_token:
    replicate_token = input("Replicate API token (r8_...): ").strip()

hf_token = os.environ.get("HF_TOKEN")
if not hf_token:
    hf_token = input("HuggingFace API token (hf_...): ").strip()

replicate = Client(api_token=replicate_token)
hf_api = HfApi(token=hf_token)

Step 1: Generate training images (comic noir style)

We generate comic noir stylized images using Flux-2-Klein 9B only. Each prompt is suffixed with a comic noir style so that all images share that theme (high contrast, shadows, vintage pulp aesthetic). For P-Image LoRA we need one image per example plus a caption file; we use a trigger word sks_comic_noir in every caption so we can prompt the trained model with that token at inference. The resulting dataset is used to train with p-image-trainer and generate with p-image-lora.

[ ]:
COMIC_NOIR_STYLE = (
    "comic noir style, high contrast black and white with bold shadows, "
    "halftone dots, vintage pulp magazine aesthetic, dramatic lighting"
)

n_samples = 200
dataset_stream = load_dataset(
    "data-is-better-together/open-image-preferences-v1",
    split="cleaned",
    streaming=True,
)

streamed_prompts = []
for item in dataset_stream:
    if item.get("simplified_prompt"):
        streamed_prompts.append(item["simplified_prompt"])
    if len(streamed_prompts) >= n_samples:
        break

random.seed(42)
prompts = streamed_prompts

print(
    f"Loaded {len(prompts)} prompts (will be generated in comic noir style with Klein 4B/9B)"
)

Now, let’s generate the image pairs. Because we are fine-tning

[ ]:
def _fetch_image_bytes(output):
    """
    Given output from replicate.run, fetches the actual image bytes.
    - If output is a file-like object, calls .read()
    - If output is a list of URLs, downloads the first URL
    - If output is a single URL string, downloads it directly
    """
    if hasattr(output, "read"):
        return output.read()
    if isinstance(output, list):
        # Replicate typically returns a list of URLs (even length-1).
        url = output[0]
    elif isinstance(output, str):
        url = output
    else:
        raise ValueError(f"Unexpected output type: {type(output)}")
    resp = requests.get(url)
    resp.raise_for_status()
    return resp.content


FLUX_KLEIN_9B = "black-forest-labs/flux-2-klein-9b"
TRIGGER_WORD = "sks_comic_noir"


def generate_image_9b(prompt: str, seed: int) -> bytes:
    """Generate a single image with Flux-2-Klein 9B."""
    output = replicate.run(FLUX_KLEIN_9B, input={"prompt": prompt, "seed": seed})
    return _fetch_image_bytes(output)


image_captions = []
for i, prompt in enumerate(tqdm(prompts, desc="Generating comic noir images")):
    styled_prompt = f"{prompt}, {COMIC_NOIR_STYLE}"
    caption = f"{TRIGGER_WORD}, {prompt}"
    try:
        img_bytes = generate_image_9b(styled_prompt, i)
        image_captions.append((img_bytes, caption))
    except Exception as e:
        print(f"Error generating image {i}: {e}")
        continue

print(f"Generated {len(image_captions)} comic noir images with captions")

Sample comic noir images (Klein 9B) to verify the generation:

[ ]:
import base64
from IPython.display import HTML


def b64_img(img_bytes):
    return f"<img src='data:image/png;base64,{base64.b64encode(img_bytes).decode()}' style='max-width: 280px; border:1px solid #ccc;'/>"


n_show = min(5, len(image_captions))
print(f"Displaying {n_show} sample comic noir images:")
html = "<table><tr><th style='text-align:center;'>#</th><th>Image</th><th>Caption</th></tr>"
for i in range(n_show):
    img_bytes, caption = image_captions[i]
    cap_short = (caption[:60] + "...") if len(caption) > 60 else caption
    html += f"<tr><td style='text-align:center; font-weight:bold;'>{i+1}</td><td>{b64_img(img_bytes)}</td><td style='max-width:200px;'>{cap_short}</td></tr>"
html += "</table>"
display(HTML(html))

These comic noir images (with captions using the trigger word sks_comic_noir) are used to train a P-Image LoRA with p-image-trainer; inference uses p-image-lora.

Step 2: Prepare Dataset (P-Image LoRA)

Format the data for p-image-trainer: one image per example plus one caption file with the same base name (e.g. image_001.png, image_001.txt). Each caption includes the trigger word sks_comic_noir so we can prompt the trained LoRA with that token at inference.

[10]:
def create_dataset_zip(
    image_captions: list[tuple[bytes, str]], output_path: str = "dataset.zip"
) -> str:
    """Create a ZIP for p-image-trainer: one image + one .txt caption per example."""
    with zipfile.ZipFile(output_path, "w", zipfile.ZIP_DEFLATED) as zipf:
        for i, (img_bytes, caption) in enumerate(image_captions):
            base_name = f"image_{i:03d}"
            zipf.writestr(f"{base_name}.png", img_bytes)
            zipf.writestr(f"{base_name}.txt", caption.encode("utf-8"))
    return output_path


dataset_zip_path = create_dataset_zip(image_captions)
print(f"Created dataset ZIP: {dataset_zip_path}")
print(f"Dataset size: {os.path.getsize(dataset_zip_path) / 1024 / 1024:.2f} MB")
Created dataset ZIP: dataset.zip
Dataset size: 2.54 MB

Step 3: Upload Dataset to HuggingFace

Upload the dataset ZIP to HuggingFace. You’ll need to create a repository first (e.g., your-username/lora-dataset).

[ ]:
hf_dataset_repo = "davidberenstein1957/comic_noir"

upload_file(
    path_or_fileobj=dataset_zip_path,
    path_in_repo="input.zip",
    repo_id=hf_dataset_repo,
    repo_type="dataset",
    token=hf_token,
)

hf_dataset_url = (
    f"https://huggingface.co/datasets/{hf_dataset_repo}/resolve/main/input.zip"
)
print(f"Dataset uploaded to: {hf_dataset_url}")

You can find the example dataset on Hugging Face.

Step 4: P-Image LoRA Training

Start P-Image LoRA training with p-image-trainer using the comic noir dataset. We use steps=1000, learning_rate=0.0001, and training_type="style". See the P-Image and LoRA documentation for full parameters.

[ ]:
training_input = {
    "image_data": hf_dataset_url,
    "steps": 1000,
    "learning_rate": 0.0001,
    "training_type": "style",
}

print("Starting P-Image LoRA training...")
prediction = replicate.predictions.create(
    model="prunaai/p-image-trainer",
    input=training_input,
)

print(f"Training started. Prediction ID: {prediction.id}")
print(f"Monitor at: https://replicate.com/p/{prediction.id}")

print(
    "Waiting for training to complete (P-Image trainer is typically 10-20 min for ~100 images)..."
)
prediction.wait()

if prediction.status != "succeeded":
    raise Exception(
        f"Training {prediction.status}: {getattr(prediction, 'error', 'Unknown error')}"
    )

print("Training completed!")

output = prediction.output
if hasattr(output, "url"):
    lora_output_url = output.url
elif isinstance(output, str):
    lora_output_url = output
else:
    lora_output_url = str(output)

print(f"LoRA output URL: {lora_output_url}")

Step 5: Download and Extract LoRA Weights

Download the training output ZIP, extract the LoRA weights file, and upload it to HuggingFace.

[ ]:
response = requests.get(lora_output_url)
response.raise_for_status()

lora_zip_path = "lora_output.zip"
with open(lora_zip_path, "wb") as f:
    f.write(response.content)

print(f"Downloaded LoRA output ZIP: {lora_zip_path}")

with zipfile.ZipFile(lora_zip_path, "r") as zipf:
    file_list = zipf.namelist()
    print(f"Files in ZIP: {file_list}")

    lora_file = next((f for f in file_list if f.endswith(".safetensors")), None)
    if not lora_file:
        raise ValueError("No .safetensors file found in ZIP")

    lora_weights_bytes = zipf.read(lora_file)
    print(f"Extracted LoRA file: {lora_file}")

lora_weights_path = "weights.safetensors"
with open(lora_weights_path, "wb") as f:
    f.write(lora_weights_bytes)

print(f"LoRA weights saved to: {lora_weights_path}")

Step 6: Upload LoRA Weights to HuggingFace

Upload the extracted LoRA weights file to HuggingFace.

[ ]:
hf_lora_repo = "davidberenstein1957/comic_noir"

upload_file(
    path_or_fileobj=lora_weights_path,
    path_in_repo="weights.safetensors",
    repo_id=hf_lora_repo,
    repo_type="model",
    token=hf_token,
)

hf_lora_url = f"https://huggingface.co/{hf_lora_repo}/resolve/main/weights.safetensors"

print(f"LoRA weights uploaded to: {hf_lora_url}")

Upload the LoRA weights to your HuggingFace model repo; then use hf_lora_url in Step 7 for inference.

Step 7: P-Image LoRA Inference

Use p-image-lora to generate an image with the trained comic noir LoRA. Prompt with the trigger word sks_comic_noir plus your scene description.

[ ]:
inference_prompt = f"{TRIGGER_WORD}, a detective in a rainy alley at night"
output = replicate.run(
    "prunaai/p-image-lora",
    input={
        "prompt": inference_prompt,
        "lora_weights": hf_lora_url,
        "lora_scale": 1.0,
        "seed": 42,
        "hf_api_token": hf_token,
    },
)
generated_bytes = (
    output.read() if hasattr(output, "read") else _fetch_image_bytes(output)
)
print("Generated image with P-Image LoRA (comic noir):")
display(Image(data=generated_bytes))

Example: Stylized generation with Flux Klein 9B (comic noir)

This section uses Flux Klein 9B for stylized text-to-image with no LoRA: we steer the model with a style-specific prompt (comic noir — high contrast, shadows, vintage comic aesthetic). To get a reusable comic noir style you can trigger with a single token, train a P-Image LoRA (text-to-image) with p-image-trainer on a dataset of comic noir images; see the LoRA documentation.

[ ]:
FLUX_KLEIN_9B = "black-forest-labs/flux-2-klein-9b"

comic_noir_prompt = (
    "A detective in a trench coat standing in a rainy alley at night, "
    "comic noir style, high contrast black and white with bold shadows, "
    "halftone dots, vintage pulp magazine aesthetic, dramatic lighting"
)
comic_noir_seed = 123

output = replicate.run(
    FLUX_KLEIN_9B, input={"prompt": comic_noir_prompt, "seed": comic_noir_seed}
)
comic_noir_bytes = _fetch_image_bytes(output)

print("Flux Klein 9B — comic noir style:")
display(Image(data=comic_noir_bytes))

Summary

You’ve completed P-Image LoRA Training and Inference (text-to-image):

  1. ✅ Generated comic noir images (and captions) using Flux-2-Klein 9B

  2. ✅ Prepared dataset for p-image-trainer (one image + one caption file per example, trigger word sks_comic_noir)

  3. ✅ Uploaded dataset to HuggingFace

  4. ✅ P-Image LoRA training with p-image-trainer

  5. ✅ Downloaded and extracted LoRA weights

  6. ✅ Uploaded LoRA weights to HuggingFace

  7. ✅ P-Image LoRA inference with p-image-lora using the trigger word

For image editing LoRA (input image → edited image), use the P-Image-Edit LoRA notebook and p-image-edit-trainer / p-image-edit-lora. See the LoRA and P-Image documentation for full parameters.