P-Image LoRA: Training and Inference
This notebook is the full walkthrough for P-Image LoRA Training and Inference: we train a LoRA adapter for text-to-image (custom style or concept) with p-image-trainer and run inference with p-image-lora. See the P-Image and LoRA documentation for full parameter lists.
Two workflows — don’t mix them:
Workflow |
Use case |
Training |
Inference |
|---|---|---|---|
P-Image LoRA |
Text-to-image (prompt → image) |
|
|
P-Image-Edit LoRA |
Image editing (input image → edited image) |
|
|
This notebook covers P-Image LoRA only.
In this notebook we will:
Generate comic noir stylized images (and captions) using Flux-2-Klein 9B
Prepare the dataset for
p-image-trainer(one image + one caption file per example)Upload the dataset to HuggingFace
Train a LoRA (P-Image LoRA training) with
p-image-trainerDownload and extract the trained LoRA weights
Upload the LoRA weights to HuggingFace
Test inference (P-Image LoRA inference) with
p-image-lora
Setup
Install required packages and set up authentication.
[ ]:
% pip install replicate huggingface-hub pillow tqdm datasets
[ ]:
import os
import io
import zipfile
import time
import requests
import random
from pathlib import Path
import requests
from typing import Iterator
from IPython.display import Image, display
from tqdm import tqdm
from datasets import load_dataset
from replicate.client import Client
from huggingface_hub import HfApi, upload_file
from PIL import Image as PILImage
replicate_token = os.environ.get("REPLICATE_API_TOKEN")
if not replicate_token:
replicate_token = input("Replicate API token (r8_...): ").strip()
hf_token = os.environ.get("HF_TOKEN")
if not hf_token:
hf_token = input("HuggingFace API token (hf_...): ").strip()
replicate = Client(api_token=replicate_token)
hf_api = HfApi(token=hf_token)
/Users/davidberenstein/Documents/programming/pruna/prunatree/.venv/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
from .autonotebook import tqdm as notebook_tqdm
Step 1: Generate training images (comic noir style)
We generate comic noir stylized images using Flux-2-Klein 9B only. Each prompt is suffixed with a comic noir style so that all images share that theme (high contrast, shadows, vintage pulp aesthetic). For P-Image LoRA we need one image per example plus a caption file; we use a trigger word sks_comic_noir in every caption so we can prompt the trained model with that token at inference. The resulting dataset is used to train with p-image-trainer and generate with
p-image-lora.
[ ]:
COMIC_NOIR_STYLE = (
"comic noir style, high contrast black and white with bold shadows, "
"halftone dots, vintage pulp magazine aesthetic, dramatic lighting"
)
n_samples = 200
dataset_stream = load_dataset(
"data-is-better-together/open-image-preferences-v1",
split="cleaned",
streaming=True,
)
streamed_prompts = []
for item in dataset_stream:
if item.get("simplified_prompt"):
streamed_prompts.append(item["simplified_prompt"])
if len(streamed_prompts) >= n_samples:
break
random.seed(42)
prompts = streamed_prompts
print(
f"Loaded {len(prompts)} prompts (will be generated in comic noir style with Klein 4B/9B)"
)
Now, let’s generate the image pairs. Because we are fine-tning
[ ]:
def _fetch_image_bytes(output):
"""
Given output from replicate.run, fetches the actual image bytes.
- If output is a file-like object, calls .read()
- If output is a list of URLs, downloads the first URL
- If output is a single URL string, downloads it directly
"""
if hasattr(output, "read"):
return output.read()
if isinstance(output, list):
# Replicate typically returns a list of URLs (even length-1).
url = output[0]
elif isinstance(output, str):
url = output
else:
raise ValueError(f"Unexpected output type: {type(output)}")
resp = requests.get(url)
resp.raise_for_status()
return resp.content
FLUX_KLEIN_9B = "black-forest-labs/flux-2-klein-9b"
TRIGGER_WORD = "sks_comic_noir"
def generate_image_9b(prompt: str, seed: int) -> bytes:
"""Generate a single image with Flux-2-Klein 9B."""
output = replicate.run(FLUX_KLEIN_9B, input={"prompt": prompt, "seed": seed})
return _fetch_image_bytes(output)
image_captions = []
for i, prompt in enumerate(tqdm(prompts, desc="Generating comic noir images")):
styled_prompt = f"{prompt}, {COMIC_NOIR_STYLE}"
caption = f"{TRIGGER_WORD}, {prompt}"
try:
img_bytes = generate_image_9b(styled_prompt, i)
image_captions.append((img_bytes, caption))
except Exception as e:
print(f"Error generating image {i}: {e}")
continue
print(f"Generated {len(image_captions)} comic noir images with captions")
Sample comic noir images (Klein 9B) to verify the generation:
[ ]:
import base64
from IPython.display import HTML
def b64_img(img_bytes):
return f"<img src='data:image/png;base64,{base64.b64encode(img_bytes).decode()}' style='max-width: 280px; border:1px solid #ccc;'/>"
n_show = min(5, len(image_captions))
print(f"Displaying {n_show} sample comic noir images:")
html = "<table><tr><th style='text-align:center;'>#</th><th>Image</th><th>Caption</th></tr>"
for i in range(n_show):
img_bytes, caption = image_captions[i]
cap_short = (caption[:60] + "...") if len(caption) > 60 else caption
html += f"<tr><td style='text-align:center; font-weight:bold;'>{i+1}</td><td>{b64_img(img_bytes)}</td><td style='max-width:200px;'>{cap_short}</td></tr>"
html += "</table>"
display(HTML(html))
These comic noir images (with captions using the trigger word sks_comic_noir) are used to train a P-Image LoRA with p-image-trainer; inference uses p-image-lora.
Step 2: Prepare Dataset (P-Image LoRA)
Format the data for p-image-trainer: one image per example plus one caption file with the same base name (e.g. image_001.png, image_001.txt). Each caption includes the trigger word sks_comic_noir so we can prompt the trained LoRA with that token at inference.
[10]:
def create_dataset_zip(
image_captions: list[tuple[bytes, str]], output_path: str = "dataset.zip"
) -> str:
"""Create a ZIP for p-image-trainer: one image + one .txt caption per example."""
with zipfile.ZipFile(output_path, "w", zipfile.ZIP_DEFLATED) as zipf:
for i, (img_bytes, caption) in enumerate(image_captions):
base_name = f"image_{i:03d}"
zipf.writestr(f"{base_name}.png", img_bytes)
zipf.writestr(f"{base_name}.txt", caption.encode("utf-8"))
return output_path
dataset_zip_path = create_dataset_zip(image_captions)
print(f"Created dataset ZIP: {dataset_zip_path}")
print(f"Dataset size: {os.path.getsize(dataset_zip_path) / 1024 / 1024:.2f} MB")
Created dataset ZIP: dataset.zip
Dataset size: 2.54 MB
Step 3: Upload Dataset to HuggingFace
Upload the dataset ZIP to HuggingFace. You’ll need to create a repository first (e.g., your-username/lora-dataset).
[ ]:
hf_dataset_repo = "davidberenstein1957/comic_noir"
upload_file(
path_or_fileobj=dataset_zip_path,
path_in_repo="input.zip",
repo_id=hf_dataset_repo,
repo_type="dataset",
token=hf_token,
)
hf_dataset_url = (
f"https://huggingface.co/datasets/{hf_dataset_repo}/resolve/main/input.zip"
)
print(f"Dataset uploaded to: {hf_dataset_url}")
Step 4: P-Image LoRA Training
Start P-Image LoRA training with p-image-trainer using the comic noir dataset. We use steps=1000, learning_rate=0.0001, and training_type="style". See the P-Image and LoRA documentation for full parameters.
[ ]:
training_input = {
"image_data": hf_dataset_url,
"steps": 1000,
"learning_rate": 0.0001,
"training_type": "style",
}
print("Starting P-Image LoRA training...")
prediction = replicate.predictions.create(
model="prunaai/p-image-trainer",
input=training_input,
)
print(f"Training started. Prediction ID: {prediction.id}")
print(f"Monitor at: https://replicate.com/p/{prediction.id}")
print(
"Waiting for training to complete (P-Image trainer is typically 10-20 min for ~100 images)..."
)
prediction.wait()
if prediction.status != "succeeded":
raise Exception(
f"Training {prediction.status}: {getattr(prediction, 'error', 'Unknown error')}"
)
print("Training completed!")
output = prediction.output
if hasattr(output, "url"):
lora_output_url = output.url
elif isinstance(output, str):
lora_output_url = output
else:
lora_output_url = str(output)
print(f"LoRA output URL: {lora_output_url}")
Step 5: Download and Extract LoRA Weights
Download the training output ZIP, extract the LoRA weights file, and upload it to HuggingFace.
[ ]:
response = requests.get(lora_output_url)
response.raise_for_status()
lora_zip_path = "lora_output.zip"
with open(lora_zip_path, "wb") as f:
f.write(response.content)
print(f"Downloaded LoRA output ZIP: {lora_zip_path}")
with zipfile.ZipFile(lora_zip_path, "r") as zipf:
file_list = zipf.namelist()
print(f"Files in ZIP: {file_list}")
lora_file = next((f for f in file_list if f.endswith(".safetensors")), None)
if not lora_file:
raise ValueError("No .safetensors file found in ZIP")
lora_weights_bytes = zipf.read(lora_file)
print(f"Extracted LoRA file: {lora_file}")
lora_weights_path = "weights.safetensors"
with open(lora_weights_path, "wb") as f:
f.write(lora_weights_bytes)
print(f"LoRA weights saved to: {lora_weights_path}")
Step 6: Upload LoRA Weights to HuggingFace
Upload the extracted LoRA weights file to HuggingFace.
[ ]:
hf_lora_repo = "davidberenstein1957/comic_noir"
upload_file(
path_or_fileobj=lora_weights_path,
path_in_repo="weights.safetensors",
repo_id=hf_lora_repo,
repo_type="model",
token=hf_token,
)
hf_lora_url = f"https://huggingface.co/{hf_lora_repo}/resolve/main/weights.safetensors"
print(f"LoRA weights uploaded to: {hf_lora_url}")
Upload the LoRA weights to your HuggingFace model repo; then use hf_lora_url in Step 7 for inference.
Step 7: P-Image LoRA Inference
Use p-image-lora to generate an image with the trained comic noir LoRA. Prompt with the trigger word sks_comic_noir plus your scene description.
[ ]:
inference_prompt = f"{TRIGGER_WORD}, a detective in a rainy alley at night"
output = replicate.run(
"prunaai/p-image-lora",
input={
"prompt": inference_prompt,
"lora_weights": hf_lora_url,
"lora_scale": 1.0,
"seed": 42,
"hf_api_token": hf_token,
},
)
generated_bytes = (
output.read() if hasattr(output, "read") else _fetch_image_bytes(output)
)
print("Generated image with P-Image LoRA (comic noir):")
display(Image(data=generated_bytes))
Example: Stylized generation with Flux Klein 9B (comic noir)
This section uses Flux Klein 9B for stylized text-to-image with no LoRA: we steer the model with a style-specific prompt (comic noir — high contrast, shadows, vintage comic aesthetic). To get a reusable comic noir style you can trigger with a single token, train a P-Image LoRA (text-to-image) with p-image-trainer on a dataset of comic noir images; see the LoRA documentation.
[ ]:
FLUX_KLEIN_9B = "black-forest-labs/flux-2-klein-9b"
comic_noir_prompt = (
"A detective in a trench coat standing in a rainy alley at night, "
"comic noir style, high contrast black and white with bold shadows, "
"halftone dots, vintage pulp magazine aesthetic, dramatic lighting"
)
comic_noir_seed = 123
output = replicate.run(
FLUX_KLEIN_9B, input={"prompt": comic_noir_prompt, "seed": comic_noir_seed}
)
comic_noir_bytes = _fetch_image_bytes(output)
print("Flux Klein 9B — comic noir style:")
display(Image(data=comic_noir_bytes))
Summary
You’ve completed P-Image LoRA Training and Inference (text-to-image):
✅ Generated comic noir images (and captions) using Flux-2-Klein 9B
✅ Prepared dataset for p-image-trainer (one image + one caption file per example, trigger word
sks_comic_noir)✅ Uploaded dataset to HuggingFace
✅ P-Image LoRA training with
p-image-trainer✅ Downloaded and extracted LoRA weights
✅ Uploaded LoRA weights to HuggingFace
✅ P-Image LoRA inference with
p-image-lorausing the trigger word
For image editing LoRA (input image → edited image), use the P-Image-Edit LoRA notebook and p-image-edit-trainer / p-image-edit-lora. See the LoRA and P-Image documentation for full parameters.