P-Video-Replace
P-Video-Replace replaces characters in an existing video using reference images and prompt-guided mappings.
Given a source video and one or more reference stills, the model replaces characters in the footage with your reference identities—output keeps the atmosphere of the video (background, blocking, camera, lighting) while swapping who appears on screen. Motion, acting, timing, and scene structure follow the driver. Use instruction_prompt to describe who replaces whom when multiple people are on screen.
It is optimized for:
Top visual quality
Most efficient inference
Speed: 3.58 s generation time per 1 s of video
Price: $0.03 and $0.06 per 1 s of 720p and 1080p video
Not sure how this differs from P-Video-Animate, P-Video, or P-Video-Avatar? See P-Video-Animate vs. P-Video-Replace and Choosing the right video model below.
Note
When using P-Video-Replace, respect the copyright of videos and images you use as input and of the video you generate.
Pricing:
Resolution |
Price |
|---|---|
720p |
$0.03 per second of output video |
1080p |
$0.06 per second of output video |
Tip
Test it in the P-Video Playground.
Prompt formula
Fast pass
Reuse a P-Video-Animate driver + matching still—good for first swaps and timing checks.
Locked-in
Spell out who replaces whom, scene placement, and audio sync for repeatable runs.
instruction_prompt.instruction_prompt with explicit lip-sync and keep lines.Still / reference image — The API uses the first entry in
images.reference_image_promptis not a request field; generate the still with P-Image (see Domain Use Cases there), then upload the file asimages[0]. In our docs examples, each still is the matching P-Video-Animate catalog frame.Examples: “Photorealistic woman, 9:16 chest-up, soft window light, single subject”, “Stop-motion clay sailor, medium full-body, gray studio stool”
video— Required driver clip. Motion, acting, timing, lip sync, and camera movement come from this file. Pair with the animate row’s{slug}_driver.mp4when reusing catalog examples.Examples: a winning UGC take, avatar output, or the driver from a P-Video-Animate example row.
instruction_prompt— Names who in the driver to replace with whom from reference image 1. Keep it to one short sentence plus a keep line (motion, audio, camera; add lip sync for dialogue drivers).Fast pass: “Replace the person in the source video with {character label} from reference image 1. Keep motion, audio, and camera from the source video.”
Locked-in: “Replace the live-action man on the gray studio stool with the stop-motion clay sailor from reference image 1. Keep lip sync, motion, audio, and camera from the source video.”
Slot |
Fast pass (enough to run) |
Locked-in (stronger control) |
|---|---|---|
Still prompt (P-Image → |
Reuse the animate-catalog still for that row, or one line: who, outfit, basic light. |
Single subject explicit; pose, wardrobe, expression, light direction, set, lens + aspect—stable identity across reruns. |
|
Animate-catalog driver for the motion you want to keep. |
Same clip every run when comparing stills or prompt tweaks. |
|
One line: driver subject → reference image 1 + keep motion/audio/camera. |
Placement + identity mapping + sync line; call out props, blocking, and match lip sync and audio from the source video. |
Note
Output size vs. driver size: resolution (720p / 1080p) sets the output megapixel budget and aspect ratio follows the driver. A 1080p replace may render at a higher pixel size than a 720p driver uploaded as video—that is expected. Side-by-side compare clips in our gallery are normalized to the driver’s frame size (crop-to-fill, no letterboxing).
Tip
For comprehensive video prompting (motion, framing, atmosphere), see the Video Generation guide.
How does it differ from other models?
The current market includes general video editing/modification models that can perform broad video transformations. P-Video-Replace is designed specifically for character replacement workflows: replacing people in an existing video while preserving the original motion, camera movement, lighting, background, and scene structure.
Benchmark numbers below are directional and may vary depending on resolution, clip length, settings, provider, queue time, and test date.
P-Video-Animate vs. P-Video-Replace
Both models take a source video and reference image(s), but they are built for different workflows—not interchangeable substitutes.
What it does |
Animates one image using motion, timing, and camera movement from a driver clip. |
Replaces the character(s) in a video with the character(s) from reference stills. |
What atmosphere you keep |
The image’s atmosphere—look, lighting, wardrobe, and world of the still drive the output. |
The video’s atmosphere—background, blocking, camera, and scene of the footage drive the output. |
When to use it |
You have an approved hero still and want it to perform like an existing take. |
You have finished footage and want different people in the same shot. |
Rule of thumb: Choose Animate when the still defines the world; choose Replace when the clip defines the world.
Choosing the right video model
Pruna ships four performance video models. They share the same prediction API, but each solves a different production problem. P-Video-Replace uses the p-video-replace model (Model: p-video-replace header) and requires an existing source video.
Replace (this page) |
||||
|---|---|---|---|---|
One-line job |
Generate new footage from prompts |
Speak from one still (script or audio) |
Retarget one still with clip motion |
Swap characters in existing footage |
You start with |
Text prompt (+ optional image refs) |
Portrait still + |
Source video + one still |
Source video + 1–4 identity stills |
You keep from the source |
N/A (new scene) |
Aspect ratio of the still |
Motion, timing, camera movement, and optionally audio |
Camera, timing, blocking, background |
Typical ask |
“Make a 10 s product ad in this style.” |
“This spokesperson says this line in French.” |
“Animate this catalog still using our winning ad take.” |
“Put our creator in this UGC b-roll.” |
Quick decision guide
No source video yet → use P-Video to create the plate, or P-Video-Avatar if you only need a talking head from a still.
Footage exists and the hero still should move like the driver → P-Video-Animate (only the first entry in
imagesis used). See P-Video-Animate vs. P-Video-Replace.Footage exists and you need different people in the same shot → P-Video-Replace (
Model: p-video-replace; useinstruction_promptwhen multiple people are on screen).
Tip
Common pipelines: Generate stills with P-Image → P-Video-Avatar for new spokesperson clips → P-Video-Replace to drop talent into b-roll → P-Video-Animate to apply a hero still to motion from an avatar or ad clip.
Speed and throughput
Metric |
P-Video-Replace (720p benchmark) |
|---|---|
Generation time per 1 s of output |
3.58 s |
Cost (720p) |
$0.03/s of output video |
Cost (1080p) |
$0.06/s of output video |
Key features
P-Video-Replace fits the same Pruna API patterns as P-Video and P-Video-Avatar:
- UGC ad variations
Scale winning creatives by swapping in new creators, customers, or personas.
- Viral meme remixes
Refresh trending clips with custom characters, avatars, or branded personas.
- Movie scene recasting
Replace actors or characters with uploaded avatars, selfies, or character images.
- Game cinematic variations
Personalize trailers or cutscenes with player avatars, skins, heroes, or custom characters.
- Educational videos
Localize or personalize training videos by replacing speakers, instructors, or role-based characters.
- Fast compared to existing replace models
Optimized for production pipelines that need turnaround without sacrificing usable quality on typical footage.
- Multi-character swap
Upload a video and 1–4 reference stills with an optional
instruction_prompt(Model: p-video-replaceheader).- Scene and blocking preservation
Keeps the driver clip’s camera, timing, layout, and background while swapping visible characters.
- Reliable everyday motion
Strong on normal movement and slow, controlled action—walking, talking heads, presenters, product demos.
- Audio-aware output
Control source audio with
save_audioandignore_audio(see Configuration).
Practical constraints
Output length follows the source video duration (within platform max length).
Output aspect ratio follows the source video.
1080poutput can exceed the driver’s native pixel dimensions; preview at driver resolution when comparing before/after.Very fast action, heavy occlusion, or extreme camera motion may reduce consistency.
Use a clean, well-lit reference still when possible.
Supports up to four references when multiple characters are on screen.
Examples
One tab per Hugging Face folder (ugc_ads, film_casting, gaming, meme_remixes), with side-by-side cards showing the reference image (image), driver (video), a driver ↔ output compare clip, resolution, and copy-ready instruction_prompt text. Create the reference still with P-Image (see Domain Use Cases there for tone and structure), then pass it as the first entry in images. Full assets live on prompt_guide/p-video-replace.
Integration
P-Video-Replace uses the same Pruna prediction API as P-Video. Upload video and images, set Model: p-video-replace, then poll or use sync headers as with other video models.
Tip
For more information on how to use the API, see the API Reference.
- API Endpoint
Base URL:
https://api.pruna.ai/v1/predictions
Authentication
-H 'apikey: YOUR_API_KEY'
Step 1: Upload source video and reference image
curl -X POST "https://api.pruna.ai/v1/files" \
-H "apikey: YOUR_API_KEY" \
-F "content=@/path/to/source.mp4"
curl -X POST "https://api.pruna.ai/v1/files" \
-H "apikey: YOUR_API_KEY" \
-F "content=@/path/to/reference.jpg"
Use the returned file URLs as video and entries in images.
Step 2: Create generation request
Replace mode (asynchronous)
curl -X POST 'https://api.pruna.ai/v1/predictions' \
-H 'Content-Type: application/json' \
-H 'apikey: YOUR_API_KEY' \
-H 'Model: p-video-replace' \
-d '{
"input": {
"video": "https://api.pruna.ai/v1/files/file-driver123",
"images": [
"https://api.pruna.ai/v1/files/file-still-a"
],
"instruction_prompt": "Replace the person in the source video with the clay sailor (medium full-body) from reference image 1. Keep lip sync, motion, audio, and camera from the source video.",
"resolution": "1080p",
"save_audio": true
}
}'
Configuration
Required parameters
Parameter |
Type |
Description |
|---|---|---|
video |
file/string |
Source RGB video ( |
images |
file[] / string[] |
Reference image(s). Replace: 1–4 identity references. |
Optional parameters
Parameter |
Type |
Default |
Description |
|---|---|---|---|
mode |
string |
— |
Do not send in |
instruction_prompt |
string |
|
Further instruction on how to place people from reference images into the scene. |
resolution |
string |
|
Target megapixel budget: |
fps |
integer |
|
Frames per second of the output video. |
save_audio |
boolean |
|
Save the video with audio. |
ignore_audio |
boolean |
|
Ignore source audio for prompt conditioning and return a silent output video. |
disable_safety_checker |
boolean |
|
Disable safety checker for generated videos (platform UI may still enforce checks). |
seed |
integer |
random |
Random seed. Leave blank for random. |
no_op |
boolean |
|
Health check mode — returns status without inference. |
Supported option values
resolution:720p,1080p.
Argument recommendations
Use these patterns for consistent quality:
Modelheader:p-video-replace(required; do not sendmodeininput).video: prefer stable exposure, minimal motion blur, and clear visibility of subjects you want to replace.images: high-resolution, well-lit faces or full-body shots matching the intended framing; supports up to four references.instruction_prompt: name who in the driver maps to reference image 1; mention wardrobe or props when identity drifts.resolution: iterate in720p, then rerun finals in1080p.fps: match source footage when possible; default24is fine for most web delivery.save_audio/ignore_audio: keepsave_audio: truefor dialogue-driven clips; setignore_audio: truewhen you only need motion without sound.seed: set for reproducible A/B tests; change one variable at a time.disable_safety_checker: leave default unless your workflow includes explicit moderation.