Open In Colab

How to create a movie from one idea using scene chaining

One idea, a few minutes, and you’ve got a short film. This guide uses scene chaining: each video segment flows into the next by using the last frame of one clip as the starting image for the next. The result feels like one continuous story.

What we’ll do:

  1. Pitch your idea — Describe your movie in one sentence (e.g., “A detective discovers a mysterious door in an abandoned warehouse”).

  2. Break it into scenes — An LLM turns your idea into 3–4 scenes, each with an image prompt, video prompt, and optional edit for transitions.

  3. Generate scene 1 — P-Image creates the first frame; P-Video animates it. We extract the last frame for the next scene.

  4. Chain the rest — For each following scene, we use that last frame (optionally edited) as the input. Same character, same world, seamless flow.

  5. Merge and watch — We stitch all segments into one video. Run the cells and watch your movie take shape.

Models used: p-image, p-image-edit, p-video

Setup

First, let’s install the packages we need: moviepy and pillow for extracting frames and merging videos. You’ll also need Replicate and OpenAI API keys—get them from Replicate and OpenAI.

[1]:
%pip install replicate openai requests moviepy pillow
/Users/davidberenstein/Documents/programming/pruna/prunatree/.venv/bin/python3: No module named pip
Note: you may need to restart the kernel to use updated packages.
[2]:
import io
import json
import os
import tempfile
import requests
from IPython.display import Image, Video, display
from replicate.client import Client
from openai import OpenAI
from moviepy import VideoFileClip, concatenate_videoclips
[3]:
token = os.environ.get("REPLICATE_API_TOKEN")
if not token:
    token = input("Replicate API token (r8_...): ").strip()
replicate = Client(api_token=token, timeout=300)
[4]:
openai_token = os.environ.get("PRUNA_OPENAI_API_KEY") or os.environ.get("OPENAI_API_KEY")
if not openai_token:
    openai_token = input("OpenAI API key (sk-...): ").strip()
openai_client = OpenAI(api_key=openai_token)

Step 1: Describe your movie and plan the scenes

First, let’s describe your movie and plan the scenes. Change the concept variable to your idea. The LLM plans each scene one at a time, using the full context of all previous scenes so each next scene continues sensibly. Each scene has: image_prompt, video_prompt, edit_prompt (optional).

Run the cell to plan all scenes. Prompts are kept short for better model control.

[5]:
concept = "A detective discovers a mysterious door in an abandoned warehouse"
SEED = 42
NUM_SCENES = 3

def plan_scene(concept: str, previous_scenes: list) -> dict:
    """Generate one scene. Uses full context of all previous scenes for continuity."""
    sys_msg = "Generate one scene. Return JSON: image_prompt, video_prompt, edit_prompt (null if not needed). Keep prompts short (1-2 sentences)."
    if not previous_scenes:
        user_msg = f"Movie: {concept}. Generate scene 1."
    else:
        ctx = "; ".join(
            f"Scene {i}: img=\"{s.get('image_prompt','')}\" vid=\"{s.get('video_prompt','')}\""
            for i, s in enumerate(previous_scenes, 1)
        )
        user_msg = f"Movie: {concept}. Previous: {ctx}. Generate scene {len(previous_scenes)+1} that continues sensibly."
    r = openai_client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "system", "content": sys_msg}, {"role": "user", "content": user_msg}],
    )
    raw = r.choices[0].message.content
    if "```" in raw:
        raw = raw.split("```")[1]
        if raw.startswith("json"):
            raw = raw[4:]
    return json.loads(raw.strip())

scenes = []
for i in range(NUM_SCENES):
    scene = plan_scene(concept, scenes)
    scenes.append(scene)
    print(f"Planned scene {i+1}")
print(f"Planned {len(scenes)} scenes")
Planned scene 1
Planned scene 2
Planned scene 3
Planned 3 scenes

Step 2: Generate the first scene

As we can see from the scene plan above, we’ll now generate the first scene. P-Image generates the first frame from the scene 1 prompt, then P-Video animates it. The key is extracting the last frame of that video—that becomes the starting image for scene 2, so the transition feels natural.

We also define a helper ``get_last_frame_url`` that downloads a video, grabs the last frame, and uploads it to Replicate for the next run. Run the cells below to generate scene 1.

[6]:
def get_last_frame_url(video_url: str) -> tuple[str, str]:
    """Download video, extract last frame, upload to Replicate. Returns (url, local_path)."""
    resp = requests.get(video_url)
    resp.raise_for_status()
    vpath = tempfile.mktemp(suffix=".mp4")
    with open(vpath, "wb") as f:
        f.write(resp.content)
    clip = VideoFileClip(vpath)
    t = clip.duration - 0.1
    frame = clip.get_frame(t)
    clip.close()
    fpath = tempfile.mktemp(suffix=".png")
    from PIL import Image

    Image.fromarray(frame).save(fpath)
    with open(fpath, "rb") as f:
        data = f.read()
    f = replicate.files.create(file=io.BytesIO(data), filename="last_frame.png")
    urls = getattr(f, "urls", None)
    url = (
        (urls.get("get") if isinstance(urls, dict) else None)
        or getattr(f, "content_url", None)
        or getattr(f, "url", None)
    )
    return (url if url else str(f), fpath)
[7]:
scene1 = scenes[0]
img_out = replicate.run(
    "prunaai/p-image", input={"prompt": scene1["image_prompt"], "aspect_ratio": "16:9", "seed": SEED, "prompt_upsampling": False}
)
image_url = (
    img_out
    if isinstance(img_out, str)
    else img_out[0]
    if isinstance(img_out, list)
    else str(img_out)
)

vid_out = replicate.run(
    "prunaai/p-video",
    input={
        "image": image_url,
        "prompt": scene1["video_prompt"],
        "duration": 5,
        "aspect_ratio": "16:9",
        "seed": SEED,
        "prompt_upsampling": False,
    },
)
def _extract_url(obj):
    if isinstance(obj, str):
        return obj
    if hasattr(obj, "url"):
        return obj.url
    if hasattr(obj, "content_url"):
        return obj.content_url
    if isinstance(obj, list) and obj:
        return _extract_url(obj[0])
    if isinstance(obj, dict):
        return obj.get("video") or obj.get("output") or (list(obj.values())[0] if obj else None)
    return str(obj)

video_url = _extract_url(vid_out)

video_clips = []
r = requests.get(video_url)
r.raise_for_status()
vpath = tempfile.mktemp(suffix=".mp4")
with open(vpath, "wb") as f:
    f.write(r.content)
video_clips.append(VideoFileClip(vpath))
last_frame_url, last_frame_path = get_last_frame_url(video_url)
display(Image(url=image_url))
display(Video(vpath, embed=True))

Step 3: Chain the remaining scenes

With the last frame from scene 1 ready, we’ll chain the remaining scenes. For each following scene, we start from the last frame of the previous video. If the LLM suggested an edit (e.g., “add a doorway”), we run P-Image-Edit first. Then P-Video animates from that frame. We extract the last frame again and repeat.

This loop runs for scenes 2, 3, 4… Each segment gets added to our list. You’ll see “Scene 2 done”, “Scene 3 done”, etc. as it progresses.

[8]:
for i, scene in enumerate(scenes[1:], start=2):
    img_url = last_frame_url
    if scene.get("edit_prompt"):
        edit_out = replicate.run(
            "prunaai/p-image-edit",
            input={"images": [img_url], "prompt": scene["edit_prompt"], "seed": SEED},
        )
        img_url = (
            edit_out
            if isinstance(edit_out, str)
            else edit_out[0]
            if isinstance(edit_out, list)
            else getattr(edit_out, "url", None)
            or (edit_out.get("output") or edit_out.get("image") or list(edit_out.values())[0] if isinstance(edit_out, dict) else str(edit_out))
        )

    vid_out = replicate.run(
        "prunaai/p-video",
        input={
            "image": img_url,
            "prompt": scene["video_prompt"],
            "duration": 5,
            "aspect_ratio": "16:9",
            "seed": SEED,
            "prompt_upsampling": False,
        },
    )
    video_url = _extract_url(vid_out)

    r = requests.get(video_url)
    r.raise_for_status()
    vpath = tempfile.mktemp(suffix=".mp4")
    with open(vpath, "wb") as f:
        f.write(r.content)
    video_clips.append(VideoFileClip(vpath))
    last_frame_url, last_frame_path = get_last_frame_url(video_url)
    display(Video(vpath, embed=True))

Step 4: Merge and watch your movie

As we can see, we’ve built all our scenes. Next, we’ll merge them into one video. The result is a short film—roughly 15–20 seconds per scene, so 3–4 scenes gives you about a minute or two. Run the cell and your movie will appear below.

[9]:
# Trim last frame from each clip (except the last) to avoid overlap with next clip's first frame
clips_to_concat = []
for i, clip in enumerate(video_clips):
    if i < len(video_clips) - 1:
        fps = clip.fps
        trimmed = clip.subclipped(0, max(0, clip.duration - 1 / fps))
        clips_to_concat.append(trimmed)
    else:
        clips_to_concat.append(clip)
final = concatenate_videoclips(clips_to_concat)
output_path = tempfile.mktemp(suffix=".mp4")
final.write_videofile(output_path, codec="libx264", audio_codec="aac")
final.close()
for c in video_clips:
    c.close()
print("Video:", output_path)
display(Video(output_path, embed=True))
MoviePy - Building video /var/folders/9t/msy700h16jz3q35qvg4z1ln40000gn/T/tmpnf3kwi8h.mp4.
MoviePy - Writing audio in tmpnf3kwi8hTEMP_MPY_wvf_snd.mp4

chunk:   0%|          | 0/330 [00:00<?, ?it/s, now=None]
chunk:  33%|███▎      | 110/330 [00:00<00:00, 765.34it/s, now=None]
chunk:  57%|█████▋    | 187/330 [00:00<00:00, 751.23it/s, now=None]

MoviePy - Done.
MoviePy - Writing video /var/folders/9t/msy700h16jz3q35qvg4z1ln40000gn/T/tmpnf3kwi8h.mp4


frame_index:   0%|          | 0/358 [00:00<?, ?it/s, now=None]
frame_index:   9%|▉         | 32/358 [00:00<00:01, 319.97it/s, now=None]
frame_index:  21%|██        | 76/358 [00:00<00:00, 387.53it/s, now=None]
frame_index:  32%|███▏      | 115/358 [00:00<00:01, 184.17it/s, now=None]
frame_index:  41%|████      | 147/358 [00:00<00:00, 216.04it/s, now=None]
frame_index:  49%|████▉     | 176/358 [00:00<00:00, 183.20it/s, now=None]
frame_index:  57%|█████▋    | 203/358 [00:00<00:00, 201.84it/s, now=None]
frame_index:  64%|██████▎   | 228/358 [00:01<00:00, 210.13it/s, now=None]
frame_index:  71%|███████   | 255/358 [00:01<00:00, 224.24it/s, now=None]
frame_index:  78%|███████▊  | 280/358 [00:01<00:00, 179.14it/s, now=None]
frame_index:  84%|████████▍ | 302/358 [00:01<00:00, 188.04it/s, now=None]
frame_index:  91%|█████████ | 324/358 [00:01<00:00, 171.98it/s, now=None]
frame_index:  96%|█████████▌| 344/358 [00:01<00:00, 177.92it/s, now=None]

MoviePy - Done !
MoviePy - video ready /var/folders/9t/msy700h16jz3q35qvg4z1ln40000gn/T/tmpnf3kwi8h.mp4
Video: /var/folders/9t/msy700h16jz3q35qvg4z1ln40000gn/T/tmpnf3kwi8h.mp4

Conclusion

Congratulations! You’ve just created a short film using AI. This method allows you to generate long-form content quickly and efficiently. We’ve combined the reasoning power of LLMs with the speed and efficiency of Pruna’s performance models to create a short film in under a minute for less than $1.

You can now use this method to create longer and more complex movies by adding more scenes.

You can check out other workflows or sign up for for our API and get started at https://dashboard.pruna.ai/login