Turbocharge Stable Diffusion Video Generation

This tutorial demonstrates how to use the pruna package to optimize a stable diffusion video generation pipeline. We will use the stable-video-diffusion-img2vid model as an example. Any execution times given below are measured on a A10G GPU, as this tutorial requires at least 21GB of GPU memory.

1. Loading the Stable Diffusion Video Model

First, load your stable diffusion video generation model.

[ ]:

import torch

from diffusers import StableVideoDiffusionPipeline
from diffusers.utils import load_image, export_to_video

model_id = "stabilityai/stable-video-diffusion-img2vid"

pipe = StableVideoDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16, variant="fp16")
pipe = pipe.to("cuda")

2. Initializing the Smash Config

Next, initialize the smash_config.

[ ]:

from pruna import SmashConfig

# Initialize the SmashConfig
smash_config = SmashConfig()
smash_config['compilers'] = ['diffusers2']

3. Smashing the Model

Now, you can smash the model, which will take around 40 seconds. Don’t forget to replace the token by the one provided by PrunaAI.

[ ]:

from pruna import smash

# Smash the model
smashed_model = smash(
    model=pipe,
    token="<your-token>",  # replace <your-token> with your actual token or set to None if you do not have one yet
    smash_config=smash_config,
)

4. Running the Model

After the model has been compiled, we run inference for a few iterations as warm-up. This will take around 3 minutes.

[ ]:

# Load the conditioning image
image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/svd/rocket.png")
image = image.resize((1024, 576))
generator = torch.manual_seed(42)

[ ]:

# run some warm-up iterations
for _ in range(3):
  pipe(image, decode_chunk_size=8, generator=generator).frames[0]

Finally, run the model to generate the video with accelerated inference.

[ ]:

# Save the result
frames = pipe(image, decode_chunk_size=8, generator=generator).frames[0]

export_to_video(frames, "generated.mp4", fps=7)

Wrap Up

Congratulations! You have successfully smashed a stable diffusion video generation model. You can now use the pruna package to optimize any custom stable diffusion video generation model. The only parts that you should modify are step 1 and step 4 to fit your use case.