Turbocharge Stable Diffusion Video Generation
This tutorial demonstrates how to use the pruna
package to optimize a stable diffusion video generation pipeline. We will use the stable-video-diffusion-img2vid
model as an example. Any execution times given below are measured on a A10G GPU, as this tutorial requires at least 21GB of GPU memory.
1. Loading the Stable Diffusion Video Model
First, load your stable diffusion video generation model.
[ ]:
import torch
from diffusers import StableVideoDiffusionPipeline
from diffusers.utils import load_image, export_to_video
model_id = "stabilityai/stable-video-diffusion-img2vid"
pipe = StableVideoDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16, variant="fp16")
pipe = pipe.to("cuda")
2. Initializing the Smash Config
Next, initialize the smash_config.
[ ]:
from pruna import SmashConfig
# Initialize the SmashConfig
smash_config = SmashConfig()
smash_config['compilers'] = ['diffusers2']
3. Smashing the Model
Now, you can smash the model, which will take around 40 seconds. Don’t forget to replace the token by the one provided by PrunaAI.
[ ]:
from pruna import smash
# Smash the model
smashed_model = smash(
model=pipe,
token="<your-token>", # replace <your-token> with your actual token or set to None if you do not have one yet
smash_config=smash_config,
)
4. Running the Model
After the model has been compiled, we run inference for a few iterations as warm-up. This will take around 3 minutes.
[ ]:
# Load the conditioning image
image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/svd/rocket.png")
image = image.resize((1024, 576))
generator = torch.manual_seed(42)
[ ]:
# run some warm-up iterations
for _ in range(3):
pipe(image, decode_chunk_size=8, generator=generator).frames[0]
Finally, run the model to generate the video with accelerated inference.
[ ]:
# Save the result
frames = pipe(image, decode_chunk_size=8, generator=generator).frames[0]
export_to_video(frames, "generated.mp4", fps=7)
Wrap Up
Congratulations! You have successfully smashed a stable diffusion video generation model. You can now use the pruna
package to optimize any custom stable diffusion video generation model. The only parts that you should modify are step 1 and step 4 to fit your use case.