Tutorials Pruna Pro

This tutorial will guide you through some common use cases of pruna_pro. Looking for pruna tutorials? Check out the Tutorials Pruna page.

Flux in a heartbeat

Optimize your Flux model for faster inference with periodic caching and torch.compile compilation.

Flux generation in a heartbeat, literally (Pro)
Shrink and accelerate SANA

Shrink and accelerate SANA with using torchao_autoquant quantization.

Shrink and accelerate Sana: x2 smaller and x2 faster (Pro)
Speed up diffusion models with caching

Speed up diffusion models with auto caching.

Make Any Diffusion Model 3x Faster with Auto Caching (Pro)
TurboCharge video generation

Speed up HunyuanVideo video generation with adaptive caching.

Turbocharge Text-to-Video Generation (Pro)
Accelerating inference in vLLM serving

Serve large language models with vLLM and pruna_pro optimization.

Accelerating inference in vLLM serving (Pro)
Caching for Custom Models

Apply caching algorithms to nearly any diffusion or flow matching model.

./custom_caching.ipynb
Accelerating vLLM with Higgs quantizer

Optimize any LLM with Higgs quantization and serve with vLLM.

./vllm_higgs.ipynb