Tutorials Pruna Pro
This tutorial will guide you through some common use cases of pruna_pro. Looking for pruna tutorials? Check out the Tutorials Pruna page.
Flux in a heartbeat
Optimize your Flux model for faster inference with periodic caching and torch.compile compilation.
Shrink and accelerate SANA
Shrink and accelerate SANA with using torchao_autoquant quantization.
Speed up diffusion models with caching
Speed up diffusion models with auto caching.
TurboCharge video generation
Speed up HunyuanVideo video generation with adaptive caching.
Accelerating inference in vLLM serving
Serve large language models with vLLM and pruna_pro optimization.
Caching for Custom Models
Apply caching algorithms to nearly any diffusion or flow matching model.
Accelerating vLLM with Higgs quantizer
Optimize any LLM with Higgs quantization and serve with vLLM.