Tutorials Pruna Pro

This tutorial will guide you through some common use cases of pruna_pro. Looking for pruna tutorials? Check out the Tutorials Pruna page.

Flux in a heartbeat

Optimize your Flux model for faster inference with periodic caching and torch.compile compilation.

Shrink and accelerate SANA

Shrink and accelerate SANA with using torchao_autoquant quantization.

Speed up diffusion models with caching

Speed up diffusion models with auto caching.

TurboCharge video generation

Speed up HunyuanVideo video generation with adaptive caching.

Accelerating inference in vLLM serving

Serve large language models with vLLM and pruna_pro optimization.

Caching for Custom Models

Apply caching algorithms to nearly any diffusion or flow matching model.

Accelerating vLLM with Higgs quantizer

Optimize any LLM with Higgs quantization and serve with vLLM.