Skip to content

Pruna documentation

Tutorials Pruna

These tutorials will guide you through the process of using pruna to optimize your models. Looking for pruna_pro tutorials? Check out the Tutorials Pruna Pro page.

Compress and Evaluate Image Generation Models

Compress with a hqq_diffusers quantizer and a deepcache cacher, and evaluate with throughput, total time, clip_score.

./image_generation.ipynb

Compress and Evaluate Video Generation Models

Compress with a torch_compile compiler and a flash_attn3 kernel, and evaluate with total time, latency, throughput, co2_emissions, and energy_consumed.

./video_generation.ipynb

Compress and Evaluate Large Language Models

Compress with hqq quantization and torch_compile compilation and evaluate with elapsed_time and perplexity.

Compress and Evaluate Reasoning Large Language Models

Compress with hqq quantization and torch_compile compilation and evaluate with total time, perplexity, throughput and energy_consumed.

./reasoning_llm.ipynb

Transcribe 2 hour of audio in 2 minutes with Whisper

Speed up ASR using the c_whisper compilation and whisper_s2t batching.

./asr_tutorial.ipynb

Smash your Computer Vision model with a CPU only

Compile your model with torch_compile and openvino for faster inference.

Speedup and Quantize any Diffusion Model

Speed up diffusers with torch_compile compilation and hqq_diffusers quantization.

./diffusion_quantization_acceleration.ipynb

Evaluating with CMMD using EvaluationAgent

Evaluate image generation quality with CMMD and EvaluationAgent.

./evaluation_agent_cmmd.ipynb

x2 smaller Sana diffusers in action

Optimize your diffusion model with hqq_diffusers quantization in 8 bits.

./sana_diffusers_int8.ipynb

Make Stable Diffusion 3x Faster with DeepCache

Optimize your diffusion model with deepcache caching.

./sd_deepcache.ipynb

Automate finding the best SmashConfig with the Optimization Agent (Pro)

Transcribe 2 hours of audio in less than 2 minutes with Whisper