Tutorials Pruna

This tutorial will guide you through the process of using pruna to optimize your model. Looking for pruna_pro tutorials? Check out the Tutorials Pruna Pro page.

Compress and Evaluate Image Generation Models

Compress with a hq_diffusers quantizer and a deepcache cacher, and evaluate with throughput, total time, clip_score.

./image_generation.ipynb

Compress and Evaluate Large Language Models

Compress with hqq quantization and torch_compile compilation and evaluate with elapsed_time and perplexity.

./llms.ipynb

Transcribe 2 hour of audio in 2 minutes with Whisper

Speed up ASR using the c_whisper compilation and whisper_s2t batching.

./asr_tutorial.ipynb

Smash your Computer Vision model with a CPU only

Compile your model with torch_compile and openvino for faster inference.

./cv_cpu.ipynb

Speedup and Quantize any Diffusion Model

Speed up diffusers with torch_compile compilation and hqq_diffusers quantization.

./diffusion_quantization_acceleration.ipynb

Evaluating with CMMD using EvaluationAgent

Evaluate image generation quality with CMMD and EvaluationAgent.

./evaluation_agent_cmmd.ipynb

x2 smaller Sana diffusers in action

Optimize your diffusion model with hqq_diffusers quantization in 8 bits.

./sana_diffusers_int8.ipynb

Make Stable Diffusion 3x Faster with DeepCache

Optimize your diffusion model with deepcache caching.

./sd_deepcache.ipynb