Tutorials Pruna

This tutorial will guide you through the process of using pruna to optimize your model. Looking for pruna_pro tutorials? Check out the Tutorials Pruna Pro page.

Transcribe 2 hour of audio in 2 minutes with Whisper

Speed up ASR using the c_whisper compilation and whisper_s2t batching.

./asr_tutorial.ipynb
Smash your Computer Vision model with a CPU only

Compile your model with torch_compile and openvino for faster inference.

./cv_cpu.ipynb
Speedup and Quantize any Diffusion Model

Speed up diffusers with torch_compile compilation and hqq_diffusers quantization.

./diffusion_quantization_acceleration.ipynb
Evaluating with CMMD using EvaluationAgent

Evaluate image generation quality with CMMD and EvaluationAgent.

./evaluation_agent_cmmd.ipynb
Run your Flux model with half the memory

Speed up your image generation model with torch_compile compilation and hqq_diffusers quantization.

./flux_small.ipynb
Making your LLMs 4x smaller

Speed up your LLM inference with gptq quantization.

./llms.ipynb
x2 smaller Sana diffusers in action

Optimize your diffusion model with hqq_diffusers quantization in 8 bits.

./sana_diffusers_int8.ipynb
Make Stable Diffusion 3x Faster with DeepCache

Optimize your diffusion model with deepcache caching.

./sd_deepcache.ipynb