Tutorials Pruna
This tutorial will guide you through the process of using pruna to optimize your model. Looking for pruna_pro tutorials? Check out the Tutorials Pruna Pro page.
Transcribe 2 hour of audio in 2 minutes with Whisper
Speed up ASR using the c_whisper compilation and whisper_s2t batching.
Smash your Computer Vision model with a CPU only
Compile your model with torch_compile and openvino for faster inference.
Speedup and Quantize any Diffusion Model
Speed up diffusers with torch_compile compilation and hqq_diffusers quantization.
Evaluating with CMMD using EvaluationAgent
Evaluate image generation quality with CMMD and EvaluationAgent.
Run your Flux model with half the memory
Speed up your image generation model with torch_compile compilation and hqq_diffusers quantization.
Making your LLMs 4x smaller
Speed up your LLM inference with gptq quantization.
x2 smaller Sana diffusers in action
Optimize your diffusion model with hqq_diffusers quantization in 8 bits.
Make Stable Diffusion 3x Faster with DeepCache
Optimize your diffusion model with deepcache caching.