Tutorials Pruna
This tutorial will guide you through the process of using pruna to optimize your model. Looking for pruna_pro tutorials? Check out the Tutorials Pruna Pro page.
Transcribe 2 hour of audio in 2 minutes with Whisper
Speed up ASR using the c_whisper
compilation
and whisper_s2t
batching
.
Smash your Computer Vision model with a CPU only
Compile
your model with torch_compile
and openvino
for faster inference.
Speedup and Quantize any Diffusion Model
Speed up diffusers
with torch_compile
compilation
and hqq_diffusers
quantization
.
Evaluating with CMMD using EvaluationAgent
Evaluate
image generation quality with CMMD
and EvaluationAgent
.
Run your Flux model with half the memory
Speed up your image generation model with torch_compile
compilation
and hqq_diffusers
quantization
.
Making your LLMs 4x smaller
Speed up your LLM inference with gptq
quantization
.
x2 smaller Sana diffusers in action
Optimize your diffusion
model with hqq_diffusers
quantization
in 8 bits.
Make Stable Diffusion 3x Faster with DeepCache
Optimize your diffusion
model with deepcache
caching
.