Tutorials Pruna
These tutorials will guide you through the process of using pruna to optimize your models. Looking for pruna_pro tutorials? Check out the Tutorials Pruna Pro page.
Compress with a hqq_diffusers
quantizer
and a deepcache
cacher
, and evaluate with throughput
, total time
, clip_score
.
Compress with a torch_compile
compiler
and a flash_attn3
kernel
, and evaluate with total time
, latency
, throughput
, co2_emissions
, and energy_consumed
.
Compress with hqq
quantization and torch_compile
compilation and evaluate with elapsed_time
and perplexity
.
Compress with hqq
quantization and torch_compile
compilation and evaluate with total time
, perplexity
, throughput
and energy_consumed
.
Speed up ASR using the c_whisper
compilation
and whisper_s2t
batching
.
Compile
your model with torch_compile
and openvino
for faster inference.
Speed up diffusers
with torch_compile
compilation
and hqq_diffusers
quantization
.
Evaluate
image generation quality with CMMD
and EvaluationAgent
.
Optimize your diffusion
model with hqq_diffusers
quantization
in 8 bits.
Optimize your diffusion
model with deepcache
caching
.