Tutorials Pruna
This tutorial will guide you through the process of using pruna to optimize your model. Looking for pruna_pro tutorials? Check out the Tutorials Pruna Pro page.
Compress with a hq_diffusers quantizer and a deepcache cacher, and evaluate with throughput, total time, clip_score.
Compress with hqq quantization and torch_compile compilation and evaluate with elapsed_time and perplexity.
Speed up ASR using the c_whisper compilation and whisper_s2t batching.
Compile your model with torch_compile and openvino for faster inference.
Speed up diffusers with torch_compile compilation and hqq_diffusers quantization.
Evaluate image generation quality with CMMD and EvaluationAgent.
Optimize your diffusion model with hqq_diffusers quantization in 8 bits.
Optimize your diffusion model with deepcache caching.