Welcome
Glad to have you here! At Pruna AI, we create solutions that empower developers to make their AI models smaller, faster, cheaper, and greener.
Our compression framework pruna is made by developers for developers. It is designed to make your life easier by providing a seamless integration of state-of-the-art compression algorithms. In just a few lines of code, pruna helps you integrate a range of diverse compression algorithms and evaluate their performance - all in a consistent and easy-to-use interface.
Pruna Open Source
pruna is a free and open-source compression framework that allows you to compress and evaluate your models.
Learn how to install pruna and use serving integrations.
Understand how to use pruna to compress and evaluate your models.
Learn how to benchmark and evaluate your optimized models with pruna.
Get familiar with end-to-end examples for various specific modalities and use cases.
How does it work? First, you need to install pruna:
pip install pruna
After installing pruna, you can start smashing your models in 4 easy steps:
Load a pretrained model
Create a SmashConfig
Apply optimizations with the smash function
Run inference with the optimized model
Let’s see how it works with an example:
import torch
from diffusers import StableDiffusionPipeline
from pruna import smash, SmashConfig
# Define the model you want to smash
pipe = StableDiffusionPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4",
torch_dtype=torch.float16
)
pipe = pipe.to("cuda")
# Initialize the SmashConfig
smash_config = SmashConfig(['stable_fast', 'deepcache'])
# Smash the model
smashed_model = smash(
model=pipe,
smash_config=smash_config,
)
# Run the model on a prompt
prompt = "a photo of an astronaut riding a horse on mars"
image = smashed_model(prompt).images[0]
Now that you’ve seen what pruna can do, it’s your turn!
Pruna endpoints
What are performance models?
pruna hosts serverless endpoints for your models in collaboration with major inference providers in the industry, like Replicate, Prodia, Runpod, Segmind, DeepInfra, and Wiro.
We also host our own models, called performance models. We optimize and serverless well known open source models for speed, efficiency, and cost. Additionally, we have our own performance models like P-Image and P-Image-Edit.
Pruna’s P-Image is a performance text-to-image model delivering AI images in under one second. It combines speed, quality, prompt adherence, and reliable text rendering.
Pruna’s P-Image-Edit is a state-of-the-art image editing model, offering fast, high-quality multi-image editing with excellent prompt following and text rendering.
API Reference for the performance models.
All models and pricing for our models.
Why performance models?
Pruna endpoints offer significant advantages over running your own model endpoints from scratch — thanks to our integrated optimizations and cloud infrastructure partnerships. They provide:
Faster: Models are hosted and optimized for speed using the latest optimization algorithms.
Cheaper: Model optimizations reduce hardware requirements and reduce costs.
Better: Good optimizations can be lossless and those are our specialty.
Tip
Check out our benchmark comparison page for a head-to-head look at latency and price compared to self-hosting and other public endpoints. See how much time and money you can save.
Pruna Pro
pruna_pro is our premium offering that provides advanced compression algorithms and features to help you achieve better model compression results. It uses exactly the same interface as pruna, but offers additional features and algorithms.
Learn how to transition to pruna_pro and access premium features.
Learn how to use the pruna_pro features with end-to-end examples.
Search for all the pro features and algorithms.
How does it work? First, you need to install pruna_pro:
pip install pruna_pro
After installing pruna_pro, you use the exact same interface as pruna but with additional features:
from pruna_pro import smash # instead of: from pruna import smash
smash(model, smash_config, token='<your_pruna_pro_token>') # add your token here
Now that you’ve seen what pruna_pro can do, it’s your turn!
Pruna Community
We love to organize events and workshops and there are many coming up! You can more info about our community and events in the Community section.