Smashing Computer Vision Models =============================== This tutorial demonstrates how to use the `pruna` package to optimize any custom computer vision model. We will use the vit_b_16 model as an example. Loading the CV Model -------------------- First, load your stable diffusion model. .. code-block:: python from torchvision.models import resnet50, ResNet50_Weights import torchvision model = torchvision.models.vit_b_16(weights="ViT_B_16_Weights.DEFAULT").cuda() Initializing the Smash Config ------------------------------- Next, initialize the smash_config. .. code-block:: python from pruna_engine.SmashConfig import SmashConfig # Initialize the SmashConfig smash_config = SmashConfig() smasher_config['task'] = 'image_classification' smash_config["compilers"] = "cv-fast" smash_config['weight_quantization_bits'] = 16 Smashing the Model ------------------ Now, smash the model. .. code-block:: python from pruna.smash import smash # Smash the model smashed_model = smash( model=model, api_key='', # replace with your actual API key smash_config=smash_config, ) Don't forget to replace the api_key by the one provided by PrunaAI. Preparing the Input ------------------- .. code-block:: python import numpy as np from torchvision import transforms # Generating a random image image = np.random.randint(0, 256, size=(224, 224, 3), dtype=np.uint8) input_tensor = transforms.ToTensor()(image).unsqueeze(0).to(device) Running the Model ----------------- Finally, run the model to transcribe the audio file. .. code-block:: python # Display the result smashed_model(input_tensor) Wrap Up --------- Congratulations! You have successfully smashed a CV model. You can now use the `pruna` package to optimize any custom CV model. The only parts that you should modify are step 1 and step 5 to fit your use case. Additionally you can use the compiler 'all' which explores many compression methods as well as supporting cpu optimization, albeit it supports less CV models.