Smashing Automatic Speech Recognition Models into a Pipeline =========================================================== This tutorial demonstrates how to use the `pruna` package to optimize any custom whisper model. In this case, the outputted model is a smashed whisper model wrapped in an efficient pipeline. We will use the openai/whisper-large-v3 model as an example. Loading the ASR model ---------------------------------- First, load your asr model. .. code-block:: python import torch from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor, pipeline, AutoTokenizer, AutoFeatureExtractor from datasets import load_dataset import tokenizers device = "cuda" if torch.cuda.is_available() else "cpu" torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32 model_id = "openai/whisper-large-v3" model = AutoModelForSpeechSeq2Seq.from_pretrained( model_id, torch_dtype=torch_dtype, use_safetensors=True, low_cpu_mem_usage=True, ) model.to(device) processor = AutoProcessor.from_pretrained(model_id) Initializing the Smash Config ------------------------------- Next, initialize the smash_config. .. code-block:: python from pruna_engine.SmashConfig import SmashConfig # Initialize the SmashConfig smash_config = SmashConfig() smasher_config['compilers'] = ['ws2t', 'c_whisper'] smasher_config['processor'] = processor # uncomment the following line to quantize the model to 8 bits # smasher_config['weight_quantization_bits'] = 8 Smashing the Model ------------------ Now, smash the model. .. code-block:: python from pruna.smash import smash # Smash the model smashed_model = smash( model=model, api_key='', # replace with your actual API key smash_config=smash_config, ) Don't forget to replace the api_key by the one provided by PrunaAI. Preparing the Input ------------------- .. code-block:: bash wget https://huggingface.co/datasets/reach-vb/random-audios/resolve/main/sam_altman_lex_podcast_367.flac audio_sample = 'sam_altman_lex_podcast_367.flac' Running the Model ----------------- Finally, run the model to transcribe the audio file. .. code-block:: python # Display the result smashed_model(sample) Wrap Up --------- Congratulations! You have successfully smashed an ASR model. You can now use the `pruna` package to optimize any custom ASR model. The only parts that you should modify are step 1 and step 5 to fit your use case.