Smashing Automatic Speech Recognition Models into a Pipeline
===========================================================

This tutorial demonstrates how to use the `pruna` package to optimize any custom whisper model. In this case, the outputted model is a smashed whisper model wrapped in an efficient pipeline. We will use the openai/whisper-large-v3 model as an example.

Loading the ASR model
----------------------------------

First, load your asr model.

.. code-block:: python

    import torch
    from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor, pipeline, AutoTokenizer, AutoFeatureExtractor
    from datasets import load_dataset
    import tokenizers


    device = "cuda" if torch.cuda.is_available() else "cpu"
    torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32

    model_id = "openai/whisper-large-v3"

    model = AutoModelForSpeechSeq2Seq.from_pretrained(
        model_id, torch_dtype=torch_dtype, use_safetensors=True, low_cpu_mem_usage=True,
    )
    model.to(device)

    processor = AutoProcessor.from_pretrained(model_id)

Initializing the Smash Config
-------------------------------

Next, initialize the smash_config.

.. code-block:: python

    from pruna_engine.SmashConfig import SmashConfig

    # Initialize the SmashConfig
    smash_config = SmashConfig()
    smasher_config['compilers'] = ['ws2t', 'c_whisper']
    smasher_config['processor'] = processor
    # uncomment the following line to quantize the model to 8 bits
    # smasher_config['weight_quantization_bits'] = 8

Smashing the Model
------------------

Now, smash the model.

.. code-block:: python

    from pruna.smash import smash

    # Smash the model
    smashed_model = smash(
        model=model,
        api_key='<your-api-key>',  # replace <your-api-key> with your actual API key
        smash_config=smash_config,
    )

Don't forget to replace the api_key by the one provided by PrunaAI.

Preparing the Input
-------------------
.. code-block:: bash

    wget https://huggingface.co/datasets/reach-vb/random-audios/resolve/main/sam_altman_lex_podcast_367.flac
    audio_sample = 'sam_altman_lex_podcast_367.flac'

Running the Model
-----------------

Finally, run the model to transcribe the audio file.

.. code-block:: python

    # Display the result
    smashed_model(sample)

Wrap Up
---------

Congratulations! You have successfully smashed an ASR model. You can now use the `pruna` package to optimize any custom ASR model. The only parts that you should modify are step 1 and step 5 to fit your use case.