LLM Integration in Automatic Speech Recognition

  • Incorporated the scores of pretrained Large Language Models with branchformer-based end-to-end (E2E) models using Masked Language Modeling to improve Automatic Speech Recognition (ASR) efficacy.

Quick links:

Rescoring pipeline for improving ASR

flowchart

Rescore branchformer-based ASR model with scores from LLM to rerank hypotheses and improve ASR performance

Slide Deck