Azure Speech-to-Text Pipeline for Sales & Customer Service
An automated pipeline for transcribing calls and performing sentiment analysis. Azure Speech streams real-time transcriptions, then an LLM analyzes the call and generates a report on agent performance.
I used a detailed system prompt to control the LLM output, plus the max_tokens parameter to adjust report length. Caching prevents redundant API calls when the same file gets uploaded twice.
Terminal demo:
Streamlit interface with real-time transcription streaming:
Pairing specialized Azure services with LLMs works well. Azure Speech handles transcription, the LLM adds analysis. Together they do more than either could alone.
Real-time streaming makes a big difference in user experience. Watching transcription appear word-by-word feels more responsive than waiting for batch processing. The caching was a practical addition that saved API costs on duplicate uploads.