26 June 2024

Summaries and Sentiment in real-time

Combine live transcription with other LLM-driven capabilities to provide additional insight and value before media has ended.
CCaaS and Media
Aaron NgMachine Learning Engineer

What is the feature and why is it useful?   

This feature:

  • Transcribes audio (in this case a phone call) in real-time.

  • Provides a written bullet point summary of the call that updates periodically.

  • Provides a color coded sentiment indicator based on the current sentiment of the speaker in the audio. 

In the below example you can see a transcript being transcribed in real-time

Using this transcript, a bullet point summary is also being populated as the audio progresses:

When the audio ends, a user therefore has a complete summary of the audio ready to go. 

We can also see a colored sentiment indicator, which reflects whether or not the conversation is positive (green), neutral (grey) or negative (red).

This sentiment can be used to provide an overall summary of the call, or be pushed in real-time to other system for example to show live dashboards. 

For contact center agents, accessing real-time call insights and features can both be a huge time saver in terms of post-call administration, but also tap into insights during the call that can be shared with the wider team and management in real-time. 

Agents spend a significant amount of time ‘wrapping up’ calls after they finish, with causes a delay in when they can take their next call. By reducing this, you effectively maximize the agents time on the phone speaking with customers and reduce the necessary but wasteful task of administration. 

Beyond CCaaS, the benefit of this is that it alleviates the challenge of only being able to use transcriptions after the entire recording has finished – this opens up the ability to do things ‘in the moment’.

This is particularly useful for those having to multi-task during customer calls for example, but it’s applications are wider – workflows and features can be built on top of this live insight into the audio as it streams.

What technology does this use? 

  • Real-time transcription using Speechmatics’ Python SDK

  • Sentiments (powered by OpenAI)

  • Summaries (powered by OpenAI)

  • The UI/proof of concept was built using Gradio  

Are there any technical considerations? 

Summaries and Sentiments both use OpenAI’s API – some configuration is needed to decide on how often the summary and sentiment is updated beyond the API call to access both features. 

You can find a full write up with code examples here.

Stay ahead of the packKeep up with the latest in cutting-edge speech technology.
Submit your own idea...Get a $50 Amazon voucher if your idea is selected.

Never miss a word

Get an insight into all things voice by signing up for our latest news and updates - you’ll always be kept in the know.