VUX World Podcast with Kane Simms: Unlocking AI use cases with speech recognition

Enhancing Speech Recognition for the Next Generation of AI

Kane Simm's podcast, VUX World, is renowned for its captivating discussions with industry leaders in Conversational AI and Customer Experience.

He sat down with Speechmatics VP Corp Dev, Ricardo Herreros-Symons, discussing voice-driven tech, conversational agents, live captioning, and the future of AI.

They talk about the impact of accurate speech recognition in transforming business operations and customer experiences.

Here is a snippet of their conversation:

Kane: How do you approach solving the generalized speech recognition problem and what is the importance of incredibly accurate speech technology?

Ricardo: Accurate speech recognition is crucial because it ensures that we can understand every single voice, regardless of accent or background noise, which significantly enhances user interactions and accessibility. By improving the quality and efficiency of speech-to-text conversion, we can open up new use cases and make interactions with technology more seamless and natural.

Watch the full podcast here

Want to skip to the best bits?

Two standout moments from the podcast include:

How Speechmatics' can even understand a lightning fast rap about legal drugs in the UK (5:00-5:30mins into the video) 💊
An incredibly impressive Rafael Nadal impression following his early departure from the French Open (6:30-7:30mins into the video) 🎾

We've also broken down the pod into digestible chapters using our own tech, so you can find the specific sections that you're interested in...

(00:00:00) Unparsed Conference Introduction The speaker greets the audience and introduces the upcoming Unparsed conference, which is three weeks away. The event is anticipated to be larger than the previous year, with a new track for developers and a focus on conversational and generative AI. The speaker encourages attendance, mentioning available promo codes and the event's aim to unite AI communities for sharing best practices and insights.

(00:02:32) Demonstrating Speechmatics' Technology Ricardo Herrera Simmons from Speechmatics is introduced and proceeds to demonstrate the company's speech recognition technology. The demonstration showcases the system's ability to transcribe speech accurately in real-time, handle complex vocabulary, and understand different accents and languages. The speaker emphasizes Speechmatics' focus on accuracy and low latency in their speech recognition solutions.

(00:03:11) The Role of Speech Recognition in AI The speaker inquires about Speechmatics and its role in AI. Ricardo explains the company's focus on turning audio into text and their differentiation in the market through high accuracy and broad vocabulary. He discusses the company's approach to building efficient models that can understand diverse voices and accents, and the importance of speech recognition in enabling various AI applications.

(00:08:45) Challenges and Future of Speech Recognition The conversation explores the future challenges in speech recognition, such as end-of-speech detection and the potential of audio language modeling. The speaker and Ricardo discuss the importance of creating natural and seamless interactions with AI agents, the need for efficient models that can run on devices, and the impact of speech recognition on emerging use cases like smart glasses and robotics.

(00:52:06) Data Management and Privacy in AI The speaker addresses the critical issue of data management and privacy in AI, particularly in speech recognition. Ricardo discusses Speechmatics' approach to sourcing training data, ensuring customer data privacy, and the company's preference for on-device processing to mitigate privacy concerns. He highlights the importance of building efficient models that respect user privacy and reduce environmental impact.

Astonishing accurate ASR is here, in real-time.

What are you waiting for?

Jun 5, 2024 | Read time 3 min

VUX World Podcast with Kane Simms: Unlocking AI use cases with speech recognition

Enhancing Speech Recognition for the Next Generation of AI

Want to skip to the best bits?

Astonishing accurate ASR is here, in real-time.

Related Articles

Why Google and Open AI’s latest announcements don’t solve all the challenges of AI Assistants

Audio Events – a step into understanding more than words

Unaligned with Robert Scoble: Discussing the power of speech technology