Oct 26, 2021 | Read time 2 min

Speechmatics achieves AI breakthrough, beating tech giants in tackling bias and inclusion to understand all voices

Speechmatics’ launches pioneering Autonomous Speech Recognition software, outperforming Amazon, Apple, Google, and Microsoft.
AI-breakthrough-tackling-bias
Speechmatics
SpeechmaticsEditorial team
Cambridge company’s pioneering self-supervised learning technology reduces speech recognition errors for African American voices by 45% versus Amazon, Apple, Google, and Microsoft.

Speechmatics, the leading speech recognition technology scaleup, has today launched its ‘Autonomous Speech Recognition’ software. Using the latest techniques in deep learning and with the introduction of its breakthrough self-supervised models, Speechmatics outperforms Amazon, Apple, Google, and Microsoft in the company’s latest step towards its mission to understand all voices.

Based on datasets used in Stanford’s ‘Racial Disparities in Speech Recognition’ study, Speechmatics recorded an overall accuracy of 82.8% for African American voices compared to Google (68.6%) and Amazon (68.6). This level of accuracy equates to a 45% reduction in speech recognition errors – the equivalent of three words in an average sentence. Speechmatics’ Autonomous Speech Recognition delivers similar improvements in accuracy across accents, dialects, age, and other sociodemographic characteristics.

Up until now, misunderstanding in speech recognition has been commonplace due to the limited amount of labeled data available to train on. Labeled data must be manually ‘tagged’ or ‘classified’ by humans which not only limits the amount of available data for training but also the representation of all voices. With this breakthrough, Speechmatics’ technology is trained on huge amounts of unlabelled data direct from the internet such as social media content and podcasts. By using self-supervised learning, the technology is now trained on 1.1 million hours of audio – an increase from 30,000 hours. This delivers a far more comprehensive representation of all voices and dramatically reduces AI bias and errors in speech recognition.

Speechmatics also outperforms competitors on children’s voices – which are notoriously challenging to recognize using legacy speech recognition technology. Speechmatics recorded 91.8% accuracy compared to Google (83.4%) and Deepgram (82.3%) based on the open-source project Common Voice.

Katy Wigdahl, CEO of Speechmatics, comments:

“We are on a mission to deliver the next generation of machine learning capabilities and through that offer more inclusive and accessible speech technology. This announcement today is a huge step towards achieving that mission.

Our focus on tackling AI bias has led to this monumental leap forward in the speech recognition industry and the ripple effect will lead to changes in a multitude of different scenarios. Think of the incorrect captions we see on social media, court hearings where words are mistranscribed and eLearning platforms that have struggled with children’s voices throughout the pandemic. Errors people have had to accept until now can have a tangible impact on their daily lives.”

Allison Zhu Koenecke, Lead Author of the Stanford study on speech recognition:

“It’s critical to study and improve fairness in speech-to-text systems given the potential for disparate harm to individuals through downstream sectors ranging from healthcare to criminal justice.”

Latest Articles

Carousel slide image
Product

Alphanumeric speech recognition: why voice assistants mangle SKUs (and how to fix it)

A guide for voice AI engineers, ecommerce platforms and warehouse teams on SKU recognition accuracy voice assistant deployments depend on: why speech recognition systems produce transcription errors on product codes, what to measure when error rates matter, and the fixes that move the needle on order picking, voice ordering and customer-facing voice AI.

Speechmatics
SpeechmaticsEditorial Team
Carousel slide image
Technical

The Adobe story: How we made cloud-grade AI work on your laptop

Behind the build: what it takes to make cloud-grade speech recognition work inside Adobe Premiere, and why Whisper raised the stakes.

Andrew Innes
Andrew InnesChief Architect
Carousel slide image
Company

Adobe and Speechmatics deliver cloud-grade speech recognition on-device for Premiere

Adobe Premiere users can run the most accurate on-device transcription locally; efficient enough for a laptop, powerful enough for professional work.

Speechmatics
SpeechmaticsEditorial Team
Carousel slide image
Use Cases

Best speech-to-text AI guide: APIs, platforms and services compared

Speech-to-text has moved from novelty to enterprise infrastructure. Here's how the leading platforms stack up in 2026 — and how to pick the right one.

Tom Young
Tom YoungDigital Specialist
Speechmatics x Thymia combine medical-grade speech-to-text with clinical-grade voice biomarker intelligence to identify health signals.
News

AI can now understand health signals from 15 seconds of your voice, including fatigue, stress and type 2 diabetes

The joint platform returns transcription and health signals in real time, with no additional hardware required.

Speechmatics
SpeechmaticsEditorial Team
[alt: Concentric circles radiate outward from a central orange icon with a white Speechmatics logo. The background is dark blue, enhancing the orange glow. A thin green line runs horizontally across the lower part of the image.]
Technical

Speed you can trust: The STT metrics that matter for voice agents

What “fast” actually means for voice agents — and why Pipecat’s TTFS + semantic accuracy is the clearest benchmark we’ve seen.

Archie McMullan
Archie McMullanSpeechmatics Graduate