The Road to True Equity in Speech Recognition

Witnessing high levels of ambition is one of the most inspiring things about working with technologists that lead their field. Data scientists, engineers, programmers; don’t see barriers, they see challenges that – with enough time and expertise – can be overcome.

The AI bias conundrum is one such challenge we’re all too familiar with. Since the dawn of AI, the fact that machines were trained using a certain type of (limited) data meant that – even though we were starting to see impressive technical breakthroughs – the beneficiaries of these new frontiers were limited to a specific profile that were represented in the datasets.

Speechmatics has always been proud of our market-leading position – both on the technical and commercial side of the market. We have an impressive roster of customers and partners and have always topped accuracy reports and held our own, often winning in competitive tenders, against the more established Big Tech providers. But still, these steps towards the future have never quite felt big enough.

In today’s society, it’s absolutely critical that everyone is understood. For us, there’s no point driving investment into technologies that are being mass adopted but are biased against so many.

We saw the challenge. We understood that all speech data has to be ‘labeled’ (tagged and classified) when building speech recognition in the traditional approach. It’s a laborious and expensive task, which means good data is hard to come by and seriously lacking in terms of representation of the full gamut of speech variations – variations like accent, dialect, age or any other sociodemographic characteristic. We were making small marginal gains in accuracy and we were as accurate as we could be on the data available. So, we changed the game.

Last week, we launched our new Autonomous Speech Recognition software which sees us using the latest techniques in deep learning, along with breakthrough self-supervised models. That means the data limits are lifted and we can train on huge amounts of ‘unlabeled’ data from the internet, including internet radio, news, and podcasts. This has completely changed the representation of voices and means our technology now represents a plethora of voice characteristics for the first time.

I’m incredibly proud of this team and this innovation. Everyone has worked so hard under extraordinary circumstances – from the engineers to the commercial team – making this truly market-shaking breakthrough a reality for our customers. The road is long and there’s still much to do, but this first step on the road to true equity is a moment to celebrate.

Katy Wigdahl, CEO, Speechmatics

Nov 3, 2021 | Read time 2 min

The Road to True Equity in Speech Recognition