Solving the speech recognition accent gap with Global English

Speechmatics is solving the widely criticized speech recognition accent gap when it comes to transcribing multiple English accents and dialects.

Global availability of speech recognition is a requirement

The demand for voice technology is growing fast – as businesses seek to improve efficiency and provide better services to their customers, and consumers desire the latest voice-enabled products. The pressure is on to serve more markets, geographies and people than ever before.

But it's not just a question of using machine learning to train voice technology systems to understand different languages, although our technology does that, of course – incorporating more than 30 speech recognition languages.

The real challenge lies in coping with the endless variations of a single language – everything from different regional accents to idiosyncratic use of grammar and vocabulary. In extreme cases, these variations can even lead to a breakdown in communication between speakers of the same language. So, it's not surprising that they present a significant challenge for speech recognition technology.

Since their launch, virtual personal assistants such as Siri and Alexa have faced well-documented issues with certain English language accents, particularly Scottish and Irish. This has led to many users being forced to modify their speech patterns to be understood – adapting their voices to the technology.

At Speechmatics, we believe it should be the technology that adapts to the user. That's why our any-context speech recognition engine can cope with any English speaker – no matter their accent or dialect.

The traditional approach to accent and dialect variations

Traditionally, speech recognition providers have dealt with significant variations of accents and dialects by producing different, customized language packs to ensure accuracy. This time-consuming and laborious process involves different sets of models trained on data from each particular subset of speakers.

For automatic speech-to-text technology vendors, this creates additional complexity as they need to manage an extensive and growing number of variants for each language they support. This slows down innovation and time to market of the latest versions of their language packs.

For customers, the traditional approach causes issues when it comes to accurately transcribing multiple speakers with different accents. In the case of an interview in English involving an Australian and an American, for example, two transcriptions would need to be run – one using the Australian-English language model, and one using the American-English language model. This is costly and makes it a slow process.

The pioneering Speechmatics approach to speech recognition languages

Speechmatics is the first and only company to do away with creating multiple language packs for different accents and dialects. Our unique approach involves using machine learning to create a single, comprehensive language pack, accurately encompassing as many variations of English as possible. For most real-world applications, this gives the most reliable, accurate and efficient performance for our customers and partners.

By implementing a new accent-independent approach – improving and harnessing recent advances in technology and data gathering – we have been able to simplify the traditional approach, dramatically improving the accuracy and ROI, while reducing complexity and time to market.

Our Global English language pack encompasses all major English accents and dialects. It's the result of Automatic Linguist – our unique machine learning framework that is capable of learning new languages quickly. The technology was a winner in the Innovation category of the 2019 Queen's Awards for Enterprise.

Real-world benefits of the Speechmatics Global English language pack

For businesses with staff and customers across the world, it is not always possible or effective to select a single accent-specific language pack. Customers contacting national contact centers have a broad range of accents; call monitoring of multinational workforces must decipher numerous different forms of accented English; and live TV interviews feature guests from across the world.

Our single, multi-use Global English solution means speech-to-text users do not need to identify which English variant is being spoken. It solves the problem of audio featuring multiple speakers, each with a different accent – or where speaker accents are not known in advance.

In one comprehensive language pack, it provides reliable results over a broad range of speakers – without having to run audio files through a speech recognition engine multiple times to capture all the different accents using accent-specific language packs.

With Global English, you can also control the output by specifying rules to select either American or British spellings. And by focusing resources on maintaining and updating fewer speech recognition languages models, Speechmatics can increase quality, improve accuracy and ensure reliability.

Global English not only delivers simplified deployment capabilities, it also leads the market in accuracy against English models designed for specific accents and dialects.

Fast, accurate, reliable and now more flexible, convenient and inclusive, Global English offers users speech recognition for the future.

Nov 19, 2020 | Read time 4 min

Solving the speech recognition accent gap with Global English

Global availability of speech recognition is a requirement

The traditional approach to accent and dialect variations

The pioneering Speechmatics approach to speech recognition languages

Real-world benefits of the Speechmatics Global English language pack