What does Speechmatics do?

Speechmatics provides speech technology and Voice AI for enterprises, offering accurate Speech-to-Text, Text-to-Speech, and Voice Agent solutions. Our models understand every voice and accent across 55+ languages, helping businesses unlock the full potential of voice data.

How accurate is Speechmatics Speech-to-Text?

Speechmatics delivers best-in-market accuracy, achieving up to 99% word accuracy and 96% medical keyword recall in industry benchmarks. Our models handle multiple accents, noisy environments, and multi speakers with ease.

What makes Speechmatics Text-to-Speech different?

Our low-latency Text-to-Speech (TTS) delivers lifelike, human-sounding voices with sub-150ms latency that is ideal for real-time conversations. Developers can stream natural speech in multiple voices and deploy it in the cloud, hybrid, or on-prem for privacy and control.

Can I build real-time voice agents with Speechmatics?

Our voice AI enables developers to build real-time voice agents that listen, understand, and respond naturally. Plug in fast with a flexible API and native integrations to power your AI voice agents.

Which industries use Speechmatics?

Speechmatics is trusted by organizations in media, healthcare, contact center, finance, education, and accessibility. Our technology powers transcription, translation, call analytics, and voice AI applications worldwide.

3 Influential Benefits of Language Identification

Language Identification is the next step in our everlasting quest to understand every voice. Working with our Autonomous Speech Recognition (ASR), this new feature detects and labels the primary language from an audio file with a confidence rating - across a range of use cases.

Our current Global-First language approach already reduces the need to know what accent or dialect is being spoken up-front. Language Identification builds on that and removes the need to manually select the language to use for transcription. This ensures the highest value from media in different languages and avoids inaccurate transcription due to transcribing a file with the wrong language.

We’re understandably excited about what this means for speech-to-text. So much so, we decided to outline the three most influential benefits of Language Identification (in terms of natural language processing) for everyone to see.

Benefit 1: News Becomes Even More Accessible

2020 migration statistics from the House of Commons Library state that in the UK, 6.2 million people have a nationality from a different nation, while 9.5 million people are British citizens born abroad. The UK is just one example, however. As the world becomes increasingly more globalized, the way we communicate must improve.

For everyone to better understand the globalized world we live in, features like automated captioning are fast becoming a critical tool in making news and other media outlets accessible to everyone.

By adding Language Identification to the cause, the quest for true accessibility in media becomes more attainable. Media and broadcasters have vast quantities of mostly unknown audio archives. So, instead of needing to manually sift through hours of speech and relying on human interpretation to label it, our Language Identification confirms the language pre-transcription.

As a result, said broadcasters can shed light on important, otherwise lost issues that could help a lot of people.

Benefit 2: Contact Centers Become More Efficient

A business’ contact center is fast becoming one of the core cogs in the customer experience machine. After all, a 2019 report found that a huge 72% of consumers would likely switch companies are a singular unpleasant experience.

To prevent this from happening, contact centers extrapolate large quantities of data and use it to improve the customer experience. However, in a world with thousands of languages, using human interpretation leaves room for error and a potential loss of business.

Identifying the predominant language in an audio file ensures contact centers can extract increasingly accurate data. From there, specific issues and promising target groups can be specifically treated, tailoring down any issues and making it less likely to lose customers.

Benefit 3: Speeding Up Your Workflows

As we've touched on, adding Language Identification to our ASR further negates the need for human intervention. For example, an unknown audio file can be dealt with almost immediately by automatically identifying the predominant language in the file rather than manually having to review the files.

We see a high accuracy for the predictions made by Language Identification, but as with all machine learning, mistakes are possible. By using the confidence score provided by Language Identification, the ASR can catch files where there is uncertainty in the prediction and in these scenarios only, trigger a manual human review.

Language Identification Is the Next Step for Our ASR

Our award-winning ASR is constantly being innovated by our talented teams across the globe. Language Identification is the latest feature to change the way we use speech-to-text. We’ve spoken about just three of the benefits, so here are a few quick-fire elements you might find useful:

Names the predominant language in any media file and can be used with pre-recorded files (Batch).
Works for 12 supported languages.
Adds a confidence score to show certainty of the predominant language.

And before we go, we couldn’t leave without listing the supported languages:

English (en)
German (de)
Spanish (es)
French (fr)
Hindi (hi)
Italian (it)
Japanese (ja)
Korean (ko)
Mandarin (cmn)
Dutch (nl)
Portuguese (pt)
Russian (ru)

Stuart Wood, Product Manager, Speechmatics

Sep 7, 2022 | Read time 4 min

3 Influential Benefits of Language Identification

Benefit 1: News Becomes Even More Accessible

Benefit 2: Contact Centers Become More Efficient

Benefit 3: Speeding Up Your Workflows

Language Identification Is the Next Step for Our ASR

Read also

Latest Articles

Speechmatics and Cekura bring real-world STT testing to voice agent pipelines

Speech-to-text in production: what 36 years of hard lessons taught me

The court reporter shortage crisis: data, causes, and what legal teams are doing about it

What Word Error Rate Is Acceptable for Legal Transcription?

Speechmatics achieves a world first in bilingual Voice AI with new Arabic–English model

Your voice agent speaks perfect Arabic. That's the problem.