What does Speechmatics do?

Speechmatics provides speech technology and Voice AI for enterprises, offering accurate Speech-to-Text, Text-to-Speech, and Voice Agent solutions. Our models understand every voice and accent across 55+ languages, helping businesses unlock the full potential of voice data.

How accurate is Speechmatics Speech-to-Text?

Speechmatics delivers best-in-market accuracy, achieving up to 99% word accuracy and 96% medical keyword recall in industry benchmarks. Our models handle multiple accents, noisy environments, and multi speakers with ease.

What makes Speechmatics Text-to-Speech different?

Our low-latency Text-to-Speech (TTS) delivers lifelike, human-sounding voices with sub-150ms latency that is ideal for real-time conversations. Developers can stream natural speech in multiple voices and deploy it in the cloud, hybrid, or on-prem for privacy and control.

Can I build real-time voice agents with Speechmatics?

Our voice AI enables developers to build real-time voice agents that listen, understand, and respond naturally. Plug in fast with a flexible API and native integrations to power your AI voice agents.

Which industries use Speechmatics?

Speechmatics is trusted by organizations in media, healthcare, contact center, medical, finance, legal, education, and accessibility. Our technology powers transcription, translation, call analytics, and voice AI applications worldwide.

Speechmatics launches Language Identification, allowing users to automatically determine the predominant language in a media file

Latest addition to Speechmatics engine saves time on manually reviewing files and is applicable to a wide variety of use cases

Speechmatics, the leading autonomous speech recognition technology scaleup has now added Language Identification (Language ID) to its industry-leading speech-to-text engine. This latest addition allows customers to automatically identify the predominant language spoken in any media file. Customers will save time and effort on manually reviewing files, safe in the knowledge that they will be provided with an accurate transcription of any media file.

Language ID drives efficiency by removing the manual step of selecting which language pack should be used when the language is not explicitly stated on the file. Often requested, it not only helps users identify unknown languages, but also adds useful metadata about the language of the spoken audio. Media and broadcast organizations have extensive archives of audio, the content of which is often unknown. Instead of manually listening to hours of speech – and relying on human interpretation to label it – Speechmatics Language ID confirms the language pre-transcription. For contact centers, being able to identify the predominant language spoken (especially when callers switch languages) is a huge benefit to those conducting call analysis.

Speechmatics has built the most accurate and inclusive speech-to-text engine available. Historically, training data had to be manually tagged, classified or ‘labelled’. This has resulted in engines that have been trained on narrow datasets, which fail to represent the diversity of voices that use them. In contrast, Speechmatics’ speech-to-text engine is trained through exposure to hundreds of thousands of individual voices using millions of hours of unlabelled, more representative voice data. Speechmatics has applied this technique to identifying predominant spoken languages on a diverse set of audio data.

Commenting on this rollout of Language ID, CEO Katy Wigdahl said, “Up until now, identifying languages without human intervention has been costly and time-consuming for users of speech-to-text. However, with our new Language ID, this will be a thing of the past and allow customers to swiftly identify and transcribe media files - with less hassle and more efficiency. We can’t wait for our customers to use this Language ID and see it deliver accurate and valuable results.’’

This latest update can be used with pre-recorded media files, works with up to 12 languages and adds a confidence score to show the certainty of the predominant language. Supported languages are English, German, Spanish, French, Hindi, Italian, Japanese, Korean, Mandarin, Dutch, Portuguese, and Russian.

Aug 31, 2022 | Read time 2 min

Speechmatics launches Language Identification, allowing users to automatically determine the predominant language in a media file

Read also

Latest Articles

How to build a microbatching workflow with the Speechmatics API

Alphanumeric speech recognition: why voice assistants mangle SKUs (and how to fix it)

The Adobe story: How we made cloud-grade AI work on your laptop

Adobe and Speechmatics deliver cloud-grade speech recognition on-device for Premiere

Best speech-to-text AI guide: APIs, platforms and services compared

AI can now understand health signals from 15 seconds of your voice, including fatigue, stress and type 2 diabetes