Apr 18, 2023 | Read time 3 min

Speechmatics to launch pioneering real-time speech translation capabilities in 69 language pairs

Speechmatics, the leading speech recognition technology scaleup, unveils plans to amplify its real-time transcription capabilities by providing real-time speech translation in an all-in-one API.
Real-Time Translation
Speechmatics
SpeechmaticsEditorial team

This offering will integrate real-time translation with its industry-beating real-time transcription in an all-in-one API  Speechmatics, the leading speech recognition technology scaleup, unveils plans to amplify its real-time transcription capabilities by providing real-time translation in an all-in-one API. Breaking down language barriers enables more people to consume content regardless of industry and unlocks the ability to automatically translate live content from multiple regions. This combined offering enables customers to use the world’s most accurate speech-to-text engine and translate speech for 69 language pairs*.

Real-time translation follows on a month from Speechmatics’ launch of Ursa – the world’s most accurate speech-to-text engine, which is 25% more accurate than OpenAI’s Whisper and 38% more accurate than Google. Speechmatics has doubled down on these capabilities to develop real-time translation, offering language pairs to and from English*, including German, Spanish, and Vietnamese. The all-in-one API can also translate multiple languages in one request – for example, a single audio stream can provide real-time English transcription and translation to Japanese, French, Hindi, Mandarin, and Korean simultaneously.

Speechmatics’ real-time transcription and now translation delivers the same level of accuracy as its pre-recorded (batch) service, as well as providing a sliding scale to enable customers to tailor the speed (latency) and/or accuracy to meet their needs. The all-in-one API streamlines processes and speeds up workflows for businesses by combining real-time transcription and translation in one API.

Businesses can reach a wider geographical audience across multiple industries where translating in real-time has previously been a challenging and costly task when completed manually by humans. Particularly for the broadcast industry – valued at over $300 billion in the US alone in 2022 – generating quick and highly accurate translated speech in one API unlocks the ability to caption live stream content and news for viewers from around the world. Similarly, for contact centres where scale is essential, contact centres can scale operations to handle multiple languages using cost-effective automation technology and offer improved customer experiences in native languages.

Damir Derd, Head of Sales Engineering at Speechmatics, said, “This is a landmark development for speech recognition technology, and we are proud to remain at the forefront of innovation, demonstrating the commitment to our mission to understand every voice. This new offering opens up a truly global market for our customers with almost instant translation from the spoken word. As demand from viewers in different regions increases for TV shows and broadcast, sports, events, podcasts, game streaming, YouTube and social media videos, the need for captioned videos in multiple languages has too. We are excited to launch this capability to our customers in the next few weeks and will be continuing to work towards adding even more languages and enabling the engine to translate between languages, so the default isn’t always English.”

Ken Frommert, President of ENCO, said, “Speechmatics provides the most accurate speech-to-text on the market for pre-recorded files and live streams. Adding real-time translation to its all-in-one API is game-changing for live broadcast captions. The ability to not only transcribe but now leverage Speechmatics to translate in real-time to provide highly accurate captions globally.”

Real-time translation will be demoed at NAB Show, Booth N2960, 16th - 19th April 2023, and will be launching later this month. Sign up for free early access and early bird offer here. 

*Also includes Bokmål > Nynorsk language pair.

Latest Articles

[alt: Bilingual medical model featuring terms related to various health conditions and medications in Arabic and English. Key terms include "Chronic kidney disease," "Heart attack," "Diabetes," and "Insulin," among others, displayed in an organized layout.]
Product

Speechmatics achieves a world first in bilingual Voice AI with new Arabic–English model

Sets a new accuracy bar for real-world code-switching: 35% fewer errors than the closest competitor.

Speechmatics
SpeechmaticsEditorial Team
[alt: Illuminated ancient mud-brick structures stand against a dusk sky, showcasing architectural details and textures. Palm trees are in the foreground, adding to the setting's ambiance. Visually captures a historic site in twilight.]
Product

Your voice agent speaks perfect Arabic. That's the problem.

Most voice AI models are trained on formal Arabic, but real conversations across the Middle East mix dialects and English in ways those systems aren’t built to handle.

Yahia Abaza
Yahia AbazaSenior Product Manger
new blog image header
Technical

How Nvidia Dominates the HuggingFace Leaderboards in This Key Metric

A technical deep-dive into Token Duration Transducers (TDT) — the frame-skipping architecture behind Nvidia's Parakeet models. Covers inference mechanics, training with forward-backward algorithm, and how TDT achieves up to 2.82x faster decoding than standard RNN-T.

Oliver Parish
Oliver Parish Machine Learning Engineer
[alt: Healthcare professionals in scrubs and lab coats walk briskly down a hospital corridor. A nurse uses a tablet while others carry patient charts and attend to a gurney. The setting conveys a busy, clinical environment focused on patient care.]
Use Cases

Why AI-native EHR platforms will treat speech as core infrastructure in 2026

As clinical workflows become automated and AI-driven, real-time speech is shifting from a transcription feature to the foundational intelligence layer inside modern EHR systems.

Vamsi Edara
Vamsi EdaraFounder and CEO, Edvak EHR
[alt: Logos of Speechmatics and Edvak are displayed side by side, interconnected by a stylized x symbol. The background features soft, wavy lines in light blue, creating a modern and tech-focused aesthetic.]
Company

One word changes everything: Speechmatics and Edvak EHR partner to make voice AI safe for clinical automation at scale

Turning real-time clinical speech into trusted, EHR-native automation.

Speechmatics
SpeechmaticsEditorial Team
[alt: Concentric circles radiate outward from a central orange icon with a white Speechmatics logo. The background is dark blue, enhancing the orange glow. A thin green line runs horizontally across the lower part of the image.]
Technical

Speed you can trust: The STT metrics that matter for voice agents

What “fast” actually means for voice agents — and why Pipecat’s TTFS + semantic accuracy is the clearest benchmark we’ve seen.

Archie McMullan
Archie McMullanSpeechmatics Graduate