Dec 16, 2025 | Read time 6 min

Speechmatics sets new standard for real-time medical transcription with German and Nordic roll-out

New German, Danish and Norwegian Medical Models deliver up to 50% lower error rates, real-time accuracy under one second, and full deployment flexibility.
New real-time medical transcription with German and Nordic roll-out
Speechmatics
SpeechmaticsEditorial Team

TL;DR — Key Takeaways:

  • Launched: Medical Models for German, Danish, Norwegian (now 7 languages total).

  • Impact: Up to 50% lower error rates on medical speech; built for real-time use.

  • Deploy anywhere: SaaS, private cloud, or on-prem for regulated healthcare.

Speechmatics today expanded its Medical Model with German, Danish and Norwegian, bringing the total language count to seven alongside its industry-leading English model.

The expansion, trained on over 2 billion words of medical data, delivers significant accuracy improvements and gives healthcare organizations deployment choice across on-premises, private cloud and SaaS infrastructure.

Each new language undergoes rigorous testing and optimization before release, ensuring the models can handle the demands that define real healthcare environments.

Trained on 2 billion words of medical data

The new language models are trained on over 2 billion words of medical conversations, clinical documentation and healthcare interactions, adding to the 14 billion words of medical data in Speechmatics' existing models. 

This training scale enables the models to understand the complexity of real clinical environments: rapid multi-speaker dialogue, medical abbreviations, drug dosages, and diverse accents.

That scale enables the models to handle what generic speech recognition systems miss: the difference between "hypertension" and "hypotension" in a noisy emergency room, a pharmaceutical name spoken with a regional accent, or overlapping speech between clinician and patient during a consultation.

The result is accuracy that changes clinical workflows.

Accuracy improvements for new language additions

The three new models demonstrate substantial Key Word Error Rate (KWER) reductions using our specially tailored medical keyword test set. This test set was designed to evaluate our models on challenging terminology, across a broad range of scenarios. 

On average, Speechmatics has improved Word Error Rate (WER) on medical test sets by around 30–50% across German, Danish and Norwegian compared with previous Speechmatics models. The new models are also around 5–20% lower in word error rate than the closest evaluated competitor on medical test sets for most languages.

German shows one of the most notable uplifts, with error rates reduced by roughly a third versus Speechmatics' previous German Enhanced model on internal medical tests. That improvement is critical in a language dense with compound terms and specialist vocabulary, where a single misplaced token can change clinical meaning.

These numbers position Speechmatics ahead of evaluated competitors on medical test sets, with the German Medical Model showing particularly strong performance. 

These accuracy gains translate directly to fewer corrections, cleaner EHR integration and reduced friction in patient interactions. 

Across our newest medical models, our medical Keyword Error Rates (KWER) performance:

Language

KWER

German

5.43

Danish

6.17

Norwegian

8.02

Nordic expansion strengthens regional coverage

The addition of Danish and Norwegian builds on Speechmatics' Nordic medical coverage alongside Finnish, enabling providers across the region to standardize on a single Voice AI platform while working in their native languages.

The Nordic healthcare market is moving fast on Voice AI adoption, and they expect technology that works without compromise. That requires the rigorous testing and optimization that made our English Medical Model the industry benchmark. We don't compromise on quality when we add new languages.

Yahia Abaza, Product Manager, Speechmatics

Deployment flexibility: on your infrastructure and terms

The new multilingual Medical Model is available across on-premises, private cloud and SaaS infrastructure, giving healthcare organizations the flexibility to choose the deployment model that fits their compliance requirements, IT infrastructure and operational priorities.

This flexibility has proven critical for Speechmatics' expanding global medical client base. Whether a healthcare provider in Germany needs on-premises deployment for data residency, a telehealth platform in Spain wants private cloud, or an AI scribe company in the Netherlands prefers cloud-native SaaS, organizations can adopt Voice AI without compromising on either performance or their specific regulatory and operational requirements.

Real-time first, built for the pace of care

Real-time performance is at the center of the release. The medical models are designed to power live ambient scribes, telehealth, clinical contact centers and in-room assistants without forcing developers to trade accuracy for latency.

Above typical real-time latency thresholds, the models remain close to batch accuracy, and under one second they perform strongly compared with competing systems. That allows clinicians to see transcripts and summaries emerge as they speak, while back-office workflows can use the same models for high-volume file processing.

Ambient AI only helps if it keeps up with real clinical conversations. We built these models for fast, overlapping dialogue, non-native speakers, accents and imperfect audio, not just clean test clips. Real-time is the default use case, not an afterthought.

Stuart Wood, Senior Product Manager, Speechmatics

Built on proven industry-leading clinical accuracy

The seven language models build on the foundation of Speechmatics' English Medical Model, which set industry benchmarks in September 2025: 93% general real-time accuracy (7% WER), 96% medical keyword recall, and a keyword error rate 50% lower than the nearest competitor.

All models are optimized using NVIDIA GPU infrastructure, delivering the same level of performance across languages and handling the full complexity of clinical environments. Whether processing real-time ambient scribes or high-volume batch transcription, the models maintain consistent accuracy without forcing organizations to choose between speed and precision.

Availability

The expanded Medical Model with German, Danish and Norwegian is now available for production usage. Access is available through:

  • Speechmatics Portal for direct testing and evaluation

  • API integration for production deployment across real-time and batch workflows

  • On-premises and private cloud deployment for regulated healthcare environments

Healthcare technology partners can begin testing today. For more information, to provide feedback, or to schedule a technical demo, visit Speechmatics’ website or contact the team directly.

Latest Articles

Carousel slide image
Use Cases

The court reporter shortage crisis: data, causes, and what legal teams are doing about it

The court reporter shortage is reshaping litigation. Explore data, causes, and how legal teams are using digital reporting and AI transcription to adapt.

Tom Young
Tom YoungDigital Specialist
Carousel slide image
Use Cases

What Word Error Rate Is Acceptable for Legal Transcription?

Word error rate for legal transcription has no single acceptable threshold. But knowing how accuracy, audio quality, and review obligations connect to real legal risk is what separates a reliable transcript from a costly one.

Tom Young
Tom YoungDigital Specialist
[alt: Bilingual medical model featuring terms related to various health conditions and medications in Arabic and English. Key terms include "Chronic kidney disease," "Heart attack," "Diabetes," and "Insulin," among others, displayed in an organized layout.]
Product

Speechmatics achieves a world first in bilingual Voice AI with new Arabic–English model

Sets a new accuracy bar for real-world code-switching: 35% fewer errors than the closest competitor.

Speechmatics
SpeechmaticsEditorial Team
[alt: Illuminated ancient mud-brick structures stand against a dusk sky, showcasing architectural details and textures. Palm trees are in the foreground, adding to the setting's ambiance. Visually captures a historic site in twilight.]
Product

Your voice agent speaks perfect Arabic. That's the problem.

Most voice AI models are trained on formal Arabic, but real conversations across the Middle East mix dialects and English in ways those systems aren’t built to handle.

Yahia Abaza
Yahia AbazaSenior Product Manger
new blog image header
Technical

How Nvidia Dominates the HuggingFace Leaderboards in This Key Metric

A technical deep-dive into Token Duration Transducers (TDT) — the frame-skipping architecture behind Nvidia's Parakeet models. Covers inference mechanics, training with forward-backward algorithm, and how TDT achieves up to 2.82x faster decoding than standard RNN-T.

Oliver Parish
Oliver Parish Machine Learning Engineer
[alt: Healthcare professionals in scrubs and lab coats walk briskly down a hospital corridor. A nurse uses a tablet while others carry patient charts and attend to a gurney. The setting conveys a busy, clinical environment focused on patient care.]
Use Cases

Why AI-native EHR platforms will treat speech as core infrastructure in 2026

As clinical workflows become automated and AI-driven, real-time speech is shifting from a transcription feature to the foundational intelligence layer inside modern EHR systems.

Vamsi Edara
Vamsi EdaraFounder and CEO, Edvak EHR