Jun 17, 2025 | Read time 4 min

Speechmatics launches Medical Model for real-time clinical transcription

With 98% accuracy, our new model is twice as good as the nearest competitor.
Medical model header asset
Speechmatics
SpeechmaticsEditorial team

The potential of AI-powered medical transcription has always been transformative: reduce screen time, streamline documentation, and free clinicians to focus on patients instead of keyboards.

This promise is driving explosive growth as new players outcompete legacy incumbent technology while improving the doctor/patient experience – with the global market projected to reach $8.41 billion by 2032, and 30% of US healthcare providers expected to adopt ambient scribes by end-2025.

But most models crumble under the pressure of real clinical environments, mishearing drug names, mangling ICD codes, and struggling with the rapid-fire complexity of medical conversations.

Speechmatics' new Medical Model is built to change this. 

As the leader in the real-time streaming speech-to-text (STT) space, Speechmatics has released this domain-specific model for HealthTech partners requiring advanced accuracy, particularly across broad medical terminology and in clinical environments where live note-taking and dictation are critical.

Already outperforming the competition

Speechmatics has consistently outperformed specialist medical model providers across a broad range of languages in both file-based and streaming modes.

The new Medical Model delivers 98% accuracy - that's twice as good as your next best option - positioning Speechmatics further ahead of other live streaming STT providers.

medical graph of competitors

Why this model matters now

In healthcare, conversations move fast.

They're full of shorthand, acronyms, multiple speakers and critical decisions made in seconds. 

Most transcription tools are trained for structured dictation, but real clinical dialogue is something else entirely – unstructured, rapid-fire, and packed with context that changes meaning.

This challenge is compounded by the accelerating pace of medical knowledge: what took 50 years to double in 1950 now doubles every 73 days, making AI-driven support systems critical to help clinicians process and apply this overwhelming volume of information in real-time conversations.

Speechmatics’ Medical Model is tuned for these real-world conversations, where speed, context, and precision are fundamental.

It's designed to support use cases where transcription needs to keep pace with care, such as:

  • Contact center triage

  • Ambient scribes and EHR integrations

  • In-room consultations and dictation

  • Telemedicine appointments

  • Radiology voice notes

The Medical Model pushes transcription accuracy to new levels in the moments that matter most. Available now in English, multilingual support across Speechmatics' 55+ supported languages is coming soon.

Tuned to understand medical context

In reality, medical professionals and patients speak in a range of accents. Your model needs to be able to understand every voice. This is not the case for the majority of models, which struggle with anything beyond standard pronunciation and mainstream voices.

Speechmatics' Medical Model, however, excels where others falter. Built on our accent-independent foundation that recognizes voices across 55+ languages, the model handles the linguistic diversity of modern healthcare – from emergency calls with heavy regional accents to consultations with international specialists. 

Our system delivers clinical-grade accuracy regardless of how doctors, nurses, or patients actually speak.

This foundation enables our new model to recognize and correctly transcribe:

  • ICD-10-CM codes

  • Clinical acronyms and shorthand

  • Drug names and procedures

  • Prescriptions and dosages

  • Phone numbers, addresses, and patient names

Essential to accurate clinical documentation, these areas are often misrecognized by both specialized and general-purpose models.

The Medical Model delivers superior accuracy for live workflows compared to traditional batch processing approaches.

Clinical impact: where performance meets usability

Real-world data reveals the scale of impact. A recent JAMA study found virtual scribes cut total EHR time by 5.6 minutes per appointment, including 1.3 minutes less note-taking and 1.1 minutes less "pajama time" (after-hours documentation). 

Research has also shown that "power users" see even bigger gains: a follow-up evaluation showed the top 33% of clinicians (by usage) achieving 2.5× greater time savings per note than lower-frequency users.

This proven reliability extends to mission-critical settings like ambulance services, where Speechmatics currently powers transcription for every emergency call across the UK, operating under a 100% uptime guarantee where reliability is literally a matter of life and death.

For clinical teams and healthtech providers, this translates to faster, cleaner EHR integration, less time spent correcting errors, and reduced friction in patient interactions.

"We're seeing great results already with Speechmatics. It's performing well in live deployments and accelerating our clients' efficiency at scale."  – Martin Taylor, CTO, Content Guru

Deployment, security, and availability

Healthcare data demands the highest security standards, and the Medical Model delivers comprehensive protection:

  • HIPAA-compliant infrastructure

  • Zero data retention

  • Encryption in transit and at rest

  • Audit trail support

  • Available now in preview via SaaS

  • On-prem availability for enterprise customers

The model is currently available in English, with rollout across Speechmatics' 55+ supported languages planned next.

Partners working in regulated healthcare environments have found confidence in Speechmatics' approach to compliance.

As a global ambient scribe customer shared "being in a regulated industry means data security, privacy, and compliance are top priorities for us. We've been reassured by how Speechmatics handles sensitive information and follows strict standards. The willingness of the team to understand and accommodate these regulations has really stood out and given us the peace of mind we need to stay compliant in all our processes."

Now accepting preview partners

If you're developing healthcare applications – from dictation tools and scribe systems to AI assistants, and need a transcription engine optimized for real-time medical speech, the Speechmatics Medical Model is now available to test.

Ready to see clinical-grade transcription in action? Head to Speechmatics' Portal to use the Medical Model directly in preview.

Experience the future of medical transcription today

With Speechmatics’ new Medical Model, you’ll streamline documentation, enhance patient care, and reduce administrative burdens.

Latest Articles

[alt: Bilingual medical model featuring terms related to various health conditions and medications in Arabic and English. Key terms include "Chronic kidney disease," "Heart attack," "Diabetes," and "Insulin," among others, displayed in an organized layout.]
Product

Speechmatics achieves a world first in bilingual Voice AI with new Arabic–English model

Sets a new accuracy bar for real-world code-switching: 35% fewer errors than the closest competitor.

Speechmatics
SpeechmaticsEditorial Team
[alt: Illuminated ancient mud-brick structures stand against a dusk sky, showcasing architectural details and textures. Palm trees are in the foreground, adding to the setting's ambiance. Visually captures a historic site in twilight.]
Product

Your voice agent speaks perfect Arabic. That's the problem.

Most voice AI models are trained on formal Arabic, but real conversations across the Middle East mix dialects and English in ways those systems aren’t built to handle.

Yahia Abaza
Yahia AbazaSenior Product Manger
new blog image header
Technical

How Nvidia Dominates the HuggingFace Leaderboards in This Key Metric

A technical deep-dive into Token Duration Transducers (TDT) — the frame-skipping architecture behind Nvidia's Parakeet models. Covers inference mechanics, training with forward-backward algorithm, and how TDT achieves up to 2.82x faster decoding than standard RNN-T.

Oliver Parish
Oliver Parish Machine Learning Engineer
[alt: Healthcare professionals in scrubs and lab coats walk briskly down a hospital corridor. A nurse uses a tablet while others carry patient charts and attend to a gurney. The setting conveys a busy, clinical environment focused on patient care.]
Use Cases

Why AI-native EHR platforms will treat speech as core infrastructure in 2026

As clinical workflows become automated and AI-driven, real-time speech is shifting from a transcription feature to the foundational intelligence layer inside modern EHR systems.

Vamsi Edara
Vamsi EdaraFounder and CEO, Edvak EHR
[alt: Logos of Speechmatics and Edvak are displayed side by side, interconnected by a stylized x symbol. The background features soft, wavy lines in light blue, creating a modern and tech-focused aesthetic.]
Company

One word changes everything: Speechmatics and Edvak EHR partner to make voice AI safe for clinical automation at scale

Turning real-time clinical speech into trusted, EHR-native automation.

Speechmatics
SpeechmaticsEditorial Team
[alt: Concentric circles radiate outward from a central orange icon with a white Speechmatics logo. The background is dark blue, enhancing the orange glow. A thin green line runs horizontally across the lower part of the image.]
Technical

Speed you can trust: The STT metrics that matter for voice agents

What “fast” actually means for voice agents — and why Pipecat’s TTFS + semantic accuracy is the clearest benchmark we’ve seen.

Archie McMullan
Archie McMullanSpeechmatics Graduate