What does Speechmatics do?

Speechmatics provides speech technology and Voice AI for enterprises, offering accurate Speech-to-Text, Text-to-Speech, and Voice Agent solutions. Our models understand every voice and accent across 56+ languages, helping businesses unlock the full potential of voice data.

How accurate is Speechmatics Speech-to-Text?

Speechmatics delivers best-in-market accuracy, achieving up to 99% word accuracy and 96% medical keyword recall in industry benchmarks. Our models handle multiple accents, noisy environments, and multi speakers with ease.

What makes Speechmatics Text-to-Speech different?

Our low-latency Text-to-Speech (TTS) delivers lifelike, human-sounding voices with sub-150ms latency that is ideal for real-time conversations. Developers can stream natural speech in multiple voices and deploy it in the cloud, hybrid, or on-prem for privacy and control.

Can I build real-time voice agents with Speechmatics?

Our voice AI enables developers to build real-time voice agents that listen, understand, and respond naturally. Plug in fast with a flexible API and native integrations to power your AI voice agents.

Which industries use Speechmatics?

Speechmatics is trusted by organizations in media, healthcare, contact center, medical, finance, legal, education, and accessibility. Our technology powers transcription, translation, call analytics, and voice AI applications worldwide.

Speechmatics launches Medical Model for real-time clinical transcription

With 93% accuracy and 50% fewer keyword errors than the next best system, our new medical model sets the bar for real-time clinical transcription.

The potential of AI-powered medical transcription has always been transformative: reduce screen time, streamline documentation, and free clinicians to focus on patients instead of keyboards.

This promise is driving fast adoption as new players outcompete legacy incumbent technology while improving the doctor/patient experience, with the global market projected to reach $8.41 billion by 2032, and 30% of US healthcare providers expected to adopt ambient scribes by end-2025.

But most models crumble under the pressure of real clinical environments, mishearing drug names, mangling ICD codes, and struggling with the rapid-fire complexity of medical conversations.

Speechmatics' new Medical Model is built to change this.

As the leader in the real-time streaming speech-to-text (STT) space, Speechmatics has released this domain-specific model for HealthTech partners requiring advanced accuracy, particularly across broad medical terminology and in clinical environments where live note-taking and dictation are critical.

Already outperforming the competition

Speechmatics has consistently outperformed specialist medical model providers across a broad range of languages in both file-based and streaming modes.

The Medical Model delivers 93% accuracy, 50% fewer keyword errors, and 17% fewer word errors than the next best system, positioning Speechmatics further ahead of other live streaming STT providers.

Keyword Error Rate (KWER) is the metric that matters most in a clinical setting: it measures how often a system gets the terms that actually change patient outcomes wrong, drug names, dosages, procedures, diagnoses. On this measure, Speechmatics leads by a wide margin.

At 4% KWER, Speechmatics is closer to half the error rate of the nearest competitor, and less than a quarter of the error rate at the bottom of the chart. In a live consultation, that's the difference between a dosage transcribed correctly the first time and a note that needs correcting before it ever reaches a patient's record.

Why this model matters now

In healthcare, conversations move fast.

They're full of shorthand, acronyms, multiple speakers and critical decisions made in seconds.

Most transcription tools are trained for structured dictation, but real clinical dialogue is something else entirely, unstructured, rapid-fire, and packed with context that changes meaning.

This challenge is compounded by the accelerating pace of medical knowledge: what took 50 years to double in 1950 now doubles every 73 days, making AI-driven support systems critical to help clinicians process and apply this overwhelming volume of information in real-time conversations.

Speechmatics' Medical Model is tuned for these real-world conversations, where speed, context, and precision are fundamental.

It's designed to support use cases where transcription needs to keep pace with care, such as:

Contact center triage
Ambient scribes and EHR integrations
In-room consultations and dictation
Telemedicine appointments
Radiology voice notes

Next stop: multilingual medical

The English Medical Model was only the starting point. Since launch, we've extended the same approach, train on real clinical audio, tune for the terms that matter, benchmark against the hardest cases, to a growing list of languages:

German, Danish, and Norwegian
, trained on over 2 billion words of medical conversation and clinical documentation, bringing the Medical Model to seven languages total and cutting error rates by up to 50% in each.
Swedish
, achieving 3.91% KWER and 40% fewer errors than the closest competitor, rounding out a Nordic medical lineup alongside Finnish, Danish, and Norwegian.
Arabic-English bilingual
, the first medical model built to handle code-switching between the two languages mid-sentence, trained on twice the clinical vocabulary of the English model to correctly transcribe drug names, dosages, and shorthand regardless of which language carries them.

Each new language model goes through the same rigorous testing before release: real clinical audio, regional accents, and the compound terminology that trips up general-purpose models. The goal isn't just translation, it's clinical-grade accuracy everywhere Speechmatics operates, with full multilingual coverage across 56+ supported languages on the roadmap.

Tuned to understand medical context

In reality, medical professionals and patients speak in a range of accents. Your model needs to be able to understand every voice. This is not the case for the majority of models, which struggle with anything beyond standard pronunciation and mainstream voices.

Speechmatics' Medical Model, however, excels where others falter. Built on our accent-independent foundation that recognizes voices across 56+ languages, the model handles the linguistic diversity of modern healthcare, from emergency calls with heavy regional accents to consultations with international specialists.

Our system delivers clinical-grade accuracy regardless of how doctors, nurses, or patients actually speak.

This foundation enables our new model to recognize and correctly transcribe:

ICD-10-CM codes
Clinical acronyms and shorthand
Drug names and procedures
Prescriptions and dosages
Phone numbers, addresses, and patient names

Essential to accurate clinical documentation, these areas are often misrecognized by both specialized and general-purpose models.

The Medical Model delivers superior accuracy for live workflows compared to traditional batch processing approaches.

Clinical impact: where performance meets usability

Real-world data reveals the scale of impact. A recent JAMA study found virtual scribes cut total EHR time by 5.6 minutes per appointment, including 1.3 minutes less note-taking and 1.1 minutes less "pajama time" (after-hours documentation).

Research has also shown that "power users" see even bigger gains: a follow-up evaluation showed the top 33% of clinicians (by usage) achieving 2.5x greater time savings per note than lower-frequency users.

This proven reliability extends to mission-critical settings like ambulance services, where Speechmatics currently powers transcription for every emergency call across the UK, operating under a 100% uptime guarantee where reliability is literally a matter of life and death.

For clinical teams and healthtech providers, this translates to faster, cleaner EHR integration, less time spent correcting errors, and reduced friction in patient interactions.

"We're seeing great results already with Speechmatics. It's performing well in live deployments and accelerating our clients' efficiency at scale." – Martin Taylor, CTO, Content Guru

Deployment, security, and availability

Healthcare data demands the highest security standards, and the Medical Model delivers comprehensive protection:

HIPAA-compliant infrastructure
Zero data retention
Encryption in transit and at rest
Audit trail support
Available now in preview via SaaS
On-prem availability for enterprise customers

Partners working in regulated healthcare environments have found confidence in Speechmatics' approach to compliance.

As a global ambient scribe customer shared:

Being in a regulated industry means data security, privacy, and compliance are top priorities for us. We've been reassured by how Speechmatics handles sensitive information and follows strict standards. The willingness of the team to understand and accommodate these regulations has really stood out and given us the peace of mind we need to stay compliant in all our processes.

Get started now.

If you're developing healthcare applications, from dictation tools and scribe systems to AI assistants, and need a transcription engine optimized for real-time medical speech, the Speechmatics Medical Model is available now.

Ready to see clinical-grade transcription in action? Head to Speechmatics' Portal to use the Medical Model directly.

Experience the future of medical transcription today

With Speechmatics’ new Medical Model, you’ll streamline documentation, enhance patient care, and reduce administrative burdens.

Jul 29, 2026 | Read time 4 min

Speechmatics launches Medical Model for real-time clinical transcription

Already outperforming the competition

Why this model matters now

Next stop: multilingual medical

Tuned to understand medical context

Clinical impact: where performance meets usability

Deployment, security, and availability

Get started now.

Experience the future of medical transcription today

Read also

Related Articles

AI for medical transcription: The ultimate guide to healthcare Speech Recognition

Healthcare is feeling the strain. Multilingual AI can be the answer.

8 ways AI medical transcription is transforming global healthcare in 2025

Latest Articles

Speaker Focus: Fixing Voice AI for the real world

Stenograph and Speechmatics Announce Industry-First On-Device Integration for CATalyst VP

A Simpler Way to Pay: Speechmatics Is Moving to Credits

From a Parked Side Project to 30 Teams Running Real Sales Calls on Speechmatics

Dutch doctors spend a quarter of their day on admin. Wellcom has built the fix.