Speechmatics
  • Enterprise
  • Pricing
  1. Features And Deployments

There's speech-to-text. Then there's Speechmatics.

An API with a comprehensive range of features, unmatched accuracy, flexible deployment and AI-powered capabilities.

Everything you need to build brilliant voice features and products.

Configuration

Our models are built to deliver for your needs

Get the very best performance and fast transcription whether you choose real-time or batch modes - deployed however suits you.

Configuration

File transcription

Process thousands of hours of pre-recorded files, whenever you need them, and fast.

Configuration

Live transcription

Transcribe media as it happens. Get initial transcriptions in milliseconds, with context-driven accuracy improvements over time.

Configuration

On-Prem

Meet architecture, security and compliance needs by hosting our API in your own environment. Combine with Cloud, deploy using Docker Containers, or preconfigured Virtual Appliances.

Configuration

Cloud

Get secure and scalable access to our API through our cloud deployment and get instant access to all our new features, languages and updates.

Configuration

On-Device

Run Speechmatics directly on your devices for ultra-low latency and maximum data privacy. Ideal for use cases where connectivity is limited and data must stay local.

Transcription Features

Everything you need to hit the highest accuracy possible

Our customization options allow you to finely tune your set up to achieve high accuracy with even the most unique words and phrases.

Feature

Custom Dictionary

Boost accuracy for proper nouns, acronyms or industry-specific terms by providing a list of custom words.

Feature

Speaker & Channel Diarization

Track who said what and when with speaker labelling for each word, available for both batch and real-time transcription.

Feature

Numeral Formatting

Identify and correctly format numbers, dates and currencies automatically to improve transcript readability and enable effective post-processing.

Feature

Profanity & Disfluency Detection

Aid comprehensibility and compliance by detecting and optionally removing words that are considered profanities or hesitations.

Features

File Formats

Minimize the resource needed to prepare audio or video files with support for all major audio and video formats along with automatic sample rate detection.

Advanced Features

Easily push a variety of media formats to the API

Easily push a variety of media formats to the API and get a rich set of metadata to support your post processing needs.

Features

Confidence Scores

Collect confidence scores for every word in the transcript to enable efficient human review and editing.

Feature

Industry Language Packs

We're developing English language packs optimized to industry with sector-specific terminology. Finance is available now, with more to follow soon.

Features

Word Timings

Get accurate timestamps for every word in the transcript to allow for post-processing and improved end user experience.

Feature

Advanced Punctuation & Casing

Improve readability with language-specific capitalization and punctuation including commas, question marks and exclamation marks.

Features

Audio Events

Improve accessibility & fully-automate tedious captioning by identifying and labelling non-speech sounds in media, using AI.

Languages

Partner with Speechmatics to maximize your total addressable market

We deliver for multilingual, multicultural and multinational businesses, with coverage of nearly half the world’s languages across a range of dialects and accents.

Language Coverage

We support 50 languages, covering most native languages with unmatched accuracy.

Accents and dialects

Whether you need Brazilian Portuguese or Canadian French, we have you covered with a single language model that supports all associated accents and dialects.

Translation

Transcribe and translate audio to and from English for over 30 languages using a single API call.

Language Identification

Simplify integration and ensure accurate transcription with automatic detection of the language spoken.

AI Powered Capabilities

The combination of accurate transcription with breathtaking speech capabilities, providing solution bundles for customers makes Speechmatics truly unique.

Translation

With automatic translation with a single API call, you can translate media and provide captions for over half the world’s population.

Summaries

Instantly generate summaries for social and video platforms, so viewers know what to expect, without you having to manually write.

Sentiment

Don’t just rely on reviews. See how customers are feeling about every aspect of your service by identifying sentiment throughout calls.

Topics

Your audience don’t want to (always) watch long media. Give them the topics discussed and the timestamps so they can engage with what they are most interested in.

Chapters

As well as being divided up and summarized, each chapter is given a heading, making it super easy to find the most engaging content.

Resources for features and deployments

Carousel slide image
Company

Adobe and Speechmatics deliver cloud-grade speech recognition on-device for Premiere

Adobe Premiere users can run the most accurate on-device transcription locally; efficient enough for a laptop, powerful enough for professional work.

Speechmatics
SpeechmaticsEditorial Team
Carousel slide image
Use Cases

Best speech-to-text AI guide: APIs, platforms and services compared

Speech-to-text has moved from novelty to enterprise infrastructure. Here's how the leading platforms stack up in 2026 — and how to pick the right one.

Tom Young
Tom YoungDigital Specialist
[alt: Two healthcare professionals, wearing blue scrubs, engage in conversation in a hospital in Sweden]
Product

Speechmatics launches new Swedish medical model, cutting transcription errors by 40%

Expanding a Nordic medical lineup with 3.91% KWER model that delivers sub-second latency across Swedish, Finnish, Danish, and Norwegian clinical workflows.

Yahia Abaza
Yahia AbazaSenior Product Manger
speaker diarization
Product

What is Speaker Diarization and why does it matter in voice AI?

The breakthrough technology helping AI understand conversations like humans do.

Stuart Wood
Stuart WoodProduct Manager

Ready to Understand Every Voice?

Sign up to our free speech-to-text SaaS Portal and we’ll guide you through the integration of our API.

SpeechmaticsSpeechmatics
ISO/IEC 27001
Queen's Award for Enterprise 2019
G2 High Performer badge
GDPR Compliant
HIPAA Compliant
ISO 27001 Certified
SOC 2 Certified
Product
  • Speech to Text
  • Real-Time
  • Languages
  • On-Device
  • Features and deployments
  • Enterprise
  • Voice Agent API
Use Cases
  • Medical & Healthcare
  • AI Voice Agents
  • Legal Transcription
  • Contact Center Solutions
  • Media & Captioning
  • Speech Analytics
  • Note-Taking & Meeting Assistants
  • EdTech
Pricing
  • Pricing Options
Resources
  • Documentation
  • Blog and Latest News
  • Case Studies
  • Service Status
  • Security
  • Hey AI, learn about us
  • GitHub Academy
About
  • About Us
  • Careers
  • Brand
  • Contact Us
  • Startup Program
Compare
  • vs Deepgram
  • vs AssemblyAI
  • Speechmatics Discord
  • Speechmatics Twitter
  • Speechmatics LinkedIn
  • Speechmatics YouTube channel
  • Speechmatics Reddit
  • Speechmatics Github
  • Privacy Policy
  • Terms of Website
  • Terms of Service
  • Cookie Policy
  • Sitemap
Copyright © Speechmatics 2026