Sep 6, 2018 | Read time 2 min

Speechmatics Extends Transcription Offering With Sounds Feature

Header image
Sounds extends the Custom Dictionary feature, which lets users add context-specific words in real-time, such as footballer names or breaking news locations, instantly delivering consistency and reliability of specific words and ultimately increasing the accuracy.

Speechmatics has launched the Sounds feature, a new addition to the company’s current speech-to-text offering. Sounds supports broadcasters by delivering highly accurate transcripts, it allows the speech engine to understand the difference between word pronunciations and the way they are written.

By refining the pronunciation within Custom Dictionary, Sounds can help with the spelling of names, products, acronyms, abbreviations, trademarks, copyrights and alternate word forms.

Ian Firth, VP Products at Speechmatics, explained: “

In the broadcast industry, subtitling for names and words that don’t sound phonetically as they are written is an ongoing bugbear and can be a cause for significant embarrassment by the broadcaster. For example, knowing how to spell Condoleezza Rice’s name flawlessly every time. With Sounds, our engine simply needs pronunciation hints that describe the sounds of the words, like ‘AI’ pronounced as ‘ay eye’. It is not necessary to use phonetic definitions, but something that sounds like the word you want to define and the way that you want it to be written. It can even be used to change things completely, for example you could easily configure it so that ‘Dr’ is written every time ‘Doctor’ is said.”

With other vendors’ offerings, a pronunciation pack is required when using a hints-style feature. With the Speechmatics solution, Sounds is built to enhance the speech engine when required and supporting multiple pronunciations of the same word. As the solution can be delivered in private environments, it is highly secure and accessible only to the broadcaster.

Firth continues,

“Accuracy is still the key metric for speech recognition systems and remains top of mind throughout the development of our solutions at Speechmatics. At Speechmatics, we pride ourselves on unparalleled accuracy rates, and Sounds enables us to continue to improve on our speech-to-text accuracy by building technology that understands pronunciations that are personalised to the user and use case when needed.”

The Sounds feature is available now through Custom Dictionary. To try a demo of the Sounds feature at IBC in Amsterdam 14-18 September, visit Speechmatics at stand 8E17.

Latest Articles

Carousel slide image
Technical

How to build a microbatching workflow with the Speechmatics API

Build a cleaner path between batch and real time. Learn when micro-batching makes sense, how to chunk audio, submit jobs, stitch JSON, and scale safely with the Speechmatics API.

Speechmatics
SpeechmaticsEditorial Team
Carousel slide image
Product

Alphanumeric speech recognition: why voice assistants mangle SKUs (and how to fix it)

A guide for voice AI engineers, ecommerce platforms and warehouse teams on SKU recognition accuracy voice assistant deployments depend on: why speech recognition systems produce transcription errors on product codes, what to measure when error rates matter, and the fixes that move the needle on order picking, voice ordering and customer-facing voice AI.

Speechmatics
SpeechmaticsEditorial Team
Carousel slide image
Technical

The Adobe story: How we made cloud-grade AI work on your laptop

Behind the build: what it takes to make cloud-grade speech recognition work inside Adobe Premiere, and why Whisper raised the stakes.

Andrew Innes
Andrew InnesChief Architect
Carousel slide image
Company

Adobe and Speechmatics deliver cloud-grade speech recognition on-device for Premiere

Adobe Premiere users can run the most accurate on-device transcription locally; efficient enough for a laptop, powerful enough for professional work.

Speechmatics
SpeechmaticsEditorial Team
Carousel slide image
Use Cases

Best speech-to-text AI guide: APIs, platforms and services compared

Speech-to-text has moved from novelty to enterprise infrastructure. Here's how the leading platforms stack up in 2026 — and how to pick the right one.

Tom Young
Tom YoungDigital Specialist
Speechmatics x Thymia combine medical-grade speech-to-text with clinical-grade voice biomarker intelligence to identify health signals.
News

AI can now understand health signals from 15 seconds of your voice, including fatigue, stress and type 2 diabetes

The joint platform returns transcription and health signals in real time, with no additional hardware required.

Speechmatics
SpeechmaticsEditorial Team