Apr 10, 2018 | Read time 1 min

Custom Dictionary – the technical edit

Accurate transcription of voice data using any-context speech recognition enables enterprise businesses to understand insights automatically.
Header image

I am sure we can all remember a time when we misheard something and requested it to be repeated or perhaps just guessed what was said and carried on. Well, computers are no different, except in many cases, there is no possibility to ask for things to be repeated.

When you guessed what was said, your brain used all the surrounding context and information available to help you make the best attempt at interpreting what was actually said. Speechmatics’ Custom Dictionary allows the Speech Recognition engine to do the same thing, using whatever context is available.

Sounds cool, but isn't that difficult to do?

Traditionally, this might have involved complex model training or been something that you had to do that impacted all transcripts universally across your deployment, but like all Speechmatics’ features, we have made this as easy for you to use as possible. We do as much of the difficult part as we can, in a fast and flexible manner. All you have to do is provide the words that you think are relevant for the audio you are transcribing and the speech engine immediately does the rest for you.

As an example, suppose that you know you are transcribing a conversation between two people, you can add their names as the context and that will enable the engine to spell them correctly. Or perhaps you are transcribing a video about a company, you can add their brand or product names to make sure that they are correctly understood.

How could I do that?

To keep it really simple, you just provide the context as a set of words in plain text. Often the context is available from existing sources, such as the attendee names in a meeting, from meeting contacts in outlook, company names from a CRM system or the video brief being used. This means that you can use them directly from these locations. Each transcription session can use a different set of contexts so one deployment can be flexible enough for all your use cases without needing to 'optimise' for a common set across multiple transcriptions.

It really is this simple:

1. Connect to the speech server via the interface 2. Start the session with the context words 3. Provide the audio 4. Get the transcript

What’s the catch?

There is no catch, with the addition of a few words and practically no additional overheads, you can produce a more accurate transcript first time with less editing, less embarrassment, better accuracy and faster.

One last thing…

As always it works in all the languages we have too.

Ian Firth, Speechmatics

Latest Articles

Carousel slide image
Technical

How to build a microbatching workflow with the Speechmatics API

Build a cleaner path between batch and real time. Learn when micro-batching makes sense, how to chunk audio, submit jobs, stitch JSON, and scale safely with the Speechmatics API.

Speechmatics
SpeechmaticsEditorial Team
Carousel slide image
Product

Alphanumeric speech recognition: why voice assistants mangle SKUs (and how to fix it)

A guide for voice AI engineers, ecommerce platforms and warehouse teams on SKU recognition accuracy voice assistant deployments depend on: why speech recognition systems produce transcription errors on product codes, what to measure when error rates matter, and the fixes that move the needle on order picking, voice ordering and customer-facing voice AI.

Speechmatics
SpeechmaticsEditorial Team
Carousel slide image
Technical

The Adobe story: How we made cloud-grade AI work on your laptop

Behind the build: what it takes to make cloud-grade speech recognition work inside Adobe Premiere, and why Whisper raised the stakes.

Andrew Innes
Andrew InnesChief Architect
Carousel slide image
Company

Adobe and Speechmatics deliver cloud-grade speech recognition on-device for Premiere

Adobe Premiere users can run the most accurate on-device transcription locally; efficient enough for a laptop, powerful enough for professional work.

Speechmatics
SpeechmaticsEditorial Team
Carousel slide image
Use Cases

Best speech-to-text AI guide: APIs, platforms and services compared

Speech-to-text has moved from novelty to enterprise infrastructure. Here's how the leading platforms stack up in 2026 — and how to pick the right one.

Tom Young
Tom YoungDigital Specialist
Speechmatics x Thymia combine medical-grade speech-to-text with clinical-grade voice biomarker intelligence to identify health signals.
News

AI can now understand health signals from 15 seconds of your voice, including fatigue, stress and type 2 diabetes

The joint platform returns transcription and health signals in real time, with no additional hardware required.

Speechmatics
SpeechmaticsEditorial Team