Jul 12, 2022 | Read time 2 min

Speechmatics unlocks accurate understanding of financial terms with new language pack

Speechmatics releases a new financial language pack for speech-to-text transcription. The engine trained on 200,000 hours of earnings calls transcripts to reduce errors by 40%.
Speechmatics unlocks accurate understanding of financial terms with new language pack
Speechmatics
SpeechmaticsEditorial team

Engine trained on 200,000 hours of earnings calls transcripts to reduce errors by 40%

Speechmatics, the leading speech recognition technology scaleup, has launched an English language pack specific to the finance industry. This addition has been built for use cases including compliance, fraud identification, analytics, financial news and earnings calls. The world’s most accurate and inclusive speech-to-text engine can now identify finance terminology in conversation helping to avoid confusion with abbreviations, acronyms and finance-specific terms.

The financial services sector is notoriously jargon-heavy with industry terms that are either completely unique to the industry or that can be confused with commonly used phrases. Acronyms such as VAT or SEC or abbreviations e.g. Generally Accepted Accounting Principles (GAAP), and the word ‘gap’ can often confuse standard speech-to-text engines. Speechmatics can now capture the speech data as intended, turning unstructured, audio data into usable information. By improving the accuracy of transcripts, downstream tasks can be more consistent and streamlined for users.

Global experts in deep learning and speech recognition, Speechmatics has built the most accurate and inclusive speech-to-text engine available. Historically, training data had to be manually tagged, classified or ‘labelled’. This has resulted in engines trained on narrow datasets, which fail to represent the diversity of voices that use them. In contrast, Speechmatics’ speech-to-text engine is trained through exposure to hundreds of thousands of individual voices using millions of hours of unlabelled, more representative voice data. This has enabled a paradigm shift in accuracy, dramatically reducing both AI bias and errors in speech recognition. Given the broad range of demographics that exist within financial services, Speechmatics’ new offering will be key to supporting and sustaining inclusivity in the sector.

Katy Wigdahl, CEO, Speechmatics, said, “Our aim is to understand every voice regardless of race, gender or accent and I’m proud that Speechmatics has overcome significant challenges that traditional speech-to-text engines have struggled with.

However, we wanted to go even further and dive into the complexities that specific industries present. Some sectors are known for complex terms and jargon that, if added to our global models, risk making the technology less effective for other users. This led to our approach for domain-specific packs that can directly address the needs of individual sectors. Financial services was an obvious place to start but we hope our language pack will set a blueprint for every high-stakes industry where the financial, reputational and social cost of misunderstanding is high.”

Customers are already using the finance language pack to transcribe financial news and earnings calls as well as utilising the technology to aid call centre analysts and traders. The pack is the first industry-specific pack and paves the way for industries with equally complex terminology such as medicine and law.

Latest Articles

[alt: Bilingual medical model featuring terms related to various health conditions and medications in Arabic and English. Key terms include "Chronic kidney disease," "Heart attack," "Diabetes," and "Insulin," among others, displayed in an organized layout.]
Product

Speechmatics achieves a world first in bilingual Voice AI with new Arabic–English model

Sets a new accuracy bar for real-world code-switching: 35% fewer errors than the closest competitor.

Speechmatics
SpeechmaticsEditorial Team
[alt: Illuminated ancient mud-brick structures stand against a dusk sky, showcasing architectural details and textures. Palm trees are in the foreground, adding to the setting's ambiance. Visually captures a historic site in twilight.]
Product

Your voice agent speaks perfect Arabic. That's the problem.

Most voice AI models are trained on formal Arabic, but real conversations across the Middle East mix dialects and English in ways those systems aren’t built to handle.

Yahia Abaza
Yahia AbazaSenior Product Manger
new blog image header
Technical

How Nvidia Dominates the HuggingFace Leaderboards in This Key Metric

A technical deep-dive into Token Duration Transducers (TDT) — the frame-skipping architecture behind Nvidia's Parakeet models. Covers inference mechanics, training with forward-backward algorithm, and how TDT achieves up to 2.82x faster decoding than standard RNN-T.

Oliver Parish
Oliver Parish Machine Learning Engineer
[alt: Healthcare professionals in scrubs and lab coats walk briskly down a hospital corridor. A nurse uses a tablet while others carry patient charts and attend to a gurney. The setting conveys a busy, clinical environment focused on patient care.]
Use Cases

Why AI-native EHR platforms will treat speech as core infrastructure in 2026

As clinical workflows become automated and AI-driven, real-time speech is shifting from a transcription feature to the foundational intelligence layer inside modern EHR systems.

Vamsi Edara
Vamsi EdaraFounder and CEO, Edvak EHR
[alt: Logos of Speechmatics and Edvak are displayed side by side, interconnected by a stylized x symbol. The background features soft, wavy lines in light blue, creating a modern and tech-focused aesthetic.]
Company

One word changes everything: Speechmatics and Edvak EHR partner to make voice AI safe for clinical automation at scale

Turning real-time clinical speech into trusted, EHR-native automation.

Speechmatics
SpeechmaticsEditorial Team
[alt: Concentric circles radiate outward from a central orange icon with a white Speechmatics logo. The background is dark blue, enhancing the orange glow. A thin green line runs horizontally across the lower part of the image.]
Technical

Speed you can trust: The STT metrics that matter for voice agents

What “fast” actually means for voice agents — and why Pipecat’s TTFS + semantic accuracy is the clearest benchmark we’ve seen.

Archie McMullan
Archie McMullanSpeechmatics Graduate