Jul 21, 2022 | Read time 4 min

Speechmatics unlocks improved understanding for Canadian French and Brazilian Portuguese speakers

The latest update removes one in five errors in Canadian French, improving the accuracy of speech-to-text tasks
Speechmatics unlocks improved understanding for Canadian French and Brazilian Portuguese speakers
Speechmatics
SpeechmaticsEditorial team

Latest update removes one in five errors in Canadian French, improving accuracy of speech-to-text tasks

Speechmatics, the leading speech recognition technology scaleup, has further enhanced its 34 existing languages with the improvement of Canadian French and Brazilian Portuguese. These commonly used language variations have both subtle and significant differences from European French and Portuguese. Speechmatics’ industry-leading engine now caters specifically to their nuances within existing language packs, enabling the company to achieve unprecedented accuracy in speech recognition. This will allow end users operating in these languages to make the most of their speech recognition technology, in industries such as contact centers, where accurate communication and understanding are vital.

Canadian French is spoken by large segments of the population, with over one in five (22%) of Canadians speaking the language. However, there are differences which have a significant impact on the accuracy of speech-to-text transcripts if they are not accounted for. For example, Canadian French includes more English words in general speech, and vocabulary is used in different ways than European French. In addition, Canadian French has alternative pronunciations, uses different prepositions (‘tu’ and ‘vous’ are used more interchangeably in formal speech) and uses several indigenous loanwords.

This differentiation from European French has historically resulted in lower accuracy levels in speech recognition, which fails to effectively serve the large groups which use these dialects. Speechmatics has tackled this challenge by training its engine on 1,100 hours of Canadian French voice data, drawing from publicly available sources including Common Voice* and a Canadian French production house. This training has led to an average relative reduction of 18% in Word Error Rate (WER) for its Standard model and 13% for its Enhanced model, meaning these improvements have removed one in five errors from Standard model transcripts. As a result, Speechmatics now produces more accurate transcripts in Canadian French than the likes of Google and Microsoft. Meanwhile, Brazilian Portuguese has been trained using Common Voice, as well as multilingual Librispeech. This additional training has led to overall improvements in the Portuguese language pack and a reduction in WER of 5% for Standard models and 3% for Enhanced models respectively.

As global experts in deep learning and speech recognition, Speechmatics has built the most accurate and inclusive speech-to-text engine available. Historically, training data had to be manually tagged, classified or ‘labelled’. This has resulted in engines that have been trained on narrow datasets, which fail to represent the diversity of voices that use them. In contrast, Speechmatics’ speech-to-text engine is trained through exposure to hundreds of thousands of individual voices using millions of hours of unlabelled, more representative voice data. By improving the accuracy of transcripts for Canadian French and Brazilian Portuguese, downstream tasks can be more consistent and streamlined for users of these dialects.

John Hughes, Accuracy Team Lead at Speechmatics, said, “At Speechmatics we are striving to understand every voice, regardless of race, gender, or accent. We have been working tirelessly to improve and simplify our machine learning pipelines to ingest and process new data rapidly. Now we have the flexibility to adapt to events as they occur. For instance, the Accuracy Team built a Ukrainian language pack in only two weeks which processed over 300 transcription jobs - 50 hours of speech data - in its first month of use. Building on this work, we next chose to improve Canadian French and Brazilian Portuguese dialects in our Global French and Portuguese language packs respectively. We chose these due to the vast number of speakers worldwide and the lack of representative training data. Now we have a clear advantage in accuracy over all our competitors: our models, powered by self-supervised learning, continue to move us closer to making speech recognition valuable to everyone, everywhere, anytime.”

These improvements in Canadian French and Brazilian Portuguese are already in use by Speechmatics customers and the Speechmatics team is in the process of developing further dialect-specific improvements to existing packs.

*A crowdsourced database for use in speech recognition software

Latest Articles

Carousel slide image
Use Cases

What Word Error Rate Is Acceptable for Legal Transcription?

Word error rate for legal transcription has no single acceptable threshold. But knowing how accuracy, audio quality, and review obligations connect to real legal risk is what separates a reliable transcript from a costly one.

Mieke Smith
Mieke SmithSenior Writer
Carousel slide image
Use Cases

The court reporter shortage crisis: data, causes, and what legal teams are doing about it

The court reporter shortage is reshaping litigation. Explore data, causes, and how legal teams are using digital reporting and AI transcription to adapt.

Tom Young
Tom YoungDigital Specialist
[alt: Bilingual medical model featuring terms related to various health conditions and medications in Arabic and English. Key terms include "Chronic kidney disease," "Heart attack," "Diabetes," and "Insulin," among others, displayed in an organized layout.]
Product

Speechmatics achieves a world first in bilingual Voice AI with new Arabic–English model

Sets a new accuracy bar for real-world code-switching: 35% fewer errors than the closest competitor.

Speechmatics
SpeechmaticsEditorial Team
[alt: Illuminated ancient mud-brick structures stand against a dusk sky, showcasing architectural details and textures. Palm trees are in the foreground, adding to the setting's ambiance. Visually captures a historic site in twilight.]
Product

Your voice agent speaks perfect Arabic. That's the problem.

Most voice AI models are trained on formal Arabic, but real conversations across the Middle East mix dialects and English in ways those systems aren’t built to handle.

Yahia Abaza
Yahia AbazaSenior Product Manger
new blog image header
Technical

How Nvidia Dominates the HuggingFace Leaderboards in This Key Metric

A technical deep-dive into Token Duration Transducers (TDT) — the frame-skipping architecture behind Nvidia's Parakeet models. Covers inference mechanics, training with forward-backward algorithm, and how TDT achieves up to 2.82x faster decoding than standard RNN-T.

Oliver Parish
Oliver Parish Machine Learning Engineer
[alt: Healthcare professionals in scrubs and lab coats walk briskly down a hospital corridor. A nurse uses a tablet while others carry patient charts and attend to a gurney. The setting conveys a busy, clinical environment focused on patient care.]
Use Cases

Why AI-native EHR platforms will treat speech as core infrastructure in 2026

As clinical workflows become automated and AI-driven, real-time speech is shifting from a transcription feature to the foundational intelligence layer inside modern EHR systems.

Vamsi Edara
Vamsi EdaraFounder and CEO, Edvak EHR