Blog - News

May 5, 2020

Consistent transcription of numbers improves contextual understanding of conversations

The first step in sophisticated number recognition is to recognize and transcribe numbers less than 10 in English in a standardized and consistent format.

Speechmatics, a UK leader in any-context speech recognition technology has announced that Veritone, Inc. (Nasdaq: VERI), the creator of the world’s first operating system for artificial intelligence, aiWARE™, now offers secure transcription capability as a cognitive service within Veritone’s aiWARETM operating system for AI.

With this announcement, Speechmatics’ award-winning technology enables Veritone customers with strict security requirements to turn unstructured voice data into actionable insights which were previously inaccessible.

documentation image
Prosodica develops analytics applications that are delivered through Vail Systems’ carrier integrated enhanced network services platform for call centers.

These applications integrate proprietary voice analysis technology with Speechmatics any-context speech recognition to generate next-level insights from recorded calls. Using Prosodica applications, call centers gain fresh perspectives into the character and content of their customer conversations, enabling them to identify opportunities to improve efficiency, reduce customer effort, and continually measure customer experience without the need for manually listening to customer calls or soliciting agent/customer self-reporting.

Using mission-critical, accurate speech recognition technology, the application delivers an unparalleled ability to drive insight and meaning from voice data at scale regardless of dialect or accent. The Prosodica platform utilizes the customer-hosted version of the Speechmatics product to ensure security and compliance is maintained in every aspect of the solution.

Ready to Try Speechmatics?

Sign up for your free trial and we'll guide you through the implementation of our API. We pride ourselves on offering the best support for your business needs. If you have any questions, just ask.

Number recognition: at a glance The first step in sophisticated number recognition is to recognize and transcribe numbers less than 10 (0-9) in English in a predictable, standardized and consistent format.


Description Number recognition is a notoriously difficult problem in automatic speech recognition (ASR). Unlike words where there is only a single way to express them within a transcript, numbers provide a challenge for transcription as they can be expressed as digits or words. This presents inconsistencies when transcribing numbers that can impact both the readability for human consumers and for machine tools that might expect a certain output format. When the numbers are a crucial part of an interaction, for example, credit card and phone number use cases, unpredictable outputs present a challenge in any instance where numbers need to be transcribed. Speechmatics’ ASR delivers a standardized and consistent format of transcribing numbers (less than 10) as words. Number recognition ASR has evolved significantly in recent years. So too have the expectations of users. Battles are no longer fought over word error rates. Top providers are consistently delivering accuracy results in the mid to high 90s, especially for English. The battlegrounds have shifted with providers considering elements other than word error rate in the pursuit of capturing more of the intricacies of voice and speech. Last year, for example, Speechmatics rolled out the most advanced punctuation in the market to its top languages. Work is in progress to add Advanced Punctuation to even more languages. ASR has many applications and capabilities to add value to businesses. It enables businesses to innovate with the voice data in their organization. From the voice of their employees to the voice of the customers they serve. Organizations are looking to integrate ASR solutions in addition to other 3rd party solutions to build out workflows using voice. These use cases range from straight-up transcription, captioning, media monitoring, call interaction capture, call routing, call center agent assist solutions, compliance monitoring and analytics. In these situations, a consistent and accurate representation of common entities such as numbers is not only necessary but expected. ASR solutions are highly effective and accurate at transcribing speech. However, when it comes to numbers the format of how these are transcribed can be mixed. In some cases, transcribed numbers are unpredictable due to how models are trained. For example, there might be a mix of words and digits with the transcription product unable to differentiate that the entity it has recognized is a number and not a word.

Speechmatics’ enhanced number recognition and consistent formatting Accurate number recognition enhances the quality of the Speechmatics Global English language pack. It delivers accurate recognition of numbers within speech and provides a consistent output format of words for numbers less than 10. Previous output “Yes, please call me back. The best number to get me on is 0 7 seven 2 3 four 5 six 7 eight nine” New output “Yes, please call me back. The best number to get me on is zero seven seven two three four five six seven eight nine” Numbers less than 10 are now always outputted as words. This standardized transcription output delivers predictability. In the case that words represent a different format than the one required by the customer, this standard approach enables a simplified mapping so that numbers can be normalized (or converted) based on the customer’s specific needs. The benefits The focus on number recognition and delivering a consistent format uplifts the quality of the Speechmatics output. The demand on customers to review and edit transcripts can be significantly reduced. This accelerates the time to market of perfect transcripts for applications like closed captioning especially in real-time. The predictable output of numbers less than ten means that transcripts require less triage from human editors, optimizing the workforce and their efforts. Another example of the benefits of this feature is within the contact center. Speechmatics can significantly optimize agent tasks like interaction and call note capture. This can be done automatically and accurately through Speechmatics’ ASR. Number recognition can also uplift the capabilities of automated customer-facing tools such as interactive voice response (IVR) and for privacy and compliance scenarios. These use cases rely on the recognition of numbers in the voice of the speaker and also require a specific format from the ASR solution to work seamlessly with additional products within the workflow. Where and how can you use Speechmatics’ enhanced number recognition? Great news, to use this you don’t have to do anything! It comes as standard in Speechmatics’ Global English language pack in all deployment options (SaaS, Batch and Real-Time Virtual Appliances, and Batch and Real-Time Containers). Other numbers (including 10 and larger) will be addressed in subsequent releases.