This April, we’re incredibly excited to bring to our customers one of our largest product releases to date. Our world-leading speech-to-text engine has added more languages, improved accuracy in others, and now boasts impressive new features, including Entity Formatting.
Over the past few years, Entity Formatting has consistently been one of our most requested features and we’re pleased to announce its release in the latest product update. The result of a process known as Inverse Text Normalization (ITN), Entity Formatting is crucial to the readability of documents, especially when it comes to any and all financial sectors. We’re now able to offer improved formatting of entities such as numbers, currencies, percentages, addresses, dates, and times for 11 languages: English, French, German, Italian, Japanese, Hindi, Portuguese, Russian, Spanish, Cantonese, and Chinese Mandarin.Amazon Science Inverse Text Normalization
The ability to clearly read a document, and its numerals with consistent output, can save time and confusion across a number of use cases. The less time spent, post-process, picking through and changing dates and numerals the better. The difference it makes to a reader’s understanding of a document can’t be overstated. In the world of medicine, education, broadcasting, and beyond, Entity Formatting offers the ability to save time and confusion.
The below examples give a great overview of exactly the difference Speechmatics’ Entity Formatting makes for your documents.
Today is the 5th of January twenty twenty one. We're very pleased with our exceptional fourth quarter performance after an unprecedented year for 2020. Total revenues were one hundred and eighty three billion dollars, up 13 percent year on year, or up 14 percent in constant currency.
Today is the 5th of January 2021. We're very pleased with our exceptional fourth quarter performance after an unprecedented year for 2020. Total revenues are $183 billion, up 13% year on year, or up 14% in constant currency.
Our new Entity Formatting comes in two modes: Written and Spoken. Written will provide a clear, formatted transcript that’s easiest to read (e.g. 5th of January 2022, £10million), meaning anyone can glean the information they need quickly and accurately. The latter will replicate the words as they are spoken (e.g. fifth of January two thousand and two, ten million pounds), a tool more beneficial for compliance, or where interfacing with third party systems.
Also in our latest product release, we’ve made further improvements to other features, such as our Speaker Diarization. Where we were seeing problems with single speakers being recognized and transcribed as multiple speakers, we’ve fixed it. Now, there’s much greater clarity of who is speaking and when.
We’ve also introduced Flexible Endpointing, meaning a marked improvement for those who use our Real-Time product. Before our latest update, our ‘max_delay’ time meant formatting could be negatively impacted when long words or sequences of numbers were spoken. Now, with Flexible Endpointing, our engine registers longer words and sequences being uttered and extends the ‘max_delay’ to compensate.
We’re also pleased to announce the introduction of two new languages – Cantonese and Indonesian. Spoken by nearly 300 million people worldwide, this addition takes our total number of languages to 33. It’s our aim to understand every voice, so we won’t be stopping there.Cantonese StatisticsIndonesian Statistics
With our current languages, we’ve seen huge improvements in accuracy across all available languages. With Danish, Dutch, Norwegian, Lithuanian, and Turkish seeing jumps in accuracy – try it for yourself now using our Self-Service Portal.
We hope you get as much as you need out of our latest product release. We’re always open to suggestions and feedback, so please get in touch if you have something to say. Now, we’ll get cracking on the next one.