The Accuracy Platform for Speech Recognition
At Speechmatics, we believe accuracy is everything. It’s a belief also held by the minds behind Atexto, the software company collecting, managing, labeling, and comparing speech, text, and annotation data to improve the accuracy of speech recognition technology.
By accessing labeled speech data, Atexto can personalize machine learning models for voice understanding. They also cater to individuals who want to record and store their voiceprint, label it, and be paid for it.
Self-styled speech data hunter-gatherers, Atexto utilize their Accuracy Platform to enable data teams to manage tasks, projects, and teamwork related to speech recognition data and models for machine learning training. They not only provide business-changing insights related to AI, but also realize that ethical AI is a necessity – helping remove biases with their diverse community, eliminating bad data around race, gender, age, and dialect.
With a similar ethos to Speechmatics, Atexto aim to “make every voice heard” by building fully customized speech models with precision audio collection and annotation. They also aim to decode unstructured text by helping train models to interpret complex text with annotation templates, designed to delve into the contextual nuance of written language. This improves prediction algorithms and chatbot performance among other AI systems.
In early 2022, Atexto held their annual ‘Speech Recognition Battle Royale’. This yearly event showcases their ability to compare ASR (automatic speech recognition) engines, by pulling data from leading speech-to-text companies and comparing the results. This year Speechmatics found themselves up against Amazon Transcribe, Deepgram, Google speech-to-text, Microsoft speech-to-text, Voicebase, and IBM Watson speech-to-text.
The challenge was to test German conversation. Using the software Atexto designed to support speech-to-text projects, their ASR comparison software allows the user to upload a zip file with a large number of audio files and their respective reference transcriptions files (REF). By using proprietary code, normalizations, and integrations to applications, they can extract accurate reports on WER (Word Error Rate), TER (Punctuation and Capitalization) and measure bias through the allocation of metadata in speech audio.
We wouldn't want to spoil the surprise of who they deemed to be the best of the seven companies tested, (we’ll give you one guess) but you can watch the video. A more in-depth breakdown of the full report, shows the conversational data was mined from interviews, talk shows, vlogs, and podcasts on a wide range of topics from sports to politics, and from healthcare to cinema. An additional dictation segment (included in the report but not the Royale) uses a combination of voice messages, lectures, and readings on topics as diverse as education and literature. A further breakdown and analysis of these results will be made publicly available in June 2022.
Read about our technology or sign up for free today. The ability to consume Speechmatics’ any-context speech recognition engine directly in the Microsoft Azure technology stack enables businesses to start using the technology quickly without barriers to adoption.
We work with great companies, read some of our partner case studies