At Speechmatics we use a wide range of technologies but at the heart of everything lies our deep expertise with neural networks. Our founder, Tony Robinson, pioneered the application of recurrent neural networks in speech recognition and many in our team have been using them for years to solve speech-related problems.
Speech Recognition & Time Alignment
Speech recognition – converting speech to text – is one of the hardest challenges to solve due to the complexity of human speech. To tackle this, we have developed new ways of using recurrent neural networks and combine these with the very latest developments from academia and industry. Using neural networks in speech recognition is very powerful as not only does it push accuracy levels but also they can be scaled to fit the application. Further, they can be used to give new insights into human speech.
Alongside our speech recognition system, we have built a universal time alignment system that can align text and audio with each other in any language in the world. To find out more or try the system.
The traditional way to build a new language for speech recognition is to obtain large amounts of language and acoustic data, build a pronunciation dictionary and then painstakingly build a system that is then continuously refine with language experts. This approach is not only cost prohibitive but also not scalable for a young company. We have therefore developed a number of unique approaches using machine learning that now allows us to learn a new language automatically, without any prior knowledge, with minimal data (10’s of hours rather than 1000’s) in a matter of weeks.
We are a highly innovative, research led business. Research forms a core part of the company and we regularly publish and speak at conferences. Our team has over 100 publications and we are regularly credited with breaking new ground.