Research and innovation is embedded in the culture

Whilst many software companies apply technology that has been invented elsewhere, we do things differently. We believe that to truly excel at something, you need world-class expertise at your fingertips.

That’s why Speechmatics has a core team of researchers who are at the cutting-edge of artificial intelligence, neural networks, machine learning and language models embedded at the heart of the company. Teams are encouraged to innovate and iterate, applying their knowledge and expertise quickly and skillfully to our automatic speech recognition technology. Continual development means that our languages are constantly evolving to provide industry-leading accuracy and performance.

With a culture that does not separate research from development, Speechmatics is constantly looking for opportunities to prototype new capabilities, evolve our language offering and rapidly develop new algorithms to remain at the forefront of ASR development.

If you’re interested in finding out more or getting involved in our research, talk to us.













Research Papers

“Hierarchical Quantized Autoencoders”, Will Williams, Sam Ringer, Tom Ash, John Hughes, David MacLeod, Jamie Dougherty. February 19, 2020.

Speechmatics’ paper was submitted and accepted to the most prestigious ML conference – NeurIPs. The paper is about a type of lossy image compression algorithm based on discrete representation learning, leading to a system that can reconstruct images of high-perceptual quality and retain semantically meaningful features despite very high compression rates.

“Discriminative training of RNNLMs with the average word error criterion”, Remi Francis, Tom Ash, Will Williams. November 8, 2020.

In this paper, Speechmatics demonstrates how you could improve recurrent neural network language models by optimizing for downstream speech recognition accuracy directly, rather than the usual generative approach which tries to model the probability of the next word in a sequence.

“A Framework for Speech Recognition Benchmarking”, Franck Dernoncourt, Trung Bui, Walter Chang. Adobe Research. Interspeech 2018.

At Interspeech 2018 in Hyderabad Speechmatics referred to as one of the most accurate providers of ASR after some evaluations, such as one done by Adobe Research. We demonstrated that our continued focus on innovation and to drive new R&D maintains our position in a growing and increasingly challenging field.

“Scaling Recurrent Neural Network Language Models”, W. Williams, N. Prasad, D. Mrva, T. Ash, A.J. Robinson. ICASSP 2015.

This is the first paper that shows that recurrent net language models scale to give very significant gains in speech recognition and it describes the most powerful models to date and some of the special methods needed to train them.

“One billion word benchmark for measuring progress in statistical language modeling”, C. Chelba, T. Mikolov, M. Schuster, Q. Ge, T. Brants, P. Koehn, A.J. Robinson. INTERSPEECH 2014.

This paper with Google presents a standard large benchmark so that progress in language modelling may be measured. Prior to this paper there was no open, freely available corpus that was large enough to be representative for modern language modelling tasks.

“Time-first search for large vocabulary speech recognition”. A.J. Robinson and J. Christie. ICASSP, pages 829–832, 1998.

Here we fundamentally change the main mechanism in speech recognition to make it both faster and more memory efficient (also US patent 5983180).

“Dynamic Error Propagation Networks”. A. J. Robinson. PhD thesis, Cambridge University Engineering Department, February 1989.

This PhD thesis introduces several key concepts of recurrent networks, several different novel architectures, the algorithms needed to train them and applications to speech recognition, coding, and reinforcement learning/game playing.

Want to try our cutting edge voice-to-text technology?

Book a Demo