Our products are built on decades of research. The main research projects and papers to date are:
- Large scale neural network language models. SMART award 710288. March 2013 – August 2014.
This research project developed the core capability for recurrent neural network training on GPUs. The resulting language models are fast and effective, typically resulting in 20% fewer errors.
- Large Scale Neural Network Acoustic Models. SMART award 710556. October 2014 – March 2016.
Following on from our success in language modelling we applied the same deep learning techniques to acoustic modelling, achieving even greater improvements in accuracy.
- An ultra-efficient decoder for automatic speech recognition. SMART award 710513. September 2014 – May 2016.
Building on the previous research project we realised that the traditional way to run speech recognition was no longer suited for our deep learning approach. We rewrote everything to achieve an eight times speed up, resulting in probably the most efficient speech recognition system ever built.
Selected papers from our team
- “Scaling Recurrent Neural Network Language Models”, Will Williams, Niranjani Prasad, David Mrva, Tom Ash, Tony Robinson. ICASSP 2015.
This is the first paper that shows that recurrent net language models scale to give very significant gains in speech recognition and It describes the most powerful models to date and some of the special methods needed to train them.
- “One billion word benchmark for measuring progress in statistical language modeling”, Ciprian Chelba, Tomas Mikolov, Mike Schuster, Qi Ge, Thorsten Brants, Phillipp Koehn, Tony Robinson. INTERSPEECH 2014.
This paper with Google presents a standard large benchmark so that progress in language modelling may be measured. Prior to this paper there was no open, freely available corpus that was large enough to be representative for modern language modelling tasks.
- “Connectionist speech recognition of broadcast news”, A. J. Robinson, G. D. Cook, D. P. W. Ellis, E. Fosler-Lussier, S. J. Renals, and D. A. G. Williams. Speech Communication, 37(1), 2002
This paper provides an overview of the 2002 state of the art methods to perform speech recognition using neural networks.
- “Recognition, indexing and retrieval of British broadcast news with the THISL system”, Tony Robinson, Dave Abberley, David Kirby, and Steve Renals. Proceedings of the European Conference on Speech Technology. volume~3, pages 1267–1270, September 1999.
Here we show that speech recognition can be used to find information in audio in much the way that web pages can be found with a search engine.
- “Time-first search for large vocabulary speech recognition”. Tony Robinson and James Christie. ICASSP, pages 829–832, 1998.
Here we fundamentally change the main mechanism in speech recognition to make it both faster and more memory efficient (also US patent 5983180).
- “Forward-backward retraining of recurrent neural networks”. Andrew Senior and Tony Robinson. Advances in Neural Information Processing Systems 8. 1996
This presents the first “end-to-end” training paper for tasks such as speech recognition.
- “The use of recurrent networks in continuous speech recognition”. Tony Robinson. Automatic Speech and Speaker Recognition: Advanced Topics, chapter 10.
Recurrent nets applied to large vocabulary speech recogntiion for the first time.
- “The application of recurrent nets to phone probability estimation”. IEEE Transactions on Neural Networks, 5(2), March 1994.
Recurrent nets are demonstrated to give the best performing system on a well established phoneme recognition task.
“A recurrent error propagation network speech recognition system”. Tony Robinson and Frank Fallside. Computer Speech and Language, 5(3):259–274, July 1991.
The first application of recurrent nets to speech recognition.
- “Dynamic Error Propagation Networks”. A. J. Robinson. PhD thesis, Cambridge University Engineering Department, February 1989.
This PhD thesis introduces several key concepts of recurrent networks, several different novel architectures, the algorithms needed to train them and applications to speech recognition, coding, and reinforcement learning/game playing.