Below you’ll find answers to some commonly asked questions. If you can’t see an answer to your question, get in touch.


Speechmatics supports both audio and video files types with industry leading inclusivity for formats. For a list of file formats please see the Speechmatics product sheets.

A full list of our available languages can be found on the languages page. If there is a language that you require but is not listed please get in touch with us.

Speechmatics is constantly looking for opportunities to prototype new capabilities, evolve our language offering and rapidly develop new capabilities and with a dedicated and agile speech team we are engineering to deliver rapid prototyping, reducing users time to market.


Yes. All job submissions must have to have a supported language included as we don’t currently have a way of auto detecting the language.

Files must have an audio sampling rate of at least 8kHz but otherwise the speech engine automatically applies the best rates to your audio removing the complexities of submitting files while providing the best possible levels of accuracy by tailoring the service to the submitted audio.

The better the quality of the audio the better the speech recognition will be. If you can hear what’s being said, then it should work for Speechmatics technology. Some tips for improving the audio you record:

  • Be close to the microphone, distant voices are often hard to hear
  • Use the best quality microphone you can. Try a different microphone if you’re having issues.

For more information take a look at “How can I improve transcription accuracy?”


We do; however, you should get in touch with the team at Speechmatics so we can get an understanding of your use case.

The better quality the audio the better the transcription will be. Things that impact the accuracy of the transcription are:

  • Clarity of the speech
  • Background noise
  • Cross talk or multiple speakers at the same time

We try and build the best speech technology wherever possible but know that we cannot be the best at every use case. Best to give us a try and see if our system works for your use case.


Speech recognition is limited by the quality of the audio presented to the system. To get the best out of our system, follow the simple steps below.

You:

  • Speak clearly. However, there is no need to speak slowly
  • Ideally, your voice should flow over the microphone, not directly into it for optimal results
  • Literal translation of people ‘thinking out loud’ is very hard to read, so – if possible – think about what you want to say before you say it
  • If you have time, record some test audio and play it back to check it all sounds okay before you try it live

Your environment:

  • Record in a quiet environment to minimise background noise
  • Try to avoid multiple people speaking at the same time
  • Minimise reverberation. Sound can sometimes bounce off flat walls and muddy the signal

Your technology:

  • Use a good microphone, such as a USB noise cancelling (or directional) microphone – a head mounted microphone is preferable
  • Record at 16kHz or greater if possible
  • There is no need to compress the audio, if you do, please don’t over compress – use 96 kbps AAC or 128 kbps MP3 or better.
  • Use two channels if available.


Whether using our technology on-premises, in any public cloud platform or using Speechmatics’ managed cloud service, flexibility is at the heart of Speechmatics. Customers can deploy anywhere their data is to meet product and compliance needs.

Off-line licensing capability means that the Speechmatics solution never needs to touch the public internet, further enhancing data security and data compliance requirements.


Yes, Speechmatics supports speaker separation so there is no requirement for multi-channel recording.

Any words in the text that are not in the audio provided need to be removed or surrounded by <>, otherwise accuracy will be reduced and the alignment job may take significantly longer.

Yes. Alignment is a separate service. Get in touch to find out more.

Why not try transcribing a media file or live speech using our free demo and see for yourself?

Ready to try Speechmatics?

Try it now