Speech recognition — when you should use it and when you shouldn’t


Transcribing data on your own can be a painstaking and time consuming process. In an age where there’s an app for virtually everything, speech recognition is a fast, convenient and affordable way to convert audio files to text. 

Automated Speech Recognition (ASR) has become a ubiquitous presence in our lives. It powers our virtual assistants like Alexa and Google Home. Our smart phones and computers have their own native speech recognition software, not to mention the wealth of ASR transcription apps. But believe it or not, ASR has been around in some form or another since the early 1950s.

Today there are more speech recognition tools than ever, providing fast and affordable ways to convert unwieldy audio files into text. Nonetheless, these digital applications come with some caveats. If users are to get the most out of them, they need to ascertain the best time to use them and when an alternative solution is required.

Journalists, bloggers and budding novelists never know when inspiration will strike. And in these moments they may not always have a pen and pad to hand, or even be able to make accurate notes on their smartphone. In these moments, ASR software is a useful way to commit the basic outlines of ideas, or even entire paragraphs, to text. Note that most virtual assistants will only allow you to dictate for around 30 seconds at a time. You’ll need a dedicated ASR application for longer periods of dictation.

There are even some writers who use ASR solutions to dictate large passages of text to be proofed and reorganised later. While it requires a learning curve, some writers find that they prefer to work in this way, especially when their hands aren’t free to type. 

However, be wary — even the best speech recognition software struggles with accuracy. In fact, if you consistently hit 80% accuracy, you’re doing better than most. Fiction writers may especially struggle when using non-traditional character names or the esoteric language of science fiction and fantasy.

ASR can be very helpful in the realm of education and academia. Undergraduates no longer need to sit through lectures furiously scribbling down notes and struggling to keep up. They simply need to record their lectures and let their speech recognition tools transcribe them for greater ease of access later. Need to recall something from a specific moment in the lecture? You can easily and quickly find it in your transcribed document with the CTRL+F search function. 

Likewise, busy teachers and lecturers can use ASR to make their workflows more efficient by cherry-picking important quotes from lengthy audio files to use in lectures and handouts. You may, however, want to double-check for accuracy before committing to this.

Whether you’re a budding production house or a business trying to improve their video marketing efforts, you should ensure that your online videos always contain closed captions. These provide ease of accessibility for hearing-impaired viewers or those who are enjoying your videos on the go, in the break room or anywhere else that watching with audio isn’t appropriate.

Captions don’t just improve accessibility, they can also give videos an SEO boost. The text in caption files can be crawled and indexed by search engines just like the content of a web page or blog post. This allows video creators to increase their chances of their videos being found online with the strategic use of keywords.

Again, it’s vital that you double-check the accuracy of your caption files if they were generated using speech recognition software. 


When is the personal touch the best approach?

It’s clear that speech recognition has some exciting applications. However, there are some instances where accuracy is of paramount importance, including use cases where there simply isn’t time to go back and check lengthy transcripts for errors. Users need to be able to get it right with guaranteed accuracy and turnaround times tailored to their needs. 

This is where a professional transcription service is the only viable alternative to speech recognition software. Accuracy, security and compliance are a greater concern to some industries than others, especially in the GDPR era, and ASR doesn’t necessarily account for that. One major drawback of ASR apps is that users have no real way of knowing who has access to their data, how many copies have been made or in which countries it has been handled. 

While professional transcription services may be more costly than ASR platforms, they are the only way to guarantee the accuracy that is essential for a variety of industries.

Public sector

Those in the public sector make decisions that make a difference in people’s lives on a daily basis. What’s more, robust regulation of public services means that accuracy and security are extremely high priorities for government agencies. Professional transcription services can give users in the public sector peace of mind while also providing flexible pricing to suit the budgets of departments.


Legal and medical

The legal and medical worlds are greatly reliant on accurate transcriptions. Especially when their audio files contain a lot of complex jargon and industry-specific terminology. These can cause accuracy levels to plummet in speech recognition solutions, but are no problem whatsoever for an experienced transcriber. 


The field of Human Resources requires making decisions both large and small that impact the lives of employees on a daily basis. Moreover, much of the data that they deal with is highly sensitive and needs to be dealt with in a fully compliant manner. ISO accredited transcription services treat data from interviews, grievances, disciplinaries, reviews, appraisals and strategy meetings with strict confidentiality and provide complete accuracy. 


Market research

Market research is the engine that drives strategic change in business. The insights gained in focus groups, in-depth-interviews and short exit surveys can help to take businesses to the next level — but only if the insights are based on 100% accurate data. 

Transcription services can be hugely beneficial for market researchers, providing accuracy and security to help them to generate actionable insights for their clients while also staying on the right side of compliance. 


Should you use speech recognition software?

Speech recognition is a valuable tool that can make workflows more efficient and save you the hassle and frustration that come with transcribing your own audio or video files. That said, Automated Speech Recognition lends itself better to some use cases than others. 

To get the most out of this software, it’s important to know that if you use it, you need to be able to put in the time to proof and review the transcript the software produces. You need to take time to ensure it’s accurate. On the other hand, a professional transcription service can provide greater accuracy as well as greater assurances when it comes to security and data protection. So ensure you do the research to figure out what method to get transcripts is best for you. 

