Human Transcription Services vs. ASR: Which is Right for You?

ranscription Decision Tree Landscape

Human Transcription Services vs. Automatic Speech Recognition (ASR): Which is Right for You?

 

If you need a transcript and you have ruled out doing the transcription yourself (smart choice, by the way), there are 2 main types of transcription to choose from. Firstly, you have human transcription services where real people, i.e. a professional human transcriber (transcriptionist in the US) will transcribe your audio or video file into text. The second is automatic speech recognition (or automated speech recognition, or voice recognition), where a machine transforms the speech to text.

Jumping straight to Automated Speech Recognition (ASR) can be very tempting, after all, artificial intelligence has come on leaps and bounds. The technology is always improving and it’s quick and cheap which is ever so enticing. But, human transcription services are very much still in demand for good reason. So, before you start to upload your file we’ll walk you through the key considerations you need to make to ensure you get the right transcription service.

There are 2 key considerations when it comes to choosing between ASR and human transcription services:

  1. The quality of you audio file and type of your recording
  2. The type of output you need

Human transcription services vs automatic speech recognition

We’ve produced this handy infographic for you to make the process of choosing between ASR and human transcription services as painless as possible. If you’re intrigued by the sound of ASR you can find out more in our blog What is speech to text software?

Now, on to making a decision…

 

ranscription Decision Tree Landscape

Step 1: Is your recording good quality?

As you might expect a machine can only do so much as accuracy levels can drop drastically if your audio has background noise or people speaking over each other. Unless you have crystal clear audio, human transcription is going to provide you with a better output and remove the need for you to correct and edit the file.

If you do have a high-quality recording there are some other elements to take into account to see if ASR or human transcription services are the way to go.

Step 2: Does your content have multiple speakers?

If you’ve just got one speaker you can jump straight to step 3. 

Usually, if you have multiple speakers you’ll want to distinguish between them in your transcript, otherwise, the text can feel jumbled and it’s hard to work out who said what. Imagine paying for a transcript and then having to meticulously go through it yourself to identify the speakers – No thank you. Avoid that analysis headache by using human transcription services who can label the speakers for you. Perfect for interviews or market research focus group content.

If you feel like you don’t need the speakers identified it’s time for step 3.

Step 3: Does the audio contain regional accents?

If you’re anything like us, you’ll love a regional accent. ASR seems a little less keen. It’s not personal, but machines can struggle with the wonderful array of dialects out there. 

Most automatic speech recognition software has been developed using a standard US accent. You know, the generic kind of American accent that you might hear on the television, one that isn’t easy to place in a specific location. This means that many other dialects and accents may see much lower accuracy rates, especially if the participants are talking quickly.

Step 4: What type of transcription do you need?

This all comes down to you, what you need the transcription for and whether you’ve got time to make edits to get the final output you need.

ASR will transcribe every single word. And we mean Every. Single. Word. On the surface this might not sound like an issue – after all, that’s what you’re paying for, to have the speech converted to text, right?

But, have you erm, thought like, what all the err, you know, speech, would actually, um look like when it’s it’s written out. 

Probably not exactly what you’re looking for! 

ASR will deliver the full verbatim, but if you don’t have time to go through and edit the transcript, human transcription services can remove stutters, repetitions and filler words for you. 

So, if you want a clean transcript that’s ready for you to use human transcription services will be the best option.

Step 5: How accurate does it need to be?

Accuracy is what divides a good transcript from the bad.

If you need 99%+ accuracy, human transcription services will deliver the most accurate transcription. Most services will provide you with a guarantee and your transcript will be proofread, so you can be confident that what you receive will be of a high quality.

If you just need a rough idea of what people are saying and you don’t need high levels of accuracy, ASR is a good budget option. But remember, the accuracy, at best, is likely to be around 85% and will drop significantly if the quality of the audio isn’t great, you have multiple speakers or participants with strong accents.  

Audio Transcription CTA

Wait. What about timescales, isn’t an automated transcription service quicker?

If you need your transcript in a hurry, on the surface, ASR can appear like the clear winner. Many services can turn content around almost instantly, a speed which human transcription services simply can’t match, no matter how many transcribers they have or how they are able to split your file. However, as you’ve probably guessed there is a big but.

You also need to factor in any time spent cleaning and editing your file to get it into a usable format.  And, unfortunately, that can take much longer than you think. 

If you do need real-time instant captions or transcriptions, ASR is the option for you. Professional human transcriptionists are pretty quick, but they can’t compete with ASR. If speed is your priority, ASR can deliver as long as you’re ok with the lower accuracy levels you’ll likely get back.

Kat Hounsell

Kat Hounsell