The Ultimate Guide for Businesses

Left transcription services question mark green

What is Transcription?

Transcription is the process of turning speech into text. Typically this involves taking an audio or video file and producing a written or typed output of the spoken word.

This can either be in the form of a word-for-word transcript, otherwise known as ‘verbatim’. Alternatively, a transcription may be a condensed set of notes capturing the key points of what is said.

For video, the transcript can be time-coded to the content, in order to provide closed-captions (CC) or translated for subtitles.

Takenote exports grey dashed line

What is the Benefit of Transcription?

  • Improved accessibility - video and audio content can be very engaging for audiences and is extremely popular. According to Hubspot, video is the most commonly used format in content marketing and has surpassed blogs. However, video and audio content needs to be accessible for the 1 million people in the UK who are deaf or hard of hearing. Adding closed captions to video content makes it more accessible, as does including transcripts of audio content, such as podcasts. In addition, there are also many users who view content on their mobile or with the sound off. It’s vital to consider how they will consume your content.
  • Faster turnaround times - video and audio content can be very time-consuming to work with, whether that’s in the editing process, for market researcher purposes or just reviewing a meeting. Having a transcript (especially one that’s time-stamped) can help you to find specific points in the content much more easily. A transcript transforms your content into something that can be searched.
  • Boosts SEO - video and audio content pose a challenge for Google (and other search engines) to know what’s being talked about. Including a transcript along with the content isn't just necessary for accessibility, it can also help with SEO, ensuring your content can be indexed and found.
  • Help focus and retention - it’s been demonstrated that including captions on content can help people to focus on the content, improve their comprehension and in turn, actually retain more of what’s being said. So, if you’re looking to get a message out there using video or audio, help it to stick with captions.
Take note favicon 1

When is transcription needed?

Legal recording - transcripts of audio and video recordings such as interview and witness statements are used as evidence in investigations and provide a practical way to review what’s being said. Transcripts are also used to document tribunal, court and disciplinary hearings.

Market Research - analysing video and audio content for themes, keywords and insights can be time-consuming. A transcript provides a quick and easy way to analyse what’s being said. The transcript can be easily scanned and searched as well as being run through additional tools such as text analytics.

HR - HR meetings are often very important and sensitive in nature. Due to the nature of meetings such as grievance and disciplinary hearings, interviews, reviews and appraisals, a written record of what was said is extremely beneficial. It helps to keep everyone on the same page, can provide a factual record, and serve as a useful resource in a tribunal hearing.

Education - lectures, lessons and seminars all contribute to learning (both online and offline). Scribbling notes can be a helpful way to absorb information but it can be hard to keep up and focus on what’s been said. Transcriptions of the spoken word can provide an additional resource for learning.

TV & Film - accessibility is vital and our changing habits, such as mobile viewing, mean that transcriptions are required in order to deliver subtitles and closed captions on content.

Take note favicon 1

What’s the difference between transcription and translation?

A transcript is the written form of the spoken word in audio or video content. The transcript will be in the same language that’s being spoken in the content. If you require the text in another language you will need to get the transcript translated.

There can often be some confusion over transcription and translation. It can be helpful to think about translation as a two-step process, transcription first, then translation. The content will be transcribed into the native language of the content first, before being translated into the required language. As languages are complex, a translation typically requires someone who is familiar with both languages to carry out the process to ensure the correct meaning is conveyed.

Take note favicon 1

How long does it take to transcribe one hour of audio?

It takes a professional transcriber around 3-4 hours to transcribe one hour’s worth of audio. They will be able to type around 130 words per minute (wpm) to achieve this.

However, if you’re looking to transcribe yourself and you’re not quite at those speeds, it’s likely to take you considerably longer. The average person types at around 41 wpm, that’s nearly three times slower than a professional! That means in 3 hours you might only be able to transcribe 15 mins worth of audio, before carrying out any additional checks.

If you use a transcription service you’ll often be given guaranteed delivery times so you know when you can expect your transcript back. You’ll find that faster turnaround times usually come at a premium.

Take note favicon 1

How difficult is it to transcribe audio/video content?

Transcribing content accurately and quickly is a skill. Although many people are capable of transcribing themselves, they may not be able to complete it in a reasonable timeframe or to a high enough standard for their requirements. It’s a time-consuming process, even if you're planning to start with editing an automated transcript.

Professional transcribers not only type a high word count per minute, but they are also fantastic listeners and have excellent spelling and grammar. This allows them to produce high-quality transcripts in a fraction of the time it would take the average individual.

The quality and content of the audio can also pose additional challenges in creating a good transcript. 

  • Background noise - can make it difficult to pick out speech and identify words
  • Multiple speakers - people can talk over each other or at different volumes 
  • Accents & dialects - can be challenging to understand if you are unfamiliar with them and may slow down the transcription process
  • Industry-specific / specialist language - 

This is why people often turn to the professionals to help them out. Transcription services will provide you with an accuracy guarantee within an agreed timeframe, taking the stress out of getting a transcript and ensuring you receive a high-quality output.  

Take note favicon 1

How does transcription work? How is it done?

When you outsource your transcription to a third party, there may be some differences between services however, the typical process is:

  • You select the service that’s right for you, the turnaround time required and any additional requirements such as speaker identification.
  • Once you’re happy with a quote you’ll upload the relevant file. High-quality suppliers will provide you with a secure method for uploading your content.
  • Once uploaded, the company will start working on your files. Some services may make your content available to a large number of transcribers so it’s best to check their specific process, particularly if you’re working with confidential, personal or sensitive information.
  • Once the process is complete it will move to the proof checking stage.
  • You’ll be alerted that your document is ready. Depending on the service you may be emailed the document or be sent a link to download it.
Take note favicon 1

Are there different types of transcription?


Human transcription providers allow you to select the options to meet your specific project requirements including transcription output, turnaround time and a range of additional services.

Verbatim transcripts

This is when you receive a full account of what’s being said in your content. This will include a huge amount of detail which you may actually find surplus to requirements. In the transcription world, typically, a verbatim transcript will have filler words such as umm & err’s, false starts and repetitions removed. This makes the output far easier to digest, as well as being easier to produce. However, if you require this information along with word-for-word detail, human transcription services will provide you with the option to have this included. 

Detailed Notes

Detailed Notes is a unique service provided by Take Note which has been developed with you, the end-user, in mind. Detailed Notes captures all the key details word-for-word but summarises the elements you don’t need in full. For example, pleasantries, off-topic chat and your questions can be summarised, providing you with a cleaner transcript that’s easier to use.

Additional Services

Depending on your requirements and content type there are additional services you can select to ensure your final transcript is as useful as possible.

  • Speaker ID - to help content with multiple speakers easier to process.
  • Anonymisation - the redaction of personal information, ideal for GDPR compliance
  • Time Stamps - helping you to find specific sections of content quickly
Take note favicon 1

What’s the difference between transcription and automated speech recognition (ASR)?

Turning speech into text falls into 2 key areas - human transcription and automatic speech recognition (ASR).

There are some key differences between human transcription and ASR and deciding which is right for you will be dependent on the quality of the audio and your requirements for the output. 

When choosing between the services you are usually trading off speed, accuracy and cost. As you might expect ASR solutions are cheaper than human transcription and can be turned around quickly; however, this is at the expense of accuracy. The time saved can often be lost in editing time. 

Human transcriptions typically come with an accuracy guarantee and a delivery time, allowing you to plan your project and reduce the need for editing.

Take note favicon 1

How do you tell if you’re a good transcriber?

As we’ve mentioned, being able to produce a high-quality transcript in a reasonable time is a skill. Below are some key attributes that good transcribers and note takers possess.

Typing speed and with accuracy.

Not only do professional transcribers need to be fast typers, but they also need to be accurate. The typical rule of thumb for a transcriber is that you want your typing accuracy to be over 92%, which allows for just eight mistakes out of every 100 words typed.  

Good listener

Transcription projects cover a broad range of requirements from legal proceeding through to market research focus groups. 

Good listening skills are essential in creating an accurate transcript at speed. Content may contain a range of accents and dialects which can be challenging to decipher, particularly if there are multiple speakers or poor audio quality.

Content may also include technical or industry-specific language, for example in medical research. A reference document may be provided in these cases or certain projects may use transcribers with specialist knowledge or experience.

A good grasp of the language you’re transcribing

Accuracy is vital when it comes to high-quality transcripts and therefore excellent spelling and grammar are key.

Languages can be tricky and many words can sound similar but have a different meaning and spelling! For example, in English, their, there, they’re all ‘sound’ the same when spoken but are used in different ways. A standard spell checker won’t pick these up either.

And lastly, to help with the listening aspect, you may also want to invest in a good set of headphones!

Take note favicon 1