A quick guide to captioning your videos on YouTube and Vimeo

Video captioning

Hosting video online is becoming an important part of doing business. YouTube and Vimeo are two of the best platforms to host on. But the big issue with these platforms is the sheer amount of competition you have for space. It’s crucial you find ways to optimise your videos’ visibility.

To that end, captioning is becoming more and more important to video content. It has the potential to raise your SEO and open you up to a wider audience. So, here is a quick guide on why, and how, to caption your videos for Vimeo and YouTube.

Captioning Videos

Vimeo vs YouTube – What’s the difference?

YouTube is the world’s second largest search engine, owned and operated by the largest, and with a community of over 1 billion users who watch hundreds of millions of hours every day, YouTube is where you go to reach the widest audience. YouTube tracks something called ‘viewer retention’, which is a fancy way of saying that if your videos can keep customers on YouTube for longer, then YouTube is more likely to promote your video.

The issue with YouTube is the algorithm — essentially, YouTube polices itself for copyrighted content using AI programs. But these tend to be less than effective, flagging original works as copyrighted, or it’s abused by people trying to take down videos they disagree with. This can lead to lots of issues with content on the platform, and so it is important to note if you are looking to use it.

The other player is Vimeo. Whilst it doesn’t have as many viewers as YouTube, it compensates by having good communities, the ability to replace videos without losing tracking data and it is run ad-free. Vimeo also has a less overbearing algorithm to detect copyrighted music, meaning that you’re less likely to have your video removed than on YouTube. Plus, your video library is stored on your account, with a generous storage limit.

However, the biggest issue with Vimeo is viewer count. Whilst Vimeo is less restrictive with copyright than YouTube, Vimeo has a smaller use base. It also can have a slight image problem as less oversight can mean less savoury content being hosted on the site.

For Youtube, the number of videos that are uploaded there every hour is staggering, so SEO is incredibly important for visibility. For Vimeo, SEO is less important as there is less competing content. However, the trade-off is a smaller audience. You need to consider this when choosing your platform.

Why bother with captioning? 

When it comes to watching content on the internet, especially video content, captioning is surprisingly important if you want to reach out to the broadest possible audience. YouTube and Vimeo have global audiences, and so just captioning in a foreign language increases the scope of your video’s appeal.

But more importantly, captioning is important for increasing SEO and allowing for accessibility and flexibility of your content. Having a verbatim transcript for your video is important because it gives the search engines a readable copy of the content. This allows it to search for keywords and optimise its visibility.

As for accessibility, there are some cases when having captioning may be a regulatory requirement, especially in corporate or government contexts. Having a video that caters to the deaf or hard of hearing is vital in ensuring that your video is accessible and friendly to all viewers.

For flexibility, 85% of Facebook videos are watched without sound. When you’re at work or in public, you may forget your headphones or it may be a loud area, but that shouldn’t stop you from being able to watch the video. It’s in fact such a useful feature that YouTube embeds the ability to auto-generate captions into their videos — however, it’s important to note that these aren’t always the most accurate.  

How to get captions for your video 

So, the next question is, how can you get this vital part of video marketing, ensuring that your video will be seen by the widest variety of viewers? And how can you ensure you’ll get an accurate and high-quality transcript? 

Well, there are three different ways of transcribing a video. The first is adding the captions yourself. However, this will take the longest and you won’t be able to produce anything else whilst you’re doing it. The second is having a computer do it automatically for you. The third is hiring a transcription service to do it for you.

Automated Speech Recognition


Automatic Speech Recognition, or ASR, is an AI process in which a computer scans through the audio on your video and generates a transcript. YouTube uses this on their auto-generated captions service, embedding the functionality within their video player. ASR is useful if you want captions delivered quickly and cheaply, however, as mentioned, the drawback of ASR is in lack of accuracy. 

ASR still struggles with speaker recognition and background noise, making it unsuitable for situations with more than one speaker. It also can’t generate accurate closed captions, meaning that any subtitling would be done purely by speech, and any background noises or music would be lost. ASR would also require you to go back through the transcript and edit it yourself, so any time you saved using ASR would be cancelled out by the editing process and ensuring everything is accurate to your video.

Human transcription services

Transcription services

With human transcription services, you don’t have this problem. Transcribers excel at picking out dialogue from background noises to create clear and accurate subtitles. What’s more, with this kind of transcription, you can create accurate closed captioning, allowing for certain background noises or music to be added to the videos. Adding these captions to videos allows for a greater breadth of accuracy and flexibility in your captioning.

Human transcription can also add time codes, speaker recognition and notes on laughter or emotion to the captioning all whilst being more accurate than ASR.  Certain transcription services can also offer subtitles in foreign languages and a host of other features. While ASR can also time code audio, it struggles to do it to the degree of accuracy really needed for video content

Human transcription provides greater accuracy and greater flexibility in your closed captioning, whilst ASR can provide fast, but inaccurate, subtitles. Your choice then is to decide which is better for your video needs.

No matter which you choose however, be it YouTube or Vimeo, ASR or Human Transcription, having captions on your video is important to ensuring that your video is seen by the largest audience possible. This means choosing the right platform to host your video for your audience, enhancing your SEO capabilities and ensuring your video is accessible. To that end, transcription services should be an important part of any video content online.

Caption CTA with Learn More button

Posted in

Take Note

Take Note is a UK-based transcription service with world-class customer support alongside the highest standards of security and ethics. We deliver a comprehensive range of transcription services including Audio and Video Transcription, Video Captions and On-Site Note Taking.