View all posts June 18, 2023

Why WebVTT is better than SRT



VTT or WebVTT and SRT are two very similar standards for subtitles.

However, WebVTT offers more options than SRT. With WebVTT you can style your subtitles and set the right font. Furthermore you can position your subtitles using CSS and add metadata.

Let's first look at SRT:

SRT stands for SubRip Subtitle format. It looks like this

1  
00:05:00,400 --> 00:05:15,300  
This is an example of a subtitle.  

2  
00:05:16,400 --> 00:05:25,300  
This is an example of a subtitle - 2nd subtitle.  

WebVTT

WebVTT (Web Video Text Tracks) or VTT looks very similar.

WEBVTT  

00:01.000 --> 00:04.500  
- Winters come after Autumn.  

00:05.000 --> 00:10.000  
- Often the weather goes too cold in winter.  
- You should cover yourself with warm clothes.  

Both WebVTT and SRT use timestamps above the caption text. The timestamp has the beginning and end time that determine when the subtitle should be displayed. The timestamp follows the format hour:minute:second,milisecond.

In the above example, during seconds 1 to 4.5, you see the subtitle "Winters come after Autumn".

WebVTT allows for more customisation of your caption text. You can choose the font and also the position of your subtitles when overlayed on top of a video.

The downside of extra features is that it's more difficult for a player to support it. Because SRT has very few features, most players support it.

WEBVTT

00:01.000 --> 00:04.500 position:10%,line-left align:left size:35%
- Winters come after Autumn. 

00:05.000 --> 00:10.000 position:10%,line-left align:left size:35%
- Often the weather goes too cold in winter.
- <b>You should cover yourself with warm clothes.</b>

In the above example we used HTML syntax to make "You should cover yourself with warm clothes." We also left aligned the caption and set the size.

It is even possible to break down the Subtitle file even further and only show individual words per frame for more emphasis. E.g.

Often <00:05.000>the <00:06.000>weather <00:07.000>goes <00:08.000>too <00:09.000>cold <00:10.000>in winter.

That way each word is shown for a second.

Luckily, it's fairly easy to translate SRT to WebVTT and vice versa.

Where SRT was developed by a relatively obscure group, WebVTT is a W3C standard. As noted by the original question, it is more or less the official caption/subtitle format for HTML5. See SO.

If you still want to convert from VTT to SRT, check out our conversion tool. Drop Tom and email if you need help.