KLRU Control Room

What’s the Impact of AI on Subtitling?

Captioning and Subtitling are similar but different and rather than dive into the differences we’ll use the word subtitling generically for this article to reference both subtitles and captions.

The beginnings of a wave of change with subtitling are happening in television. You may not have felt the tremors yet, but you will.

Subtitles on television began in the early 1970’s. Live subtitling, however, did not begin until 1982 when it was developed by the National Captioning Institute using court reporters trained to write 225 words per minute using a Stenograph machine. This provided viewers with on-screen subtitles within two to three seconds of the word being spoken and it’s been that way ever since. But…

Stenography, as a profession, has been in decline since 2013. As senior professionals retire they are not being replaced by younger candidates who are passing on this field as a career in favor of other professions. One reason for this is the demanding requirements of the job. Candidates must be able to type at lease 225 – 250 words per minute with limited mistakes. This requires discipline and focus for many hours at a time. Schools that taught stenography typically had a graduation rate around 4% due to the high standards and have seen a gradual decline in enrollment that ultimately forced many schools to close. In the US alone, there are over 5,000 open stenography positions with limited candidates to fill those positions.

At the same time the Stenography profession began trending downward, the demands for live subtitling were rising. This increase was driven by new government mandates in multiple countries, the exponential growth of live and breaking news, 24-hour cable news cycles, and more live sports broadcasts. Additional competition for live captioners is also coming from corporate events, government briefings, meetings and increased usage from the legal system for depositions and trials that are creating resource issues and rising prices for human captioning.

So how do we fill the gap between a decline in human subtitling also facing an increase in market demand? Technology. Specifically, Artificial Intelligence (AI). The technology has been around for many years. In fact, voice recognition dates back to the late 1800’s and early 1900’s. The technology began to show significant improvement beginning in 1971 and continued to evolve into 2014 when it became commercially available. Early efforts suffered from accuracy problems and limitations.

With the acquisition of Screen Subtitling Systems, BroadStream began working to incorporate AI speech engines to create an automated solution we call VoCaption. VoCaption delivers live subtitling for your live broadcast that is more accurate than previous AI implementations and in many cases equal to or better than humans.

VoCaption can be used in multiple applications including:

  1. Emergency Subtitling – for those occasions when subtitles are expected to be in a program but for some technical reason they are not. VoCaption can help reduce those angry viewer calls because it can be activated in just a matter of seconds.
  2. Supplemental Subtitling – your news may provide subtitles using a script and teleprompter but for weather, sports, traffic and un-scripted field reports, rather than use a human who must be available and “on the clock” for the entire time, VoCaption can be turned on and off as needed.

The two biggest benefits you can expect are:

  1. Improvements in accuracy – A frequent comment we hear is “we tried AI several years ago and it wasn’t very accurate.” That was true then, but the technology has made excellent progress and accuracy is up significantly depending on the program genre and the audio quality. In addition, it’s easy to regionalize or localize the technology by using custom dictionaries to import regional or local names, geography, schools, sports teams and more. Utilization of these dictionaries will substantially increase accuracy and improve pronunciations and we can help you with that process during commissioning.
  2. Substantial savings – Human subtitlers are expensive and as the shortage of qualified subtitlers continues to decline you will likely see an increase in rates. Current estimates range from a low of $60 per hour to a high of $900 per hour depending on your needs. VoCaption is available when you need it and it sleeps when you don’t. You only pay for the times you need it so you will experience a significant cost savings versus human captioners.
VoCaption 1RU Hardware


VoCaption is available in both hardware and software versions for 3rd party caption inserters or as part of OASYS Integrated Playout and Polistream our leading, subtitle inserter. For more information you can contact us here and a representative will be happy to answer any questions, arrange a demonstration or provide a quotation.


Broadcast Control Room

OASYS Integrated Playout with Live Captioning and Subtitling

With the acquisition of Screen Subtitling Systems, BroadStream is capitalizing on Screen’s captioning and subtitling expertise with new, value-added integrations that combine OASYS Integrated Playout with Polistream and VoCaption.


VoCaption is BroadStream’s new, AI-based, speech-to-text solution that provides broadcasters with an accurate and lower-cost, live captioning alternative to human captioning. VoCaption is ideal for captioning live news, breaking news, remote field reports, sports, weather and other events like church services or local special interest programs, as well as, occasional file-based events.

We include an easy-to-use Custom Dictionary so users can easily pre-load local and regional words, phrases and geographical terms plus unique names so there’s no waiting for the system to learn. You can teach it and benefit from increased accuracy right from the start at a fraction of the cost. There is also an Alternate Words Dictionary that can be used to swap out words that you don’t want to go to air or words that need to be capitalized as an example. Much of this work can be done during commissioning so the system becomes much more accurate from first air date.

What about accuracy? Five to ten years ago, AI based systems were in their infancy and accuracy was not very good. But over the last 5+ years we’ve seen accuracy improve dramatically along with significant increases in machine speed and power that combine to provide much greater accuracy with reduce latency and this trend will continue.


Polistream is a closed captioning encoder available as a 1RU solution or software only integrated into OASYS. The combination of both Polistream and VoCaption provides live closed captioning and caption encoding within OASYS for a streamlined, all-encompassing playout solution that runs on cost-effective COTS hardware with no special purpose devices.


OASYS Integrated Playout is a modular playout solution designed to streamline media ingest, preparation, and management, along with playlist creation, switching, channel playout, advanced dynamic graphics, secondary recording, time delay, archiving and logging in a solution that is agile, flexible and fully scalable for any sized broadcast facility.

VoCaption Diagram

With the addition of Polistream and VoCaption, OASYS can be triggered manually, by pre-scheduled triggers or in emergency mode to caption on-the-fly if it detects the lack of captions in the live signal to protect you from unwanted fines or angry viewer complaints.

The combination of OASYS, Polistream and VoCaption eliminates several pieces of hardware and replaces them with software designed to work together to provide a more efficient system that has fewer potential points of failure and much improved automation capabilities that can eliminate programs without captions or subtitles and save you significant money versus human captioners.

For more details, click the links above for each solution. If you need specifics including a quote or demonstration please visit our Contact Us page and our team will be respond to provide whatever you need.