VoCaption Automated Live Captioning & Subtitling

Automated, Accurate, and Reliable Live Captioning

Automatically Convert Program Audio into Real-Time Closed Captions and Subtitles!

Reach a Wider Audience!

BroadStream’s VoCaption software uses Automatic Speech Recognition technology to create real-time captions that are reliable and accurate and deliver real-time cost-savings for broadcasters with an improved experience for your viewers.

VoCaption is live automated closed captioning software that supports broadcasters in various industries:

Commercial broadcasting
Government
Public Access
Religious Entities
Educational Centers
Corporate Broadcasting

VoCaption makes live captioning a “one mouse click” operation that takes the frustration out of hiring human captioners and streamlines your workflow. No matter your industry, VoCaption is real-time captioning software that provides peace of mind.

Reliable Closed Captioning for Scheduled Live Programming or Emergency Broadcasts

VoCaption live captioning software reduces the worry typically associated with producing live closed captions and maintaining accuracy or reliability.

Automatic conversion of audio to text in real time.
Custom vocabulary tools to improve accuracy.
Alternate word list / profanity filter.
Available 24/7 for emergencies or breaking news.
No scheduling issues.
No need to pay for a full hour of labor if only a few minutes are needed.

Why Caption Live Broadcasts?

All live programming should be captioned, and here’s why.

Local regulations, laws or guidelines governing accessibility, which can vary by country or community. As an example, in the US, government regulations require cable operators, broadcasters, satellite distributors and other multi-channel video programming distributors to provide closed captioning on most live programming. Failure to follow these regulations can result in government fines or other sanctions.

On top of federal guidelines, broadcasters also have a moral obligation to make their programming accessible to everyone. Closed captioning:

Makes programming more accessible to individuals who are deaf or hard of hearing
Improves the viewing experience and comprehension for those in noisy environments
Helps viewers whose primary language is different than the language being broadcast
Supplements speakers with accents that may be harder to understand for some.
Enhances breaking news or emergency situations related to weather, violence or traffic related issues.

Next Generation Technology – ASR

Speech-to-text technology or Automatic Speech Recognition (ASR) has improved dramatically over the last few years and delivers accuracy for live closed captioning that is comparable to a live human captioner in many situations and genres.

You can now leverage this technology and reap significant savings and benefits using VoCaption as your primary or back-up captioning solution.

Closed Captioning Improves Comprehension and Retention

The modern world is active and filled with noise and distraction. It’s common to see people doing multiple tasks at the same time…travelling, cooking, exercising, etc. People also watch while using a second screen. Often, individuals are in places that are loud and crowded, or areas where they must respect a level of quiet.

What does this mean for live programming?

People can’t always listen to the audio while they watch the video.
Live closed captions allow viewers to follow along with the information being shared.
Live closed captioning also makes the information being shared more retainable. Did you know that studies show that people retain information better when they hear it and read it together?
It also helps individuals to pick up words that they didn’t understand when spoken, and reinforces the information being shared.

How VoCaption Creates Automated Real-time Closed Captioning

VoCaption uses Automated Speech Recognition to convert the spoken word from program audio into the text you see on the screen.
Industry leading accuracy with ultra-low latency, and support for over 50 languages.
Accuracy is improved through a custom dictionary for supplemental or program specific vocabulary, such as proper nouns (people and place names) and specialist terminology.
A Substitution List provides a profanity filter and ensures consistent presentation.
Output options include EIA 608/708 closed captions, teletext/OP47 subtitles, DVB bitmap or teletext subtitles, SCTE-27 and open (burnt-in) subtitles – delivered via Polistream, OASYS, Broadcast Pix or a suitable third-party caption/subtitle encoder.
Delivered captions are saved to file for re-broadcast, VOD delivery or archiving purposes. These can also be used to harvest metadata to improve search within a media library and for social media and SEO improvements.
Saved files can be given a human review and corrected as needed to ensure perfect captions for future uses.

Languages Inputs Diagram Advantages FAQ’s Video Hardware Datasheet

Previous Next

Supported Languages

Currently VoCaption supports over 50 languages including English, French, Spanish, Portuguese and even bilingual Spanish/English option. A “Global” approach to language modeling ensure that you get optimum results for every conversation regardless of the speaker’s accent.

Bilingual Spanish/English Option

Arabic
Bashkir
Basque
Belarusian
Bulgarian
Cantonese
Catalan
Croatian
Czech
Danish

Dutch
English
Esperanto
Estonian
Finnish
French
Galacian
German
Greek
Hebrew

Hindi
Hungarian
Interlingua
Italian
Indonesian
Japanese
Korean
Latvian
Lithuanian
Malay

Mandarin
Marathi
Mongolian
Norwegian
Persian
Polish
Portuguese
Romanian
Russian
Slovakian

Slovenian
Spanish
Swedish
Tamil
Thai
Turkish
Uyghur
Ukrainian
Vietnamese
Welsh

Inputs

Line-in / microphone
Audio sound card
SDI or HDMI capture card
AES 67 over IP
AES 67 over XLR
MP4 file
Transport stream

VoCaption - OASYS Diagram with Polistream

Advantages

Single 1RU
Available in a single 1RU chassis or as software only
Automatic Conversion
Automatically converts program audio into real-time, live captions.
Custom Dictionaries
A Custom dictionaries is available for unfamiliar names or local/regional words to improve accuracy.
Alternate Word List
Alternate Word List / Profanity Filter to replace words that should not go to air or words needing capitalization.
Saved to EBU-TT or SRT File
Results are automatically saved to an EBU-TT or SRT file to edit for replay, archive, compliance, repurposing and metadata harvesting.
Integrates with Polistream
Integrate with Polistream to detect the absence of subtitles and instantly begin providing captions to a live feed
Upgrade for Polistream U8000
Retro fit with Polistream U8000 caption/subtitle encoder.
Upgrade to OASYS Integrated Playout
Integrates with OASYS Integrated Playout or OASYS Channel-in-a-Box solutions.

FAQ’s

What is VoCaption?

What is VoCaption and how does it work?

VoCaption converts the spoken words from live program audio into real-time captions automatically using a form of AI known as speech-to-text or automated speech recognition (ASR). The system is available with locally installed or cloud-hosted ASR processing:

Cloud-hosted ASR – VoCaption runs on a standard Windows PC using the Internet to access the most up-to-date cloud-hosted ASR, reducing hardware overheads and ensuring the most accurate results.
Local/on-premise ASR – the ASR is installed on a powerful Windows PC as one or more Containers using Docker Desktop for Windows, keeping all content and data processing on-site with all updates handled manually. The PC used should have a powerful CPU, lots of RAM and a high-spec GPU, such machines do not come cheap, so this option may not be for everyone – particularly low-volume users.

What about accuracy?

Artificial Intelligence has improved dramatically thanks to advancements in computer software and hardware. As a result, accuracy is also much improved and is often in the high 90% range. However, accuracy also depends on a number of external factors related to audio quality:

Background Noise – The presence of music, noise from traffic, crowds, barking dogs and other intrusive sounds can have a direct impact on accuracy.
Speech Clarity – Some speakers are whisper quiet and others mumble. Some speak very clearly and others have accents that can be heavy and difficult to understand. These are issues that human captioners also face and impact accuracy.
Speaker Interactions – Multiple speakers often talk over one another and make it hard to understand. Interviewers don’t always allow the interviewee an opportunity to answer a question before asking another question. Sporting events often have multiple speakers addressing play-by-play and color commentary that often speak at the same time and this makes it hard to separate the dialog into cohesive text iterations of each conversation.
Vocabulary – Vocabulary will play an important role in accuracy. Jargon, buzz words, acronyms, foreign words and mixed languages along with localized names of people, towns, geography and other uniqueness impact accuracy. To prepare for these situations, VoCaption incorporates Supplemental Dictionary and Alternate Words List. These improve accuracy by enabling localization and suggesting alternative words for profanity or similar pronunciations.

Can VoCaption meet my legal obligations for caption quality?

99% accuracy depends on a number of things including quality of the audio, the number of speakers, cross talk by those speakers, background noise, music, use of localized or foreign words etc. Given those variables neither machine nor human can maintain 99% accuracy for a full, live broadcast.

The FCC rules for US broadcasters do not provide an accuracy benchmark but instead say: FCC rules for TV closed captioning ensure that viewers who are deaf and hard of hearing have full access to programming, address captioning quality and provide guidance to video programming distributors and programmers. The rules apply to all television programming with captions, requiring that captions be:

Accurate: Captions must match the spoken words in the dialogue and convey background noises and other sounds to the fullest extent possible.
Synchronous: Captions must coincide with their corresponding spoken words and sounds to the greatest extent possible and must be displayed on the screen at a speed that can be read by viewers.
Complete: Captions must run from the beginning to the end of the program to the fullest extent possible.
Properly placed: Captions should not block other important visual content on the screen, overlap one another or run off the edge of the video screen.

Obviously, our goal is to provide the most accurate speech-to-text conversions possible, given the quality of the audio. To that end, we provide two additional ways to improve accuracy:

Custom Dictionary: A custom dictionary is available to include local, regional and hard to pronounce words, names and geographical places and references, including foreign names and words so you can pre-load these items into the system.

Alternate Words Dictionary: In addition, we provide an alternate words dictionary that you can populate to swap out words you don’t want going to air, including profanity, to ensure certain words are replaced, capitalized or referenced properly to match the needs of your audience.

Additionally, the entire vocabulary of the speech engine is updated regularly to ensure continued improvement.

We run lower thirds on most programs.
Can we shift the subtitles to a different location? 

Depends on the caption inserter used. Polistream enables captions to be relocated but some caption encoders may not provide this feature. Call us to discuss.

How is the system controlled?

The system will respond to GPI’s, UDP and HTTP triggers from a traditional button box, keyboard hotkeys or from Secondary Events within OASYS Integrated Playout. When integrated with both Polistream and OASYS the system can sense the absence of captions and turn on automatically for emergency captioning and off again when captions return.

How many languages are available?

Currently the system provides 50 languages including Bilingual Spanish/English option.

Supported Languages

Is the system sold as a product or service?

VoCaption is available in the following options:

Aas a product in a 1RU rackmount unit loaded with a Docker Container within a Linux VM. All processing is done locally ad requires a license plus usage for minutes.
As software integrated with Polistream and/or OASYS Integrated Playout with processing done locally or in the Cloud. Requires a license for local processing plus usage by minute for either version.
As software only on a local Windows PC that links to the Cloud for processing and only requires minutes for usage.

What’s the lead time and commissioning process?

Lead times depend on which installation option you choose. It varies depending on the size of the project. Sales Engineering can advise during the pre-sales process. Cloud versions can be installed and configured in just minutes. Local versions that require hardware will depend on hardware delivery timetables.

VoCaption is available as follows:

In a 1RU rackmount server for stand alone operations
In a 1RU rackmount server with Polistream
As software only integrated with OASYS Integrated Playout
As software only for non-broadcast applications like meetings, church services and classroom applications.
Speech engine may be hosted locally, as a Docker Container within a Linux VM or hosted in the Cloud thus minimizing system requirements and ensuring automatic updates.

Automated, Accurate, and Reliable Live Captioning

Automatically Convert Program Audio into Real-Time Closed Captions and Subtitles!

Reach a Wider Audience!

Reliable Closed Captioning for Scheduled Live Programming or Emergency Broadcasts

Why Caption Live Broadcasts?

Next Generation Technology – ASR

Closed Captioning Improves Comprehension and Retention

What does this mean for live programming?

How VoCaption Creates Automated Real-time Closed Captioning

Supported Languages

Inputs

Advantages

Single 1RU

Automatic Conversion

Custom Dictionaries

Alternate Word List

Saved to EBU-TT or SRT File

Integrates with Polistream

Upgrade for Polistream U8000

Upgrade to OASYS Integrated Playout

FAQ’s

Book a Demo

Watch our demo video below.

You can schedule a private, on-line demo here.

READY TO GET STARTED?

SUPPORT

FOLLOW US

Company

Products

Resources

Solutions

Store

PROFESSIONAL MEMBERSHIPS