Automatically Convert Program Audio into Real-Time Live Closed Captioning!
Help Your Videos Reach a Wider Audience.
BroadStream’s VoCaption software uses Automatic Speech Recognition technology to create reliable and accurate video captions that provide real-time cost-savings for broadcasters along with an improved experience for your viewers.
VoCaption is live automated closed captioning software that supports broadcasters in various industries:
- Commercial broadcasting
- Public Access
- Religious Entities
- Educational Centers
- Corporate Broadcasting
VoCaption makes live captioning a “one mouse click” operation that takes the frustration out of hiring human captioners and truly streamlines your workflow. No matter your industry, VoCaption truly is real-time captioning software that provides peace of mind.
Live Closed Captioning For Any Emergency Broadcast
VoCaption live captioning software reduces the worry typically associated with producing live closed captioning without reducing accuracy or reliability. Here are a few of VoCaption’s with these standard features:
- Automatic conversion of audio to text in real time.
- Custom vocabulary tools to improve accuracy.
- Alternate word list / profanity filter.
- Available 24/7 for emergencies or breaking news.
- No scheduling issues.
- No need to pay for a full hour of labor if only a few minutes are needed.
Why Do Live Broadcasts Need To Be Captioned?
All live programming should be captioned, and here’s why.
Local regulations, laws or guidelines governing accessibility, which can vary by country or community. As an example, in the US, government regulations require cable operators, broadcasters, satellite distributors and other multi-channel video programming distributors to provide closed captioning on most live programming. Failure to follow these regulations can result in government fines or other sanctions.
On top of federal guidelines, broadcasters also have a moral obligation to make their programming accessible to everyone. Closed captioning:
- Makes programming more accessible to individuals who are deaf or hard of hearing
- Improves the viewing experience and comprehension for those in noisy environments
- Helps viewers whose primary language is different than the language being broadcast
- Supplements speakers with accents that may be harder to understand for some.
- Enhances breaking news or emergency situations related to weather, violence or traffic related issues.
Next Generation Technology – ASR
Speech-to-text technology or Automatic Speech Recognition (ASR) has improved dramatically over the last few years and delivers accuracy for live closed captioning that is comparable to a live human captioner in many situations and genres.
You can now leverage this technology and reap significant savings and benefits using VoCaption as your primary or back-up captioning solution.
Live Closed Captioning Improves Comprehension and Retention
The modern world is active and filled with noise and distraction. It’s common to see people doing multiple tasks at the same time…travelling, cooking, exercising, etc. People also watch while using a second screen. Often, individuals are in places that are loud and crowded, or areas where they must respect a level of quiet.
What does this mean for live programming?
- People can’t always listen to the audio while they watch the video.
- Live closed captions allow viewers to follow along with the information being shared.
- Live closed captioning also makes the information being shared more retainable. Did you know that studies show that people retain information better when they hear it and read it together?
- It also helps individuals to pick up words that they didn’t understand when spoken, and reinforces the information being shared.
How VoCaption Creates Automated Live Closed Captioning
- VoCaption uses Automated Speech Recognition, to convert the spoken word from program audio into the text you see on the screen
- Accuracy is enhanced thanks to VoCaption’s Supplemental Dictionary and Alternative Words List. These tools are used to fully localize a system by preloading names, geographical references, or other details appropriate to your region.
- The Alternate Words List also enables VoCaption to check for profanity and provide suitable substitutes where appropriate.
- Format options include EIA 608/708, teletext/OP47, DVB bitmap or teletext subtitles, SCTE-27 and open (burnt-in) subtitles working in conjunction with a suitable caption encoder.
- Output options are “saved to file” as an SRT or EBU-TT file for rebroadcast or archiving purposes. These can be used to harvest metadata to improve search within a media library and for social media and SEO improvements.
- These files can be edited as needed and corrections made to ensure perfect captions for repeat broadcasts.
Currently available in 31 languages:
What is VoCaption?
VoCaption creates live, Automated Captions using Automated Speech Recognition, that converts the spoken word from program audio into the text for on-screen display.
How does it work?
Live studio or program audio is fed into VoCaption. It uses Automatic Speech Recognition to convert the spoken word into text that can be inserted into the video and displayed on screen.
What about accuracy?
Artificial Intelligence has improved dramatically thanks to advancements in computer software and hardware. As a result, accuracy is also much improved. However, accuracy also depends on a number of external factors related to audio quality:
- Background Noise – The presence of music, noise from traffic, crowds, barking dogs and other intrusive sounds can have a direct impact on accuracy.
- Speech Clarity – Some speakers are whisper quiet and others mumble. Some speak very clearly and others have accents that can be heavy and difficult to understand. These are issues that human captioners also face and impact accuracy.
- Speaker Interactions – Multiple speakers often talk over one another and make it hard to understand. Interviewers don’t always allow the interviewee an opportunity to answer a question before asking another question. Sporting events often have multiple speakers addressing play-by-play and color commentary that often speak at the same time and this makes it hard to separate the dialog into cohesive text iterations of each conversation.
- Vocabulary – Vocabulary will play an important role in accuracy. Jargon, buzz words, acronyms, foreign words and mixed languages along with localized names of people, towns, geography and other uniqueness impact accuracy. To prepare for these situations, VoCaption incorporates Supplemental Dictionary and Alternate Words List. These improve accuracy by enabling localization and suggesting alternative words for profanity or similar pronunciations.
Can VoCaption meet my legal obligations for caption quality?
99% accuracy depends on a number of things including quality of the audio, the number of speakers, cross talk by those speakers, background noise, music, use of localized or foreign words etc. Given those variables neither machine nor human can maintain 99% accuracy for a full, live broadcast.
The FCC rules for US broadcasters do not provide an accuracy benchmark but instead say: FCC rules for TV closed captioning ensure that viewers who are deaf and hard of hearing have full access to programming, address captioning quality and provide guidance to video programming distributors and programmers. The rules apply to all television programming with captions, requiring that captions be:
- Accurate: Captions must match the spoken words in the dialogue and convey background noises and other sounds to the fullest extent possible.
- Synchronous: Captions must coincide with their corresponding spoken words and sounds to the greatest extent possible and must be displayed on the screen at a speed that can be read by viewers.
- Complete: Captions must run from the beginning to the end of the program to the fullest extent possible.
- Properly placed: Captions should not block other important visual content on the screen, overlap one another or run off the edge of the video screen.
Obviously, our goal is to provide the most accurate speech-to-text conversions possible, given the quality of the audio. To that end, we provide two additional ways to improve accuracy:
- Custom Dictionary: A custom dictionary is available to include local, regional and hard to pronounce words, names and geographical places and references, including foreign names and words so you can pre-load these items into the system.
- Alternate Words Dictionary: In addition, we provide an alternate words dictionary that you can populate to swap out words you don’t want going to air, including profanity, to ensure certain words are replaced, capitalized or referenced properly to match the needs of your audience.
- Additionally, the entire vocabulary of the speech engine is updated regularly to ensure continued improvement.
We run lower thirds on most programs.
Can we shift the subtitles to a different location?
Depends on the caption inserter used. Polistream enables captions to be relocated but some caption encoders may not provide this feature. Call us to discuss.
How is the system controlled?
The system will respond to GPI’s, UDP and HTTP triggers from a traditional button box, keyboard hotkeys or from Secondary Events within OASYS Integrated Playout. When integrated with both Polistream and OASYS the system can sense the absence of captions and turn on automatically for emergency captioning and off again when captions return.
How many languages are available?
Currently the system provides 31 total languages including Global English. The number of languages will grow as the system matures and expands.
Is the system sold as a product or service?
VoCaption’s software is available as follows:
- In a 1RU rackmount unit to work with 3 party subtitle encoders.
- As software only integrated with Polistream and/or OASYS.
- Once the software is installed minutes are purchased in blocks of 25 hours.
What’s the lead time and commissioning process?
Lead times depend on which installation option you choose. It varies depending on the size of the project. Sales Engineering can advise during the pre-sales process.
VoCaption is available as follows:
- In a 1RU rackmount server for stand alone operations
- In a 1RU rackmount server with Polistream
- As software only integrated with OASYS Integrated Playout
- As software only for non-broadcast applications like meetings, church services and classroom applications.