A Brief History of ASR Technology

A Brief History of ASR Technology

Did you know that the first ASR Technology was invented in 1952?

ASR stands for Automated Speech Recognition. This technology uses machines (computers) instead of humans to convert speech to text for captions, subtitles, transcripts and other documentation.

One of the earliest projects that can be considered an ASR technology was developed in 1952 by researchers at Bell Laboratories. They called this technology “Audrey” and it could only recognize spoken numerical digits. A few years later in the 1960’s, IBM engineered a new technology called Shoebox which, unlike Audrey, could recognize arithmetic commands as well as digits.

Later in the 1970’s, a new model of ASR was developed called the Hidden Markov Model. In brief, this ASR speech model used probability functions to transcribe what it determined to be the correct words. Although the original technology was not very efficient nor accurate, about 80% of the ASR technology currently being used today derives from this original model.

So how did these technologies evolve into the ASR software that we know today?

In the 1970’s, various groups began to take speech recognition technology more seriously. The U.S Department of Defense’s ARPA, for example, began the Speech Understanding Research program which funded various research projects and led to the creation of new ASR systems. In the 1980’s, engineers began taking the Hidden Markov Method seriously which led to a huge leap forward in the commercial production of more accurate ASR technologies. Instead of trying to get computers to copy the way humans digest language, researchers began using statistical models to allow computers to interpret speech.

This led to highly expensive ASR technologies being sold during the 90’s which thankfully became more accessible and affordable during the technology boom in the 2000’s.

Nowadays, ASR technologies continue to grow and develop to constantly improve accuracy, speed, and affordability. The need for humans to check the accuracy of these technologies is decreasing, and the availability of ASR technology across all industries is spreading. No longer is ASR considered to be only useful for broadcast TV. The importance of this technology is being explored by universities, school systems, businesses, houses of worship, and much more.

What first began as a technology to recognize numerical digits has now developed into a highly advanced system of recognizing hundreds of languages and accents in real-time. BroadStream continues to innovate and improve upon ASR products to create systems that are accurate, easy to install and run, and affordable across various industries.

Our VoCaption and SubCaptioner solutions, provide real-time live captioning and on-premise, file-based captioning that saves time and money when compared to using human captioners and increases video accessibility and engagement. To learn more about these solutions, please visit our Captioning & Subtitling page!