- Duration: 1 hr 22 mins
- Publication date: 30 May 2025
Abstract
40 years ago, when the task of getting machines to transcribe human speech was first investigated, even the largest mainframe computers had a fraction of the power of a smart phone of today. As a result, techniques for Automatic Speech Recognition (ASR) were developed that did not rely on massive computing power. Now that computing power is cheap and fast, computers are actually better at recognising human speech than humans. However, what computers are not so good at is understanding speech. This is because humans unconsciously apply context which is not available to a computer and so some of the techniques developed in the early days of ASR have again become relevant.
Good as the Cloud-based ASR services are for searching phone calls for keywords, they fall short of the requirements for full operational call transcription. However, by applying AI and some of the techniques developed at NPL, it is possible to add context to ASR thus transforming speech recognition into speech understanding.