Speech recognition systems
Thanks to Paul Tomlinson for supplying the information on this page. Paul is currently Treasurer of the National Cochlear Implant Users Association.
My professional training was as an electrical engineer so when, in the late 1980s, my moderate deafness became progressively more severe, my dream was always of the development of some form of ‘listening box’.
This would have a microphone on one side into which friends and visitors would speak, and a display on my side on which their words would pop up for me to read.
In practice, technology came to my aid in a different way – through my wonderful cochlear implant. But now, all these years later, my dream of a ‘listening box’ is actually very close to reality in the form of speech recognition software that runs on a fairly basic laptop PC or portable device.
Below is a screenshot of me talking into my Google Nexus 7.
Tablets and smartphones
The key change over recent years has been the quite incredible pace of technical development, which has resulted in Android based Tablet computers and Smartphones becoming almost commodity items.
Outfits like Tesco and Amazon are now offering Tablets for around the £120 mark which match those you’d have paid £200+ for not much over a year ago, and whilst the Android models from brands like Google and Samsung are a bit more expensive they still offer a stunning amount of technology for the price. Similarly very effective Android based Smartphones can now be bought for about the price of an evening out for two!
The emphasis on Android in the above paragraph is deliberate, because recent versions of the Android operating system include a quite brilliant speech recognition package as a standard feature. Perhaps because typing more than a few words via the on-screen keyboards displayed on Tablets and Smartphones can be a rather tedious business, Android’s standard on-screen keyboard includes a microphone symbol.
If you tap on it the machine goes into speech recognition mode and throws up on the screen its interpretation of what you – or your friends and visitors – are saying, second by second. It really is as simple as that! For example I spoke “How does Android speech recognition work” into the Google search box on my Nexus tablet recently, the machine recognised what I said absolutely perfectly – and then got on and went into search mode on my behalf.
Hearing loss and speech recognition technology
However from the point of view of the hearing impaired the obvious use is as a way of listening to what other people are saying to you: there are plenty of basic word processing and note-taking Apps available for free download, which will accept input from the Android on-screen keyboard and thence from its speech recognition option.
In this context the key feature to Android’s speech recognition system is that it makes no attempt at all to model individual voices, it genuinely is a pure speech recognition system.
Hitherto most of the available systems have relied on a degree of training to get the best results for an individual speaker’s voice, but Android doesn’t do this. It records short pieces of the incoming speech and then sends them off to a powerful Cloud computer, which compares the sound pattern with a database of thousands of people speaking thousands of words, and sends back the text which produces the best statistical fit.
One consequence of this arrangement is that the device needs to have an internet connection to work to its full potential, but as so many homes these days have an always-on broadband connection which includes a wi-fi facility this isn’t usually a problem. Having said that I recently tried speaking into my Google Nexus tablet with wi-fi switched off, it followed my dictation quite accurately but perceptibly slower than it does in wi-fi mode.
Google Chromebook, ipads and iphones …
Although Android based systems have the largest share of the market (including the Google Chromebook, which is essentially a notebook size computer running Android) it doesn’t have the market to itself. Apple products such as the iPad and iPhone include a system called Siri which is mainly designed to allow the user to control the device by voice, but can be used as a form of dictating machine.
There are also various tablets and Smartphones on the market which use the Windows 8 operating system, which includes the latest version of the speech recognition engine which has been available within Windows for several years.
As always the motto has to be “try before you buy”, but if you can get access to a friend’s reasonable modern tablet or Smartphone do ask them to put it into speech recognition mode and see how things go. Like any phonetically based system it will always make a few mistakes, but the error rate is far better than most of us can ever hope to achieve with lip-reading, and it knocks scribbling notes on pieces of paper into a cocked-hat.