Computers that understand you – Voice Recognition
Technology with Integrity
By Tim Torian, Torian Group, Inc.
Speech Recognition has been part of the lore of computing since before there were computers. Science fiction has always included the idea that you would just talk to your robot, your house, or your car, and the devices would understand you and do what you wanted them to. As computing power has grown, and the technology of speech recognition has advanced, the dream is much closer to reality.
If you are not already dictating to your computer, consider the possibility. Many professionals can save time and money by getting their notes down on paper quickly and easily. Doctors, therapists, and many others are required to keep extensive notes on their work. In the past, they dictated in to a tape recorder, and hired someone to transcribe the dictation. Many are now talking directly to their computer, saving time and money.
The leading speech recognition products for the pc market are Dragon Naturally Speaking, now owned by ScanSoft and IBM’s ViaVoice. The Mac OS X software comes with basic speech recognition, allowing you to give your computer voice commands, as do many cell phones. Office XP and 2003 have speech recognition built in. Dragon Naturally speaking is the clear market leader, and has products with special vocabularies for the medical and legal professions. There are also full solution packages which work with Dragon – Portable digital recorders, Integration with a paperless office, etc.
Larger companies purchase speech recognition engines and servers which are designed for dedicated speech recognition, and integrate with other software to allow speech enabled technologies such as phone systems, speech enabled workstations, and websites that are voice enabled. These systems are usually powerful enough to support many simultaneous users. Cell phone companies use these systems to enable voice commands as part of their service.
Speech recognition takes a lot of computing power. As PC’s have evolved, the quality of the speech recognition has kept pace. The recent versions not only recognize words, but put them in the context of the sentence to see which word should be chosen among several similar sounding words. You should have a P4 1.5Ghz or better with 512Mb Ram running Windows XP. For a desktop, get a SoundBlaster Live! or SoundBlaster Audigy sound card. For a laptop, plan on using a USB adapter if the on-board sound system does not meet your needs. If being used professionally for heavy dictation, 2GHz/512MB are minimum requirements.
To get accurate results, you need to ‘train’ the software, usually by reading predefined text to the computer for about 30 minutes, so that it can associate the sounds you make with the words. The program learns – it gets better at understanding you as use it. When you correct the text it generates, it remembers the corrections, and makes better choices next time.
The computer can only interpret what it “hears”. The better the microphone and sound board you use, the better the results will be. It is well worth investing in a good microphone if you plan to use this technology. Experiment with the position of the microphone as well, and keep it consistent. You can connect it to a tape player or record your voice on the computer to check the clarity of the recorded speech. Of course, speaking clearly also helps get better results. You can also use a personal recorder, and transfer the recording to the computer for transcription later.
You will always need to proof the resulting text. Dragon Dictate integrates with most popular word processors. It is surprisingly accurate if set up and trained correctly.
There is much more to this topic than can fit in a short article. If you are interested in learning more, here are some starting points:
http://www.toriangroup.com/speech - Expanded version of this article, with web links to resources on speech recognition software.
http://www.speechtechmag.com - Speech Technology magazine.
http://www.scansoft.com/naturallyspeaking/ – Dragon Naturally Speaking
Tim Torian teaches computer networking at the College of Sequoias, and has owned and managed several businesses. He is president of Torian Group, Inc. which provides a full range of Technology Consulting services to local business, including computer services, networking, and web and custom software development. They can be reached at (559) 733-1940 or on the web at http://www.toriangroup.com