Speaking to Software
Volume Number: 16
Issue Number: 9
Column Tag: Speech Recognition
Speaking to Your Software
by Erik Sea
Making your application work well with IBM ViaVoice
Enhanced Edition 2.0
What Can I Say?
It's here! Talking to machines and having them respond and react has been the stuff of
science fiction for decades. The promise has been so long in coming that the release of
ViaVoice Millennium for Mac last year seemed to take some people by surprise - many
a passerby at MacWorld San Francisco was astonished by the speed and accuracy of the
system, even in noisy showfloor conditions. Nonetheless, the combination of
computational power and algorithm design has finally produced speech recognition
software for the Mac that permit routine and productive use, especially as fast, new
copper IBM PowerPC chips find their way into more and more Macs.
ViaVoice Millennium, the first release, was a low-end product, providing dictation
into a single application, SpeakPad, and non-customizable transfer scripts. Good for
basic dictation, with a large, extensible vocabulary, dictation macros, and AppleScript
support. ViaVoice Enhanced builds on this capability, adding new features such as
direct dictation into selected applications and allowing customization of "built-in
functions through AppleScript.
"Aha!" you say - "Direct dictation into selected applications, but what if I'm not among
the 'selected' few?" Fair enough - IBM can only test and support a few high-profile
programs (although the development team is always interested in testing new software
for compatibility, particularly games). However, the ViaVoice software doesn't
prevent dictation into any application and, in many cases, the Mac OS and ViaVoice
extensions that ship with our software are all you need - your application may already
support dictation and correction without you writing a single line of code!
Probably, though, you should write a line or two of code. This is essential for
maintaining the awe and admiration from your employer, and I know that you really do
want to anyway.
ViaVoice Speech Technology
But, before we write code, let's talk about speech. Or speak about talk, and how the
ViaVoice engine decides what words it thinks you uttered.
Unlike earlier "discrete" speech recognition systems, which ...required ... distinct ...
pauses ... between... words, ViaVoice works with "continuous" speech, with no
unnatural breaks between words. In consumer products, we're not quite to the stage
where you can have conversations with your computer, or even record or transcribe a
speech or a meeting, but for one person, sitting at a computer, speaking clearly and
providing cues such as punctuation and formatting, recognition is really quite good. In