From Speech Recognition
Update #119 (May 2003)
Microsoft releases sample
applications built with MS SALT development tools
Vertigo Software created
voice-only applications for the retail and financial sectors
Two sample applications based on the Speech
Application Language Tags (SALT) specification were released by Microsoft
in April. The applications are voice-only applications (that is, they work over
standard telephones without requiring any “multimodal” capabilities.) The sample
applications were developed by Vertigo Software Inc., a provider of
software development and consulting services, using Microsoft’s SALT-based
Speech Software Development Kit (SDK) beta 2. Vertigo has developed Web-based
ASP.NET sample applications for Microsoft in the past, according to Scott
Stanfield, Vertigo CEO. The Speech SDK integrates into the Visual Studio .NET
development environment familiar to Web developers.
The sample applications are important in at
least four ways:
1. They provide
developers an example of a complete application that they can use to understand
SALT and speech-application development.
2. The specific
applications can serve as a template for rapid development of similar
applications.
3. The sample
applications illustrate how pre-existing graphical Web resources can be re-used,
in particular, the business layer and data layer code. Both sample applications
extend an existing Web sample application, and a voice transaction will
automatically be reflected in the Web application. James Mastan, director of
marketing for Microsoft speech technologies, noted that those specific sample
Web applications have been studied by many developers, and provide a familiar
base to build upon.
4. The
applications will give developers an insight into the differences in Web-site
Graphical User Interface (GUI) design and Voice User Interface (VUI) design.
Stanfield said that Vertigo designers were challenged in creating a quality VUI.
“It’s still mostly an art,” he noted. “We did lots of testing, and refined the
design through trial and error.” He said that Vertigo is developing VUI
specialists, reflecting the differences between GUI and VUI designs.
Retail and financial examples
The two sample applications are examples of
retail- and financial-sector applications:
§
The speech-enabled ASP.NET Commerce Starter Kit is based on the
existing Web sample application “IBuySpy Store,” a tongue-in-cheek commerce site
for spies (selling items such as water escape vehicles). Users can use speech
recognition to order an item by product number, browse the store catalog, hear
product descriptions, and add products to their shopping cart by voice.
§
The speech-enabled Fitch & Mather Stocks (FMStocks) Web
application is an online stock brokerage that allows customers to manage a stock
portfolio by telephone. Users of FMStocks can obtain quotes on stock prices, buy
and sell stock, and review their portfolios.
The applications use
directed-dialog style, but with some flexibility and sophistication. For
example, the application uses confidence scores to decide whether to verify a
choice.
Developers can download the sample
applications on Microsoft’s Web site by visiting www.microsoft.com/ speech/
techinfo/ sampleapplications. The web site also includes detailed white
papers that contain specifics on the voice interface design and some of the
principles involved; these papers may be interesting to those who are interested
in looking at an early SALT application without having to set up the development
tools.
Xuedong Huang, general manager of Speech
Technologies at Microsoft, summarized: “Providing developers with
speech-enabling best practices gives them immediate access to this
groundbreaking technology, allowing them to produce speech applications that can
help reduce costs, enhance revenues, and improve business agility for their
businesses or their customers’ businesses.”
Voice platform
The sample applications run on a PC with an
attached microphone, using Microsoft’s speech recognition engine and recorded
speech. Microsoft is planning a beta release of its Microsoft SALT-based speech
platform this summer, and that platform will allow calling the sample
applications by phone, Mastan said.
A number of companies are creating SALT-based
solutions (SRU # 118, April 2003, p. 1). Microsoft is actively promoting
telephone and multimodal solutions with a series of Webcasts and other
initiatives. They are also offering prizes for the best “speech controls,”
speech components for building applications, in a contest at
www.componentsource.com/ SpeechContest.
System using BeVocal hosting at
Venetian Hotel, Las Vegas, and elsewhere
On April 7, OnCall Systems, Inc.
announced its speech-based telephone recruiting application and the signing of
The Venetian Hotel in Las Vegas as a customer. The bilingual (English and
Spanish) system is targeted at hourly workers, rather than the salaried
employees targeted by most Web recruiting sites. The system is hosted by
BeVocal for OnCall using Nuance speech recognition, but Udhe
Abluwalia, OnCall president and CEO, said the system is designed for easy
portability.
Abluwalia said that the telephone-based,
hourly-worker recruiting solution is OnCall’s focus. It was conceived to meet a
need, and speech recognition only became a requirement when they realized that
tasks such as the caller choosing a job type (with over 600 job titles at the
Venetian, for example) could not be addressed satisfactorily with a touch-tone
system. Reena Jadhav, OnCall Chief Marketing Officer, said, “Hourly workers,
which make up over 70% of U.S. workers, are much less likely to have access to
the Internet, but they do have telephones.” (See the article on Internet access
in the May issue, p. 17.)
The system is designed to be a continuing
resource for workers and employers. When potential employees call, they are
given a voice mailbox that they can check for messages or job offers or
appointment requests. The system also has outgoing alerts for job offers and
interview appointments. The outgoing calls are interactive, for example, to
verify that the person answering the phone is the called individual (using a
user passcode) and to allow the potential employee to choose a time for an
appointment and confirm it on the spot. In addition to providing a tailored
response for specific employers, OnCall plans to establish 1-866-JOBFLASH as a
multiple-opportunity job line, where workers can be offered a choice of
companies and job categories. As workers enter the system, employers can often
call up and find a qualified individual immediately, rather than having to post
a job requirement. Workers can also call in at any time to update their
“resume.” Abluwalia indicated that OnCall’s layered architecture makes the
expansion to a job marketplace an easy technical evolution.
When individuals first call in, they are
interviewed by the speech recognition system. Some questions, such as
availability, job preference, and years of experience are interpreted by the
speech recognition system and entered into a database for screening. Answers to
other questions are recorded, allowing more lengthy answers, as well as letting
employers hear the communication style of the applicant. Some questions are
job-specific or employer-specific. An employer can set up screening criteria,
and, if a caller meets those criteria, the system can set up an appointment
during the initial call. Some applicants are so surprised by the immediate
response, Jadhav said, that they call to confirm that the appointment is real.
The best applicants go quickly, Jadhav said, so the quick response often gives
employers using OnCall an advantage in getting the best employees. Employers can
review the database of resumes by phone or through a Web interface.
Dave Newton, vice president of human resources
for the Venetian, said, “Our goal is to have our employment people spend their
time interviewing the most qualified candidates. OnCall enables us to do this by
automatically prescreening and scheduling interviews with qualified candidates
for open jobs. It also allows us to collect resumes for future openings.”
The system has already handled over 25,000
calls from about 4,000 individual applicants in three months, according to
Jadhav. In addition to the Venetian, OnCall hotel customers include Sheraton
Palo Alto, Westin Palo Alto, Marriott San Mateo, and the Hyatt San Jose. With
hotel occupancy rates very low in Silicon Valley lately, Jadhav said, OnCall
turned to the Las Vegas market to get more traffic on its system. (The Venetian
is unveiling a new hotel tower and expanded convention space in June 2003, which
increases its existing requirement for new hires.) Notably, however, even though
some Silicon Valley hotels are not currently hiring, they are still paying a
monthly fee to remain on the system; they save money by not having to manually
process and reply to the resumes. The company is also testing the system with
security companies, including First Alarm Securities.
OnCall offers the system on a service bureau
model with a startup fee and monthly subscription fees. Jadhav said that the
monthly fees were chosen to result in an overall cost of about $100 per hired
employee. She estimates that this compares to other alternatives as follows:
recruiters, $12,500; newspaper, $5,000; job fair, $3,000; campus recruiting,
$2,000; and the Internet, $1,000. Some of the efficiencies of the OnCall service
include avoiding advertising costs, fewer employees handling resume screening
and interviews, a more objective applicant-screening process that reduces job
discrimination claims, decreased hiring time, 24/7 operation (particularly
important for applicants that are currently working), and an employee database
that is more current (because applicants can update it by phone). Jadhav said
that there was also the opportunity to reduce employee turnover by correlating
the screening criteria with the resulting length of employment of the worker.
Jadhav estimates the available market that the
OnCall service addresses at $1l billion, based on the job turnover of about 110
million hourly workers. Competition in the telephone-automated segment is
limited to a few companies with narrow touch-tone solutions, she said. (For
comparison, the temporary-placement companies have about $60-$120 million in
revenues and online recruiting generates about $1-2 billion, she estimates.)