| ||||||
|
fromSpeech Strategy News, May 2008 Yahoo! announces voice search for mobile phones in CTIA keynoteVoice-enabled oneSearch for idle screen gives fewer, but more relevant, resultsMarco Boerries, Executive Vice President of Yahoo! Connected Life, gave a keynote address at CTIA Wireless 2008 (see conference notes, p. 5 of the May issue). Yahoo! is treating the mobile phone as a separate opportunity from PC web browsing, and speech recognition is an important part of that vision. During the keynote, the company demonstrated spoken search requests for multiple types of search using a single search box, using technology developed by Vlingo (VUI Visions, SSN, March 2008, p. 26, and SSN, September 2007, p. 1), with exclusivity (subject to some exceptions). The speech recognition is network-based and uses the data channel rather than the voice channel, with some client software from Yahoo! on the phone. Vlingo recently announced that the core speech recognition technology they are using was licensed from IBM (p. 7 of the May issue). Available immediately for select Blackberry users in the United States, the new voice-enabled Yahoo! oneSearch can be downloaded from http://m.yahoo.com/voice. Over the coming months, the product is expected to support additional devices and become available internationally. Boerries predicted that by 2010, there will be more than four times the number of mobile phones than PCs. Yahoo! is addressing a number of aspects required to make search a good experience on mobile phones with Yahoo! oneSearch 2.0, a newversion of its mobile search service. First, they are working with service providers to try to get the search box on the idle screen so that no navigation is necessary to begin a search and so that one can simply push a button to speak a request. This idle screen solution is expected to roll out in Q2 2008. A text request can, of course, also be typed into the search box, and—through a new feature—a form of predictive text will suggest completions based on the most popular search queries that contains the letters typed, from which the subscriber may select or continue typing. For example, as you type in “Apple,” Search Assist may recommend links such as Apple iPhone, Apple iPod, or Apple stock price. At launch, the text-based Search Assist is available for the iPhone and is expected to become available on additional AJAX-compatible devices over the coming months. Second, as “oneSearch” suggests, the single search box provides access to multiple types of searches without specifically selecting a context. Yahoo! intends for oneSearch to be “open,” Boerries said, and will work with content providers to make single-point access possible. Boerries emphasized that giving a long list of web links in response to a search request is not optimal for a small device, perhaps taking an unspoken swing at Google. He said that oneSearch won’t return web links, but will attempt to return “answers.” He said that oneSearchwill use the Semantic Web (anevolving concept suggested by Tim Berners-Lee, the Director of the World Wide Web Consortium) to integrate more helpful content. For example, where today's search results for “barbeque restaurants”would include information such as the address and phone number of a local option, open results could also add information from restaurant booking companies displaying available reservations. The final innovation is the addition of flexible voice search. If one speaks, “N-C-Double-A,”for example, oneSearchinstantly returns a rich set of results highlighting the latest tournament scores, upcoming game times, and breaking news.Users may switch between speaking and typing at any time, enabling consumers easy access to refine queries. Boerries summarized the company’s view: “With the launch of Yahoo! oneSearch in 2007, we revolutionized mobile search by re-creating search specifically for the mobile phone, focusing on answers, not just Web links. In just over a year, we signed 29 partnerships with carriers across the globe, covering more than 600 million consumers under contract. With Yahoo! oneSearch 2.0, we are fundamentally changing the way consumers use the Internet on their mobile phones.” Voice search The demonstration of speech recognition for search, in front of a crowded hall, was flawless and even generated scattered applause. Without any prior context, Boerries said, “21,” “21” appeared in the text box, and, in a few seconds, theatres in Las Vegas showing the new movie were displayed. He then said, “British Airways 287,” that text appeared in the text box, and the current status of the next flight was displayed. He then said, “3600 Las Vegas Boulevard,” that text was displayed accurately, and a map showing that location was displayed in a few seconds. He then said, “Where is the best place to play craps in Las Vegas?” and advertiser results were displayed. Boerries said that the system becomes more personalized as one uses it. In a later press conference, Boerries said that “the big news of the day was voice input.” Vlingo Yahoo!’s deal with Vlingo includes an investment in the company, part of a $20 million round (p. 51), with the specific amount provided by Yahoo! not disclosed. The exclusivity is both ways, making the companies partners in voice search on mobile phones. There are some exceptions to the exclusivity, however; a service provider may demand, for example, the use of another speech technology provider or web search option. Dave Grannan, Vlingo’s president and CEO, said, “We have been thrilled by the market interest in Vlingo since our beta launch last year, providing true confirmation of the revolutionary technology we have developed. We have aggressive expansion plans over the next year that will take Vlingo overseas and across a broad range of mobile phones here in the U.S. We’re eager to leverage Yahoo!’s expertise and reach as we execute against this strategy.” Vlingo was co-founded by Mike Phillips, the company’s CTO. Mike started his career as a researcher first at Carnegie Mellon University and then at the Spoken Language Systems group at MIT working on speech recognition technology. In 1994, he founded SpeechWorks, which was later acquired by ScanSoft (now renamedNuance). Nuance has a competitive offering, called Nuance Voice Control, currently offered through Sprint and Rogers Wireless (SSN, September 2007, p. 6). Vlingo has supplemented the core speech technology with a new approach they call adaptive Hierarchical Language Models (HLMs), a form of Statistical Language Models (SLMs). One key to HLMs is that they adapt to the user’s habits andto the specific text box in a particular application, increasing accuracy over time. The hurdle to making such an approach work, Philips said, is using context in such a way that the data isn’t segmented into separate bins. It would not be very effective or robust, for example, to create separate language models for each possible context, even if one could decide in advance what those contexts should be. Statistical methods can create conditional language models without making too many assumptions ahead of time. |