Internet Speech


Requires Macromedia Flash
	Web Browsing
Requires Audio Program. Click the right mouse button to download.
	E-mail
	Stock Quote

May 5, 2000

Digitized Talking Heads Lend Voices To The Web
By Donna Howell

She has green hair, a thin smile and heralds a different way to get the Net - via voice. Ananova, the first virtual newscaster, debuted on the Web last month, attracting a lot of media and user interest.

"Many millions of hits" crashed servers around the world, claimed Jonathan Jowitt, senior project director for London-based Ananova Ltd. The 3-D character is fictional Max Headroom's more serious cousin. She was built by the British Press Association, which named its Internet media unit Ananova Ltd. and has put it up for sale. Ananova "reads" news, generating human-sounding speech from text. Visitors to Ananova.com click on her picture to start a video newscast. Eventually, Ananova will be able to listen and interact with her audience; for example, responding to a spoken request for sports scores, Jowitt says.

Voice is coming to the Net, and not just Ananova's. This summer, some drivers will be able
to surf the Web by voice from their dashboards. But already, anyone with a telephone can check their e-mail simply by "asking" for it. This is largely the work of speech synthesis and voice-recognition software, along with the acceptance of a new standard way to build Web pages to be heard aloud.

Within five years, 45 million wireless phone users in North America will access the Web regularly by voice, says The Kelsey Group Inc. of Princeton, N.J., a market analytic firm. "There are a ton of start-ups promising to push Web-based content by voice," said analyst John Dalton of Forrester Research Inc.

Overseas Users One voice-Net product of Motorola Inc. provides a way phone companies can offer voice and text browsing. With the product, users of Web-enabled cell phones can view or hear Internet content, and any phone system with the product can let people surf the Net by voice. Julie Roth, a division marketing director at Motorola, likes testing it. "I often get in my car and listen to Net content, or ask it to dial a number for me," she said. Motorola customers are starting to use the system overseas. It's not yet available in the U.S.

The company is part of the VoiceXML Forum, an industry group developing VoiceXML, a set of programming code to voice-enable Internet sites. It picks up where HTML, the language used in creating almost all Web pages, stops. Users can call and hear Internet sites that are VoiceXML-adapted. AT&T Corp., IBM Corp. and Lucent Technologies Inc. are co-developing VoiceXML, which 130 firms now support.

Some also are working on services that will help users get certain Web content by phone. Motorola is humanizing its voice service with a name. "Mya" is billed as a cyber-generated personal assistant who can read e-mail or transport users around the Web. "She" is represented in Motorola's marketing as a 3-D ultrablond answer to Ananova. Automatonically correct in a silver jumpsuit, she debuted in a TV commercial during the Oscars telecast, saying warmly, "Hiya. I'm Mya." But the Mya service will be just voice, not the 3-D character used to market her. Ananova, though, could start popping up on Web-capable cellphones next year. Her makers are in talks with European mobile phone service providers. Britain auctioned off spectrum licenses for such next-generation video cell phones last week.

"The future is not that far around the corner," said Jowitt of Ananova Ltd. "The problem is fitting her into that tiny (cell-phone screen) display."

The nuts and bolts of a talking, listening Web are voice-recognition and speech-synthesis programs. In development by many companies for many years, the quality finally is getting good enough for commercial use. Voice-recognition programs are so good they're replacing human voices on some call desks.

"Online brokerages and the airlines are saving tens of millions of dollars by moving to automated voice support," said Forrester's Dalton. But turning a page of text into understandable talk has proved more difficult. "Text-to-speech technology is still lousy," said Dalton. Said Jowitt, "It's a funny thing, this text-to-speech. Humans put in little breaks and emphasis to employ more meaning." His team is working on a new "emphasis algorithm" to help Ananova make her points. Ananova is no "10" in voice quality. "She's probably about a 6 or 7," said Jowitt. But like Eliza in"My Fair Lady," Ananova's getting voice lessons.

Other Services Handlers are tweaking the voice-synthesis program she runs on. It's called RealSpeak, from long-time voice company Lernout & Hauspie Speech Products NV of Ieper, Belgium. Beyond the big splash of Ananova, other voice-browsing services are quietly going online.

"We have a browser that can browse any Web site using any phone," said Emdad Khan, chief executive of privately held InternetSpeech.com in San Jose, Calif. The firm plans to release its NetECHO voice-browsing service in a few weeks. Voice browsing "is going to grow exponentially because more people have telephones than computers," Khan said. He expects mobile users, unwired seniors and the visually impaired to find the service particularly useful. He says many disabled people must have voice access to use the Web.

The visually impaired have used voice recognition and speech synthesis products for years. Also, PC users can buy off-the-shelf software. But the quality of products designed for consumer use has been widely criticized.

Potentially easier voice-interactivity with the Web is coming through these new services that offer the Net by phone. Lucent Technologies unveiled its PhoneBrowser product in March. Like Motorola's system and InternetSpeech.com's service, it lets users have Web pages read back aloud. PhoneBrowser is being tested through users of motor club DriveThere.com, a project of St. Louis-based Influence LLC and Ultradata Systems Inc.

Surfing By Voice
The telephone may be the "ultimate information appliance," says analyst William Meisel, chief executive of TMA Associates of Tarzana, Calif. "If you want to surf the Web the conventional way, you at least have to download a browser and sign up with an Internet service provider." With these voice services, he notes, "you just dial an 800 number."

Behind the Web efforts of companies like InternetSpeech.com, Tellme Networks Inc., BeVocal Inc. and others is voice-command browsing technology from Nuance Communications Inc. The Menlo Park, Calif., firm went public April 13 at 17, and hit 50 last week after it said first-quarter revenue rose 60% from the year-earlier quarter. It's now trading near 40. Another voice-browsing provider is Vocal Point Inc. of San Francisco.

Reading E-Mail
Some other services, like TelSurf Networks Inc., let a passable computer-synthesized voice read back a user's e-mail, but provide a more appealing prerecorded human voice for news, movie listings and other information. Users dial up to check a calendar, send and read e-mail, and get driving directions, news or other items. But they can't access the whole Internet by any stretch - just bits the company has customized for voice delivery.

"We're signing up about a thousand people per week," said Richard O'Dea, product marketing director for the Westlake Village, Calif., firm. Similar projects exclude e-mail because it requires text-to-speech. "We don't believe the technology is at a point that works with the mass consumer market," said Mike McCue, chief executive of Tellme in Mountain View, Calif. "We're trying desperately to use human voice," affirmed Nick Unger, chief executive of Fairfax, Va.-based Audiopoint Inc. Quack.com Inc. is another information-clipping voice portal, based in Sunnyvale, Calif., as is BeVocal of Santa Clara, Calif.

Other large and small firms also are introducing ways to use voice to access limited parts of the Internet, as they explore what consumers want. Human voice services are the ones to watch, says Forrester's Dalton, because they're easier to listen to.

"Tellme has a very good interface. The service works. BeVocal is also a strong contender, with deep pockets to do some development," he said. But Lernout & Hauspie Chief Executive Gaston Bastiaens believes there's plenty of room for computer-generated voice.

"Text-to-speech on Internet is going to be big," he said. And it will be personal - one's own voice can be captured and emulated. "The interesting thing is you can create a real virtual personality through speech that can read news or your messages to you," Bastiaens said.

Ananova, for the record, doesn't have a British accent. It's more U.S. mid-Atlantic.