Thursday, March 14

The Infoscope: reality augmented with translation

In a New York Times article today (‘Point, Shoot and Translate Into English’, in the Technology section, 14 March 2002), Anne Eisenberg describes an ingenious new invention. It solves a common problem experienced by travellers: coping with foreign languages.

Dr R. Ismail Haritaoglu, a computer scientist at an IBM research centre in San Jose, California, has devised a cellphone or palmtop containing a colour digital camera that records text in the foreign language and transmits it to a server on the Web. There, software identifies and translates the words, sends English text back and superimposes it on the cellphone or palmtop screen. Cellphones with embedded cameras are already available in some parts of the world.

The device is intended for translation not of sizeable chunks of language, but of odd words, phrases and sentences –– up to three or four lines of text. In that it presents the real world (a Chinese shop sign, for instance) with virtual information (a translation of the sign) added as an overlay, Dr. Haritaoglu’s invention is an example of ‘augmented reality’. The user selects the part of the image that contains the words, just as a PC user would frame a section of a photo to be enlarged in a standard photo-editing program. The image is then compressed and sent to the server via the cellular network. The server carries out image processing, optical character recognition and translation from the source language (Chinese, French, Italian, Spanish or German, so far) to the target language (English).

Augmented reality means the addition of information other than translations, too. If a camera fitted with a Global Positioning System device is pointed at a building, the overlay can consist of a street map of the area where the user is standing. The potential of the technology is vast.

IBM is seeking to develop Dr. Haritaoglu’s prototype, called the ‘Infoscope’, with several companies that might want to provide image analysis, including sign translation. Dr. Henry Fuchs, a computer science professor and expert on augmented reality at the University of North Carolina, says that such translation devices will become even more useful in five to 10 years’ time. He envisions glasses with a miniaturised camera in the side-piece, and the display in the top edge.

Meanwhile, Dr Haritaoglu keeps testing his prototype, Anne Eisenberg writes. Recently, he has been trying it out in Chinatown, San Francisco, deciphering the Chinese characters for shark fin, ginseng and so forth. ‘I was surprised when I pointed the camera at one box labelled in Chinese,’ he is reported to have said. Back came the translation of the words on the price tag: ‘Buy one, get one free.’