Statistical Machine Translation

NYTimes writes:

Statistical machine translation – in which computers essentially learn new languages on their own instead of being “taught” the languages by bilingual human programmers – has taken off. The new technology allows scientists to develop machine translation systems for a wide number of obscure languages at a pace that experts once thought impossible.

Dr. Knight and others said the progress and accuracy of statistical machine translation had recently surpassed that of the traditional machine translation programs used by Web sites like Yahoo and BabelFish. In the past, such programs were able to compile extensive databanks of foreign languages that allowed them to outperform statistics-based systems.

Traditional machine translation relies on painstaking efforts by bilingual programmers to enter the vast wealth of information on vocabulary and syntax that the computer needs to translate one language into another. But in the early 1990’s, a team of researchers at I.B.M. devised another way to do things: feeding a computer an English text and its translation in a different language. The computer then uses statistical analysis to “learn” the second language.

India v China Discussion

HBS Working Knowledge has an interview with the authors of a recent India-China article which created a lot of buzz with its contention that entrepreneurs could help India overtake China. Say the authors Yasheng Huang of M.I.T. and Tarun Khanna of HBS:

The different composition of the Chinese and Indian diasporas has to do with the different time periods during which each diaspora settled overseas and the different circumstances under which it did so. The Indian diaspora consists more of professionals; the Chinese consists more of entrepreneurs outside China. The implications of the differential structure of the diasporas is only now being appreciated, at least in the commercial arena. India has been particularly unreceptiveexcept until very recentlyto embracing the diaspora.

We think [economists, policy specialists, and business people] see it as a new and intriguing way to look at a centuries-old comparison. The best endorsement for our article is the way in which it has been disseminated. In India, it has spread by word of mouth and been reprinted in numerous newspapers and magazines. In China, one is hard pressed to find public discussion of the article; though the message is being discussed, we’ve been told, in other, less transparent forums. This is, in some sense, part of the very point of the article!