Install

Get the latest updates as we post them — right on your browser

. Last Updated: 07/27/2016

End May Be in Sight to Terrible Translating

BOSTON -- Take this sentence, let translation tools on an Internet search engine work their magic to translate it into Korean and back again, and this is what you get:

"It has this elder brother and boil, if magic they in order to translate it again at a Korean and after one, it makes the translation tool in the internet search engine, this is what you get."

Clearly, computers still can't translate as accurately and artfully as people do.

Many experts doubt they ever will. But recently some researchers have stumbled on what could be a powerful new tool for translators: the World Wide Web.

The web is flooded with translations of everything from novels to corporate documents to personal pages. Some have been translated by people, some by translation software, some by a combination.

For now, translation software programmers generally assemble dictionaries of words and phrases likely to occur in the documents to be translated, along with rules to help figure out an unfamiliar phrase.

It works well enough for texts with recurring vocabularies and style: weather forecasts or owner's manuals. But it's not a tool to be used on marketing literature or contracts. Dimensions, dates, local currencies, laws and proper nouns are too complex.

In short, computers have little common sense. Even a child can tell from the context of a sentence whether the "bank" is a place to borrow money or to fish, but that still largely baffles machines.

By surveying millions of translated pages, however, a computer could deduce that "bank" usually means financial institution when the word "account" also is used.

To deduce such rules, a computer needs millions of examples, laid out in perfectly aligned, translated text.

The emerging idea faces considerable obstacles, said Philip Resnik, a University of Maryland professor doing research in the field.

How can a computer identify two documents that are translations of each other? How does it ensure the texts line up perfectly, so the computer is comparing the correct sentences?

And what about incorrect translations? Mistakes on the web could be reinforced and perpetuated.

Resnik, however, insists that over the breadth of the web the important patterns will emerge. He said he has developed a program that searches the web and is more than 90 percent effective at spotting documents that are translations of each other.

Resnik acknowledges that rules-based programs, such as SYSTRAN's, are the best for now. But mining from the web has enormous advantages: less work, more languages, and the ability to keep up with changes in usage.

"Whatever the flaws of these systems, the quality may not be very good. But by God they will give you some translation of any sentence you give them," Resnik said.