Here are some links I’ve found in a morning of web-wide wanderings.
Machine Translation is beginning to develop phrase-based systems, supposedly to help eliminate ambiguity.
What’s the connection? Both seem to me to open the doors for some pretty scary stuff. The CIA monitoring blog content is one thing, but with the rising rightwing’s neglect of things like, say, civil rights and freedoms, I think the possibility of Chinese-styled censorship is not necessarily a thing only of SF… not in the the mid-range future, anyway. Of course, it needn’t be full-on suppression, just creative approaches to reducing the visibility and searchability of priavte-produced content that doesn’t agree with the oligarchic agenda.
And what about this translation stuff? I’ve heard a lot of chatter for years about how instantaneous Machine Translation is the holy grail of translation technology. I don’t buy it! The problem with instantaneous translation by machine is that it cannot clarify. Watching the translators at the film festival this week, translating questions by audience members into languages that directors speak, and translating answers into Korean (and sometimes into English), I was struck by the amount of clarification that the translators needed. They checked about all kinds of things, from specific words, intended meanings of words, and scopes of intended meanings. Sometimes they needed something rephrased in order to be sure of how to translate something, and sometimes they seemed unsure of the connection of two parts of an answer and needed to make sure they understood before conveying the intended meaning of a statement.
This shows us something absolutely crucial to the issue of translation: translation involves the transmission of meanings, that is, sets of concepts that are rendered transmissible within one set of arbitrary signs (such as English words) and melting them down to their signified meanings; thereafter, the translation is a re-casting of the same concepts in another set of arbitrary signs (such as English words). The fact that the translator’s understanding of the ideas conveyed by the content is so crucial, suggests to me that a machine that cannot understand cannot truly translate.
Secondly, the fact that a phrase-based translation system is envisaged seems dangerous to me. There are, after all, only a limited number of phrases one can punch into the database. Sometimes idioms have direct translations, such as “Yuyu Sangjong” in Korean to, “Birds of a feather flock together,” in English. But whole phrases? It seems to me this will profoundly limit the kinds of things that will be readable across a language barrier, and it’s not too big a leap of the imagination to think that the selection of which phrases get translative proirity will probably be, at least in part, determined by someone’s political agenda(s). The nightmare of Orwell’s Newspeak from the novel 1984 is really unlikely as a spoken language, but if a “translatability standard” is set up for the Web (as Ray Kurzweil suggests is possible in some of his writings) then the kinds of things we can say will be limited not only to foreign readers in other languages but also in the original document.
And no, this is not better than the document being inaccessible as it would have been before translation tech came along, because limited documents and the appearance of powerful translation tech that also necessitates limited content is a really dangerous thing.
So I think, for the time being, we’d better stick to multilinguality and to using human translators: they may also have political agendas, but those are individual, and not implicit in the whole process of translation itself. It’s a lot safer to let humans do this very human work.