25 June 2018
Below is an open e-mail that I sent to Bradley Hauer and Grzegorz Kondrak of the Department of Computing Science of the University of Alberta in Edmonton, Canada.
In Usenet Group sci.lang someone referred to your research on the Voynich manuscript, as described in https://www.folio.ca/using-ai-to-uncover-the-mystery-of-an-ancient-manuscript/ ("Using AI to uncover the mystery of an ancient manuscript", January 24, 2018) and https://transacl.org/ojs/index.php/tacl/article/download/821/174 ("Decoding Anagrammed Texts Written in an Unknown Language and Script"). This Usenet discussion (including my comments) can be found in Google Groups here: https://groups.google.com/d/msg/sci.lang/jlWEsHSdsRw/UTI4hmBEBAAJ.
You research is certainly interesting and important, but with all due respect, I think that regarding the use of Google Translate (page 10, section 5.4), you are not on the right track. That’s because for completely different and non-scientific reasons (fascination with scripts; boredom with what I should actually have been doing at the time; nostalgia for my C programming days), I happened to have done something similar to what you did:
I devised some schemes for writing Interlingua (a language that Google Translate does not support) in Greek script (in ways that make it immediately clear to human observers, that the result cannot possibly be Greek), and fed that ‘Greek’ to Google Translate. Surprisingly, the result is English that is often grammatical (or nearly so), and on superficial inspection, even seems to contain many interesting philosophical thoughts. If you look closer though, you will see that it is complete nonsense.
If we replace Interlingua with Hebrew, and the script and possible encodings in the Voynich manuscript with my Greek encoding schemes, we see the parallels. And just like the English I obtained from Google Translate does not reveal what my Interlingua originals were about, I think your results do not give us more insight into the meaning of the Voynich manuscript, and do not prove that it was originally in Hebrew.
What the results do show, is a shortcoming of Deep Learning techniques (as now employed by Google Translate and DeepL), which can produce seemingly sensible output from invalid input, if the invalidity of that input is not detected in a preliminary step in the translation algorithm.
The point is my ‘Greek’ Interlingua wasn’t Greek, but GT assumed that it was. And my original Interlingua texts weren’t philosophical at all: one is about a car journey in France, during which I discovered a nice programme on a French classical music station, and I drank coffee with my wife; the other is about muscle and bones problems people may develop when residing in a spaceship without gravity, and my amazement that they don’t let the spacecraft rotate like a big bicycle wheel, so artificial gravity would result.
GT’s ‘translations’ give a completely different and therefore false impression of the contents.