[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Specialty words

    I haven't had a chance to catch up on 8 years of e-mail, so my ideas
may be old hat but here goes.  I'm a Chinese linguist with some basic
experience in cryptography.  I don't have the software or know-how to do
this stuff, but I would (from my Chinese experience) approach the
manuscript's words instead of symbols.  To do this we would have to
assume (on shaky ground) that the spaces correspond to word breaks.
Once we do this and have a baseline word frequency (already done I
understand) an effort should be made to break the text as well as
possible into subjects.  Perhaps the pictures don't delineate changes in
subjects---perhaps they do.  If they do, it should be possible to
compare those section's word frequency to the whole and thus produce a
list of 'specialty words'.  In an astronomy text one can reasonably
expect to see words like 'star' and 'planet' far more often than in a
biology book.  Once a list of words with a high probability of being
subject specialized has been determined, than consider the cipher
amongst those words only.  Look for things like 'star' being in the
pattern ABCD, which makes 'planet' EFCGHB, because of the sharing of the
letters 'a' and 't' amongst the two.  Comparing these similarities in
specialty words to lists of specialty words generated in languages
considered possible 'hits' should help ident the language. Words where a
letter occurs twice or start and end with the same letter are a gold
mine in this deciphering technique. Some sections would be more fruitful
than others.  Some subjects lend themselves to flowery descriptions and
metaphysical allusions, but stuff that's very hands-on should be written
in ordinary language as a matter of habit.  Recipes for instance are
likely to  contain a very high frequency of 'measure words' that you
won't find anywhere else.  This method also has a high probability of
correctly ident-ing the language even if Voynich is written in an
obscure regional dialect---or even written by someone improperly
schololed in the language he was writing in.  This is because of
principles set forth in Grimm's law.  They (the Grimm brothers) studied
Germanic languages and discovered that languages shift and change in
regular patterns.  They set forth rules that turn translation of one
Germanic language to another into a substitution cipher (to oversimplify
things).  For instance English is much softer than German, the German
'tag' becomes the English 'day' as the less harsh tongue takes the 't'
into a 'd' and gets lazy on the endings, dropping hard 'g's in favor of
the less voiced 'y' modification.  I saw a demonstration of this in my
German class years ago, given a long German passage that none of us had
a clue about, then given a set of rules we were able to translate it
easily into something akin to Old English and then had no trouble
understanding.  This method could very possibly produce what seems to be
a positive 'hit' on one language for the specialty words and then seems
to fail the rest of the manuscript.  My personal assumption is that
Voynich statistically looks funny because it is written in two or more
languages.  I'm not talking about the differences in Voynich A and B,
but the idea that a lot of 'scholarly words' in the text might be
something like Latin or Greek while the rest could be in a common
language, similar to any medical or legal textbook you find at modern
universities.  The differences in Voynich A and B may be due to a
difference in classical education.  Again, very practical sections of
the manuscript are the gold mine, they are likely to have far fewer
words that aren't in the common language.  'Old habits die hard' in this
case would be a saving grace.  I think if this sort of analyses were to
be attempted, dropping the endings from the words in the  lists made
from possible languages should also be done for comparison.  The words
in Voynich seem too short and the little I've seen shows way too many
common letter sequences in the beginning of words, these combinations
look like verb or noun endings to me.  They may have been chopped off
and added to non-sense letters or to nulls.  Any thoughts?

Brian Farnell