    One of the problems of trying to match word patterns is that the VMS
does not
use double characters as frequently as would be expected in any natural
language that
I can see. Whether syllabic or alphabetic - you would expect a certain
amount of 'doubling' (except if there is a character that signifies - double
the preceding letter). We also have a problem with the consistent 'end
forms' - specifically the an, ain, aiin, aiiin types. This may indicate that
a character's shape depends on where it is in the word (like Arabic) and
makes it difficult once again to make the comparison you suggest - if STAR
is ABCD, but the T is written differently in PLANET - then you don't really
see it as the same character.

    John Grove

>     I haven't had a chance to catch up on 8 years of e-mail, so my ideas
> may be old hat but here goes.  I'm a Chinese linguist with some basic
> experience in cryptography.  I don't have the software or know-how to do
> this stuff, but I would (from my Chinese experience) approach the
> manuscript's words instead of symbols.  To do this we would have to
> assume (on shaky ground) that the spaces correspond to word breaks.
> Once we do this and have a baseline word frequency (already done I
> understand) an effort should be made to break the text as well as
> possible into subjects.  Perhaps the pictures don't delineate changes in
> subjects---perhaps they do.  If they do, it should be possible to
> compare those section's word frequency to the whole and thus produce a
> list of 'specialty words'.  In an astronomy text one can reasonably
> expect to see words like 'star' and 'planet' far more often than in a
> biology book.  Once a list of words with a high probability of being
> subject specialized has been determined, than consider the cipher
> amongst those words only.  Look for things like 'star' being in the
> pattern ABCD, which makes 'planet' EFCGHB, because of the sharing of the
> letters 'a' and 't' amongst the two.  Comparing these similarities in
> specialty words to lists of specialty words generated in languages
> considered possible 'hits' should help ident the language. Words where a
> letter occurs twice or start and end with the same letter are a gold
> mine in this deciphering technique. Some sections would be more fruitful
> than others.  Some subjects lend themselves to flowery descriptions and
> metaphysical allusions, but stuff that's very hands-on should be written
> in ordinary language as a matter of habit.  Recipes for instance are
> likely to  contain a very high frequency of 'measure words' that you
> won't find anywhere else.  This method also has a high probability of
> correctly ident-ing the language even if Voynich is written in an
> obscure regional dialect---or even written by someone improperly
> schololed in the language he was writing in.  This is because of
> principles set forth in Grimm's law.  They (the Grimm brothers) studied
> Germanic languages and discovered that languages shift and change in
> regular patterns.  They set forth rules that turn translation of one
> Germanic language to another into a substitution cipher (to oversimplify
> things).  For instance English is much softer than German, the German
> 'tag' becomes the English 'day' as the less harsh tongue takes the 't'
> into a 'd' and gets lazy on the endings, dropping hard 'g's in favor of
> the less voiced 'y' modification.  I saw a demonstration of this in my
> German class years ago, given a long German passage that none of us had
> a clue about, then given a set of rules we were able to translate it
> easily into something akin to Old English and then had no trouble
> understanding.  This method could very possibly produce what seems to be
> a positive 'hit' on one language for the specialty words and then seems
> to fail the rest of the manuscript.  My personal assumption is that
> Voynich statistically looks funny because it is written in two or more
> languages.  I'm not talking about the differences in Voynich A and B,
> but the idea that a lot of 'scholarly words' in the text might be
> something like Latin or Greek while the rest could be in a common
> language, similar to any medical or legal textbook you find at modern
> universities.  The differences in Voynich A and B may be due to a
> difference in classical education.  Again, very practical sections of
> the manuscript are the gold mine, they are likely to have far fewer
> words that aren't in the common language.  'Old habits die hard' in this
> case would be a saving grace.  I think if this sort of analyses were to
> be attempted, dropping the endings from the words in the  lists made
> from possible languages should also be done for comparison.  The words
> in Voynich seem too short and the little I've seen shows way too many
> common letter sequences in the beginning of words, these combinations
> look like verb or noun endings to me.  They may have been chopped off
> and added to non-sense letters or to nulls.  Any thoughts?
> Respectfully,
> Brian Farnell