[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Specialty words

    One of the problems of trying to match word patterns is that the VMS
does not
use double characters as frequently as would be expected in any natural
language that
I can see. Whether syllabic or alphabetic - you would expect a certain
amount of 'doubling' (except if there is a character that signifies - double
the preceding letter). We also have a problem with the consistent 'end
forms' - specifically the an, ain, aiin, aiiin types. This may indicate that
a character's shape depends on where it is in the word (like Arabic) and
makes it difficult once again to make the comparison you suggest - if STAR
is ABCD, but the T is written differently in PLANET - then you don't really
see it as the same character.

    John Grove

----- Original Message -----
From: Brian Eric Farnell <bfarnell@xxxxxxx>
To: Voynich List <voynich@xxxxxxxx>
Sent: Thursday, May 11, 2000 10:32 PM
Subject: Specialty words

>     I haven't had a chance to catch up on 8 years of e-mail, so my ideas
> may be old hat but here goes.  I'm a Chinese linguist with some basic
> experience in cryptography.  I don't have the software or know-how to do
> this stuff, but I would (from my Chinese experience) approach the
> manuscript's words instead of symbols.  To do this we would have to
> assume (on shaky ground) that the spaces correspond to word breaks.
> Once we do this and have a baseline word frequency (already done I
> understand) an effort should be made to break the text as well as
> possible into subjects.  Perhaps the pictures don't delineate changes in
> subjects---perhaps they do.  If they do, it should be possible to
> compare those section's word frequency to the whole and thus produce a
> list of 'specialty words'.  In an astronomy text one can reasonably
> expect to see words like 'star' and 'planet' far more often than in a
> biology book.  Once a list of words with a high probability of being
> subject specialized has been determined, than consider the cipher
> amongst those words only.  Look for things like 'star' being in the
> pattern ABCD, which makes 'planet' EFCGHB, because of the sharing of the
> letters 'a' and 't' amongst the two.  Comparing these similarities in
> specialty words to lists of specialty words generated in languages
> considered possible 'hits' should help ident the language. Words where a
> letter occurs twice or start and end with the same letter are a gold
> mine in this deciphering technique. Some sections would be more fruitful
> than others.  Some subjects lend themselves to flowery descriptions and
> metaphysical allusions, but stuff that's very hands-on should be written
> in ordinary language as a matter of habit.  Recipes for instance are
> likely to  contain a very high frequency of 'measure words' that you
> won't find anywhere else.  This method also has a high probability of
> correctly ident-ing the language even if Voynich is written in an
> obscure regional dialect---or even written by someone improperly
> schololed in the language he was writing in.  This is because of
> principles set forth in Grimm's law.  They (the Grimm brothers) studied
> Germanic languages and discovered that languages shift and change in
> regular patterns.  They set forth rules that turn translation of one
> Germanic language to another into a substitution cipher (to oversimplify
> things).  For instance English is much softer than German, the German
> 'tag' becomes the English 'day' as the less harsh tongue takes the 't'
> into a 'd' and gets lazy on the endings, dropping hard 'g's in favor of
> the less voiced 'y' modification.  I saw a demonstration of this in my
> German class years ago, given a long German passage that none of us had
> a clue about, then given a set of rules we were able to translate it
> easily into something akin to Old English and then had no trouble
> understanding.  This method could very possibly produce what seems to be
> a positive 'hit' on one language for the specialty words and then seems
> to fail the rest of the manuscript.  My personal assumption is that
> Voynich statistically looks funny because it is written in two or more
> languages.  I'm not talking about the differences in Voynich A and B,
> but the idea that a lot of 'scholarly words' in the text might be
> something like Latin or Greek while the rest could be in a common
> language, similar to any medical or legal textbook you find at modern
> universities.  The differences in Voynich A and B may be due to a
> difference in classical education.  Again, very practical sections of
> the manuscript are the gold mine, they are likely to have far fewer
> words that aren't in the common language.  'Old habits die hard' in this
> case would be a saving grace.  I think if this sort of analyses were to
> be attempted, dropping the endings from the words in the  lists made
> from possible languages should also be done for comparison.  The words
> in Voynich seem too short and the little I've seen shows way too many
> common letter sequences in the beginning of words, these combinations
> look like verb or noun endings to me.  They may have been chopped off
> and added to non-sense letters or to nulls.  Any thoughts?
> Respectfully,
> Brian Farnell