[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Rules in the Voynich Manuscript



Antoine Casanova posted an article in sci.crypt today with the above
title, message ID <8a0lih$tqs$1@xxxxxxxxxxxxxx>.  It looked interesting,
but I didn't understand all of it.  In particular, I don't know what it
means to have an "accounting of the Hamming distances equal to the unit", 
which seems to be central to the analysis.  I *think* that means that for
each length of VMS "word", he counts how many pairs there are of two
different words of that length that differ in exactly one character
position, ignoring the frequencies of the words.

He also draws a distinction between "words" and "terms", which isn't clear
to me.  I agree that it's not clear that the groups of characters
separated by spaces are necessarily words in the underlying language;
maybe it's a good thing to have a different name for them than "words".  
Is that what a "term" is, then, an unambiguous name for the things that
look like words?  I didn't see a definition in the paper.

My biggest concern with this analysis is that it looks like it wouldn't
remain valid if we changed our assumptions about how big a chunk of script
constitutes a character.  For instance, what if it were repeated with the
Frogguy transcription alphabet, which turns many Currier transcription
characters into multi-character sequences?  That would also break the
grouping by length.  So I'm not sure it would meet one of the criteria I
articulated some time ago for statistical tests, that of stability under
changes to the arbitrary assumptions.

Still, it seems like an interesting line for research, and the pattern of
><>< in Table 5 looks quite impressive.

Matthew Skala                       "Ha!" said God, "I've got Jon Postel!"
mskala@xxxxxxxxxxxxxxxxx            "Yes," said the Devil, "but *I've* got
http://www.islandnet.com/~mskala/    all the sysadmins!"