[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: VMs: Another method different from Cardano Grilles



27/01/2005 6:16:20 PM, Elmar Vogt <elvogt@xxxxxxxxxxx> wrote:

>Jacques Guy wrote:

>> For how many words which DO NOT occur will those five wheels 
>> reconstruct? 

>Brilliant!

It just came to me like that, without thinking about it.

>I'm not one to discard the hoax theories so quickly, but Jacques, this 
>appears to be an excellent test for any hoax hypothesis: Are there words 
>which do not appear in the VM which the discs (or grilles or whatever we 
>use) ought to be producing in sufficient quantities?

There is a much simpler and statistically valid test.
Take the Voynich manuscript. Make a list of all the different
words in it. If you are not sure about the validity of the
definition of "word" (because you are not sure of what 
makes a boundary, for instance), make a list of all the trigraphs
in it (or n-graphs, doesn't matter). Now calculate the probability
of occurrence of each word (or trigraph or whatever) using the
Cardano grilles, or the wheels, or whatever. For instance:

<qok> We know that there are, say, 100 "letters" on the outer
wheel, and <q> occurs twice. So, the probability of
spinning a <q> is 2/100

Next, look at next wheel inside. Say, 60 "letters" and <o>
occurs 10 times. Probability of spinning an <o> = 10/60

Finally, <k> on the inner wheel. Using the same method again, 
say we get 8/40

So, the probability of spinning <qok> is:

 2/100 * 10/60 * 8/40 = 1/(50*6*5) = 1/1500

We have counted, say, 80,000 trigraphs in the VMS, so we should
expect 80,000/1500 = 533 occurrences of <qok>. But we see, let's say,
734 (completely off the top of my head, I have no idea at all of the
real figure)

Let's record that:

         <qok>
expected  533
observed  734

And we continue, doing the same for every trigraph (or word, if
you prefer words)

Next you calculate chi-squared. That gives you the probability
of the actual text of the VMS being significantly different from
what the wheel generates. In other words: the probability that
it was NOT generated by your wheel (or Cardan grills, or
whatever method). (You can also calculate phi if you want to
know how far it deviates from your wheel, phi = sqrt(chi2/N))

Now, chi-squared is valid only when all expected
absolute frequencies are >=5. In most statistics text books
they advise you to merge some rows and columns (here, it's
columns) until they are all >=5. You can do that here but 
I think that just IGNORING the digraphs or words with those 
expected frequencies is much better.

Thus if, say, <qoteedy> has an expected frequency of 3
(as calculated from the layout of the wheel) you just
do as if it was not there.







______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list