[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: VMs: More questions.
12/08/2003 12:51:03 AM, Barbara Barrett <barbarabarrett@xxxxxxxxxxxxxxx> wrote:
>So my first question (as a raw ignorant beginner) is what exactly is the
>purpose of statistical analysis, once the "standard" cryptographic stats
>(such as frequency analysis) have failed?
Cribs. Let me qualify that. If you have even only shallowly dug into
the archives, you will have noticed several long discussions on the
entropy of texts. Entropy is a measure of the unpredictability of "what
follows". Hawaiian for instance has a very low entropy because whenever
you hit a consonant you know that the next letter/sound is going to
be a vowel, and there are only five of them: a, e, i, o, and u.
Very predictable.
English on the other hand... if you see a 'g' almost anything can
follow, and 'h', an 'l', an 'r', another 'g', and so on. Much less
predictable: higher entropy.
So the computation of the entropy helps narrow down the range of
possible languages (assuming the code is a simple substitution one).
What if the code is not a simple substitution? The effect of
good encipherment schemes is to raise the entropy. A text
enciphered with a very secure algorithm will look completely
random: no way of predicting what the next letter/symbol is.
And now to pairs. Imagine a very secure cipher. The cipher
text is random, therefore its entropy is maximal. Now replace
each letter with a pair, 'a' with (say) 'ba', 'b' with 'to', etc.
Suddenly, the second-order entropy drops drastically (the cipher
text has, superficially, become much like Hawaiian: consonant, vowel,
consonant, vowel...). _But_ its third and fourth-order entropies
remain unchanged, those of a completely random text.
That was one example of the purpose and use of the statistical
analyses we have been indulging in for the past 13 years (yes,
thirteen years).
>The other thing is that my very limited experience of "secret alphabets"
>and "ciphers" is that they encipher not a spoken language but a
>language's *written* form. What if the voynich author(s) made the
>conceptual leap of enciphering a language's *phonemes* (rather than its
>standard alphabet's values and spellings) and then added to that
>contractions, abbreviations, truncations etc and then "spelling out"
>things like punctuation coma run hyphen ins coma & nulls question mark
Has it occurred to you that you have just described shorthand writing?
>In any case enciphering phonemes rather than letters would radically
>alter a language's visual appearance, and its unique statistics would no
>longer be valid (EG "e" may be the most common letter in english but the
>most common phoneme is /t/).
I have recently, perhaps a month ago, posted again here the English
adaptations of two or three articles by Boris Viktorovich Sukhotin,
which address all these questions in general terms.
>The question is has anyone ever produced even something as simple as
>frequency tables for the various suspected VMS languages' phonemes?
No. The "Chinese hypothesis" alone would require producing frequency
tables of several hundred Chinese dialects, most mutually unintelligible.
I have a comparative dictionary of (modern) Chinese dialects, but no texts.
Even if there were texts, those should be in the dialects as they
were 500 years ago, when the VMs was likely written. We have nothing.
Chinese was always written in wen2yan2 (Classical Chinese), never in the
dialectal forms, and we do not know with any degree of certitude how these
were pronounced, nor how many there were (many must have become extinct,
many arisen since).
The situation is the same for most other possible candidates. At any
rate, it does not take any statistical analysis to realise that the
VMs, if in a simple substitution cipher, cannot possibly be in Gaelic,
nor in Nahuatl, but just might, just might, be in Malay (or some other
Austronesian language), and very possibly in a Chinese-type sort of
language.
>And now my final question <sighs of relief all round ;-)>. There are
>apparently diacritics (accent marks) used in the voynich script. Do the
>transcription systems reflect these or are they ignored?
Nothing is ignored in our current transcription system.
Frogguy (do a google search), a.k.a. Jacques Guy
______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list