[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
VMs: RE: question
2/27/03 8:35:13 PM, "GC" <glenclaston@xxxxxxxxx> wrote:
>I personally can't speak to the 'entropy' problem, as I believe
>the VMS to be cipher.
Entropy is the unpredictability of "what comes next" and
as such it is quite independent from whether a text is a cipher,
and whether it makes sense. A musical score has entropy too,
and so has a picture.
>However, variations of spelling, phonetic
>spellings, etc., were a common feature of medieval manuscripts,
It takes no mathematical knowledge to see that spelling
variations increase the unpredictability, e.g. "I am
pretty sure that the next word is going to be 'night',
but how is it going to be spelt? Night or nite?"
>If the VMS is not encoded, I don't see how entropy itself can
>offer an explanation for the oddities it contains.
It does not offer an explanation for those oddities.
It only offers a _measure_ of unpredictability of
"what comes next", and that unpredictability is very
low. In other words, "what comes next" is highly
predictable. Among other things, if the VMS is a cipher,
this rules out a vast number of enciphering schemes--
a polyalphabetic cipher for instance.
>Multiple
>repetitions of groups like 8oe, 8am, etc., and 'words' that differ
>only by one glyph strung along next to each other in many places.
>No one has yet offered a language that can answer these questions,
>and any 'pattern' search (something I do for unknown ciphers)
>would yield no spoken language that even closely matches these
>anomalies.
I have time and again brought here examples drawn from
real languages, from Indonesian to Gaulish.
>One manuscript I'm currently investigating uses only three letters
>to communicate its message. It does it by using capital letters
>only, but I wonder what the entropy of such a manuscript would be
>against any known language if it were written entirely in only
>three letters?
It depends on the encipherment scheme. Assume a single-substitution
cipher. If the original letter A is enciphered as gogogogogogogo,
B as chchch and the last letter, C, as blblblblblblblbl, the low-order
entropies of the message will be extremely low, since every time
you come across a 'g' you know that an 'o' will be next. It is only
when you come across an 'o', an 'h', or an 'l' that the next letter
can be any of 'g', 'c' or 'b'. However, the 17th-order entropy will
be close to the 2nd-order entropy of the original message (the
one with only three letters).
>And when calculating entropy, what does one use as the recognized
>glyphset? Is <sh> actually one glyph, or two?
I have just discussed that.
>If the thousands of
>examples of <sh> are seen my the vast majority of us a single
>glyph, and we never write or separate the 's' from the 'h' when
>referring to this glyph, then <sh> is an extremely poor
>representation of what we're all seeing.
On the contrary. Replacing <sh> by a single symbol destroys
information. The decipherment of the Easter Island tablets
brings a good example. Thomas Barthel, who was the first to
publish the corpus of the hieroglyphic texts, elaborated
a transcription system where each sign was represented by
a number of up to three digits. What is sign is obvious to
anyone who has looked at them: most of them are anthromorphic.
They remind you of the "dancing men" of Poe's short story.
More recently, though, Konstantin Pozdniakov has brought
compelling evidence that those anthropomorphic signs are
actually composed of up to five phonetic elements (head +
four limbs). As a consequence, Barthel's system is utterly
inadequate. It was inspired, BTW, of Eric Thompson's system
for Maya.
As for determining whether <sh> is a single entity or not,
that is a problem of segmentation. I have, perhaps ten years
ago already, brought this group's attention to Boris Viktorovich
Sukhotin's works on the subject, and I seem to remember having given
here partial translations of some of his methods. If you
represent <sh> by, say, <@>, or any other _single_ symbol,
you introduce a bias. That is why highly analytical transcription
systems are better, Frogguy marginally better EVA (because more
analytical), EVA much better than Currier. Likewise, iiiic is
better than 300 when trying to represent that particular Easter
Island hieroglyph.
______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list