[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Curious coincidence
penetrating, Jorge. congratulations!!! It is almost certainly not a coincidence, whatever the cause.
Don
On Friday, June 09, 2000 6:38 PM, Jorge Stolfi [SMTP:stolfi@xxxxxxxxxxxxx] wrote:
>
> Hi,
>
> Over the past few weeks I have been counting VMS beans of various
> shapes and colors, extracted from the almost complete, not-so-bad,
> majority-vote transcription in EVA.
>
> I just noticed a curious coincidence:
>
> total *occurrences* of words (tokens) with
>
> 0 gallows .... 17363 (49.4%)
> 1 gallows .... 17443 (49.6%)
> 2 gallows .... 323 (0.9%)
> 3 gallows .... 3
>
> These numbers look more suspicious than the elections in Peru. 8-)
>
> Many (if not all) of the 2- and 3-gallows words are probably due to
> omission of word spaces by the transcribers. Other data errors may
> have injected a few percent of noise in these figures.
>
> Still, the coincidence is intriguing. It seems safe to assume that a
> "correct" Voynichese word can have at most one gallows; so we have
> almost exact 50-50 split between 0-g and 1-g words.
>
> Maybe this is merely an amazing linguistic coincidence. Perhaps the
> presence of gallows indicates an independent binary phonetic
> attribute (say, voiced vs. unvoiced, high/low register, front/back); and
> Voynichese happens to be an extremely efficient language, that makes
> full use of that available bit.
>
> Or could this be something else? Three possibilities that I can
> think of:
>
> * Voynichese "words" are actually keys into a codebook-style cipher,
> encoded in a notation resembling Roman numerals (only more complicated);
>
> * Voynichese is a complex "randomizing" code a la Vigenere,
> where the encrypted numeric text is further scrambled
> with a second, complicated encoding responsible for the peculiar
> word structure;
>
> * Voynichese "words" are generated, at least in part, by throwing
> dice; and the gallows belong to the random part.
>
> In all these scenarios, the presence/absence of gallows would be a
> low-order bit in the encoding. That would explain the precise 50-50
> split ---- in spite of the fact that the VMS word frequencies are as
> irregular as those of any natural language
>
> Comments, anyone?
>
> All the best,
>
> --stolfi
>
> PS. I hope to post a summary of my bean-counting over the
> weekend.