[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Curious coincidence



penetrating, Jorge. congratulations!!!  It is almost certainly not a coincidence, whatever the cause.
Don

On Friday, June 09, 2000 6:38 PM, Jorge Stolfi [SMTP:stolfi@xxxxxxxxxxxxx] wrote:
> 
> Hi,
> 
> Over the past few weeks I have been counting VMS beans of various
> shapes and colors, extracted from the almost complete, not-so-bad,
> majority-vote transcription in EVA.
> 
> I just noticed a curious coincidence:
> 
>   total *occurrences* of words (tokens) with
>     
>      0 gallows .... 17363  (49.4%)
>      1 gallows .... 17443  (49.6%)
>      2 gallows ....   323   (0.9%)
>      3 gallows ....     3
> 
> These numbers look more suspicious than the elections in Peru. 8-)
> 
> Many (if not all) of the 2- and 3-gallows words are probably due to
> omission of word spaces by the transcribers. Other data errors may
> have injected a few percent of noise in these figures.
> 
> Still, the coincidence is intriguing. It seems safe to assume that a
> "correct" Voynichese word can have at most one gallows; so we have
> almost exact 50-50 split between 0-g and 1-g words.
> 
> Maybe this is merely an amazing linguistic coincidence. Perhaps the
> presence of gallows indicates an independent binary phonetic 
> attribute (say, voiced vs. unvoiced, high/low register, front/back); and
> Voynichese happens to be an extremely efficient language, that makes
> full use of that available bit.
> 
> Or could this be something else?  Three possibilities that I can 
> think of:
> 
>   * Voynichese "words" are actually keys into a codebook-style cipher,
>     encoded in a notation resembling Roman numerals (only more complicated);
> 
>   * Voynichese is a complex "randomizing" code a la Vigenere, 
>     where the encrypted numeric text is further scrambled
>     with a second, complicated encoding responsible for the peculiar
>     word structure;
>     
>   * Voynichese "words" are generated, at least in part, by throwing
>     dice; and the gallows belong to the random part.
> 
> In all these scenarios, the presence/absence of gallows would be a
> low-order bit in the encoding. That would explain the precise 50-50
> split ---- in spite of the fact that the VMS word frequencies are as
> irregular as those of any natural language
> 
> Comments, anyone?
> 
> All the best,
> 
> --stolfi
> 
> PS. I hope to post a summary of my bean-counting over the 
> weekend.