[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Curious coincidence

To: voynich@xxxxxxxx
Subject: Curious coincidence
From: Jorge Stolfi <stolfi@xxxxxxxxxxxxx>
Date: Fri, 9 Jun 2000 21:37:48 -0300 (EST)
Delivered-to: reeds@research.att.com
Reply-to: stolfi@xxxxxxxxxxxxx
Sender: jim@xxxxxxxxxxxxx

Hi,

Over the past few weeks I have been counting VMS beans of various
shapes and colors, extracted from the almost complete, not-so-bad,
majority-vote transcription in EVA.

I just noticed a curious coincidence:

  total *occurrences* of words (tokens) with
    
     0 gallows .... 17363  (49.4%)
     1 gallows .... 17443  (49.6%)
     2 gallows ....   323   (0.9%)
     3 gallows ....     3

These numbers look more suspicious than the elections in Peru. 8-)

Many (if not all) of the 2- and 3-gallows words are probably due to
omission of word spaces by the transcribers. Other data errors may
have injected a few percent of noise in these figures.

Still, the coincidence is intriguing. It seems safe to assume that a
"correct" Voynichese word can have at most one gallows; so we have
almost exact 50-50 split between 0-g and 1-g words.

Maybe this is merely an amazing linguistic coincidence. Perhaps the
presence of gallows indicates an independent binary phonetic 
attribute (say, voiced vs. unvoiced, high/low register, front/back); and
Voynichese happens to be an extremely efficient language, that makes
full use of that available bit.

Or could this be something else?  Three possibilities that I can 
think of:

  * Voynichese "words" are actually keys into a codebook-style cipher,
    encoded in a notation resembling Roman numerals (only more complicated);

  * Voynichese is a complex "randomizing" code à la Vigenère, 
    where the encrypted numeric text is further scrambled
    with a second, complicated encoding responsible for the peculiar
    word structure;
    
  * Voynichese "words" are generated, at least in part, by throwing
    dice; and the gallows belong to the random part.

In all these scenarios, the presence/absence of gallows would be a
low-order bit in the encoding. That would explain the precise 50-50
split ---- in spite of the fact that the VMS word frequencies are as
irregular as those of any natural language

Comments, anyone?

All the best,

--stolfi

PS. I hope to post a summary of my bean-counting over the 
weekend.

Follow-Ups:
- Re: Curious coincidence
  - From: Bruce Grant
- Re: Curious coincidence
  - From: Rene Zandbergen

Prev by Date: glossolalia on the Net
Next by Date: Re: glossolalia on the Net
Previous by thread: Works of Vickie (WAS: Re: glossolalia on the Net)
Next by thread: Re: Curious coincidence
Index(es):
- Date
- Thread