[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

*To*: voynich@xxxxxxxx*Subject*: Re: Curious coincidence*From*: Jorge Stolfi <stolfi@xxxxxxxxxxxxx>*Date*: Sun, 11 Jun 2000 20:07:30 -0300 (EST)*Delivered-to*: reeds@research.att.com*In-reply-to*: <39440A46.BF2E8E2C@voynich.nu>*References*: <200006100037.VAA22543@coruja.dcc.unicamp.br> <39440A46.BF2E8E2C@voynich.nu>*Reply-to*: stolfi@xxxxxxxxxxxxx*Sender*: jim@xxxxxxxxxxxxx

> [Rene:] If there really is a 50% chance of having a gallows or > not, how close are the numbers allowed to be? A difference of 80 > seems almost too small. Well, the variance of a 0-1 coin toss is 1/2, right? So the standard deviation of the sum of N = 34806 independent coin tosses should be sqrt(N/2) = 131. Thus 40 ( = 80/2) is a bit better than what we would expect, but still not suspiciously too good, I would say. (Beware that there *is* noise in my data, at the level of 100-200 tokens if not more. So even if the original text had a perfect 50-50 split, my counts would be only approximately equal.) > By the way, I presume that 'gallows' also include the pedestalled > ones.... It doesn't matter for this particular statistic, since in either case the tabulated variable is the presence or absence of [ktpf]. > Does your count include the labels (and other non-flowing text)? It includes circular and "radial" text from the diagrams, but not labels proper (such as the zodiac star labels), nor the key-like sequences. > The three options above don't really explain why it should be > 50/50 and not, say, 40/60, unless you go to some kind of binary > encoding, as you suggest also. It is not necessary to assume a full binary encoding. For, instance, suppose the units-place decimal digits are encoded as 0=nothing 1=k 2=e 3=ke 4=ch 5=kch 6=sh 7=tch 8=ee 9=tee Encoding a string of largish numbers (e.g. entries from a codebook) with this encoding would result in an even split between words with gallows and words without. Again, it is *not* necessary that the codebook be "random", as long as it is independent of the plaintext. By the way, I recall a couple of letters in Kircher's correspondence about his "universal language". (I believe one of them was from Don Caramuel y Lobkowicz, Czech-born bishop/cardinal of Naples (?), who was of course a close friend of our close friend Marci. 8-). I got the impression that Kircher's language was some sort of codebook scheme, where the word codes were written in roman numerals. Do you happen to know something more about it? > It would have to mean also, that each word is 'constructed'. Assuming > for the moment a word-by-word (or by syllable) translation of some > source text, then whether or not a gallows appears depends on some > property of the original word. > A 50% chance could appear in many circumstances, e.g. depending on the > number of characters in the original word (odd/even) Ah yes, I didn't think of that. More genrally, a "pseudo-random" encoding that is applied to each word individually (as opposed to the whole text as a single string) could also explain the 50-50 split, without messing up the Zipfian word frequencies and the peculiar word structure. > stress on odd or even syllable, etc, etc. (This will not always > lead to 50% chance either). Indeed. So, if it's not a coincidence, we seem to be left with a codebook scheme, word-by-word encription, or random noise... All the best, --stolfi

**Follow-Ups**:**Re: Curious coincidence***From:*Gabriel Landini

**Some first impressions***From:*Woody Brison

**References**:**Curious coincidence***From:*Jorge Stolfi

**Re: Curious coincidence***From:*Rene Zandbergen

- Prev by Date:
**Re: About Thaddeus Hajek** - Next by Date:
**Re: About Thaddeus Hajek** - Previous by thread:
**Re: Curious coincidence** - Next by thread:
**Re: Curious coincidence** - Index(es):