[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

*To*: voynich@xxxxxxxx*Subject*: Re: VMS words and Roman numerals*From*: Zandbergen@xxxxxxxxxxx (Rene Zandbergen)*Date*: Fri, 29 Dec 2000 17:14:31 +0100*Delivered-to*: reeds@research.att.com*References*: <3A4A0DBA.BC28F9CE@voynich.nu> <200012271855.eBRIt0J00982@coruja.dcc.unicamp.br> <3A4A698E.4A60F050@voynich.nu> <200012280142.eBS1gHU01674@coruja.dcc.unicamp.br> <3A4BC47E.EC075BF@voynich.nu> <200012290324.eBT3Orc04721@coruja.dcc.unicamp.br>*Reply-to*: rene@xxxxxxxxxx*Sender*: jim@xxxxxxxxxxxxx

Jorge Stolfi wrote: > > [Rene:] Agreed, but since we know that not all possible words do > > exist > > Do we? We don't know what are the "possible words". Perhaps we *do* > have 90% of them. If the "cipher" is indeed based on a codebook that was > built on the fly, then that is just what we expect. > > > even in the list of words (as opposed to the list of > > tokens) the probabilities 'per slot' could be unequal, i.e. for > > example 0.5 that it's empty and the other 0.5 divided over > > various options. > > I don't follow. Let me explain what I had in mind, while not making any statement about the likelihood that this is what actually happened in 1448 or thereabouts. Each slot could either have nothing or a single distinctive character. This way a dictionary of 511 words could be built up (omitting the empty word). When building the dictionary, the author could, for each slot, use not one single character, but two or three different ones, which he would pick from, whenever the slot should not be empty. Thus the probability that the slot is empty is 0.5, that it has char-1 is 0.25 (for example) and that it has char-2 is also 0.25. In this way not all possible combinations will be generated. At the same time, the vocabulary size is still 511. Alternatively, there could be 511 word patterns, and the dictionary of 6000 words could be built up by allowing the multiple choices as a scale factor independent of the word length. This is not a very realistic scenario IMHO. Other ways of obtaining the 12-fold vocabulary size while still maintaining a symmetric near-binomial length distribution: - The use of nulls - Using the 'alternative choice per slot' only at the stage of writing the text. I.e. the dictionary has 'okal' but the writer could write 'okal', 'otal', 'okar' as he desired. - A third one which is more interesting. Both of the first two options have the major problem that they reduce the size of the actual vocabulary of the underlying text. In the third option, one could imagine having a system with fewer variable-length slots, where the individual distributions are skewed towards short fragments, but the 'multiple-choice' option balances this with a tendency towards fewer empty slots. (In the end each slot would still have a symmetric distribution, but the factor 12 would be explained) I would like to think that the binomial distribution should be an explainable result of a relatively straightforward 'encoding' by the author. All this rather theoretical reasoning should be seen as leading to clues what this encoding could or could not be. With encoding I mean nothing more (or less) than the translation of the source text into the Voynichese alphabet. And, yes, I also prefer a 'rule' as opposed to a code book, but the Dalgarno precedent (postcedent??) given by Stolfi stands. Cheers for now, Rene

**Follow-Ups**:**Re: VMS words and Roman numerals***From:*Gabriel Landini

**References**:**RE: On the word length distribution***From:*Rene Zandbergen

**VMS words and Roman numerals***From:*Jorge Stolfi

**Re: VMS words and Roman numerals***From:*Rene Zandbergen

**Re: VMS words and Roman numerals***From:*Jorge Stolfi

- Prev by Date:
**Re: VMS words and Roman numerals** - Next by Date:
**Re: Caramuel, Lobkowitz y Chinese** - Previous by thread:
**Re: VMS words and Roman numerals** - Next by thread:
**Re: VMS words and Roman numerals** - Index(es):