[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: VMS words and Roman numerals



Jorge Stolfi wrote:

>     > [rene:] A base-60 system as used by the Babylonians and
>     > understood (to the best of my knowledge) by later cultures is
>     > one interesting possibility [to generate the factor `12'] that
>     > comes to mind immediately. The base numbers could be:
>     > 1,5,10,20,60,300,600,1200,3600 Hmmm, only 9, not 12.
> 
> Hm, these could be the nine yes/no slots in the binomial part of
> the code.
> Indeed the Babylonians (and the Greek, Roman, Chinese...) used a
> digit-position code: with different sets of symbols for each position,
> omitting the zeros.  Unfortunately, all but the Romans had several
> choices per slot; so the word length distribution for those
> numerals is not symmetrical.

Agreed, but since we know that not all possible words do exist, even
in the list of words (as opposed to the list of tokens) the
probabilities 'per slot' could be unequal, i.e. for example 0.5
that it's empty and the other 0.5 divided over various options.
Thus, the Roman number system may not be the only choice after all.

Of course, the fact that the list of words does not display a complete
binomial tree is a problem for this whole theory. At the same time,
the observed factor 12 in the word count for each given word length
could suggest various ways out of this. Like I said, more thought
needed.

The idea that the binomial word length distribution could be due to
the combination of a smaller number of (largely) independent
sub-groups with individual symmetric length distributions does of
course take us back to the prefix - stem - suffix construction.
Here, again, the Roman number system (but also Greek and Arabic)
have the interesting property that numbers over 1000 are built
using the character set for the smallest numbers (using a special
indicator for the multiplication by 1000).

Lastly, the various not-quite-but-almost binomial distributions
shown in Jorge's posts mostly differ in the areas of extreme
word lengths. When plotted on a linear scale as in the figure on
Jorge's web page, the difference might not at all be noticeable....
 
Cheers, Rene