[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
VMS words and Roman numerals
> [rene:] A base-60 system as used by the Babylonians and
> understood (to the best of my knowledge) by later cultures is
> one interesting possibility [to generate the factor `12'] that
> comes to mind immediately. The base numbers could be:
> 1,5,10,20,60,300,600,1200,3600 Hmmm, only 9, not 12.
Hm, these could be the nine yes/no slots in the binomial part of the code.
Indeed the Babylonians (and the Greek, Roman, Chinese...) used a
digit-position code: with different sets of symbols for each position,
omitting the zeros. Unfortunately, all but the Romans had several
choices per slot; so the word length distribution for those
numerals is not symmetrical.
But your remark made me realize that the *Roman* system, unlike the
others, is actually quite similar to the binary bit-position code,
except that it allows multiple I/X/C letters. Here is the length
distribution d_k for the Roman "digits" from 0 to 9, without the
subtractive notation:
k d_k words
- --- -----------
0 1 ()
1 2 I V
2 2 II VI
3 2 III VII
4 2 IIII VIII
5 1 VIIII
The length of a Roman numeral between 0 and 999 will be the sum of
three variables, each with this distribution (one for each decimal
position). With a couple of unix hacks, I computed the number R_k of
distinct Roman numerals in 0-999 with each given length k:
k R_k
--- ---
0 1 (empty)
1 6 (I, V, X, L, C, D)
2 18 (II, VI, XI, XV, XX, LI,... DC)
3 38
4 66
5 99
6 128
7 144
8 144
9 128
10 99
11 66
12 38
13 18
14 6
15 1 (DCCCCLXXXXVIIII)
This distribution is not quite a binomial distribution, but, thanks to
the law of large numbers, it is not very far from one --- specifically,
to binomial(15,k), except for a constant factor:
k R_k binm ratio
--- --- ---- -----
0 1 1 1.000
1 6 15 0.400
2 18 105 0.171
3 38 455 0.084
4 66 1365 0.048
5 99 3003 0.033
6 128 5005 0.026
7 144 6435 0.022
8 144 6435 0.022
9 128 5005 0.026
10 99 3003 0.033
11 66 1365 0.048
12 38 455 0.084
13 18 105 0.171
14 6 15 0.400
15 1 1 1.000
The match between R_k and the binom(15,k) distribution is not as good
as in the case of the VMS words (the ratio varies from 0.02 to 0.05
over the significative range), but it is close enough to be
suggestive.
So perhaps we do not need to assume nine independent X/empty slots in
the VMS words. Perhaps there are only (say) three slots, each of which
may be filled with a "digit" string of length between 0 and 3. Let d_k
be the number of distinct "digits" of each length k, in a given slot.
It is not necessary that these counts be in the ratio 1:3:3:1. As long
as they are symmetrical (i.e. d_0=d_3 and d_1=d_2), the word
length distribution will be symmetrical and approximately binomial.
All the best,
--stolfi