VMs: Re: VMs as Numbers

```The phrases "bench bits" and "almost exactly half" make me think of something.

The gallows characters really have several dichotomous features, don't they?
That is,
-    one loop vs. two loops
-    two straight legs vs. a leg and a hook
-    leg straddled by a bench or not

In addition, you have "bench + no hook + no gallows" and "bench + hook + no
gallows" characters. In other words, almost all the combinations of:

(no loops, one loop, two loops) x (no hook, hook) x (no bench, bench)

They could code decimal digits, though I would expect a more even distribution
if so.

Perhaps it would be a good idea to look at the joint distribution of these
features.

Bruce

Jorge Stolfi wrote:

>     > [Bruce Grant:] Speaking of encoded Roman numerals, a dead
>     > giveaway ought to be the presence of seven different symbols,
>     > four of which appear in multiples (I, X, C, M) and three of
>     > which do not (V, L, D), and with certain forbidden diagraph
>     > patterns:
>     >
>     > IV,VI, IX, XI, XV =>   OK   VX => not OK
>     > XL, LX, XC, CX, CL =>  OK   LC => not OK  and so on
>
> There are indeed rules of this sort that apply to the sequence
> of letters in the VMS words.  However, I haven't been able to see
> any obvious match to the patterns of standard Roman numerals.
>
> One intriguing fact is that almost exactly half of the VMS tokens have
> exactly one gallows, while the other half has none. Also, almost
> exactly half of the tokens have "bench" letters (EVA ch, sh, ee); and
> this "bench bit" seems to be independent of the "gallows bit". It is
> therefore tempting to identify those letters with the 5's of Roman
> numerals, e.g. {gallows = V, benches = L}. But then what? And why are
> there 4 different gallows, and several different benches?
>
> Perhaps the 4 gallows represent the Roman "digits" V,VI, VII, VIII,
> while the benches stand for L, LX, etc.. But then what are the EVA
> letters "a"/"o", and "e", which seem to be pre- and postfix modifiers
> for other letters?
>
>     > [Robert:] A quick thought. If the VMs is mostly encoded numbers,
>     > then there is a fairly powerful test of this hypothesis.
>     >
>     > Just as Zipf's Law predicts word frequency, so Benford's
>     > Law predicts the frequencies of the initial digits of a
>     > sequence of numbers.  In a nutshell, P(n) = log(n+1) - log(n)
>
> That law may hold for "open" number sets, where the frequency
> of a number decreases with its magnitude in the approrpiate way.
> It is unlikely to hold for "closed" number sets, such as
> telephone numbers or train times.
>
> Would it hold for a numerical code? I guess that it depends on how the
> numbers are assigned to the words.
>
> All the best,
>
> --stolfi

```