VMs: Random Text Generation

Here, I am assuming that the numbering system used
is a system like Greek or Arabic, where the
symbols used are different for units, tens,
hundreds etc.

Note that:
(1) this requires probably 27+ active symbols in the underlying language
(2) this doesn't by itself produce a binomial distribution of word lengths
(3) this would probably produce a more flat symbol adjacency distribution than observed?

Roman numerals (or some steganographic relative of it) would seem to match these three points more closely.

