[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: low entropy text



Claus Anders wrote:

> today I tried to compute the entropy of a raeto romance example:
> I got (with monkey):
> h0: 4.32
> h1: 3.93
> h2: 2.69
> Nearly as low, as VMS and even h1-h2 is in the same range.
> Maybe the numbers are due of the low char count of my example.

Actually, h1 is quite normal for a 20-character alphabet (as implied
by the h0). h2 is right in between Latin and Voynichese, and the
relatively low value could indeed be due to the shortness of the text
(the higher the order, the more the estimated entropy is reduced by
this). You can see some of that happening in the graphs of the
web article I mentioned yesterday.

Which brings me to the other thread: what we need is a word game
which both reduces entropy and word length, still keeping the
vocabulary size reasonable. That last feat may of course be
assisted by introducing spelling variations.

Cheers, Rene