[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: VMs: learning



Hi Rafal (again),

Some things that I didn't properly cover in my last email:

(1) Here's a link to an accessible presentation of Shannon's idea of statistical entropy (which is strictly what we're talking about, as opposed to thermodynamic entropy etc). You should be able to see from this how increasing the context length progressively increases the negentropy ("information") measure H(N) [Shannon's notation]:-
http://pespmc1.vub.ac.be/ENTRINFO.html


(2) I described the 0th-order entropy of a set of 1000 completely random numbers in the range (0..255) as being 1.0, but this would only be true if all instance counts of the 256 symbols were equal. The flattest 8-bit distribution you can get with 1000 (as opposed to 1024) elements would be (232 symbols x 4 each) + (24 symbols x 3 each): but while this is close to flat, it isn't actually flat. As an exercise:-

(a) effective bit-length of a 4-instance symbol here = log2(1000/4) = log2(250) = 7.96578
(b) effective bit-length of a 3-instance symbol here = log2(1000/3) = log2(333.3) = 8.38082
(c) effective bit-length of entire stream = 232 * 4 * 7.96578 + 24 * 3 * 8.38082 = 7995.662 bits
(d) h1 = weighted average bitlength = 7995.662 / 1000 = 7.995662 bits/instance.
(e) h0 = variety of choice (independent of actual distribution) = log2(256) = 8.0
(f) 0th-order entropy = h1/ h0 = 7.995662 / 8.0 = 0.9994577. Nearly 1.0, but not quite. :-)


(Tech aside: this kind of entropic quantization is why (for example) Huffman coders are usually less optimal than arithmetic coders.)

(3) If you're thinking of using associative arrays to store contexts, the cool trick is to use the longest context you have at any point - and then to proceed from the longest context entropy calculation to shortest context entropy calculation, shortening the context as you go.

Cheers, .....Nick Pelling.....


______________________________________________________________________ To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying: unsubscribe vms-list