[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: VMs: learning
Hi Rafal (again),
Some things that I didn't properly cover in my last email:
(1) Here's a link to an accessible presentation of Shannon's idea of
statistical entropy (which is strictly what we're talking about, as opposed
to thermodynamic entropy etc). You should be able to see from this how
increasing the context length progressively increases the negentropy
("information") measure H(N) [Shannon's notation]:-
http://pespmc1.vub.ac.be/ENTRINFO.html
(2) I described the 0th-order entropy of a set of 1000 completely random
numbers in the range (0..255) as being 1.0, but this would only be true if
all instance counts of the 256 symbols were equal. The flattest 8-bit
distribution you can get with 1000 (as opposed to 1024) elements would be
(232 symbols x 4 each) + (24 symbols x 3 each): but while this is close to
flat, it isn't actually flat. As an exercise:-
(a) effective bit-length of a 4-instance symbol here = log2(1000/4) =
log2(250) = 7.96578
(b) effective bit-length of a 3-instance symbol here = log2(1000/3) =
log2(333.3) = 8.38082
(c) effective bit-length of entire stream = 232 * 4 * 7.96578 + 24 * 3 *
8.38082 = 7995.662 bits
(d) h1 = weighted average bitlength = 7995.662 / 1000 = 7.995662 bits/instance.
(e) h0 = variety of choice (independent of actual distribution) = log2(256)
= 8.0
(f) 0th-order entropy = h1/ h0 = 7.995662 / 8.0 = 0.9994577. Nearly 1.0,
but not quite. :-)
(Tech aside: this kind of entropic quantization is why (for example)
Huffman coders are usually less optimal than arithmetic coders.)
(3) If you're thinking of using associative arrays to store contexts, the
cool trick is to use the longest context you have at any point - and then
to proceed from the longest context entropy calculation to shortest context
entropy calculation, shortening the context as you go.
Cheers, .....Nick Pelling.....
______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list