[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: VMs: algorithm to generate VMS like text
Hi all,
I have emailed Jeff the code and have some stats from MONKEY on an output text of 1000 lines, 9947 words, 53860 chars. Text in FSG notation.
(I have also taken the various bits of advice and dl'ed copies of monkey bitrans, etc - that was the easy bit, the hard bit is getting a good understanding of entropy, but the recent posts should help)
I later saw the post about 32k file sizes so I have 3 sets of figures, first for the original 53k file then I split this file into 2 files approx 26k and re-ran Monkey on them.
1. 53k file 9947 words 53860 chars
h0 = 4.45943
h1 = 3.76551
h2 = 2.23392
h3 = 2.20849
2. 26k file approx 5000 words
h0 = 4.45943
h1 = 3.76424
h2 = 2.23032
h3 = 2.20558
3. another 26k file approx 5000 words
h0 = 4.45943
h1 = 3.77193
h2 = 2.22793
h3 = 2.24149
I have generated the stats for Currier and will plug them in and produce a list of values for Currier text generated by my program. All I have to do now is understand what the figures mean - I'll re-read your recent posts Rene.
As Gabriel notes there seems to be an inconsistency in the output where some long words are generated and I would expect with the spaces being inserted in the correct place (statistically) that the average word length should be shorter than it currently is.
More later,
Brett
Yahoo! Plus - For a better Internet experience