[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: LSC sums for monkey texts
Rene, I have looked at your curves and noticed the following features:
the Se curves (calculated) are exactly as we obtained, so in regard to Se
your program seems to work the same way. 2) If I understand it correctly
(and if not you correct me) what you call 1st order monkey is actually a
random permutation of letters of the original text. Indeed, the Sm LSC
sum looks like those we obtained for such permutations 3) The higher
order monkeys are (if I understood it correctly) results of random
permutations of n-tuples of letters. The fourth order monkey is then
somehow similar to our texts obtained by random permutations of words.
Indeed, the Sm curves for 4-order monkey looks rather similar to our
word-shuffled texts. 4) What is puzzling is the Sm curve for your
original Latin text. It is like our typical Sm curves for meaningful
texts (including Genesis in Latin) at small n, but is rather different at
large n. For all meaningful texts we obtained a well expressed growth of
Sm at n exceeding that for well formed PMP. In your example PMP seems to
be not well formed and there is no typical rise of Sm toward large n. In
order to find the reason for that, I'll email to tomorrow you some texts
we used (including VMS-A and VMS-B). If you conduct LSC text on them
using your program we'll be able to see if you obtain the same curves we
did or your program works differently. I would like to say that our
program was tested and retested very meticulously and we are confident it
measures OK. So, either you encountered a Latin text which is peculiar
in regard to LSC, or something is wrong with the program. Yes, our
program ignored the spaces, commas, etc. Cheers, Mark
> Dear all,
> I have indeed done a few initial tests with the LSC technique on
> texts generated by Jacques' monkey program.
> Tentatively, I would say that the technique still sees a difference
> between real meaningful text and a 3rd or 4th order character monkey.
> Not very conclusive but quite promising.
> I used text length of only 20,000 characters, which is not enough
> to be absolutely sure of the conclusion.
> See a quick summary of what I did at:
> There are some plots in which the X-scale has numbers from 1 to 19.
> These are 'codes' and represent the following actual values:
> 1, 2, 3, 5, 7, 10, 15, 20, 30, 50, 70, 100, 150, 200, 300, etc.
> I should rewrite the code in C so that I can run it at home.
> And I would also appreciate if Mark could send me one of his sample
> texts so that I can validate the results of my program.
> One more comment: spaces were removed from all source texts. With the
> spaces included as an additional 'character', the sums change
> Comments are welcome.
> Cheers, Rene