[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: LSC sums for monkey texts



On 15 Jan 00, at 6:03, Jacques Guy wrote:
> I vaguely suspect that
> LSC sums would distinguish between real Rotokas and
> second-order Monkey Rotokas. Third-order and beyond,
> I am not so sure.
> What do you think?

I think that the LSC depends heavily on the construction of words, 
But also think that word construction (because of Zipf's law) 
depends heavily on a sub-set of the word pool.

Long-range correlations in codes was discussed in DNA a couple 
of years ago in very prestigious Journals like Nature and Science, 
but to date I do not think that anybody had a convincing theory or 
explanation of the meaning and validity of the results.

If you think, really what is the relation (in any terms) of a piece of 
text which is many characters away from another? What is the 
large scale structure of a text?  That would mean that there are 
events at a small scales and also at larger scales. 
I can imagine that up to the sentence level or so there may be 
patterns or correlations (what we call grammar?), but beyond that, I 
am not sure.
Think of a dictionnary, there may not be any structure beyond 1 
sentence or definition (still Roget's Thesaurus coforms Zipf's law for 
the more frequent words). Consequently I see no reason why there 
should be any large scale structures in texts. (I may be very wrong).

I suggested the other day that higher-order Monkeys generate LSC 
which are closer and closer to that of the language the Monkeys 
are based on. If I understand correct, Rene's analysis seems to 
confirm that?

I guess that the LSC could not differentiate between, let's say, an 
"order 3 word-Monkey" and a real text. (Word Monkeys generate a 
language based on the probability of words, rather than 
characters). Note that 3rd order word-Monkeys usually generate 
readable, (meaningless and most of the time hilarious) text.
Perhaps this is worth looking into.

Cheers,

Gabriel