[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: LSC sums for monkey texts
On 15 Jan 00, at 6:03, Jacques Guy wrote:
> I vaguely suspect that
> LSC sums would distinguish between real Rotokas and
> second-order Monkey Rotokas. Third-order and beyond,
> I am not so sure.
> What do you think?
I think that the LSC depends heavily on the construction of words,
But also think that word construction (because of Zipf's law)
depends heavily on a sub-set of the word pool.
Long-range correlations in codes was discussed in DNA a couple
of years ago in very prestigious Journals like Nature and Science,
but to date I do not think that anybody had a convincing theory or
explanation of the meaning and validity of the results.
If you think, really what is the relation (in any terms) of a piece of
text which is many characters away from another? What is the
large scale structure of a text? That would mean that there are
events at a small scales and also at larger scales.
I can imagine that up to the sentence level or so there may be
patterns or correlations (what we call grammar?), but beyond that, I
am not sure.
Think of a dictionnary, there may not be any structure beyond 1
sentence or definition (still Roget's Thesaurus coforms Zipf's law for
the more frequent words). Consequently I see no reason why there
should be any large scale structures in texts. (I may be very wrong).
I suggested the other day that higher-order Monkeys generate LSC
which are closer and closer to that of the language the Monkeys
are based on. If I understand correct, Rene's analysis seems to
confirm that?
I guess that the LSC could not differentiate between, let's say, an
"order 3 word-Monkey" and a real text. (Word Monkeys generate a
language based on the probability of words, rather than
characters). Note that 3rd order word-Monkeys usually generate
readable, (meaningless and most of the time hilarious) text.
Perhaps this is worth looking into.
Cheers,
Gabriel