Re: AW: VMs: Character repetition

On Thursday 16 September 2004 19:22, Koontz John E wrote:
> Would we, perhaps, be safer here to say that the modal token length has to
> do with the token construction rules?  In case, for example, token weren't
> words, but were some other unit? 

I think that this is due to *word* construction rules and their relative 
frequencies (i.e. the lexicon rules and the frequency of words in the text). 
After all the token length distribution is biased by the relative frequencies 
(lots of THE, AND, FOR, etc).

> Although, for example, I wouldn't expect 
> a list of numbers to behave this way under scrambling.  But a list of
> "letter encodings" or "numbers + grammatical endings" or "syllable
> encodings" might.

Remarkably, DNA has coding and non-coding parts and no spaces, however, the 
spectral analysis shows a peak at length=3... we know that DNA is written in 
3-letter words that code for a specific aminoacid.
I've seen the analysis of several millin digits of pi that do not show such 


