[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: VMs: excessive frequency of doubles...



Zitat von Marke Fincher <markefincher@xxxxxxxxxxxxxxxxxxxxx>:

> 
> Let me put it another way...
> 
> If you randomly scrambled the positions of the 37000 words in the
> sample, given the frequency of the 50+ words in my table, you would
> expect to find (on average) only 2 or 3 doubles of any them.
> There are actually 73.   This is why it is significant.
> 
> Hence (IMHO) the process that generated each word was not
> independent from the creation of neighbouring words.
> 
> Marke
> 

Hm -- you could be right. Was I too quick to jump to my conclusions...?

What was the text length you used, what was the vocabulary size, and how many 
words of the vocabulary occured only once or twice?

And wouldn't it be more elucidating to check the behaviour of frequent 
words...? Just a question...

Cheers,

   Elmar




-------------------------------------------------
debitel.net Webmail
______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list