[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: On the word length distribution
On 23 Dec 2000, at 18:05, Jorge Stolfi wrote:
> While reviewing my "word grammar" write-up, I noticed another amazing
> statistical coincidence in the distribution of words
The fitting is indeed quite remarkable. Did you try to fit a binomial
model to other word distributions? (Latin & English?)
Let's suppose that the VMS is based on a nomenclator. Word
structure could (or should?) be considered arbitrary.
Zipf's law of word frequencies would be maintained, but the length-
frequency law (common words tend to be shorted than the
uncommon) would probably not (unless the author designs a
special nomenclator so s/he can save some ink).
So, if the vms is written with a nomenclator, would it be going back
to the concordances the only viable way to crack it?
Merry Christmas to all,
Gabriel