[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

VMs: Word Length Distribution

Maybe this has been exhausted on the list in the past but I will 
bring it up again anyway. What could account for the binomial 
distribution of vocabulary words as shown by Jorge Stolfi? 


I think this is an acid test for any scheme that someone might 
devise. Does it show that the Currier transcription is pretty close 
to the mark or is it a function of the transcription? I ran a check 
on a long section, almost 8000 tokens, of an unmodified EVA 
transcription that I have been working with and it showed almost 
identical results. It will  be interesting to see whether this holds 
with shorter sections and if not, where and how it breaks. Does it 
vary from one section to another?  Is it consistent with a real 
vocabulary in the writing about certain subjects or with any known 
specific cipher? Who would have such a vocabulary? Could (the) 
elimination of certain (sometimes) unessential parts of speech or 
letters explain it? (Shorthand, Nick?) Pidgin? Maybe there was a 
European Pidgin that became obsolete. 

Ciao ......... Knox

To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list