[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: VMs: VMS Word context similarities



<grin>  Theories and observations come and go....


Larry Roux
Syracuse University
lroux@xxxxxxx

>>> MarkeFincher@xxxxxxxxxxxxxxxxxxxxxxx 09/09/05 5:30 AM >>>

Follow up on word context matching:

Having now produced some graphs of the "word context 
match scores", for both the bible and the VMS and 
the word-randomised-bible and the word-randomized-vms 
a slightly different picture emerges.

The strength of the best context matches within the
bible is way more than the VMS, extending up to 60%
for the bible, and topping out at just 31% for the 
VMS.

But, depressingly, I notice that randomising the
bible file increases the number of context matches
below 32%, and only decreases the ones above that
level.  This indicates to me that a context match
below 30ish percent can easily come about solely 
through the natural distribution and adjacency of 
the most frequent words, with no significant 
relationship involved.

On the VMS, I now can see that it is only a very 
small number of the highest matches (>28%) which 
are reduced by randomisation, and all the others  
are increased.   Given that the strongest is
only 31% this means that the strength of the
suggested VMS context matches is only really
slightly above what you would expect from the
word frequencies alone.  :-(

I would say based upon these findings that we 
should be very wary about the groupings I 
suggested in my initial post on this thread!
In fact, they are probably rubbish!
 
Marke




______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list

______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list