[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: VMs: VMS Word context similarities



In case of repeated similar but not exact strings
"kor,sor,okor" variation may be because of hiding the these symbols are the same.
In case of exactly equal repeated strings "qar qar qar"
it is mark of that values of these are not equal.


Marke Fincher wrote:
I wrote a crude program which given a large input text tries to identify groups of words which occur in a similar context.

When I ran it on a bible it suggested the following word groupings:

(a,the,and,of,to,in)
(were,be,was,is,are)
(had,have,has)
(his,their,my,your,the)
(he,i,they,who,you,him,them)
(shall,will)
(yahweh,god)

...along with some very large groups, i.e. one group for nouns, one for verbs, another for adjectives, etc.

I found this quite encouraging, so then using the same
parameters I ran it on the VMS and here is what I got:


(ar,or)
(kor,sor,okor)
(otchol,tchey)
(ol,chol,chedy,shedy,qokeey,qokeedy,qokedy)
(dar,qokaiin,okaiin,qokai!n,okal,qokar,saiin,otar)
(qokain,otai!n)
(r,l,sol)
(tar,ykar)
(shecthy,olchedy)
(ched,lkar)


...but no large groups.

Marke

______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list


______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list