[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: VMs: VMS Word context similarities



Hi Marke,

At 17:06 07/09/2005 +0100, Marke Fincher wrote:
I wrote a crude program which given a large input
text tries to identify groups of words which occur in
a similar context.

I found this quite encouraging, so then using the same
parameters I ran it on the VMS and here is what I got:

(ar,or)
(kor,sor,okor)
(otchol,tchey)
(ol,chol,chedy,shedy,qokeey,qokeedy,qokedy)
(dar,qokaiin,okaiin,qokai!n,okal,qokar,saiin,otar)
(qokain,otai!n)
(r,l,sol)
(tar,ykar)
(shecthy,olchedy)
(ched,lkar)

...but no large groups.

Interesting! Could I please ask you to run the same code over a few different groups of pages within the VMs - for example, Herbal A pages, Herbal B pages, the balneology quire, or the starred paragraph "recipe" quire? I'd be fascinated to see how such groups do (and don't) differ from one another under such a metric...


Thanks, .....Nick Pelling.....

PS: as a further refinement, you might consider seeing the effect of double-sided context as a predictor, i.e. when two different words in a sample appear flanked by the same pair of words. (I presume you're using single-sided context only ATM?)


______________________________________________________________________ To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying: unsubscribe vms-list