[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: VMs: VMS Word context similarities

To: vms-list@xxxxxxxxxxx
Subject: Re: VMs: VMS Word context similarities
From: Nick Pelling <nickpelling@xxxxxxxxxxxxxxx>
Date: Wed, 07 Sep 2005 21:06:02 +0100
In-reply-to: <JAEKJOMCOCMKCPMKKHGMOEBBDGAA.MarkeFincher@travelinfosystem s.co.uk>
References: <5.2.1.1.0.20050904011424.03c45ec0@pop3.blueyonder.co.uk>
Reply-to: vms-list@xxxxxxxxxxx
Sender: owner-vms-list@xxxxxxxxxxx

Hi Marke,

At 17:06 07/09/2005 +0100, Marke Fincher wrote:

I wrote a crude program which given a large input
text tries to identify groups of words which occur in
a similar context.

I found this quite encouraging, so then using the same
parameters I ran it on the VMS and here is what I got:

(ar,or)
(kor,sor,okor)
(otchol,tchey)
(ol,chol,chedy,shedy,qokeey,qokeedy,qokedy)
(dar,qokaiin,okaiin,qokai!n,okal,qokar,saiin,otar)
(qokain,otai!n)
(r,l,sol)
(tar,ykar)
(shecthy,olchedy)
(ched,lkar)

...but no large groups.

Interesting! Could I please ask you to run the same code over a few different groups of pages within the VMs - for example, Herbal A pages, Herbal B pages, the balneology quire, or the starred paragraph "recipe" quire? I'd be fascinated to see how such groups do (and don't) differ from one another under such a metric...

Thanks, .....Nick Pelling.....

PS: as a further refinement, you might consider seeing the effect of double-sided context as a predictor, i.e. when two different words in a sample appear flanked by the same pair of words. (I presume you're using single-sided context only ATM?)


______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list

Follow-Ups:
- RE: VMs: VMS Word context similarities
  - From: Marke Fincher

References:
- Re: VMs: Notes on f116v.1-2
  - From: Nick Pelling

Prev by Date: Re: VMs: Impressions
Next by Date: Re: VMs: Impressions
Previous by thread: Re: VMs: VMS Word context similarities
Next by thread: RE: VMs: VMS Word context similarities
Index(es):
- Date
- Thread