[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: VMs: Detecting "hands" automatically
Hello Bruce,
you wrote:
> I have been reading about algorithms for grouping
> points into clusters with similar characteristics.
> I was curious whether an algorithm like
> this could detect the difference between A and B
> hands in the VMS based on the relative letter
> frequencies.
We tended to call A and B 'languages' and '1'
and '2' the 'hands', but that's just nitpicking ...
> Using a version of the interlinear VMS
> transcription,
Can you tell me which file you used? This is
very important.
> the algorithm I used (called "K-means") classified
> 145 pages identified as hand A or B as follows:
Did you have only 145 pages, or did you select
only 145 pages?
> For the K-means algorithm, you start by chosing the
> number of clusters
> you are looking for (2 in this test) and choosing
> that many points as
> first guesses for the centers of the clusters.
> (Typically you just use
> the first N points in the list. as I did.)
>
> Then, you repeatedly do the following steps, until
> cluster assignments
> don't change anymore:
> 1. Assign each point (page) to the cluster whose
> center it is nearest to.
> 2. For each cluster, re-estimate its center point
> as the average of
> all the points in the cluster.
Does the algorithm tell you in the end whether the
number (2) was a good choice or not?
Cheers, Rene
__________________________________
Do you Yahoo!?
New Yahoo! Photos - easier uploading and sharing.
http://photos.yahoo.com/
______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list