[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

VMs: And another one ...



It will take some hard study to grasp this, I fear.
That's why I ask first if you think it's worth the effort :

http://arxiv.org/ftp/cs/papers/0212/0212033.pdf

Abstract. This paper presents a simple unsupervised learning algorithm for
recognizing
synonyms, based on statistical data acquired by querying a Web search
engine. The algorithm, called PMI-IR, uses Pointwise Mutual Information
(PMI) and Information Retrieval (IR) to measure the similarity of pairs of
words. PMI-IR is empirically evaluated using 80 synonym test questions from
the Test of English as a Foreign Language (TOEFL) and 50 synonym test
questions
from a collection of tests for students of English as a Second Language
(ESL). On both tests, the algorithm obtains a score of 74%. PMI-IR is
contrasted
with Latent Semantic Analysis (LSA), which achieves a score of 64% on
the same 80 TOEFL questions. The paper discusses potential applications of
the
new unsupervised learning algorithm and some implications of the results for
LSA and LSI (Latent Semantic Indexing).

______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list