[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
VMs: f85r2 "four ages" diagram ( word boundaries)
Re. collocation analysis. It appears to me there are fewer phrases or
strings of words than a non-linguistic chance assortment of words
with the same array of frequencies would produce. (I do not have a
word for such a condition). Can you test for that with spaces intact?
Using letter strings only?
What is the definition of "minimum description length"?
On 10 Jul 2004 at 10:18, Eric wrote:
> I just received and finished reading through
> D'Imperio's book (two months wait - so much for the
> supposed convenience of Amazon :). Anyways, one (of
> the many) things she mentions is the "four ages"
> diagram on f85r2 (1006229) and similarities with
> existing manuscripts and possible tie-ins with Galenic
> medicine. I've done a quick search of the archives but
> don't see any discussion. Is there any work out there
> which would help in researching this page?
> ...For the past few months I've been trying to work
> out a non-lexical based, language independent, word
> segmentation program to see if we could come to some
> more definitive conclusion on the open-ended debate of
> whether the spaces are word boundaries or not. Haven't
> been able to get it to work to a decent degree of
> performance (though apparently, nobody in
> Computational Linguistics can either :). I've tried
> collocation analysis and minimum description length in
> all sorts of algorithms. I'll keep picking at it
> whenever a new idea pops up and if anyone else has
> some insights, that would be great. Quite a learning
> experience in any case. I'm looking for a new main
> point of focus, though, and folio f85r2 jumped out at
> me while reading D'Imperio.
> Do you Yahoo!?
> Yahoo! Mail is new and improved - Check it out!
> To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
> unsubscribe vms-list
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying: