This sounds interesting ... could you expand on it a little? What sort of algorithms have you tried?...For the past few months I've been trying to work out a non-lexical based, language independent, word segmentation program to see if we could come to some more definitive conclusion on the open-ended debate of whether the spaces are word boundaries or not. Haven't been able to get it to work to a decent degree of performance (though apparently, nobody in Computational Linguistics can either :). I've tried collocation analysis and minimum description length in all sorts of algorithms.
______________________________________________________________________ To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying: unsubscribe vms-list