This is interesting. One has to think, though, that this agglomeration should keep Zipf's law more or less unchanged.I am not sure whether agglomeration would do this, because one is creating a new word each time (there should be too many new words, but there aren't).
We shouldn't forget about the labels. They are likely to be more accurate (at least as far as the word start and end) than other pieces of text.
______________________________________________________________________ To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying: unsubscribe vms-list