[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Words, terms, tokens, etc.



The recent exchange about the meaning of "word" and "term" in Antoine's
messages highlights a common problem.  

Anyone who deals with text has the need to distinguish between those
two concepts. Unfortunately each community (if not each author) will
pick different names for them. For instance, I recall that Gabriel's
Zipf law paper uses "word" for the instance, and "token" for the
dictionary entry; whereas the compiler-writer community makes precisly
the opposite choice.

So, a plea for all list members: when you post statistics to the list,
please take the time to define your terms (or words, tokens, whatever 8-).

All the best,

--stolfi