[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: VMs: RE: Character Frequency Analysis

> One has to keep in mind that the text in many pages is extremely short. The
> comparisons may not be completely reliable. Perhaps you could to the table
> (nice) add how many characters the frequencies are drawn from because the %
> value does not reflect the size of the data set.
Yes, that is very true.  A limited data set may skew the results and, in order to know which lines may be skewed it would be nice to have the data set size to look at.  Great idea!

Larry Roux
Syracuse University