[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

VMs: VMs Frequency Counts



I wrote a quick C++ program to show Letter frequency counts, Word counts, first letter of words, last letter of words and a Contact Chart.
 
I have been sending each page of text from the voy to the program and recording the results.
 
Having completed 74 pages, I am finding a very interesting trend...
 
I realize that it is hard to get definitive results from small sets such as the 1 or 2 paragraphs on each page, but for the letter frequency counts I am seeing a pattern where consecutive pages are very similar - especially as one moves further into the manuscript.
 
For example, here are the top 5 characters (descending order) for some pages:
 
f24r    o   h   c   a   e
f24v    o   h   c   d   a
 
f25r    h   i   c   o   a
f25v    o   i   h   c   a
 
f26r    e   y   d   h   o
f26v    e   y   d   o   h
 
f27r    h   c   y   o   e
f27v    h   c   o   y   d
 
f28r    o   h   c   t   y
f28v    o   h   i   c   y
 
v29r    h   o   c   y   s
f29v    h   o   c   y   i
 
f30r    h   c   o   e   y
f30v    h   c   o   i   y
 
f31r    e   y   o   d   h
f31v    e   o   h   y   a
 
...
 
f76r    e   y   h   o   d
f76v    e   y   d   h   o
 
I have this all in a spreadsheet with the top 7 characters colorized and the effect is quite striking.
 
After I get another bunch through I will post the results to the web and let others see if they see what I see.
 
 
 
 
 
 
 

******************************
Larry Roux
Syracuse University
lroux@xxxxxxx
*******************************