[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: VMs: Word Length Distribution



First four colums as before.


Fifth column shows word length distribution after changing "or", 
"ol", "al" and "qo" to single glyphs in addition to those already 
changed. 


QUOTE
"A word is an abstract sequence of symbols; a token is an occurrence 
of a word in the VMS text (delimited by blanks, line breaks, etc.) 
The length of a word or token is the number of symbols it contains. 
For this page, we will define symbol as Currier did; i.e. EVA ch ans 
sh will be counted as single symbols, and so are EVA cth, ckh, etc.."
UNQUOTE   ---- J. Stolfi


http://www.dcc.unicamp.br/~stolfi/voynich/00-12-21-word-length-distr/


... and here are a couple of the words and their factorization into 
the "alphabet" he used to define the word length: 

chcthdy  {Ch}{CTh}{d}{y}  4

cphey {CPh}{e}{y} 3

When I previously said "almost identical" to his graph it was by 
eyeball. My old version of Quattro-Pro does not work on this machine. 
I have not looked at Excel functions yet and really did not 
understand a tenth of what QP had to offer. Hello Christoph, are you 
there? Someone? I thought this would be definitive by sight but it is 
not -- not to me.

I have not been able to change font with the settings for plain text 
only in Pegasus. If copied, pasted and changed to Currier New the 
columns should align unless the mail scrambles them.


I will push this a little more with consolidations. Not sure how to 
handle "eee". 


1	15	16	17	21
2	70	85	110	193
3	237	297	440	778
4	641	736	1138	1569
5	1276	1388	1798	1906
6	1701	1739	1847	1628
7	1645	1565	1267	916
8	1096	993	670	471
9	626	554	294	215
10	261	231	142	90
11	122	111	77	56
12	91	84	50	32
13	50	48	22	9
14	27	23	13	10
15	18	12	9	6
16	16	13	9	4
17	4	4	2	4
18	3	2	2	0
19	6	4	1	0
20	2	2	0	0
21	2	2	0	0

Regards to all, 

Knox
______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list