[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
determining the word-break character in VMS
Dear all,
just for fun I wrote a small awk-script to determine the word-break
character and got the following results:
# min avg. max char
------------------------------------------------------------------
31529 1 5.73269 13 .
20720 1 8.57968 59 o
16974 1 8.71916 76 e
16036 2 10.6487 81 h
13974 2 10.825 72 y
12108 2 12.5189 82 c
11760 1 11.7747 113 a
10990 1 9.73012 78 i
10774 1 12.645 113 d
9550 2 14.0578 76 k
8423 1 14.2798 79 l
6082 1 20.0636 114 s
5906 1 16.4927 115 r
5842 1 16.9921 109 t
5180 3 19.9332 114 q
5076 3 18.6377 77 n
1458 3 24.3395 87 p
388 3 25.616 90 f
313 1 18 78 *
309 1 17.5049 68 m
19 3 35.3158 74 x
2 43 54.5 66 j
IMHO the word-break char "." is really the word-break because
- it's the most common character at all
- the average word length is statistically with min and max
Comments ?
Claus
===================================
Claus Anders
debis Systemhaus GEI
Pascalstr. 8
52076 Aachen, Germany
phone:(+49) 2408/943-781 Fax: -430
mailto:CAnders@xxxxxxxxx
===================================