[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

determining the word-break character in VMS



Dear all,
just for fun I wrote a small awk-script to determine the word-break
character and got the following results:
#        min         avg.     		max     char
------------------------------------------------------------------
31529	1	5.73269		13	.
20720	1	8.57968		59	o
16974	1	8.71916		76	e
16036	2	10.6487		81	h
13974	2	10.825		72	y
12108	2	12.5189		82	c
11760	1	11.7747		113	a
10990	1	9.73012		78	i
10774	1	12.645		113	d
9550	2	14.0578		76	k
8423	1	14.2798		79	l
6082	1	20.0636		114	s
5906	1	16.4927		115	r
5842	1	16.9921		109	t
5180	3	19.9332		114	q
5076	3	18.6377		77	n
1458	3	24.3395		87	p
388	3	25.616		90	f
313	1	18	   	 78	*
309	1	17.5049		68	m
19	3	35.3158		74	x
2	43	54.5		66	j

IMHO the word-break char "." is really the word-break because
	- it's the most common character at all
	- the average word length is statistically with min and max
Comments ?
Claus
===================================
Claus Anders

debis Systemhaus GEI
Pascalstr. 8
52076 Aachen, Germany

phone:(+49) 2408/943-781          Fax: -430  
mailto:CAnders@xxxxxxxxx
===================================