Here are my resluts of a quick test on 6988 tokens:
j 1
x 6
i 6
q 444
m 114
f 66
p 188
e 67
t 214
h 5
n 31
d 429
s 383
o 737
k 180
r 305
a 173
l 532
y 842
i.e "j" changed only once a token and y 842 times.How would you interpret these figures?
Claus
-----Ursprüngliche Nachricht-----
Von: Robert Antony Hicks [mailto:rob_hicks_vms@xxxxxxxxxxx]
Gesendet: Freitag, 21. Februar 2003 12:07
An: vms-list@xxxxxxxxxxx
Betreff: VMs: Trying to create a test for letter 'value'
I wish to create a test for the 'value' of letters within the VMS. I have
the basic programming skills to make a utility, but I need a bit of help
establishing what algorithm to use. I'd better explain what I mean by
value-
I define the value, V, of a particular letter as a measure of how frequently
the letter is the difference between two otherwise identical words.
For example,
Consider a text containing just four words : okol okoy okcy kcy
There are 5 different letters in the text : o, k, l, y and c.
Letter o has V=2/4=0.5 because it distinguishes between okol and okcy and
okcy and kcy out of the three words.
Letter k has V=0 because it does not distinguish between any of the words.
Letter l has V=1/4=0.25 because it distinguishes between okol and okoy out
of the three words.
Letter y has V=1/4=0.25 because it distinguishes between okol and okoy out
of the three words.
Letter c has V=1/4=0.25 because it distinguishes between okoy and okcy out
of the three words.
Thus, a value 'ranking' for the five letters is -
o V=0.5
lyc V=0.25
k V=0
I hope this makes some semblance of sense. I need help with creating an
algorithm for this - can anybody assist with pseudo-code?
Even better - does such a test already exist, and if so, what is it called
and where can I find it?
Rob
_________________________________________________________________
Express yourself with cool emoticons http://messenger.msn.co.uk
______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list