Hi all,
after modifying my algorithm (discarding single letter tokens, looking at pairs only once and computing a relative count), I'd like to show my results:
o 20%
l 18%
d 17%
y 15%
s 14.6%
k 14%
t 13.6%
r 12.9%
e 10.3%
p 8.8%
a 8.1%
q 5.4%
f 4.6%
m 4.0%
c 2.98%
n 1.8%
i 0.8%
x 0.47%
h 0.45%
j 0.004%
Now we have the char 'o', which makes the difference between 20% of all similar (=differing in 1 char only) tokens.
Cheers
Claus