[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: VMs: Strange pair statistics



--- Jeff <jeff@xxxxxxxxxxxxxxxxxxxxxx> wrote:

> Take a look at this table of the top 21 VMS EVA
> pairs. 
> 
> ch - 1075
> ho - 820
 <snipped>

This is certainly an interesting way to look at
the VMs. It's been done before, of course, and
without leading to any breakthroughs, needless
to say, but the results aren't really available
in any detail.

The pair frequency table could be compared to
a single-character frequency table, in order to
decide which pairs are really 'suspiciously' 
frequent. One should also do this for a text
in a known language, to have a 'yard stick'.
English is good, since it has the very 
frequent 'th', which the method 'should' be able
to detect.

> Note also the pairs yc, yd and rc. These can occur
> within words and
> also appear as y.c, y.d and r.c respectively. 

A table with two columns: inside word and across
word spaces (plus the total) would be very 
interesting.

> Also why is their position so high in the table?

They combine frequent word enders with frequent
word starters. Nothing more mysterious, IMHO.
If word spaces are arbitrarily inserted, then the
table should show an essentially equal ratio
of in-word vs. cross-word statistics for each
pair, provided the sample is big enough.
Again, one should try this for an English (or
Italian...) text as a reference for comparison.

Cheers, Rene

__________________________________
Do you Yahoo!?
Yahoo! Hotjobs: Enter the "Signing Bonus" Sweepstakes
http://hotjobs.sweepstakes.yahoo.com/signingbonus
______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list