[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

VMs: split words

I've done a little experiment around the idea of split-words,
or that some of the spaces in the VMs are misdirectional.

For each unique five letter string that occurs within words 
(there are 8400), I've looked to see if that same sequence 
appears elsewhere but with a space inserted.  It turns out 
that in 3000 (36%) a split-sequence does occur.  In many
cases the same 5 char sequence is split in 3 or even 4
different ways (a few examples are shown at the end).

This is interesting, and I feel gives more weight to the
hoax argument than to the natural language argument.

Take a five char sequence from above 'natur';  You wouldn't 
expect to also find all of 'n atur' 'na tur' 'nat ur' and
'natu r' occurring in the same text.  I understand there may 
be more opportunity for this to happen in other languages, 
but 3000 out of 8400 still seems high.

I'd be interested in what the language experts take on this


chokc:34 ch.okc:1 cho.kc:8  chok.c:2 
heolk:2  he.olk:1 heo.lk:10 heol.k:39 
eodar:36 eo.dar:5 eod.ar:1  eoda.r:1 

dolch:7  d.olch:2  do.lch:2  dol.ch:29 
holch:3  h.olch:1  ho.lch:2  hol.ch:215 
hopch:18 h.opch:1  ho.pch:3  hop.ch:1 
hotch:26 h.otch:1  ho.tch:10 hot.ch:3 
lolch:2  l.olch:29 lo.lch:2  lol.ch:20 

dolda:2  d.olda:1 do.lda:1 dol.da:14 
tshol:15 t.shol:2 tsh.ol:1 tsho.l:1 
olche:156 o.lche:13 ol.che:344 
hodai:100 ho.dai:17 hod.ai:2 
eeoda:29  eeo.da:18 eeod.a:2 
araii:30  a.raii:1  ar.aii:116 
alych:1   al.ych:12 aly.ch:37 
chold:19  cho.ld:1  chol.d:121 
choro:13  cho.ro:3  chor.o:83 
dalch:13  da.lch:1  dal.ch:90 

To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list