[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
VMs: split words
I've done a little experiment around the idea of split-words,
or that some of the spaces in the VMs are misdirectional.
For each unique five letter string that occurs within words
(there are 8400), I've looked to see if that same sequence
appears elsewhere but with a space inserted. It turns out
that in 3000 (36%) a split-sequence does occur. In many
cases the same 5 char sequence is split in 3 or even 4
different ways (a few examples are shown at the end).
This is interesting, and I feel gives more weight to the
hoax argument than to the natural language argument.
Take a five char sequence from above 'natur'; You wouldn't
expect to also find all of 'n atur' 'na tur' 'nat ur' and
'natu r' occurring in the same text. I understand there may
be more opportunity for this to happen in other languages,
but 3000 out of 8400 still seems high.
I'd be interested in what the language experts take on this
is...
Marke
---examples---
chokc:34 ch.okc:1 cho.kc:8 chok.c:2
heolk:2 he.olk:1 heo.lk:10 heol.k:39
eodar:36 eo.dar:5 eod.ar:1 eoda.r:1
dolch:7 d.olch:2 do.lch:2 dol.ch:29
holch:3 h.olch:1 ho.lch:2 hol.ch:215
hopch:18 h.opch:1 ho.pch:3 hop.ch:1
hotch:26 h.otch:1 ho.tch:10 hot.ch:3
lolch:2 l.olch:29 lo.lch:2 lol.ch:20
dolda:2 d.olda:1 do.lda:1 dol.da:14
tshol:15 t.shol:2 tsh.ol:1 tsho.l:1
olche:156 o.lche:13 ol.che:344
hodai:100 ho.dai:17 hod.ai:2
eeoda:29 eeo.da:18 eeod.a:2
araii:30 a.raii:1 ar.aii:116
alych:1 al.ych:12 aly.ch:37
chold:19 cho.ld:1 chol.d:121
choro:13 cho.ro:3 chor.o:83
dalch:13 da.lch:1 dal.ch:90
______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list