[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

VMs: word boundaries



Hi Eric,

At 20:35 12/07/2004 -0700, Eric wrote:
However, I haven't tried colocation likelihood
measurements on gibberish or heavily encrypted text
(by that i mean, a simple substitution cipher would
behave the same as plain text for collocations). I
would guess in these cases the number of significant
collocations would drop, but by how signficantly I
don't know. I did run collocation likelihoods once on
the VMS text (see my long message about concentrating
on known languages) and didn't see any anomolies (the
character combinations we always see together - 4o -
show up).

If Voynichese is based in part on a verbose cipher (where certain digraphs ('diglyphs'?) like "qo" ('4o'), 'dy', 'or', 'ol', etc code for special tokens), then you might also be able to tease out interesting results from it using these kinds of analyses.


However, the analytical problem is that the basic concept of "letter" becomes rather amorphous: for example, it seems that in many/most/all cases, Voynichese "o" has no independent meaning - so any analysis that relies on a concept of a "state" associated with that "letter" will be misleading. Take a basic pair of words like "otedy" and "qotedy": I personally have little doubt that the latter should probably be parsed as "qo-t-e-dy" (or perhaps "qo-te-dy") - but what about the former?

If all "o"s are misleading (that is, if free-standing "o" has no meaning), then we should expect to parse it as "ot-e-dy" - but IIRC other analyses suggest that "tedy" is some kind of word base here, with "qo-" and "o-" as prefixes.

Perhaps you might consider how to apply your box of tricks to test this "o is not a real letter" hypothesis?

Cheers, .....Nick Pelling.....


______________________________________________________________________ To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying: unsubscribe vms-list