[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: VMs: RE: RE: Best-fit 2-state PFSMs



Nick Pelling wrote:

Not necessarily regarding (P)FSM's,

Really, I think that Voynichese is only superficially language-like, and that it has an innate artificiality at its core. The idea behind examining its structure by incrementing the number of target states is to try to find a kind of transition point between the superficial language-like structure and whatever lies beneath.

  Real languages apparently can't be
realistically modeled without very, very large FSMs, so if a drop like
this occurred, it may not happen until the FSM has hundreds of states.
The largest I've ever tried to generate was 120 states.

What effect does the size of the text corpus have? The HMM needed a text corpus of ~6 Mbyte for a clear result. Would it tell us something to create an enormous synthetic Voynichese corpus by Gabriel or Jeff's method or Stolfi's Voynichese grammar and then analyze it?


FWIW, I suspect the transition point will fall between 10 and 20 states (for a pre-paired transcription set, such as one where qo / ol / or / al / ar / iiii / iii / ii / eeee / eee / ee / dy / cfh / ckh / cph / cth / ch / sh / eo / od all map to single tokens). For EVA, the PFSM finder would also have to unpick all the tricky pairing rules correctly (lots of local minima to avoid), so I guess 60+ states would be closer there.

What about defining a transcription alphabet that treats these digraphs/verbose cipher elements as single glyphemes and then analyzing the VMs in that transcription? That seems like a useful exercise.


Dennis



Cheers, .....Nick Pelling.....


______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list



______________________________________________________________________ To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying: unsubscribe vms-list