[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: VMs: RE: RE: Best-fit 2-state PFSMs

To: vms-list@xxxxxxxxxxx
Subject: Re: VMs: RE: RE: Best-fit 2-state PFSMs
From: Nick Pelling <nickpelling@xxxxxxxxxxxxxxx>
Date: Fri, 11 Mar 2005 22:38:27 +0000
In-reply-to: <4231363E.2030806@asus.net>
References: <5.2.1.1.0.20050310113104.03b1b710@pop3.blueyonder.co.uk> <5.2.1.1.0.20050310113104.03b1b710@pop3.blueyonder.co.uk>
Reply-to: vms-list@xxxxxxxxxxx
Sender: owner-vms-list@xxxxxxxxxxx

Hi everyone,

At 00:10 11/03/2005 -0600, Dennis wrote:

What effect does the size of the text corpus have? The HMM needed a text corpus of ~6 Mbyte for a clear result. Would it tell us something to create an enormous synthetic Voynichese corpus by Gabriel or Jeff's method or Stolfi's Voynichese grammar and then analyze it?

Note that a grammar is a kind of generative probabilistic state machine, but not necessarily the same kind as described by Ben Preece.

For example, there's good reason to consider that EVA <o> functions differently in "qo" and "ol": while "qol" does occur in the VMs, it only normally occurs on pages where you typically find free-standing (ie non-"ol"/"al"-paired) "l" characters. So, a state machine where <o> maps to a single state would not be able to capture this behaviour satisfactorily.

In fact, this is basically true of any generative grammar where an individual letter (like <o>) appears in multiple "columns". The only easy way around it would be to pre-tokenise groups of letters (as Ben is already doing, though only for the usual suspects ATM), and compare those implicit transcriptions' n-state PFSMs.

What about defining a transcription alphabet that treats these digraphs/verbose cipher elements as single glyphemes and then analyzing the VMs in that transcription? That seems like a useful exercise.

That's what I'm suggesting. :-)

Cheers, .....Nick Pelling.....


______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list

References:
- RE: VMs: RE: RE: Best-fit 2-state PFSMs
  - From: Nick Pelling
- Re: VMs: RE: RE: Best-fit 2-state PFSMs
  - From: Dennis

Prev by Date: Re: VMs: RE: RE: Best-fit 2-state PFSMs
Next by Date: Fwd: Re: VMs: "Digital Watermarking: Historical Roots" - where can I find this paper?
Previous by thread: Re: VMs: RE: RE: Best-fit 2-state PFSMs
Next by thread: Re: VMs: RE: RE: Best-fit 2-state PFSMs
Index(es):
- Date
- Thread