[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: VMs: Do 2-state PFSMs distinguish vowels/consonants in VMs?



The information content of a symbol which appears in the wrong state is
essentially infinite, since the information content of a symbol is
log2(1/p), where p is the probability that a symbol might appear, and an
illegal symbol is given probability zero.  

In a PFSM, the probabilities are all associated with the state of the
FSM.  I haven't actually included the probabilities for any of the PFSMs
I've shown you yet, so I'll put an example together and send it out.

The basic idea is that a good model ought to be able to predict when a
symbol might appear.  The model will assign a high probability to likely
symbols, and unlikely symbols would have a low probability.  Likely
symbols carry a low information content, and unlikelyl symbols carry a
high information content.  Impossible symbols carry an infinite
information content.  If a model matches the text well, then for the
most part the symbols will have low information content.

The PFSMs I showed were optimized for the texts I used.  The empty state
transitions represent symbols that never show up in those states in that
text, so a zero probability is optimum for those texts.  If a PFSM were
optimized for a different text, it may have non-zero probabilities to
those states, but unless there were something unusual about a text,
those states would presumably still have a low probability.

The text I used for English was a portion of Anne of Green Gables.  If
you look at the 2-state PFSM for English that I sent out, you'll see
that the letter 'x' never appears in state 1.  This is reasonable,
because every instance of 'x' in the text comes after a vowel, and
vowels always put the PFSM into state 0.  Obviously this would fail on
any text which had the letter 'x' after any letter which puts the PFSM
into state 1, as for example in the name Beckx.

-Ben

-----Original Message-----
From: owner-vms-list@xxxxxxxxxxx [mailto:owner-vms-list@xxxxxxxxxxx] On
Behalf Of Nick Pelling
Sent: Monday, March 21, 2005 9:08 AM
To: vms-list@xxxxxxxxxxx
Subject: RE: VMs: Do 2-state PFSMs distinguish vowels/consonants in VMs?

Hi Ben,

Can I ask you a quick question about your PFSMs? Your output state
machines seem to associate individual letters with *state transitions*
(rather than with states themselves, per se), which surely means that
the underlying model is only expecting to see a certain set of letters
in any given state? 
What then is the information content of letters which appear in the
"wrong" 
state?

Just trying to get a grip on what you're measuring. :-o

Cheers, .....Nick Pelling.....


______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list


______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list