[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: VMs: Do 2-state PFSMs distinguish vowels/consonants in VMs?



The texts were of various sizes.  The English PFSMs were generated from
a text of 21,260 words and 108,357 characters.  The Turkish PFSMs were
generated from a text of 4,387 words and 33,478 characters.  The VMs
PFSMs were generated from the complete set of Language B pages.

I should probably also have acknowledged that most of the explanations I
suggested were not new, and have been already discussed on this list.

-Ben

-----Original Message-----
From: owner-vms-list@xxxxxxxxxxx [mailto:owner-vms-list@xxxxxxxxxxx] On
Behalf Of Dennis
Sent: Saturday, March 19, 2005 12:15 AM
To: vms-list@xxxxxxxxxxx
Subject: Re: VMs: Do 2-state PFSMs distinguish vowels/consonants in VMs?

Ben Preece wrote:

> What would explain this? 

	How big are your text corpora in the known languages?

> 6) None of the transcription alphabets yet accurately reflect the 
> actual letters in the text (Is "daiin" supposed to be d-a-i-i-n, or 
> d-a-ii-n, or d-aii-n, or d-a-iin, or d-ai-in, or what?)

	Yes.

	It does seem clear to me that in most cases, -iin is a single
glypheme.  We have similar clarity on some of the others, but not many.

> If this were the problem, then we might be able to identify vowels
once the correct transcription alphabet is identified.  
> And in fact, this might be one test for what would qualify as a good
transcription alphabet.

	I agree.  This could help us to select the "right" 
transcription alphabet.

Dennis
______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list


______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list