[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: VMs: Grove words



  > [John E Koontz:] Out of curiosity, have you tried adding 
  > Grove-prefixes to the grammar?
  
Yes and no. My grammar actually generates all the VMS words, Grove
words included. However, the topmost rules are

  Word:
    33827 0.96296 0.96296 NormalWord
     1301 0.03704 1.00000 AbnormalWord

  AbnormalWord:
      716 0.55035 0.55035 Multiple
      213 0.16372 0.71407 GroveWord
      372 0.28593 1.00000 Weird   

  Multiple:
      208 0.29050 0.29050 MultiCore
      278 0.38827 0.67877 MultiCoreMantle
      206 0.28771 0.96648 EmbeddedAIN
       24 0.03352 1.00000 EmbeddedYQ

The grammar proper (with the crust-mantle-core division and all that)
hangs from category NormalWord, which accounts for 33827 tokens (96.3%
of the text). The categories GroveWord, Weird, MultiCore,
MultiCoreMantle, EmbeddedAIN, and EmbeddedYQ are merely long lists of
words, most of them with occurrence count 1. Together, these account
for 1301 tokens, or 3.7% of the total.

The category GroveWord could be defined as G.NormalWord, where
category G yields the four gallows.  However, that would introduce
ambiguous parsings for words like EVA "kody" (is that a GroveWord 
"k"+"ody", or a NormalWord "kody"?) 

As a computer scientist, I have been pavlovized to avoid ambiguous
grammars; and that prejudice is the explanation for several
questionable choices in my paradigm. A linguist would probably have
cared more about simplicity and less (or not at all) about unique
parsings, and produced a shorter grammar.

All the best,

--stolfi
______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list