[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

VMs: Re: Monkey authorship

Greetings all,

And welcome to another academic year :)

> From: Milo Velimirovic
> >"If you put a billion monkeys in front of a billion typewriters typing
> >at random, they would reproduce the entire collected works of Usenet in
> >about ... five minutes."

Just a thought before someone goes off to employ monkeys to
solve the Voynich enigma.....

Schroeder (1991) [Fractals, Chaos, Power Laws] argues that although,
Mandelbrot demonstrated that a Monkey hitting typewriter keys at random
produces a language that obeys Zipf's hyperbolic law, the resulting
plethora of word-forms produced far exceeds that produced in a real
natural language.  Schroeder goes on to state that detailed analysis
shows, that if a Monkey's typewriter has N equally probable letter keys
and a space bar (with probability P0), then the words generated (With N=26
and P0 = 1/5 [average word length]) results in a lexicon unlike any
generated by an intelligent agent.

To illustrate this, the median word rank of all word-forms generated to
comprise the author's lexicon, were compared: that is, the number of words
it takes to reach a total probability of 0.5 for the top most frequent
words.  In the English language, the median word rank ranges from 100 to
500: a factor dependent on the author's lexicon and therefore their
literacy.  In contrast, the Monkey attains a median rank of 1,895,761 for
only a nine-letter alphabet.  Thus, the Monkey, while strictly clinging to
Zipf's law, generates a lexicon, which potentially forms a Cantor-set,
whose letter combinations comprise that resembling those produced by a
random letter generator.

Natural languages do not 'grow' all the possible branches of self-similar
combinations but rather, they only use a sub-set due to physiological
constraints of production; possible reception ambiguity; efficiency in
communication and the requirement to enable the language to be learnt in a
child.s formative years during neotency: morphographemic recursion being
employed whenever possible.  However, this is tempered at word level to
avoid irresolvable ambiguity due to homographs and incognates.    Most
branches of letter combinations would, in effect, be dead as these would
'grow' non-words.

Of course, anything is possible (see, I.m an optimist at heart): for
instance, a plastic cup being blow down a road can sound like the
clip-clop of horse's hoofs.

regards to all,


Dr John Elliott
Centre for Computer Analysis of Language and Speech
University of Leeds.  http://www.comp.leeds.ac.uk/jre/
and Computational Intelligence Group, School of Computing
Leeds Metropolitan University
email:  jre@xxxxxxxxxxxxxxxx  or J.Elliott@xxxxxxxxxxxxxx
Phone: 0113 283 2600 ext 5157

To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list