[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: VMs: Odd Thoughts



Zitat von Dennis <tsalagi@xxxxxxxx>:

> ... 
> 	One hears that one never sees strings of the same word
> repeated in natural languages.  This is not so.  ...

I partly agree. In German, there is a certain ambiguity about articles and 
relative pronouns, which may lead you to constructs like --

"Er dachte, dass das das Richtige war."

("He thought, that this which is right was" -- taking the "verb last" word 
order into account. Note that the first "dass" has a different spelling from 
the others.)

"Die Katze, die die Milch trank."

("The cat, which the milk drank.")

It's fairly straightforward constructions which you might readily encounter. 
But:

*) The VM often repeats comparatively long words, whereas the articles in 
question are naturally short,

*) The VM has sequences of (long) words which differ only by a single letter or 
so. This is very unusual, because "grammar proximity" is not usually reflected 
by "shape proximity" (ie words having related functions not necessarily look 
alike.)

So I'd still say the repetitions are a remarkable feature of the VM.

Besides, the numerical value for entropy which you get from text analysis 
simply _is_ larger than that of most natural languages, so it's not just the 
appearance of the text.

> 	I've never closely studied the Zipf's laws, but I've
> always been somewhat skeptical of them.  In days of
> old, engineers had a saying: "Anything makes a straight
> line if you plot it on log-log paper."  

Yeah, I remember that! ;-)

> In other words,
> everything, to some extent, follows a power law - which
> is what the Zipf's laws are.  

Uhm... I'd take it with a grain of salt. log-log plots certainly aren't 
completely useless, but have to be used with care. AFAIU, you retrieve 
parameters of the underlying language from the Zipf plots, and these at least 
seem to point out that the structure of the language underlying the VM is non-
trivial.

> 
> ...
> Dennis

Tallyho,

   Elmar, who currently doesn't have an idea how to tackle the VM next




-------------------------------------------------
debitel.net Webmail
______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list