[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: AW: VMs: Character repetition



Hi John,

On Friday 17 September 2004 21:54, Koontz John E wrote:
> So, whatever else they are, the tokens or words of the VMs behave
> statistically like other cases of higher-level coding units and presumably
> represent some kind of significant unit.

Yes, that is what I think because I can see the same effect in other 
languages.

> Or maybe what you're actually saying, since you say it rather carefully,
> is that word spacing produces units similar in length to the units
> implied by spectal analysis, and simplest assumption is that the two sets
> of entities are one and the same?

Short: yes.
Long: There are several features in those correlation plots. I mostly 
concentrated in the short-length correlations. What I am saying :-) is that 
the analysis still recognises fluctuations in symbol occurrences that peak at 
the same length as the tokens (i.e. their mode) in various languages. Of 
course this could be a coincidence. The only way I can see to test this is to 
create surrogate data, and that is precisely what I did: character and token 
scrambling. The first one destroys the effect, the second doesn't. I 
therefore suggested that this peak has to do with word construction+relative 
frequencies of words and not with  sentence construction (i.e. the position 
of the tokens does not seem to affect it). 

Note that Chaucer's modal *verse* length can also be seen in the correlation 
plots (all this in space-less texts!). This 2nd peak and the long range 
correlation slope, disappear after both word- and character scrambling. Of 
course, this makes sense too: moving tokens around in the text, breaks the 
rhyme.

If one uses a polyalphabetic substitution with more than 2 alphabets, then 
these correlations quickly disappear as well -- as also does Zipf's law. This 
could be seen as another bit of hammering against Strong's "solution" because 
those features exist in the vms and become increasingly unlikely in 
polyalphabetic substitutions.

Regards,

Gabriel

______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list