[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: AW: VMs: Character repetition



On Thursday 16 September 2004 04:50, Koontz John E wrote:
> On Wed, 15 Sep 2004, Gabriel Landini wrote:
> > That is the reason why I believe that the spaces really separate the
> > tokens.
>
> Where the tokens are encoded characters?  In such proposals one wonders
> about labels, though obviously labels like "a", "b", etc., are not
> improbable, and I suspect have a fairly ancient pedigree.

Hi John,
I do not understand the comment above. By "token" I meant: any string 
delimited by spaces or start/end of line. By "word" I mean a type of token.

My comment was: I find it remarkable that removing the spaces and doing a 
spectral analysis of the space-less stream one still can see the modal token 
length and verse length in (for instance) Chaucer. The same token 
length-related peak (this time at 5.9) appears in the space-less vms which 
corresponds to the modal token length when considering spaces as the 
delimiters (5 or 6 depending how one measures it). If one scrambles the 
characters (still same char. distribution) this peak disappears, so obviously 
it has to do with the word construction rules (the peak does not disappear by 
word-scrambling).

Cheers,

G.

______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list