[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: VMs: Petr's repeated strings plots

--- Jorge Stolfi <stolfi@xxxxxxxxxxxxx> wrote:
> Petr asks
> > I have one strange effect that I don't
> > understand. If I scan a
> > small part of the VMS (test04) I get a
> > recognizeable pattern. Then
> > if I remove all the spaces (test08) I find a
> > whole lot less
> > matching strings. [...]
> If you are looking for strings of the same length
> (say 12 characters)
> then in the second case you probably get more varied
> strings (i.e.
> more bits of information), because spaces are fairly
> predictable from 
> the other letters.  

I'm not sure I agree, because all letters are
predictable from context. However, clearly, if
you count space as a letter, it is the most
frequent one that you are discarding. 
The result is more varied. Also, you should set the
limit lower than 12 for the new text, to get an
honest comparison.

Cheers, Rene

Do you Yahoo!?
Friends.  Fun.  Try the all-new Yahoo! Messenger.
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list