[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: VMs: Number crunching the Fincher window
On Mon, 13 Sep 2004, Elmar Vogt wrote:
> Here's what I got -- number of different sequences for different sequence
> lengths:
>
> Length VM German
> 4 4389 9435
> 5 8773 14949
> 6 14087 19623
> 7 19432 23443
> 8 23934 26264
> 9 27263 28237
> 10 29527 29609
> 11 30954 30543
> 12 31783 31187
> 13 32249 31651
> 14 32491 31964
> 15 32612 32190
> 16 32674 32346
>
> ...
> We seem to see that natural languages have a larger variety of short
> sequences. At the same time, for longer sequences, the VM gets more varied,
> until at a sequence length of 16, there were only 90 instances of phrases of
> 16 or more characters, which got repeated. (In German, we still had some 400
> duplicates.)
Pardon my denseness, but I don't see how we got from the preceding table
to the numbers in the text?
______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list