VMs: Long-range correlations

  > [Gabriel:] So my conclusion on this [frm spectral analysis] was
  > that there is some structural part of the text which is destroyed
  > with the token scrambling. If the vms text was a random collection
  > of tokens, these long range correlations would not exist in the
  > first place.

The long-range correlations are probably due to changes in vocabulary
induced by changes in subject matter.

Consider again the "brother" in "War of the Worlds": although the word
occurs over a hundred times in the book, all its occurrences are
confined to the middle third (more precisely, between 38% and 62% of
the text). The word "soldier" occurs 22 times, all but two of them in
the first half. The 3-letter sequence "kit" occurs 23 times, all in
the second half; whereas "bha", "obh", "sey" and "lvy", about just as
common, occur only in the first half. And so forth.

On the other hand, many 3-letter patterns like "the", "and", "ere",
"ght", "und", "hat", "her", have the almost exactly same frequency in
both halves.

Needless to say, the VMS has plenty of these patterns, too: some words
and letter tuples have very uneven distributions, correlated with
secions and pages; while others are very evenly distributed. The
former are expected to produce long-range correlations of the sort
seen by Gabriel, Merk Perakh, and others. In spectral analysis, in
particular, they enhance the low-frequency components (long waves), at
the expense of the low-frequency ones.

By the way, Gordon Rugg's mechanism could in principle produce text
with similar features, by suitably changing tables and/or grilles.
That is not saying much, however, since it can reproduce any text
whatsoever, word by word. The table format that Gordon used in his
experiments has room for 40×13 = 520 triplets. So, with a single Rugg
table, sequentially scanned just once with a trivial grille, one could
produce a verbatim copy of Hamlet's monologue (275 words) followed by
the lyrics of "I'm My Own Grandpa" (220 words), and still have just
enough space left in the table for a quick limerick, e.g.

  A burlesque dancer, a pip
  Named Virginia, could peel in a zip;
  But she read science fiction
  And died of constriction
  Attempting a Moebius strip.

(25 words). Or one could generate the entire WotW novel (78000 words),
with all the subtle asymmetries and long-range correlations noted
above, with less than 160 Rugg tables.

So, yes, a medieval cryptographer could easily have generated several
pages of very convincing Voynichese by Rugg's method: all he had to do
was to get hold of the VMS, copy several pages of it into a table,

All the best,

