[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: VMs: Overfitting the Data
On Mon, 31 Jan 2005, Marke Fincher wrote:
> Most of the time I don't think it is necessary to use sophisticated
> mathematics to gauge overfitting. Given a generative system, just
> examine what percentage of the words it generates are found in the
> VMs. Some of the hypothetical wheel systems and their functional
> equivalents will unavoidably generate 60,000 or more words!
>
> Of course, if the VMs was generated by a process that involved a
> random element, and even if that EXACT process was repeated, you
> would not get exactly the same set of words again. But given the
> frequency distribution of VMs words that we see (which is far from
> flat) you would expect the hard-core of frequent words to reappear
> in any subsequent rerun.
In Rugg's experiments, Voynich (EVA character) words are treated as
consisting of three parts. Jorge Stolfi has used this approach (or one
similar to it), too, cf.
http://www.dcc.unicamp.br/~stolfi/voynich/97-11-12-pms/
Stolfi currently has a much more sophisticated analsys, which is at
http://www.dcc.unicamp.br/~stolfi/voynich/00-06-07-word-grammar/. (For a
summary of approaches to the grammar of Voynic words see Rene Zandbergen's
site, page http://www.voynich.nu/a_para.html.)
There are various ways of expressing this approach. If you provide three
lists, of prefixes, midfixes, and suffixes, or some generative grammar
notation precisely equivalent to this, you seem to imply that any
combination of three elements from this list occurs. In addition, you may
seem to imply that all possible combinations of prefix, midfix and suffix
are equally likely. Really all you intend is an underfitted statement
that this set of lists, or, equivalently, this generative grammar, will
include all of the Voynich words. Actual words occurring and their
frequencies are not specified. These, at least, are the standard
assumptions of generative grammarians.
And, of course, whatever the set of words these mechanisms produce, it
will only be a subset of the Voynich wirds, because I think everyone using
this approach explicitly admits that it is a first approximation that
doesn't handle everything.
Rugg's approach is a little different in implication because it is
explicitly operational, not simply descriptive. I've indicated that Rugg
matrices amount really to lists of prefixes, infixes, and suffixes. But
whether you slide these up and down next to each other and read across or,
equivalently, superimpose a grill with offset windows over the lists and
read what you see in the windows, this secondary mechanism explicitly
defines a subset of the possible combinations. Even if you combine
several lists with several grills or slidings, you end up with several
subsets, but you aren't likely to end up with the full set of possiblities
permitted by combining any prefix with any midfix and any suffix. If you
don't make it possible to select a given combination through your
arrangement of lists and windows, it won't occur.
In addition, a Rugg grill arrangement can repeat forms (prefixes,
midfixes, or suffixes) in any column and this makes a statement about
probability, or imposes it during random generation. If your set of
prefixes is 4 qo's and one shol, you have a matrix that should generate qo
4 times as frequently as shol. (I'm neglecting empties produced by a
window that reads off the edge of a column, if this is allowed.)
> ...and similarly for any modern proposed generating system to be
> considered successful, it should generate nearly all of the
> frequent VMs words, _but in similar proportions_
If you can deduce the structure of the words - perhaps the constituent
prefixes, midfixes, and suffixes, though this seems too simple to work,
then you can generate the lists for a Rugg grill and the set of observed
words constrains the set of alignments of these lists, while frequencies
suggest the number of repetitions of elements in each list. You could
then perhaps come up with a set of matrixes and grills that would generate
the proper set of words in the proper proportions. Such a set, even if it
weren't the actual generating device would be functionally equivalent to
it. If you have the proper distribution of the right things it doesn't
matter how you got it. Numerous equivalent mechanisms might exist. You
need only demonstrate a feasible mechanism with the right properties.
(That's a real big "only," of course.)
> At the moment wheels just don't do it for me, unless they are
> highly controlled via some other system from where the frequency,
> word order and phrase patterns originate.
It's easy to show that a three slider wheel is equivalent to a Rugg
matrix. Cut the matrix apart into its constituent groups of three columns
(triplets of prefix, midfix, and suffix columns). Tape the bottom of the
first three to the top of the second three and so on until you have one
set of three columns. Now tape the top of your three columns to the
bottom to make a ring (in three dimensions with the rows parallel to the
axis). I hope you used topologist's infinitely stretchy paper and tape,
available at many fifth-dimension math supply stores, because next you
have to flatten the three parallel ringa out into three concentric rings
by stretching one side or the other until the rows are along radii from
the axis. Stop when you have a perfect two dimensional ring.
Now cut along the circular lines separating the three rings so you can
rotate them relative to each other. If you prefer you can skip the last
ouroboros-style taping and the awkward stretching exercise and just cut
the three columns apart so that you have a prefix column, a midfix column,
and a suffix column. Instead of rotating the rings past each other, you
just slide your strips up and down next to each other.
Either way, any particular alignment of the three sliders is functionally
equivalent to making a grill with three windows in it. Sliding your next
slider up or down n rows is like offsetting your next window down or up n
rows. Reading across the three sliders on some random radius (or row) in
the proper order is functionally equivalent to plopping down the grill on
the matrix at some random location and reading the windows in the proper
order.
QED
Well, QED modulo edge effects. If you consider it legitimate to place a
grill on a matrix so that one or more windows is off the matrix, counting
that window as blank, then, you'll find you can't do that when you tape
the threes of matrix columns together. Now a window off the bottom of the
matrix will read into the corresponding column in the next three. I
believe you could handle this by allowing as many blank rows between
concatentated columns as you are prepared to allow total cumulative
effective offsets in your grills. I'll leave this as an exercise for the
reader, because it looks complicated.
______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list