Some thoughts about process

These are a couple of thoughts, admittedly not completely thought
throught, about the _process_ of trying to crack the VMS:

1.   Entropy is a popular measure to calculate and speculate with, but
it depends sensitively on the definition
        of what the alphabet is. Would it be possible to develop some
type of measure which would be
        independent of the alphabet? I am thinking of something like the
statistics used for data without
        numerical values (e.g. rankings rather than measurements). Even
if such a measure would not allow
        you to say "this text has the same information content as Latin
with every letter replaced by a pair"
        or something like that, it might allow you to say "the text
becomes more repetitive in the middle than
        at the beginning" or so on.

2.    A lot of interesting ideas, such as the current discussion of
gallows letters, are floated, and calculations
        are produced, etc. then the ideas disappear into the archives.
Is there some way to gather
       such quantitative questions or theories and the resulting
statistics about ut the VMS into one place
      (say, a FAQ) where  you could look at it all at once?

3.    In order to do item #2, would it be worthwhile to try to produce
from the EVA transcription file a
        single machine readable transcription in a convenient form for
compuertized processing, even if
        it were necessary to:

                make some kinds of assumptions
                omit parts that cannot be reconciled between the
different versions
                convert to a/the standard alphabet
                standardize the line-numbering scheme

4.    If this were done, what would be a useful format?  XML?
Relational tables? Simple lists of lines or words? Something else?

Bruce Grant