[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Some thoughts about process
These are a couple of thoughts, admittedly not completely thought
throught, about the _process_ of trying to crack the VMS:
1. Entropy is a popular measure to calculate and speculate with, but
it depends sensitively on the definition
of what the alphabet is. Would it be possible to develop some
type of measure which would be
independent of the alphabet? I am thinking of something like the
statistics used for data without
numerical values (e.g. rankings rather than measurements). Even
if such a measure would not allow
you to say "this text has the same information content as Latin
with every letter replaced by a pair"
or something like that, it might allow you to say "the text
becomes more repetitive in the middle than
at the beginning" or so on.
2. A lot of interesting ideas, such as the current discussion of
gallows letters, are floated, and calculations
are produced, etc. then the ideas disappear into the archives.
Is there some way to gather
such quantitative questions or theories and the resulting
statistics about ut the VMS into one place
(say, a FAQ) where you could look at it all at once?
3. In order to do item #2, would it be worthwhile to try to produce
from the EVA transcription file a
single machine readable transcription in a convenient form for
compuertized processing, even if
it were necessary to:
make some kinds of assumptions
omit parts that cannot be reconciled between the
different versions
convert to a/the standard alphabet
standardize the line-numbering scheme
etc.?
4. If this were done, what would be a useful format? XML?
Relational tables? Simple lists of lines or words? Something else?
Bruce Grant