[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: VMs: qo-words MORE
Dear Akinori-san,
Welcome to the list, and thank you very much for your two papers! :-)
More VMS-related links are here:-
http://www.geocities.co.jp/Technopolis/7220/voy/
My own central hypothesis is that, whereas almost all codes/ciphers of the
15th Century were designed by *cryptographers*, the VMS was designed by a
*cryptologist*, whose key design decision was to make direct statistical
analysis unrevealing.
I therefore think that blind (ie, non-theory-driven) statistical assault
will most likely not be helpful - and it would seem the last 90 years of
effort supports this idea. :-(
In fact, every statistical test we try seems to yield a new structural
artefact of some kind... which begs the question: how would you build a
code (or language) so that it displays a lot of structure at some levels
and almost no structure at others?
* * * * * *
Here are some ideas for theory-driven statistical assault [please feel free
to add to the list]:-
(1) Where are the numbers in the text? Has there been a proper statistical
assault looking for candidate numbering systems in the VMS?
The internal properties of the VMS' numbering systems might be
statistically distinctive:-
* It would probably have a "generative structure" - ie, its layout
would be rule-based, giving a large number of closely-related
[yet very slightly different] "words".
* It may follow Zipf's Law (in some contexts).
* Other text may exhibit different type of structures
What would a generalised numbering-system filter look like?
(2) One theory is that the VMS is an "embellished code" - ie, that many of
its "words" are actually number indices into a code-book (which is perhaps
hidden in the star paragraphs at the back, or simply lost), but written in
a way that makes them not obvious.
Here, most of the VMS would be comprised of indices: but it would require a
special "shift" code to indicate when the number following the code is
really a number (and not an index)... and I suspect that "q" is the shift-code.
One way of testing this might be to look for differences in (Zipf-style)
distribution between "q"- words and their related "non-q" words.
(3) Another idea I've proposed is that the four gallows characters count in
tens, with sh (ie, ch with a plume) denoting fifty, etc. These would be
vaguely similar to the Cistercian numbering system
ch v
k x
t xx
p xxx
f xl
sh l
cKh lx
cTh lxx
cPh lxxx
cFh xc
However, this may well simply have been the numbering system in the
shorthand alphabet the VMS was built upon... it's hard to tell. :-o
Are there any statistical properties this kind of system would exhibit in
the VMS which might indicate whether or not it's present?
(4) Another idea is that labels might have a different internal structure,
to prevent direct crpyptological assaults on them - ie, they may be
anagrams etc. What are the statistical differences between labels and other
similar-length words in the corpus?
(5) Yet another testable idea is that gallows characters (and/or
<c+gallows+h> glyphs, <ch>, <sh>, etc) somehow denote a change of
"code-page" - like triggers into different states within a Markov chain.
What are the statistical properties of pieces of text following each
gallows character (ie in each state)? When I looked at this before, there
seemed to be quite strong differences between them - but I didn't
investigate how local this effect was.
(6) Grouping glyphs together into pairs (or triples) - like <oe>, <dy> etc
- to hide the real alphabet is a mechanism that was known at least by
1440... and there's evidence that suggests it might also be present in the VMS.
This would also have the effect of confusing many statistical tactics, as
you would (effectively) be measuring glyphs from different levels at the
same time, giving contradictory results.
Are there any clever statistical ways we might use that we might derive a
possible set of groupings? I'm still trying hand-designed groupings ATM,
but any ways of deriving these more definitively from the text itself would
perhaps be more convincing. :-o
* * * * * *
Please feel free to run with any of these however you like! :-)
Cheers, .....Nick Pelling.....
______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list