[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
VMs: Making a vms with meaning (long)
Hi,
I have been thinking about Gordon Rugg's "solution" for some time now. For
sure it is an interesting method to generate nonsense, but I am not convinced
that it logically follows that the vms is (or could be) a hoax of that type.
Although he has shown one method of generating something like the real thing,
it still has to be shown that:
1) any statistical properties are in fact, similar to the vms, and
2) how likely is one to come up with grilles that produce so much text showing
the degree of consistency seen in the vms.
Note that Gordon is aware that one needs several grilles to make something as
big as the vms (otherwise it starts repeating). He is surely reverse-
engineering the grilles, to fit the vms. However extrapolating to a possible
hoaxer, it assumes that s/he was intending to make something like the vms
with a large degree of consistency throughout the grilles. To end up with a
set of grilles like that, I think, may be difficult to achieve without a
clear idea of where the grille cuts are to be made).
What I mean is that (i think) it would be quite unlikely to end up with many
grilles that all produce vms-like text, unless you knew what one is intending
to produce.
So I decided to give some hammering to the 'hoax assumption' ("it has not been
read because there is nothing to be read") by producing something that looks
and has similar properties to the vms, AND has meaning AND is difficult to
crack.
The method of encoding I used is simple, straightforward and perhaps time
consuming. Decoding by the author may be relatively difficult (certainly it
is possible to read back, I am not sure as it could be real time read,
perhaps not) but cracking by a third party it in its entirety is (I think)
*quite* difficult.
Here is some text block cut randomly from the encoded corpus:
otal oldar chor lkeedol eer ol dair chedy daiin ockhdar cpheol chedy
xar qokaiin y chedy kshdy ololdy aiin char y okeey oldar qokaiin lsho
daiin olsheam qoeey chedy dchos pshedaiin shedy d qol key sheol or
cpheeedol qokedy qokaiin daiin cthosy chedy ar aiir chedy teeol aiin
cheey y cheam oky qokaiin daldaiin loiii ar shtchy chedy aldaiin
ydchedy daiin shd okaiin qokain daiin qotcho chedy daiin lchy oloro
I produced a text of similar size of the vms with this entropy statistics on
the first 32000 characters using Monkey:
NewText vms
h0 4.70044 4.64386
h1 3.85988 3.82127
h2 2.02343 2.14011
and it follows Zipf's law AND it has a meaning.
I can send the text who whoever wishes to have a look.
I thought of leaving this as a puzzle for the list to solve, but maybe it is
better to tell how it was made, so even knowing the method, one realises that
it may be quite difficult to crack it.
I used a nomenclator to exchange the words for something artificial. I used
vms words in their ranking order to exchange the plain text (PT) words. So
the most common PT word is the most common vms word and so on. Therefore all
character-derived statistics are due to the rules of construction of vms
words and little to do with the language (since most languages follow Zipf's
law). Entropy is left almost unchanged, and Zipf's law is exactly the same as
in the original PT.
I imagined the author creating a two-way dictionary of imaginary words --that
correspond unambiguously to real words-- as he writes the vms.
You do not know neither the PT source nor the language.
Any suggestions on how to crack this *in full*?
Cheers,
Gabriel
______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list