[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: VMs: Crypto question
Jeff wrote:
How would you go about cracking this cipher? What would be your first step?
BTW this is a famous quote. It works on a very simple principle. I can write
a program to produce more of this if of interest.
okdidka kadidukfod medidmo madadekgid kacodakbid
encodmocud mocuduk imdodikgadukfudombid
At the risk of moving a little further off-topic... my first step is frequency
counts: there are 13 letters used, and 'd' is way high. The pronounceable
output, low variability, and long apparent words suggest that the ciphertext is
expanded from the plaintext. I also look at repeated strings; in this case the
string "dmocu[d]" appears long enough to be useful. "did" appears a number of
times, idka and dukf twice, dmo several times. The repeats suggest that it's a
substitution system with no transposition step -- at least not at the point where
the final substitution is made.
I'd then try to determine length of cipher units. I assume the spaces represent
word divisions. Since the 8 "words" are of different parity (7, 10, 7, 10, 10,
7, 20) and the words are long, the cipher units are probably different lengths.
I'll assume for openers that it's straight substitution -- i.e. individual letters
are substituted with reversible digraphs and trigraphs. I'll guess that double
consonants (at least) represent a letter break, so the initial breakup would
look like:
ok-did-ka kadiduk-fod medid-mo madadek-gid kacodak-bid
en-cod-mocud mocuduk im-dodik-gaduk-fudom-bid
Let's assume that groups we see separated into 2- or 3-letter groups cannot be
formed by others, so (for example) "did" will always represent the same thing.
(By the way, I'm now well beyond the point where I'm speculating wildly without
enough support for the chosen path, but I like to push guesses to the limit in
hopes that some insight will pop up to try on a different path.)
ok-did-ka ka-did-uk-fod me-did-mo madadek-gid ka-cod-ak-bid
en-cod-mo-cud mo-cud-uk im-dodik-gad-uk-fudom-bid
There are still some things to separate: madadek, dodik and fudom.
I'll guess that "dad" is a particle to match "did" (which would give
"ek" to match our "ak"), and perhaps "fud" to match "fod".
ok-did-ka ka-did-uk-fod me-did-mo ma-dad-ek-gid ka-cod-ak-bid
en-cod-mo-cud mo-cud-uk im-dodik-gad-uk-fud-om-bid
The final missing separator would be "dod-ik" or "do-dik". I suspect
the former, again by analogy with did and dad, and leaving ik to match
ek, uk and ak.
ok-did-ka ka-did-uk-fod me-did-mo ma-dad-ek-gid ka-cod-ak-bid
en-cod-mo-cud mo-cud-uk im-dod-ik-gad-uk-fud-om-bid
This doesn't look great: there are too many resulting "letters"
here -- out of the 33 syllables here, 22 are different -- that's a very
low index of coincidence. Also, if there really are only about 33 letters
in the cryptogram, it's going to be hard to figure it out unless we spot
some kind of regularity, such as multiple syllables mapping to a single
letter (e.g. did, dad, dod -> e).
One promising feature of this separation is that the 2- and 3-letter groups
alternate, without my having had to force that. Perhaps this means that each
plaintext letter may be encoded either as a digraph or a trigraph, depending
on the parity of its position. This would also explain the overly high
variability.
The real first step, of course, is to look at provenance -- it's almost never
the case that a cipher drops out of the sky on a piece of paper in front
of you. If lives depended on it I'd also try a brute force aphorism search (since
this is a famous quote) looking for a quote with words of length 3/4/3/4/4/4/3/8.
But for now I guess I've theorized so far beyond my data that I'd better stop
wasting bits.
--
Jim Gillogly
______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list