[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: VMs: Crypto question



Jeff wrote:
How would you go about cracking this cipher? What would be your first step?
BTW this is a famous quote. It works on a very simple principle. I can write
a program to produce more of this if of interest.

okdidka kadidukfod medidmo madadekgid kacodakbid

encodmocud mocuduk imdodikgadukfudombid

At the risk of moving a little further off-topic... my first step is frequency counts: there are 13 letters used, and 'd' is way high. The pronounceable output, low variability, and long apparent words suggest that the ciphertext is expanded from the plaintext. I also look at repeated strings; in this case the string "dmocu[d]" appears long enough to be useful. "did" appears a number of times, idka and dukf twice, dmo several times. The repeats suggest that it's a substitution system with no transposition step -- at least not at the point where the final substitution is made.

I'd then try to determine length of cipher units.  I assume the spaces represent
word divisions.  Since the 8 "words" are of different parity (7, 10, 7, 10, 10,
7, 20) and the words are long, the cipher units are probably different lengths.
I'll assume for openers that it's straight substitution -- i.e. individual letters
are substituted with reversible digraphs and trigraphs.  I'll guess that double
consonants (at least) represent a letter break, so the initial breakup would
look like:

ok-did-ka kadiduk-fod medid-mo madadek-gid kacodak-bid
en-cod-mocud mocuduk im-dodik-gaduk-fudom-bid

Let's assume that groups we see separated into 2- or 3-letter groups cannot be
formed by others, so (for example) "did" will always represent the same thing.
(By the way, I'm now well beyond the point where I'm speculating wildly without
enough support for the chosen path, but I like to push guesses to the limit in
hopes that some insight will pop up to try on a different path.)

ok-did-ka ka-did-uk-fod me-did-mo madadek-gid ka-cod-ak-bid
en-cod-mo-cud mo-cud-uk im-dodik-gad-uk-fudom-bid

There are still some things to separate: madadek, dodik and fudom.
I'll guess that "dad" is a particle to match "did" (which would give
"ek" to match our "ak"), and perhaps "fud" to match "fod".

ok-did-ka ka-did-uk-fod me-did-mo ma-dad-ek-gid ka-cod-ak-bid
en-cod-mo-cud mo-cud-uk im-dodik-gad-uk-fud-om-bid

The final missing separator would be "dod-ik" or "do-dik".  I suspect
the former, again by analogy with did and dad, and leaving ik to match
ek, uk and ak.

ok-did-ka ka-did-uk-fod me-did-mo ma-dad-ek-gid ka-cod-ak-bid
en-cod-mo-cud mo-cud-uk im-dod-ik-gad-uk-fud-om-bid

This doesn't look great: there are too many resulting "letters"
here -- out of the 33 syllables here, 22 are different -- that's a very
low index of coincidence.  Also, if there really are only about 33 letters
in the cryptogram, it's going to be hard to figure it out unless we spot
some kind of regularity, such as multiple syllables mapping to a single
letter (e.g. did, dad, dod -> e).

One promising feature of this separation is that the 2- and 3-letter groups
alternate, without my having had to force that.  Perhaps this means that each
plaintext letter may be encoded either as a digraph or a trigraph, depending
on the parity of its position.  This would also explain the overly high
variability.

The real first step, of course, is to look at provenance -- it's almost never
the case that a cipher drops out of the sky on a piece of paper in front
of you.  If lives depended on it I'd also try a brute force aphorism search (since
this is a famous quote) looking for a quote with words of length 3/4/3/4/4/4/3/8.

But for now I guess I've theorized so far beyond my data that I'd better stop
wasting bits.
--
	Jim Gillogly

______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list