[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

VMs: The Mystery of the Voynich Manuscript (Scientific American)


Scientific American

June 21, 2004   
The Mystery of the Voynich Manuscript   
New analysis of a famously cryptic medieval document suggests that it 
contains nothing but gibberish   
By Gordon Rugg    
In 1912 Wilfrid Voynich, an American rare-book dealer, made the find of a 
lifetime in the library of a Jesuit college near Rome: a manuscript some 230 
pages long, written in an unusual script and richly illustrated with bizarre 
images of plants, heavenly spheres and bathing women. Voynich immediately 
recognized the importance of his new acquisition. Although it superficially resembled 
the handbook of a medieval alchemist or herbalist, the manuscript appeared to 
be written entirely in code. Features in the illustrations, such as hairstyles, 
suggested that the book was produced sometime between 1470 and 1500, and a 
17th-century letter accompanying the manuscript stated that it had been 
purchased by Rudolph II, the Holy Roman Emperor, in 1586. During the 1600s, at least 
two scholars apparently tried to decipher the manuscript, and then it 
disappeared for nearly 250 years until Voynich unearthed it. 
Voynich asked the leading cryptographers of his day to decode the odd script, 
which did not match that of any known language. But despite 90 years of 
effort by some of the world's best code breakers, no one has been able to decipher 
Voynichese, as the script has become known. The nature and origin of the 
manuscript remain a mystery. The failure of the code-breaking attempts has raised 
the suspicion that there may not be any cipher to crack. Voynichese may contain 
no message at all, and the manuscript may simply be an elaborate hoax. 

Critics of this hypothesis have argued that Voynichese is too complex to be 
nonsense. How could a medieval hoaxer produce 230 pages of script with so many 
subtle regularities in the structure and distribution of the words? But I have 
recently discovered that one can replicate many of the remarkable features of 
Voynichese using a simple coding tool that was available in the 16th century. 
The text generated by this technique looks much like Voynichese, but it is 
merely gibberish, with no hidden message. This finding does not prove that the 
Voynich manuscript is a hoax, but it does bolster the long-held theory that an 
English adventurer named Edward Kelley may have concocted the document to 
defraud Rudolph II. (The emperor reportedly paid a sum of 600 ducats--equivalent 
to about $50,000 today--for the manuscript.) 

Perhaps more important, I believe that the methods used in this analysis of 
the Voynich mystery can be applied to difficult questions in other areas. 
Tackling this hoary puzzle requires expertise in several fields, including 
cryptography, linguistics and medieval history. As a researcher into expert 
reasoning--the study of the processes used to solve complex problems--I saw my work on 
the Voynich manuscript as an informal test of an approach that could be used to 
identify new ways of tackling long-standing scientific questions. The key 
step is determining the strengths and weaknesses of the expertise in the relevant 

Baby God's Eye?
The first purported decryption of the Voynich manuscript came in 1921. 
William R. Newbold, a professor of philosophy at the University of Pennsylvania, 
claimed that each character in the Voynich script contained tiny pen strokes that 
could be seen only under magnification and that these strokes formed an 
ancient Greek shorthand. Based on his reading of the code, Newbold declared that 
the Voynich manuscript had been written by 13th-century philosopher-scientist 
Roger Bacon and described discoveries such as the invention of the microscope. 
Within a decade, however, critics debunked Newbold's solution by showing that 
the alleged microscopic features of the letters were actually natural cracks in 
the ink. 

The Voynich manuscript appeared to be either an unusual code, an unknown 
language or a sophisticated hoax. 

Newbold's attempt was just the start of a string of failures. In the 1940s 
amateur code breakers Joseph M. Feely and Leonell C. Strong used substitution 
ciphers that assigned Roman letters to the characters in Voynichese, but the 
purported translations made little sense. At the end of World War II the U.S. 
military cryptographers who cracked the Japanese Imperial Navy's codes passed 
some spare time tackling ciphertexts--encrypted texts--from antiquity. The team 
deciphered every one except the Voynich manuscript. 

In 1978 amateur philologist John Stojko claimed that the text was written in 
Ukrainian with the vowels removed, but his translation--which included 
sentences such as "Emptiness is that what Baby God's Eye is fighting for"--did not 
jibe with the manuscript's illustrations nor with Ukrainian history. In 1987 a 
physician named Leo Levitov asserted that the document had been produced by the 
Cathars, a heretical sect that flourished in medieval France, and was written 
in a pidgin composed of words from various languages. Levitov's translation, 
though, was at odds with the Cathars' well-documented theology. 

Furthermore, all these schemes used mechanisms that allowed the same 
Voynichese word to be translated one way in one part of the manuscript and a different 
way in another part. For example, one step in Newbold's solution involved the 
deciphering of anagrams, which is notoriously imprecise: the anagram ADER, 
for instance, can be interpreted as READ, DARE or DEAR. Most scholars agree that 
all the attempted decodings of the Voynich manuscript are tainted by an 
unacceptable degree of ambiguity. Moreover, none of these methods could encode 
plaintext--that is, a readable message--into a ciphertext with the striking 
properties of Voynichese. 

If the manuscript is not a code, could it be an unidentified language? Even 
though we cannot decipher the text, we know that it shows an extraordinary 
amount of regularity. For instance, the most common words often occur two or more 
times in a row. To represent the words, I will use the European Voynich 
Alphabet (EVA), a convention for transliterating the characters of Voynichese into 
Roman letters. An example from folio 78R of the manuscript reads: qokedy qokedy 
dal qokedy qokedy. This degree of repetition is not found in any known 
language. Conversely, Voynichese contains very few phrases where two or three 
different words regularly occur together. These characteristics make it unlikely 
that Voynichese is a human language--it is simply too different from all other 

The third possibility is that the manuscript was a hoax devised for monetary 
gain or that it is some mad alchemist's meaningless ramblings. The linguistic 
complexity of the manuscript seems to argue against this theory. In addition 
to the repetition of words, there are numerous regularities in the internal 
structure of the words. The common syllable qo, for instance, occurs only at the 
start of words. The syllable chek may appear at the start of a word, but if it 
occurs in the same word as qo, then qo always comes before chek. The common 
syllable dy usually appears at the end of a word and occasionally at the start 
but never in the middle. 

A simple "pick and mix" hoax that combines the syllables at random could not 
produce a text with so many regularities. Voynichese is also much more complex 
than anything found in pathological speech caused by brain damage or 
psychological disorders. Even if a mad alchemist did construct a grammar for an 
invented language and then spent years writing a script that employed this grammar, 
the resulting text would not share the various statistical features of the 
Voynich manuscript. For example, the word lengths of Voynichese form a binomial 
distribution--that is, the most common words have five or six characters, and 
the occurrence of words with greater or fewer characters falls off steeply from 
that peak in a symmetric bell curve. This kind of distribution is extremely 
unusual in a human language. In almost all human languages, the distribution of 
word lengths is broader and asymmetric, with a higher occurrence of 
relatively long words. It is very unlikely that the binomial distribution of Voynichese 
could have been a deliberate part of a hoax, because this statistical concept 
was not invented until centuries after the manuscript was written. 

Expert Reasoning 
In summary, the Voynich manuscript appeared to be either an extremely unusual 
code, a strange unknown language or a sophisticated hoax, and there was no 
obvious way to resolve the impasse. It so happened that my colleague Joanne Hyde 
and I were looking for just such a puzzle a few years ago. We had been 
developing a method for critically reevaluating the expertise and reasoning used in 
the investigation of difficult research problems. As a preliminary test, I 
applied this method to the research on the Voynich manuscript. I started by 
determining the types of expertise that had previously been applied to the problem. 

The assessment that the features of Voynichese are inconsistent with any 
human language was based on substantial relevant expertise from linguistics. This 
conclusion appeared sound, so I proceeded to the hoax hypothesis. Most people 
who have studied the Voynich manuscript agreed that Voynichese was too complex 
to be a hoax. I found, however, that this assessment was based on opinion 
rather than firm evidence. There is no body of expertise on how to mimic a long 
medieval ciphertext, because there are hardly any examples of such texts, let 
alone hoaxes of this genre. 

Several researchers, such as Jorge Stolfi of the University of Campinas in 
Brazil, had wondered whether the Voynich manuscript was produced using random 
text-generation tables. These tables have cells that contain characters or 
syllables; the user selects a sequence of cells--perhaps by throwing dice--and 
combines them to form a word. This technique could generate some of the 
regularities within Voynichese words. Under Stolfi's method, the table's first column 
could contain prefix syllables, such as qo, that occur only at the start of 
words; the second column could contain midfixes (syllables appearing in the middle 
of words) such as chek, and the third column could contain suffix syllables 
such as y. Choosing a syllable from each column in sequence would produce words 
with the characteristic structure of Voynichese. Some of the cells might be 
empty, so that one could create words lacking a prefix, midfix or suffix. 

English adventurer Edward Kelley may have concocted the document to defraud 
Rudolph II, the Holy Roman Emperor. 

Other features of Voynichese, however, are not so easily reproduced. For 
instance, some characters are individually common but rarely occur next to each 
other. The characters transcribed as a, e and l are common, as is the 
combination al, but the combination el is very rare. This effect cannot be produced by 
randomly mixing characters from a table, so Stolfi and others rejected this 
approach. The key term here, though, is "randomly." To modern researchers, 
randomness is an invaluable concept. Yet it is a concept developed long after the 
manuscript was created. A medieval hoaxer probably would have used a different 
way of combining syllables that might not have been random in the strict 
statistical sense. I began to wonder whether some of the features of Voynichese 
might be side effects of a long-obsolete device. 

The Cardan Grille 
It looked as if the hoax hypothesis deserved further investigation. My next 
step was to attempt to produce a hoax document to see what side effects 
emerged. The first question was, Which techniques to use? The answer depended on the 
date when the manuscript was produced. Having worked in archaeology, a field 
in which dating artifacts is an important concern, I was wary of the general 
consensus among Voynich researchers that the manuscript was created before 1500. 
It was illustrated in the style of the late 1400s, but this attribute did not 
conclusively pin down the date of its origin; artistic works are often 
produced in the style of an earlier period, either innocently or to make the 
document look older. I therefore searched for a coding technique that was available 
during the widest possible range of origin dates--between 1470 and 1608. 

A promising possibility was the Cardan grille, which was introduced by 
Italian mathematician Girolamo Cardano in 1550. It consists of a card with slots cut 
in it. When the grille is laid over an apparently innocuous text produced 
with another copy of the same card, the slots reveal the words of the hidden 
message. I realized that a Cardan grille with three slots could be used to select 
permutations of prefixes, midfixes and suffixes from a table to generate 
Voynichese-style words. 

A typical page of the Voynich manuscript contains about 10 to 40 lines, each 
consisting of about eight to 12 words. Using the three-syllable model of 
Voynichese, a single table of 36 columns and 40 rows would contain enough syllables 
to produce an entire manuscript page with a single grille. The first column 
would list prefixes, the second midfixes and the third suffixes; the following 
columns would repeat that pattern. You can align the grille to the upper left 
corner of the table to create the first word of Voynichese and then move it 
three columns to the right to make the next word. Or you can move the grille to 
a column farther to the right or to a lower row. By successively positioning 
the grille over different parts of the table, you can create hundreds of 
Voynichese words. And the same table could then be used with a different grille to 
make the words of the next page. 

I drew up three tables by hand, which took two or three hours per table. Each 
grille took two or three minutes to cut out. (I made about 10.) After that, I 
could generate text as fast as I could transcribe it. In all, I produced 
between 1,000 and 2,000 words this way. 

I found that this method could easily reproduce most of the features of 
Voynichese. For example, you can ensure that some characters never occur together 
by carefully designing the tables and grilles. If successive grille slots are 
always on different rows, then the syllables in horizontally adjacent cells in 
the table will never occur together, even though they may be very common 
individually. The binomial distribution of word lengths can be generated by mixing 
short, medium-length and long syllables in the table. Another characteristic 
of Voynichese--that the first words in a line tend to be longer than later 
ones--can be reproduced simply by putting most of the longer syllables on the left 
side of the table. 

The Cardan grille method therefore appears to be a mechanism by which the 
Voynich manuscript could have been created. My reconstructions suggest that one 
person could have produced the manuscript, including the illustrations, in just 
three or four months. But a crucial question remains: Does the manuscript 
contain only meaningless gibberish or a coded message? 

I found two ways to employ the grilles and tables to encode and decode 
plaintext. The first was a substitution cipher that converted plaintext characters 
to midfix syllables that are then embedded within meaningless prefixes and 
suffixes using the method described above. The second encoding technique assigned 
a number to each plaintext character and then used these numbers to specify 
the placement of the Cardan grille on the table. Both techniques, however, 
produce scripts with much less repetition of words than Voynichese. This finding 
indicates that if the Cardan grille was indeed used to make the Voynich 
manuscript, the author was probably creating cleverly designed nonsense rather than a 
ciphertext. I found no evidence that the manuscript contains a coded message. 

This absence of evidence does not prove that the manuscript was a hoax, but 
my work shows that the construction of a hoax as complex as the Voynich 
manuscript was indeed feasible. This explanation dovetails with several intriguing 
historical facts: Elizabethan scholar John Dee and his disreputable associate 
Edward Kelley visited the court of Rudolf II during the 1580s. Kelley was a 
notorious forger, mystic and alchemist who was familiar with Cardan grilles. Some 
experts on the Voynich manuscript have long suspected that Kelley was the 

My undergraduate student Laura Aylward is currently investigating whether 
more complex statistical features of the manuscript can be reproduced using the 
Cardan grille technique. Answering this question will require producing large 
amounts of text using different table and grille layouts, so we are writing 
software to automate the method. 

This study yielded valuable insights into the process of reexamining 
difficult problems to determine whether any possible solutions have been overlooked. A 
good example of such a problem is the question of what causes Alzheimer's 
disease. We plan to examine whether our approach could be used to reevaluate 
previous research into this brain disorder. Our questions will include: Have the 
investigators neglected any field of relevant expertise? Have the key 
assumptions been tested sufficiently? And are there subtle misunderstandings between 
the different disciplines that are involved in this work? If we can use this 
process to help Alzheimer's researchers find promising new directions, then a 
medieval manuscript that looks like an alchemist's handbook may actually prove to 
be a boon to modern medicine. 

GORDON RUGG became interested in the Voynich manuscript about four years ago. 
At first he viewed it as merely an intriguing puzzle, but later he saw it as 
a test case for reexamining complex problems. He earned his Ph.D. in 
psychology at the University of Reading in 1987. Now a senior lecturer in the School of 
Computing and Mathematics at Keele University in England, Rugg is editor in 
chief of Expert Systems: The International Journal of Knowledge Engineering and 
Neural Networks. His research interests include the nature of expertise and 
the modeling of information, knowledge and beliefs. 
© 1996-2004 Scientific American, Inc. All rights reserved.
Reproduction in whole or in part without permission is prohibited. 
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list