[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

VMs: MONKEY and Voynich concordance



I had announced here, already several months ago, that I was rewriting
MONKEY.

My original source code, in Turbo Pascal 5.5, having been written
on the fly, adding features as  I thought of them, has turned out
to be an unmanageable mess. Not only unmanageable, but unreadable: I
could  not figure out what it did and how it did  it! And
then, I had distractions: books to review for THES, CD-drive
problems (still unsolved), an article on the rongorongo of Easter
Island that turned into a nightmare and finally, top-down
programming MONKEY landed me in an awful mess. Conclusion: its many
functions have to be coded first, one by one, then, and only
then, can I design the user interface. (And to add insult to
injury, the space bar of my keyboard is now sticking!).

Having realised the errors of my ways, I turned to implementing
the most important function for my immediate purposes: a concordance.
And a concordance letter-by-letter, because I suspect that the
spaces are spurious. I had already written a concordance for the
Easter Island tablets, and I  have just finished adapting it for
texts such as the VMS. I have abandoned Pascal for a little-known
programming language, Euphoria, with many advantages. It is very
easy to learn and to use, it puts no limit on the amount of data
held in memory (it will even swap to disk automatically when you
run out--which feature I always disable), and more importantly, it is
available for DOS, Windows, Linux, and BSD. One drawback: it is
interpreted and thus slower than compiled languages. Nevertheless,
it is very fast, as interpreted languages come. I just did a
letter-by-letter concordance of the 5,471 lines of the EVA
transcription ending with f115r. It took the program 14.34 seconds to
read and reformat the data, and only 37.24 seconds to produce the
concordance. But not to save it on disk: it is 191,545 lines long,
and would occupy some SIXTEEN megabytes. Would make a nice book,
wouldn't it? Some 30,000 pages!

Here are the first ten lines:

<f79r.P.44;H>     aiin.chkam.ar.cheedy.ldy\ody.o|aan.okeey.dar.cheory\poldshedy
<f113v.P.45;H>    .lsar.lkeey.opchedy.qokchdy.ot|a.aram\olkaiin.cheey.lain.al.c
<f112v.P.38;H>    n.ar.qokaiin.chol.kedy.qokam\s|a.ar.oiin.okchey.al.chedy.chol
<f32v.P.10;H>     .chor.chol.daiin.cphol.cthol.d|aar\ol.sho.chy\tshdar.shdor.sh
<f115r.P.3;H>     od.qol.chedy.qockheos.cholor.d|aar.oraro\dchedain.qokeedy.olk
<f71v.S2.1;H>     .cheey.otal.al.shldy.otaly.\of|acfom\otalody\otalaiin\otar.sh
<f72r1.S1.5;H>    y\oaiin.ar.ary\okalam\ytal.shd|a\char.alf\otaraldy\otaiin.ota
<f42r.P1.1;H>     kol.chedy.okeey\sho.ofaiin.cth|achcthy.otcheey.pchear\sol.kai
<f22r.P.7;H>      hy.ctheen\kchol.shol.dsheor.sk|a.chdoly.ytaiin.olotchy.cphal\
<f68r3.C1.1;H>    pcheody.dchedy.daiin.oteeody.d|a*cheedy.oteeody.qochecth*m.dc

Unlike with MONKEY, the text appears as it is, even with the line
"headers", e.g. <f79r.P.44;H>.

Each line shows a word, or part of a word, in its context. A pipe
(|) has been inserted to mark the beginning of the word (or part of
a word). So we have there: aanokeey..., aaramolkaiin...,
aaroiinokchey..., and so on. A back slash (\) means an end-of-line.
These two special characters are user-definable of course
(end-of-line and pipe). For the purpose of sorting, only "active"
letters (user-defined, of course) are taken into account (here:
'a' to 'z'); the rest is ignored.

MONKEY had a limit to sorting accuracy, which was  user-definable.
This version has no limit at all: entries are sorted right to
the end of the input text if necessary. I was very surprised that
it took barely over half a minute--and my computer is old, 5-years
old, an AMD-K6 running at 200MHz only. With a modern monster, I
expect that you could hardly count one, two, three.

If you have ideas, suggestions, now is the time to tell me about
them, before I get too deep in writing this "SON OF MONKEY".






______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list