[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: VMs: Word Endings



Hi everyone,

At 21:54 10/07/2003 -0500, GC wrote:
Because of recent comments to the nature of "the VMS has so few word
endings", I've posted three short studies of these in separate files.
There's a lot to consider on this topic, so I'll just touch on a few and see
if we can draw enough different perspectives to look into this a little
deeper.  Follow me to http://voynich.info/vgbt/xcrptn/first_last.pdf

I'm extremely interested in GC's transcription of the variously accessorised EVA <ch> characters: one interesting idea I keep returning to which might just explain these is my <generic shorthand completion token> hypothesis (which is similar to what Gordon just mentioned, but a little more specific) - I'll try to explain... [*]


It's clear from a closer examination of the glyphs (as GC has done so assiduously) that there are multiple subtly different variants of EVA <ch>... but why should this be? Certainly, the VMS' core alphabet has several pairs of hard-to-quickly-distinguish characters (such as o-a, ii-iii, r-s, perhaps even 2+ forms of y, and arguably some of the gallows), which is typical of the kind of Quattrocento cipher I'm familiar with. The obvious suggestion might therefore simply be that this set of ch-mutants (?) is a form of nomenclature, containing (say) 5 or 6 particular words which would be better hidden than spelled out in full repeatedly.

I'm not so sure. I have another idea...

Digression: in the same way that I don't think VMS glyphs are letters (but are typically half-pairs), I don't think VMS words are words (if you're going to the bother of hiding letters, why make the words easy to discern?) - which means that I think
--> (a) full-spaces are being inserted relatively systematically (though half-spaces may be meaningful) into an otherwise continuous stream, and
--> (b) that the real word boundaries are probably happening *inside* apparent words (which would tend to increase the apparent "vocabulary", as we observe).


So, given that I think that the plaintext is concatenated into a stream, encoded and then artificially divided so as to obscure real word divisions, what can we infer?

The key observation here might be that, looking specifically at GC's second table, EVA <ch> very rarely terminates a word - my inference (on the back of everything else I've just described) is that this is highly likely to indicate that EVA <ch> terminates real words... the author is going out of his way to ensure that <ch> almost never terminates an apparent word, but why?

All in all, I strongly suspect that <ch> functions as a "generic shorthand completion token", which I render as "..." (an "ellipsis", Unicode 0x2026 if you're a code-junkie) - also note that the *physical rhythm of writing EVA <ch> on a wax tablet* (try it with a biro) has the same lurching three-step physical rhythm of writing three dots. In fact, I think that <ch> is a kind of *joined up ellipsis*.

Now, even though Shakespeare uses ellipses, the idea is actually much older. FYI, I quickly found one page (on St Bridget's Tractatus de summis pontificibus) which mentions a Sara Ekwall in Sweden (now how weird is that?) dating a manuscript with three dots to 1402:-
http://www.fordham.edu/halsall/basis/bridget-tractatus.html


But in fact, you can (apparently) trace it right back to Old Norse:-
        http://www.everything2.com/index.pl?node=ellipsis
        The ellipsis is first noted in Old Norse starting in about 200 BC,
        which is the first known written language to utilize the ellipsis.
        Often in Old Norse, writers would omit infinitive phrases and
        non-action verbs, which is the first known existence of such
        verbal omissions in written language. Old Norse was
        particularly well structured for this because the language was
        contextually very strong; writers and speakers were able to
        easily make it apparent what the subjects and objects were,
        meaning the verb became less important in many cases.

My hypothesis is therefore that EVA <ch> represents a <generic word completion token> (ie, "finish the current word off in a sensible way"), in almost exactly the same way that Gordon suggested for EVA <y> for Latin.

However (and here's the twist, at long last), I believe that EVA <ch> gets accessorised when the encoder reviews the text and realises that what he's written was actually ambiguous - so, the extra marks (loops, tears, curls, dots, whatever) are for disambiguating the text stream *somehow*... they're decoding hints, to indicate that you may need to think more about what they're abbreviating.

So, all in all, I infer from this that the overall process for encoding a page went like this:-
(1) clearly re-write the original text as continuous shorthand/tachygraphy onto a wax tablet
(2) review it for ambiguities - mark up any ambiguous word endings with tears, loops, etc
(3) encode it, leaving many of the original letters in place but faking word boundaries
(4) Pass the wax tablet to a scribe for copying onto vellum


It's not 100% cryptography, it's not 100% shorthand, but it makes a lot of sense to me: does it make any sense to you?

Cheers, .....Nick Pelling.....

[*] note ironic ellipsis! :-)


______________________________________________________________________ To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying: unsubscribe vms-list