Re: VMs: Work on the relation penstroke -> letters?

Hi Elmar,

At 08:34 13/01/2004 +0100, Elmar Vogt wrogt:
Yet most people seem to take the current tanscription schemes for granted, and
only give a fleeting glance to this question which I feel is very basic and
fundamental. So, did I miss research which clearly answered that question, or
are people simply taking the transcription for granted, since it's easier to
tackle with the statistical apparatus we have?

You're absolutely correct - EVA isn't (what GC would call) a glyph-transcription, but is rather (closer to) a stroke-transcription. Given the ambiguity over what-is-or-isn't-a-glyph, this is so that you can construct your own glyph-transcription (for example, if you think iiin is a glyph, pre-process it so that *is* a glyph) for doing your own statistical analysis.

For example, looking at "raw" EVA word lengths - though relatively interesting - is fairly likely to be absolutely misleading, because of distortion "ch", "sh", "cfh", etc (add your preferred glyph-set here).

And as for verbose cipher candidate pairs (like "qo", "ee", "dy", "ol", "or", to name but five), these appear so frequently that, if the VMs *is* a cipher, then many of our most cherished statistical insights (as to letter adjacency, letter frequency, apparent vowels/consonants, word length, etc) may be largely worthless.

Unfortunately, the "why-not-use-EVA" (ie "the-map-is-the-territory") assumption is being actively used by a number of people (most notably Jeff, but I suspect others as well), which (I'm afraid) probably only serves to add noise to the overall signal here. :-(

Of course, any statistical analysis of the VMs' text should detail the exact transcription and glyph-set used for pre-processing (for reproducibility), as this is a key modelling assumption - but this important step occasionally gets overlooked in the excitement. :-(

Cheers, .....Nick Pelling.....

