# Re: VMs: An improved version of the voynichese-coding

```Hello John,

and thank you very much for your improving comments!

I have been planning to make some kind of logical VMs-page instead of the
somewhat confusing mix of notes that started as personal  notes file for
myself only. I'll do the improvements right away when I have time
(unfortunately, I'm now very busy with the job I'm payd for...) and also
dwell into your many suggestions then. I'll only briefly answer to some of

> There are several things about you encryption algorithm that puzzle me.
> First, it appears that the formula for representing a letter is
>
> row-label table-label offset
>
> The row-label is omitted if it is null, i.e., for the first row.
> The table-label (a gallows) is clearly included in the first letter
> encoded with a table as in 0-p-chee to encode the first e (might help to
> number your substitution examples for discussion purposes).
>
> But sometimes thereafter it seems to me that it is randomly included in
> contravention of this principle, as in o-p-chee for the first e of kelley
> when we are already in the p-table, and then in o-p-aii for the first l in
> kelley immediately thereafter.
>
Yes, it could be omitted or included, so as to give the impression that
p is just a frequent letter.

> It also appears that in giving the offsets that you can start either at
> the first column of the body of the table, or at the first column of the
> second half of the table (column 7, if we start numbering columns at 1).
> This principle seems to be applied regularly.
>
Yes, this is what I mean.

> However, sometimes you omit ch from the start of the offset in the first
> part of the table, as in o-*p-*ch-e (or oe) for the first d in edvard or
> d-*k-*cheeee (deeee) for r in edvard.
>
This is again my purpose. The meaning of all this is to avoid the sequencies
from
revealing themselves by repeating too often.

> It seems to me that you could reformulate your tables as follows:
>
> Write your alphabet down the edge of the paper.  Opposite each letter
> write the ways of encoding it - 2 occurrences per table x 4 tables.
>
> For example
>
> a 0-k-ch       o-k-ch
>   qo-t-aiiiii  d-t-aiiiii
>   qo-p-aiiii   d-p-aiiii
>   qo-f-aiii    d-f-aiii
>
> b 0-k-che      o-k-che
>   0-t-ch       o-t-ch
>   qo-p-aiiiii  d-p-aiiiii
>   qo-f-aiiii   d-f-aiiii
>
> etc.
>
> (A cryptographical defect of this scheme is that all letters, regardless
> of their frequency, have the same number of representations, so, while the
> scheme is complex, it doesn't supress frequencies.  The handling of the
> ligatures below does modify this contention somewhat.)
>
> The idea is that you can omit the table-label (gallows) if you stay in the
> same table, but have to include it to indicate a switch of tables.  You
> have eight encodings for each letter.
>
> You have also introduced a new scheme whereby changing
>
> a => y
> e => s
> ch => sh
> i => n, l, r, or m
>
> allows you to omit also the row-label, so that opchee opai (e l) can be
> rendered opchesain.  Actually, I'd make this opchee oaii already, since
> the p wasn't needed as long as we were in the same table.
>
> Apparently we can still (or must) change the last character in the
> encoding to the ligatured form at the end of a word.  We can run encodings
> together into a word or divide them as we like.
>
We can, if we like, but we only must if the next word does not start with
qo,o,
d or gallows.

> But there are final e, a, and i in the texts:  chokoishe, odshe, ska,
> ykyka, orai, aiidalaii.
>
Yes, it is allowed if the next word starts with qo, o, d or gallows.

> I assume we don't really need a table-label or gallows in every word,
> though you did this in the edvard kelley example.  A table-label can hold
> across words.  In fact, it holds until a new one crops up.
>
Yes, this is what I mean.

> I can see some problems with this scheme in terms of producing Voynichese.
>
> - I think, on further consideration, and in spite of my earlier
> suggestion, it wouldn't make any sense to see embedding or split gallows
> (gallows with other characters embedded within them) as distributing a
> table marker across a section of encodings, because the relevant zone of
> application of a gallows table-label follows it, and even a split gallows
> has a following region.  So what is the internal region?
>
This split gallows thing has to be considered; actually, for now I just take
them as
'red herrings'. Or, maybe there are 8 tables :-))

Actually, I originally started to consider the question: why do VMs letters
seem to
include numbers 0, 2, 4, and 8 (EVA o, r, l OR q if l is X=10) and 8, but no
1, 3,
5 or 7? If these are present, they are roman numbers, not arabic. Of course,
there is
9, so everything is very confusing, but one has to start somewhere...
Also, there are 4 (or 8) gallows, 2 of which have 2 legs an two of which
have only 1,
a detail which also hints to some kind of sequential or numbering system, or
at least
some kind of logic behind all the gibberish-looking repetitions...

> - I believe that this approach does not explain the decreasing frequencies
> of the longer sequences of e, i, and ai
>
Yes, this is a problem, but it can be diminished by changing the order of
the plaintext
alphabet (now I have them almost in the same arrangement in all the tables;
only
shifted by one).

> - It doesn't generate oo sequences, which do occur, e.g., ooiin, kooiin.
>
It does, actually, since one can omit all the letters after o exept the
iii-sequence.
And, we can also write many o's and it only means one, since the first one
is null
(or the two first ones, in e.g. ooo)

> - It doesn't generate final o, as in cheo, o, kcho.
>
If the word spaces are arbitrary, this can be done by inserting a space
between o
and the following gallows that the o belong to, e.g. che opaiin --> cheo
paiin

> - It doesn't explain e-sequences without leading ch, though I am not sure
> I understand your ch-omission rule.
>
As I have explained above, the ch does not necessarily have to be written in
front
of the following letters, but it can be written - the preceding letters must
be written
only when there number is significant: in the e- and i-sequencies. I.e.,
we can write e.g. okiii with
okaiii
okchiii
okcheiii
okcheeiii
okchaiii
okcheaiii
and so on, just as long as the sequence does not terminate with qo, o, d,
gallows
or ligature, i.e okshaiii would be 2 letters, oksh +aiii.

> - The same for i-sequences without leading a, with the same caveat
> regaridng a possible a-omission rule, so it doesn't generate o, etc., + i
> or i-ligatures, without leading a, e.g., chor, chol, qotychor, olaen,
> qooko.iiincheom, dld.iir, qokyshey.ithey.
>
All these can, actually, be written in my scheme, e.g. qooko iincheom
= qookoiiin che om = qookoiin che oi, where qoo are nulls and n, which
is a ligature, starts a new sequence from the first row of the k table.

> - Except as random elaboration it doesn't explain initial sh, as in sho,
> shol, sheky.  And actually, final sh is rather rare - only sh and ash,
> once each.
>
> - By making word boundaries somewhat arbtrary it is out of step with
> various evidence (summarized recently by Stolfi and Landini) that they are
> not.
>
This and the preceding comment are the most serious drawbacks in my scheme;
it has become clear that one of the most important features of voynichese is
the
binomial distribution of word lengths (by Stolfi). It will be interesting to
test what
kind of distribution my voynichese-lookalike produces. Unfortunately, it
will
have to wait unless someone else is willing to test that... (too bad that
there are
no scholarships for VMs-research..)