RE: VMs: truncated repeating sequences

Yes, the phrases were a bit contrived - but maybe a more common letter sequence might have seemed less artificial. Whether this sort of thing is compatible with your observations depends in part on how many such sequences there are in the MS. I'll look forward to seeing your results - it's a very interesting piece of work.
FWIW I also think it might tie in with the way Jeff believes the MS to have been encrypted - but I'll leave it up to him to comment on that if he's reading this thread.
I am impressed with your example, but you've hand picked 17 quite specific phrases
which together exhibit the phenomena.  If you weren't specifically trying to create the
effect I still find it unlikely to happen naturally.   
So far I have found 10 related sets and I'm still picking them out manually.   Shortly
I will add some code to look for them and we'll see how many I can find in total.
The next step is to see if >99% of the VMs can be created by pasting together
decent sized chunks taken from a small set of "master sequences".  After all, this
would be like being able to create a small novel from phrases of the first page.
Hi Marke,

Don't get me wrong, I'm not attacking or dismissing your work - in fact I think it is very interesting. I'm just wondering what it could imply or suggest about the way the VMS text was generated.

Given that it appears the VMS text often appears to be made of symbol pairs/sequences rather than individual symbols (or, ol, iin, ch, cPh, etc.), we could imagine a plaintext sequence taking around half the number of characters as are seen in the VMS text. (Also, spaces in the VMS could simply be indicators that the Markov/state machine is in a particular state [to ease decryption], and not corresponding with plaintext word breaks.) So your example consisting of 20 EVA chars might correspond with a 10-character plaintext sequence, which I feel is a little more plausible.

Take the flask
THrow away the residue
THInk about the results
THIStles are prickly
THIS Item is expensive
THIS IS not important
THIS IS A short example
THIS IS AN article of faith
THIS IS AN Endothermic reaction
THIS IS AN EXample of encryption
 HIS IS AN EXPonential curve
ourS IS AN EXPEdition of discovery
that IS AN EXPERt opinion
perform AN EXPERIment
a pleasant EXPERIence 
 a beautiful PERIwinkle
 to study the wRIting

Well, it was off the top of my head, but I'm sure you get the idea!


> A Markov chain system that is driven by the next plaintext
> character is really just a polyalphabetical cypher where
> the "state" of what just happened determines which alphabet
> to use next.   And that means that to get these sort of
> repeated subsequences in the coded text a similar pattern
> would have to exist in the plaintext.
> > Did anybody run a comparable test on plaintext language,
> > to see if the "truncated sequences" pattern would show up
> > there as well?
> Are you saying that you might expect a set of repeating strings
> like below to occur in a 'sensible' english text???
> le t
> le te
> le tes
> le test o
> le test on
> le test on 
> le test on p
> le test on pla
> le test on plain
> le test on plaint
> le test on plaintex
> le test on plaintext
> le test on plaintext l
> le test on plaintext lang
>    test on plaintext lang
>     est on plaintext lang
>      st on plaintext lang
>         on plaintext lang
>         on plaintext lang
>            plaintext lang
>             laintext lang
>                ntext lang
>                  ext lang
> You can call me presumptuous or dogmatic but I say that
> they don't.  :->
> Marke

