Imagine English written in the *ultimate* deficient alphabet:
only one letter!
The above Cv, cvc vc vcc. Vcvcvcv v ... etc., etc., becomes:
xx xxx xx xxx xxxxxxx x
The original question was: what is the probability of
finding the same word exactly P positions apart?
The new question is: what is the probability of
finding the _same-length_ word exactly P positions
apart?
If the language forbids the same word occurring twice
in a row, I think it will still show in the statistics.
Does anyone care to comment on this hypothesis before I test
it?