[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Diringer and 's "imprecision" and copy(-daiin) was: intercultural artefact
26/01/02 01:11:20, "Rafal T. Prinke" <rafalp@xxxxxxxxxx> wrote:
>The interesting thing about it is that Diringer says it is
>"imprecise", ie. there are the same letters for different
>[sounds], so that different words are spelt the same. This would
>probably affect the copy(-x) stats so interestingly explained
No, not at all. Imagine a writing system in which the same letter
is used for all the consonants, and another letter for all
the vowels. Thus:
Cv, cvc vc vcc. Vcvcvcv v ... etc., etc.
You will observe far fewer different words, but the
probabilities of finding the same word exactly P positions
apart remain the same: (n-1)/(N-1) where n is the number of
occurrences of this word in the text, and N the number of
words in the text.
I even think that, in the case of English (and most European
languages), we will still see that the frequency of copy(-1)
that is, the same word occurring twice in a row, will be
quite low, not as low as in properly spelt English, but
still significantly lower. I don't feel like testing it
right now, but here is a thought experiment.
Imagine English written in the *ultimate* deficient alphabet:
only one letter!
The above Cv, cvc vc vcc. Vcvcvcv v ... etc., etc., becomes:
xx xxx xx xxx xxxxxxx x
The original question was: what is the probability of
finding the same word exactly P positions apart?
The new question is: what is the probability of
finding the _same-length_ word exactly P positions
If the language forbids the same word occurring twice
in a row, I think it will still show in the statistics.
Does anyone care to comment on this hypothesis before I test