[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: VMs: RE: Nabataean /sefira?



With models which involve direct encoding of natural language, we keep coming
back to the twin problems of (a) the high number of repeated "words" and (b) the
low number of "phrases". The first of these can conceivably be explained as an
artefact of apparent versus real divisions between "words", but the second
should be unaffected by apparent points of division between words - if anything,
extra divisions would increase the number of "words" forming "phrases" rather
than decrease them.

One simple test, which as far as I know hasn't been performed, is to see whether
the probability of a given pair of "words" occurring together in the VMS is
significantly different from random. So, for instance, if the probability of the
word "qoteedy" occurring is 1/100, is the probability of "qoteedy qoteedy"
significantly different from 1/10,000? The analysis would need to be treated
with some caution, since character distributions in the VMS are not completely
random with regard to position in a line, and the analysis would need to look at
frequencies across the set of duplicated words, rather than cherry-picking, but
it should give an order of magnitude figure.

The results from this would seriously constrain the possible explanations for
how the manuscript was constructed.

Best wishes,

Gordon


______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list