[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

VMs: Spaces



--- Nick Pelling <incoming@xxxxxxxxxxxxxxxxx> wrote:

(I concluded:)

> >So, I don't see anything particularly suspicious in
> >the VMs spaces.
> 
> Unless you can compare data-sets across a
> time-dimension, statistics talk 
> of correlation, not of causality. 

We'll have to wait with the causality bit until
after we've deciphered the MS, so correlation is
the best we can do. The Sukhotin algorithm
works with correlations and has the capability,
for some languages of which the script behaves
sufficiently well, to come up with the right
answer. (English is not one of them, mainly
since it uses 'th' and 'sh' to represent single
phonemes).

Anyway:
I'd like to stress that what I have analysed
is the original reason why people started
to wonder about spaces: are they actually like
word spaces in a statistical sense? Well, it can
be shown that they are. There is no statistical
evidence that they have just been inserted after
certain standard character shapes.

Having shown that the original premise is wrong,
and observing that label words occur in the text
separated by spaces on both sides, I cannot find
a single good argument for doubting the meaning
of spaces in the VMs. 
Note also that:
- word length distribution is reasonable
- vocabulary size is reasonable
- word frequency distribution is reasonable
- Mark Perakh has shown that long-range word
  correlations are normal
- Gabriel has done analysis showing that the
  average word length can be computed even if
  spaces are removed

All this would be affected if word spaces are
not word spaces.

About 'half spaces': they are a bit of a pain.
When Gabriel and I transcribed, we preferred to
call them 'uncertain spaces'. The official EVA
symbol for them is a comma. It is probably fair
that when transcribing one would be tempted
to decide one way or the other, based on 
'experience' with what are VMs words and what are
not. 
It is easy to argue that this is bad practice,
but just think how in everyday life one reads
handwritten text (think of something you wrote
hastily five years ago) and you will see that
this very mechanism is required all the time.

Cheers, Rene

__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com
______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list