[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: VMs: Some methods - do you know them ?
On Fri, 31 Oct 2003, Jacques Guy wrote:
> 30/10/2003 11:31:00 PM, "PK#01" <pklist01@xxxxxxxxx> wrote:
>
> >I'm reading "Deciphering the Indus Script" by Asko Prpola and on pages
if he is Finnish, could it be "Orpola" - on keyboard "o" is next to "p"
and "prp" is impossible word-beginning for Finnish
> >97-100 he mentions some methods for attacking relationships within unknown
> >languages:
>
> >- the syntagmaic approach that studies the probabilities of sign
> >co-occurence
>
> Hmm...
>
> >- an approach based on the work by Zellig Harris
>
> I can only think of what Harris proposed for the segmentation
> of continuous text into its constituent morphs. Let me call it
> "the surprise factor". You read an English text. So far, you
> have read "fea". You are not very surprised when the next letter
> turns out to be 't': "feat". You are even less surprised when
> the next is 'h': "feath", and from then on, there are no surprises
> until you have read "feather". From then on, you are no longer
> so sure. The next letter could be an 's', or it could be
> anything, being the beginning of another word. That is what
> Harris wrote, just translated into plain, pedestrian English.
>
> >- the paradigmatic approach based on the degree of similarity between two
> >signs - he gives an example that finds synonyms and antonyms in an english
> >text
>
> You mean, I suppose (I haven't read Parpola's book) the degree of
> similarity between the contexts of two signs? I tried that many
> years ago (about 20-25 years) on other data, and it worked nicely.
> Stephen (or Steven?) Finch did it too in his PhD thesis, on
> computer-generated text, and it worked nicely too.
> But to do it on the VMs, you need to segment the text first,
> which no-one has done. I did try a text segmentation algorithm,
> in 1977 or so, and it worked... sort of. Not reliably enough,
> though. It was based on Sukhotin's segmentation algorithm,
> but I had added a measure of the "surprise factor" which
> relied on the hypergeometric distribution (I remember having
> reinvented Pascal's triangle for the purpose).
>
> >Before I start looking into the algorithms (they're not in the book, I have
> >to dig deeper in the literature) I wouldl ike to know if someone ever tried
> >this on the VMS.
>
> You'll find all the necessary algorithms in the archives, adapted from
> the French (by yours faithfully) themselves translated from the Russian.
> Well, I think, they are all there. It's a long time ago.
>
>
>
> ______________________________________________________________________
> To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
> unsubscribe vms-list
>
--
kontaktinfo ja telefonid:
http://www.ehi.ee/~mesinik
______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list