[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: identify a text's author or language

To: voynich@xxxxxxxx
Subject: Re: identify a text's author or language
From: Gold residence <gold@xxxxxx>
Date: Tue, 29 Jan 2002 02:34:57 -0500
In-reply-to: <200201292317.g0TNHd228839@mail2.alphalink.com.au>

At 11:01 PM 1/29/02 +0000, Jacques Guy wrote:

29/01/02 11:50:17, "Anders, Claus" <Claus.Anders@xxxxxxxxxxxxx> wrote:


>1. take any text greater than n Bytes, compress it with ZIP "known text"
>2. Add more text and compress it too - this is the "unknown" text
>3. compare difference of length of compressed text in step 1 and 2 . If you
>yield a minimum difference, they claim, the "unknown" text is derived form
>the "known" text's language or even from the same author.

I would say "congruent with" or "drawn for the same corpus", rather
than "derived from". But this is nit-picking.

I'd agree. It would also be useless with something taken from physical and oral transmission, to text, or based on something secret or esoteric. Eg: Carlos casteneda, L. Ron Hubbard

The question: how small is "minimum"?

I would also say that producing the zipped files is unncessary, and,
in fact, amounts to throwing out a great deal of information, since
you end up with a single figure. It would be far more informative
to compare the two Huffmann trees computed in the first stage of
the algorithm.

(All this is off the top of my head, before I forget it)

I like the sentence structure analysers. The shareware ones are adequate, but i'd like to have the industrial grade ones used by the three letter agencys. Thats a more in depth analysis. Because it can find sentence deviations which can be cut-and-paste's or emotional content. There is a nice balance between cryptological and psychological analysis. Not just informational analysis.

Turiyan

References:
- Re: identify a text's author or language
  - From: Jacques Guy

Prev by Date: Re: identify a text's author or language
Next by Date: Re: Dana's Botany
Previous by thread: Re: identify a text's author or language
Next by thread: Re: identify a text's author or language -- an experiment
Index(es):
- Date
- Thread