[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: meaning-less/full

To: voynich@xxxxxxxx
Subject: Re: meaning-less/full
From: Jorge Stolfi <stolfi@xxxxxxxxxxxxxx>
Date: Mon, 24 Jan 2000 22:05:23 -0200 (EDT)
Delivered-to: reeds@research.att.com
In-reply-to: <12CrXU-0oye8mC@fwd01.sul.t-online.de>
References: <200001221227.KAA02811@coruja.dcc.unicamp.br> <12CPS7-0TZJ44C@fwd02.sul.t-online.de> <388B4068.C18A1B8E@nctimes.net> <12CrXU-0oye8mC@fwd01.sul.t-online.de>
Reply-to: stolfi@xxxxxxxxxxxxxx
Sender: jim@xxxxxxxxxxxxx

    > A mathematical definition doesn't seem feasible.
    
Right. Meaning is not a property of the message alone. In fact,
information theory says that a stream of messages with maximum meaning
will be indistinguishable from a stream of perfectly random strings.
Any deviation from uniform probabilities, or any correlation between
bits, means a waste in channel capacity.

A message has meaning only to the extent that it mirrors some
information that the sender wishes to send. Therefore, in order to
define meaning, one must specify the information to be sent, and the
encoding algorithm; then one can analyze how much of that information
is preserved by the encoding.

Consider that an ideal text compression algorithm should take
"typical" texts and turn them into random-looking strings of bits. Of
course this transformation preserves meaning (as long as one has the
decompression algorithm!); but, for maximum compression, the program
should equalize the bit probabilities and remove any correlations.
Modern compressors like PKZIP go a long way in that direction. The
compressed text, being shorter than the original, will actually have
more meaning per unit length; but it will look like perfect gibberish to
LSC-like tests.

Or, consider a meaningful plaintext XORed with the binary expansion of
pi. The result will have uniform bit probabilities, and no visible
correlations; but it will still carry the original meaning, which can
be easily recovered. It would take a very sophisticated algorithm (one
that knows that pi is a "special" number) to notice that the text is
not an entirely random string of bits.

So the LSC and possible variants are not tests of `meaning' but rather
of `naturalness.' They work because natural language uses its medium
rather inefficiently, but in a rather peculiar way: it uses symbols
with unequal frequencies (a feature that mechanical monkeys can
imitate), but changes those frequencies over long distances (something
which simple monkeys won't do).

However, with slightly smarter monkeys one *can* generate meaningless
texts that fool the LSC; and the same applies for any "meaning
detector" that looks only at the message. Conversely, one can always
encode a meaninful text so as to make it look "random" to the LSC. In
short, a naturally produced (and natural-looking) text can be quite
meaningless, while a meaningful text may be (and look) quite
unnatural.

All the best,

--stolfi

References:
- LSC and the VMS
  - From: Jorge Stolfi
- Re: LSC and the VMS
  - From: Rene
- Re: LSC and the VMS
  - From: Mark Perakh
- meaning-less/full
  - From: Rene

Prev by Date: meaning-less/full
Next by Date: Re: meaning-less/full
Previous by thread: meaning-less/full
Next by thread: Re: meaning-less/full
Index(es):
- Date
- Thread