[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: meaning-less/full



    > A mathematical definition doesn't seem feasible.
    
Right. Meaning is not a property of the message alone. In fact,
information theory says that a stream of messages with maximum meaning
will be indistinguishable from a stream of perfectly random strings.
Any deviation from uniform probabilities, or any correlation between
bits, means a waste in channel capacity.

A message has meaning only to the extent that it mirrors some
information that the sender wishes to send. Therefore, in order to
define meaning, one must specify the information to be sent, and the
encoding algorithm; then one can analyze how much of that information
is preserved by the encoding.

Consider that an ideal text compression algorithm should take
"typical" texts and turn them into random-looking strings of bits. Of
course this transformation preserves meaning (as long as one has the
decompression algorithm!); but, for maximum compression, the program
should equalize the bit probabilities and remove any correlations.
Modern compressors like PKZIP go a long way in that direction. The
compressed text, being shorter than the original, will actually have
more meaning per unit length; but it will look like perfect gibberish to
LSC-like tests.

Or, consider a meaningful plaintext XORed with the binary expansion of
pi. The result will have uniform bit probabilities, and no visible
correlations; but it will still carry the original meaning, which can
be easily recovered. It would take a very sophisticated algorithm (one
that knows that pi is a "special" number) to notice that the text is
not an entirely random string of bits.

So the LSC and possible variants are not tests of `meaning' but rather
of `naturalness.' They work because natural language uses its medium
rather inefficiently, but in a rather peculiar way: it uses symbols
with unequal frequencies (a feature that mechanical monkeys can
imitate), but changes those frequencies over long distances (something
which simple monkeys won't do).

However, with slightly smarter monkeys one *can* generate meaningless
texts that fool the LSC; and the same applies for any "meaning
detector" that looks only at the message. Conversely, one can always
encode a meaninful text so as to make it look "random" to the LSC. In
short, a naturally produced (and natural-looking) text can be quite
meaningless, while a meaningful text may be (and look) quite
unnatural.

All the best,

--stolfi