[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


Mark wrote:

>  Rene, if you have in mind a rigorous mathematical definition of meaningful vs
> meaningless, it may be indeed a hard task (but certainly possible). However,
> for the beginning we just may be satisfied with the fact that when we see a
> meaningful text we recognize it as such (if we know the language). Of course,
> when you deal with an unknown language, the situation is different, but what
>  we
> can do is to test various forms of meaningless texts, compare them to a
>  variety
> of meaningful exts in various languages, and try to determine common features
> in each of those two types of texts. LSC can probably be useful as one of the
> tools which though better has to be complemented by other tools.

A mathematical definition doesn't seem feasible.  And yes, I have got some
intuitive feeling for what's meaningful and what isn't. But the more I
think of it, the more I realise that it isn't enough. Why? If we want to
check whether the LSC correctly identifies a text as meaningful or meaningless,
we must know whether the text is or not, and then check the LSC curve to
see if it matches or not. All your examples are very clear cut.
(Except one: the VMs, but about that later).

What are the doubtful cases? 

- Think of Jacques' telegram-style recipes. This is more towards the
meaningful side, but it could have someone very puzzled.
- Or a text in which every third word has been struck out. It is un-
grammatical but depending on the complexity of the original text, it
may be right in the middle between meaningful and meaningless.
- Or take a text in which every character has been replaced by the
next one in the alphabet. Totally meaningless. Yet the LSC defines it
as fully meaningful. And it becomes meaningful once you know the trick
and learn to read it (half an hour's practice might be enough to develop
a good reading speed).
- Probably (definitely) much, much more.

Is 'bogorodice djevo raduysia' Russian or the result of a Russian character
monkey? Before last Xmas I wouldn't have know but the LSC could have told
me (given more text, of course). And this gets us to the question of the 
VMs. That is as readable to me as Russian. In fact, it is more readable
to me than Arabic. The LSC classifies it as meaningful, and all the
experiments Mark has done help to reinforce the conclusion.
But could it be in the grey area above?

The point: we're not sure what we're measuring. And that isn't the 
first time in the history of the VMs, to put it mildly.
Still, as an engineer, I feel that it shouldn't stop us from experimenting.

Cheers, Rene