[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: LSC and the VMS
Rene wrote:
>
>
>
> Jacques already pointed out that we don't actually know how to define
> meaningful and meaningless.
Rene, if you have in mind a rigorous mathematical definition of meaningful vs
meaningless, it may be indeed a hard task (but certainly possible). However,
for the beginning we just may be satisfied with the fact that when we see a
meaningful text we recogfnize it as such (if we know the language). Of course,
when you deal with an unknown language, the situation is different, but what we
can do is to test various forms of meaningless texts, compare them to a variety
of meaningful exts in various languages, and try to determine common features
in each of those two types of texts. LSC can probably be useful as one of the
tools which though better has to be complemented by other tools. I tried
something in that vein using LSC plus letter frequencies distributions. The
results, as I agree, have not been really conclusive, so some additonal tools
have to be invented. Even if neither of such tools will by itself be
sufficient to exclude doubts, in their totality they at some moment should
provide overhwelming evidence in favor of text being either meaningful or a
gibberish. What tools? I can think of several. One example. Once, some time
ago, the Biblical texts were tested by comparing the frequencies of words in
the left and in the right halves of the verses. There was obvious correlation
which disappeared when the text was letter-permuted. No precise math measure of
the above correlation was derived, , but there are in the list at least a few
guys who are capable to take over that idea and develop it. Another obvious
correlation which disappeared when a text was permuted was between the first
one, or two, or three letters of each consecutive word, if observed along the
entire text. There were also noticed some other similar correlations. All these
things were tried before LSC has become the first choice. These observations
were never developed beyond the few preliminary trials, never properly
quantisized, and never well recorded. But they are real and could be developed
to the same extent as LSC was and provide additional info. Of course, many
other (fresh) ideas can be suggested to analyze texts. Somebody needs to spend
time on that. Cheers, Mark