[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: VMs: Chinese thoughts [ was: languages etc]
At 19:27 17/03/2004 -0600, Dennis wrote:
"Maurizio M. Gavioli" wrote:
> 2) MORPHO-SYNTACTIC level (so to speak...). Here we stop to speak about
> characters to start speaking about words. In modern Chinese, many words
> (for instance most of the nouns) are multi-syllabic [...]
This is what I thought. So several characters may
represent one "true" (syntactic, that is) word.
Exactly.
> In some cases, these are very similar to compound words of most Germanic
> languages (like "sandbox" or "Weltanschauung");
> but in many cases, they
> have been created to disambiguate mono-syllabic words which, because of the
> modifications of Chinese phonology, became homophonic. It is supposed that
> ancient Chinese had a much higher rate of mono-syllabic words than recent
> or modern Chinese.
In practice, though, might these "true/syntactic"
words might be indistinguishable in structure and form
from the ones mentioned above?
Well, this goes beyond my familiarity with Chinese. From an hobbist CJK
lover perspective, I would say there is no morphological distinction
between the two groups; usually a distinction could be made on semantic
elements: real compounds (of the 'sandbox' type) have a meaning which is
different from the meaning of each component (and a composition of both),
while dis-omophonic compounds (I'm sure there is term for them, but I do
not know it) usually have a meaning equal or very similar to one of the
part, the other being a sort of 'genus' indicator, telling which of the
many words with that pronunciation we are dealing with.
Anyway, I am pretty sure that there is no obvious morphological difference.
The point, though, would be that Chinese, written in
some sort of phonemic system, comprises independent
syllables, which in turn combine into larger, syntactic
"words" in the Indo-European sense. That would be
something conceivably consistent with Voynichese. Am I
correct about this?
I think that a more precise description would be: a phonetic rendering of
Chinese would show a sequence of syllables, ALL of which are also morphemes
(i.e. morphological elements) or sememes (i.e. meaning elements) which in
many cases combines into larger words (no needs for quotes, around the word
'word': they are words and listed as such even in Chinese native dictionaries).
These correspondence between syllables and morphemes-sememes is a
peculiarity of Chinese and, I believe, of most other languages of that
group, like for instance, Tibetan, Viet and so on (but I do not know
anything of these other languages and I speak mostly by second hand readings)
The statistics aren't entirely consistent with this.
Stolfi's numbers show similar entropies for Chinese and
Voynichese. About 250 Voynichese words account for
about 80% of the character count. (I don't know about
the token count.) However, there are about 8200 total
Voynichese words, which I don't think one would expect
for something like Chinese.
Sorry, I do not understand what cold not be expected in Chinese. Are you
saying that the ratio between high-frequency words and total words is
significantly lower in Vms than in Chinese? If so, we need some frequency
ranking for Chinese, before saying anything.
PS: Not relevant to the VMs. Is it true that every
syllable in Chinese can be a free-standing unit? Bob
Brzustowicz said that not all Chinese morphemes are
free, so this must not be true.
A greater familiarity with Chinese vocabulary than mine is needed to answer
this question. I believe that practically every morpheme has been, at some
time, an independent unit, but that many may have a very low rate of
occurrence as an independent unit now. In part, the question is also
tautological: what is an independent unit?
Cheers,
Maurizio
Maurizio M. Gavioli - VistaMare Software
via San Bernardo 5, I-16030 Pieve Ligure, ITALY
http://www.vistamaresoft.com/
______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list