[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: John Chadwick (Linear B) of corpus size. Comments invited.
> Yes, my point was that Chadwick's formula is dead wrong.
> However, I would like other opinions. I know, I know, over the
> years, we have thrashed this matter to the death. .... I would
> hate to see nonsense like "Chadwick's formula" fed to a wide
> readership. IF it is nonsense. I think it is, but I prefer not
> to trust my judgment. Comments, everybody?
I had never heard of "Chadwick's formula", and I can't imagine how it
could be derived. Your binary reencoding argument is a good point---
at best, the formula needs some special assumptions.
One can define 6 "limiting" types of undeciphered languages, depending
on whether (1) the script, (2) the language, and (3) the meaning of
the corpus texts are known or unknown. Thus Rongorongo, which is
almost surely in the local language, would be of type NYN (Y=known,
N=not known). Phaistos and Voynich are NNN, Etruscan would lie
somewhere between types YNY and YNN, etc.
Clearly, the amount of text one needs for successful decipherment
strongly depends on the language's type. Roughly, in
order of increasing difficulty:
scr lng mng example decipherment needs:
--- --- --- ------------------------ -----------------------------------
N Y Y Egyptian after Rosetta, A fairly small corpus, basically
almost. large enough for each glyph to
occur at least once.
N Y N Linear B after Ventris's A somewhat larger corpus, basically
breakthrough, almost. large enough for a few dozen function
words and inflections to occur
and be reconized.
Y N Y Etruscan after Pyrgi, A fairly large corpus, large
perhaps? enough to pinpoint the meaning
of individual words (rather than
whole sentences) and extract a
basic vocabulary.
N N Y (no idea) Basically the same as the previous
case.
Y N N Elamite, perhaps? A very large corpus, large enough
to spot syntactic structures
and reliably guess their meaning.
N N N Voynich, Phaistos Ditto, only harder.
(The type YYY means there is no problem to solve, and YYN is nonsensical.)
The last two entries of the table include cases where the language is
actually known but is still unidentified. Thus Linear B and Egyptian
used to be NNN, but once the language was identified they became YNN
or YNY, and decipherment soon followed. (Hopefully the same will
happen to Voynichese .)
Chadwick's formula does not make sense if it ignores the "little
details" of language and text meaning. Your binary encoding trick may
be said to simplify the script, but make the language much more
obscure.
All the best,