Jacques Guy writes: > Yes, my point was that Chadwick's formula is dead > wrong. However, I would like other opinions. OK, here is my $0.02. The formula may not be as wrong as everybody here seems to suppose, at least if we approach it with a little goodwill. First, it should be taken to apply only to "NYN" type languages. Robinson must be made aware of the incredible importance of getting the language right, but once we have that much, it's not that hard to justify the formula. The 0th step of the analysis is to arrange the symbols in frequency order. The 1st step is to construct a grid of what can follow what. As we try to grok the pattern, it is this grid that gets rearranged over and over again: something that is extremely hard to do with higher order statistics because we don't really have the visual means to deal with higher dimensional grids. Given our human limitations in constructing, displaying, and comprehending higher order data, it is likely that 1st order statistics will contimue to play a very significant role in solving the puzzle, even if computers can store (and selectively display) the higher order material with ease. Now, to fill in a bigram grid with any chance of random fluctuations not totally overwhelming the true pattern we need n^2 data points (actually, some constant time n^2 is better), so there is a fair bit of engineering wisdom in the formula. This of course applies only to ordinary phoneme- mora- or syllable-based scripts, where the usual goal (if systems created by a long evolutionary process can be said to have a goal) is to map sounds to symbols in a simple fashion. For the VMS, we don't even know whether there is a spoken system behind it (though I personally strongly suspect there is) and the goal of the script seems to be to delibarately obscure, rather than plainly present, the relationship between sounds and symbols (so a corpus larger than n^2 should be required). I think this goes a long way towards explaining why the kind of binary encoding suggested as a counterexamle renders the formula meaningless: once such an encoding is performed the relationship between the symbols and the sounds is anything but straightforward. Andras Kornai

