[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Dovetail battlements in Rome?
> Can anybody explain how Sukhotin does the job of sorting out
> vowels and consonants?
I recall that Jacques posted an explanation to the mailing list,
a couple of years ago.
The basis of the algorithm is the observation that in most languages
Vs and Cs have a tendency to alternate: Vs are mostly surrounded by
C's and vice-versa. So the algorithm tries to find a partition of the
alphabet in two classes X and Y that maximizes the number of XY and YX
pairs, and minimizes the number of XX and YY pairs.
> Did anybody apply that method to VMS, and if yes, were the
> symbols in VMS reliably shown to be either vowels or consonants?
Jacques did, and I gather that the results were incoclusive.
I have tried to do roughly the same thing, by hand and by ad-hoc
algorithms. I did find some structure in the aphabet, which is now
part of the crust-mantle-core paradigm.
Basically, one can distingush several classes of letters (gallows,
benches, dealers, etc.) which have similar digraph statistics; but
there doesn't seem to be any simple mapping of those classes to a
plausible `vowels and consonants' bipartition. Moreover, although
those statistical classes seem clar-cut, and are fairly compatible
with the morphological classification of the symbols, if I slightly
change the similarity measure, I get a very different set of classes
--- also clear-cut and compatible.
But two days ago I noticed another weird thing about the digraph
frequencies, with sort of explains that ambiguity (and why Sukhotin's
algorithm couldn't possibly work). Stay tuned...
All the best,
--stolfi