[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: VMs: Voynichese as an Abugida

To: vms-list@xxxxxxxxxxx
Subject: Re: VMs: Voynichese as an Abugida
From: william edmondson <w.h.edmondson@xxxxxxxxxxxxx>
Date: Mon, 26 Jul 2004 10:50:35 +0100
In-reply-to: <Pine.GSO.4.58.0407260001310.19750@spot.colorado.edu>
References: <Pine.GSO.4.58.0407260001310.19750@spot.colorado.edu>
Reply-to: vms-list@xxxxxxxxxxx
Sender: owner-vms-list@xxxxxxxxxxx

Hi John

Interesting post. I too have wondered about syllabic representations.

A few thoughts of encouragement. Don't be put off by the apparently large number of syllables in English. The figures may be high for text but nothing like so high for actual sounds. Couple this with the fact that naive intuitions regarding syllables can be good - the written form for Cherokee, if I recall correctly, was invented by a non-linguist and serves well - and it seems likely that someone of, say, Kelly's abilities would have no trouble devising a serviceable syllabary for, say, Latin or other European 'phonetic' languages.

We should note that the consonant vowel distinctions in semitic languages are managed independently for morphological reasons (patterns of vowels are morphemes, and interleave with patterns of consonants, as morphemes). We would not expect to find that in VMS if it is a rendering of anything other than Hebrew/Arabic.

I'd go for consonants with distinct symbols, plus some sort of simple minded coding for vowel sounds (or even their omission).

I'll take another look at Stolfi's grammar when I get home tonight.

Cheers

William

On 26 Jul 2004, at 07:51, Koontz John E wrote:

I am wondering if anyone has looked at the Voynich ms. script as an abugida (alphasyllabary) or abjad (consonantal script)? I've actually encountered these terms only this evening myself, though the phenomena to which they refer are not new to me and are probably familiar to most of you reading this.

Abugida refers to scripts in which consonant symbols C indicate inherently a particular CV syllable, usually Ca. The consonant symbols are combined with or modified by additional marks to indicate vowels other than the default, or to indicate the absence of a vowel, maybe the presence of other modifiers, like nasalization, e.g., C<nasal-V>, possibly representing CVN, and so on. The Brahmic scripts of India (e.g., Dev(a)nagari for Sanskrit) are familiar types of this approach. Tolkien's Tengwar are also an abugida, with a systematically generated set of C-symbols.

Abjads refer to consonant-only scripts, primarily the Semitic scripts, in which the consonant symbols C are augmented with modifying marks that indicate associated vowels.

The distinction between abugida and abjad may correspond fairly well to historical development - syllabary > abjad > abugida, with true alphabets falling somewhere in that sequence - but logically, perhaps one should say cryptographically, it is somewhat moot, turning on whether the C-grade or some CV-grade is the unmarked term in a series.

I'm aware that the possibility that the Voynich script is syllabic is generally rejected on the grounds that there are not enough distinct symbols to represent the syllables of the Western European languages likely to underly the script. I'm also aware that vowel-identification procedures identify some of the characters in the text as likely vowels, and threfore presumably the rest as consonants. I gather this sort of analysis underlies the EVA transcriptions of the script.

However, I notice that the usual transcription tables and the more sophisticated analysis in Stolfi's Grammar for Voynichese Words reveal a system that lends itself to a tabular presentation of series (consonants?) and grades (vowels?).

For example, Stolfi's similar R and N sets - in combination the EVA characters d l r s n m x - I'll just call them R - occur alone and with one to four i's preceding - I'll call these I. These really look like one to four (once five) strokes with a distinguishing twiddle at the end, which called to my mind Tolkien's approach with the Tengwar, if rather vaguely, and so led to my speculations here. Conventionally, the last stroke is taken with the twiddle as the R letter, in EVA transcription, though I see that Frogguy transcription followed my instinct on this. I'll stick with the EVA version, since that is what is used.

As Stolfi's grammar shows, the R series also occur with one or two of the o a y characters preceding - I'll just call these O - or with o or a plus one to three i's preceding - which I'll call OI. It seems that you can have O(0:2)I(0:4)R(1:1) - where X(i:j) means i to j instances of an X. If these were syllables, I'd assume they were RIO syllables, or CVN, where C = syllable onset, V = syllable peak, and N = syllable coda, something like pan or par. In other words, the syllables are inverted, though this is perhaps more due to taking the constituents of a syllable code as individual characters rather than a formulaic whole.

I am not clear whether the two of O sequences before R are always oo or aa or yy or can be mixed arbitrarily. If only doubled, perhaps they represent sporadic nn, rr, etc.

There are, of course, some holes in the pattern, e.g., only iiiin (once?) has 4 i's and only id and ix occur in the i + d or x series, if I understand the presentation. The holes should, of course, reflect rare or impossible CV combinations.

To return to the logic of the system, I'm suggesting that a series like l, il, iil, iil represents, e.g., pa pe pi po or p pa pi pu or pa pi pu p or something like that. The set of grades in a series R IR IIR IIIR (once IIIIR?) doesn't provide for many vowel distinctions. I'm assuming that the O characters are something different - syllable codas - but perhaps they simply augment the vowel set. There are languages with only three (or four) vowels, but this is not typical of Western Europe, where the Classical Languages have aeiou systems (with length and diphthongs), and many of the modern languages added additional rounded front vowels.

The 7 R characters d l r s n m x also don't make a very large set of series markers or consonants, especially since d and x are limited in their combinations with i. In a typical European language we'd expect the 14 chracters in the set p b t d k g f v s z m n r l w y at minimum and maybe some from h th dh sh zh ch j kh ny ly too. I'm being fairly imprecise in representing these sounds orthographically, I realize.
In regard to these lists, Latin is closer to the minimum, with most
medieval and modern language in Europe having more consonantal
distinctions.
Of course, we'd expect a phonological analysis in line with the orthographic traditions of the underlying language, if any, and not necessarily in line with modern linguistic theory. For example w and y might be handled as vowels and palatals might not be distinctly represented - collapsed with velars and dentals plus certain vowels or represented as geminates, clusters, etc. Nasal vowels would be likely to be handled as vowel plus n or m, and so on.

Note that m is also a bit limited in its combinations with O (only o and a plus m). If O letters are codas this is like allowing pan and par, but not pal, to pick arbitrary examples.

In regard to the shortage of series I notice that there are a number of additional patterned sets of characters that might provide additional series. For example, the gallows sets p t cph cth and f k cfh ckh, in those orders, each have 1 2 3 and 4 strokes reaching the base line, and in that respect resemble the patterns in R series like l il iil iiil. However, I don't see how to relate this to the ch and sh (looks like a ligature or modification of ch), and the ch certainly looks like it is involved in the cph cth cfh ckh forms.

Another possible case of series behavior involves e ee eee eeee and o a y followed by e or ee, but the orthographic logic of the series here seems somewhat different. Perhaps these are analogous to R iR iiR iiiR OR OOR?

Finally q and Oq - but only oq and yq occur - may be like the restricted d and x series.

I have to admit that at the most optimistic I have perhaps 11 series here, some rather defective, where I would expect upwards of 14.
I have not addressed the issue of complex syllable initials, which in
European languages are chiefly of the form SC (s or sh plus a simple
initial) and CR (stops plus r or l).
I haven't considered the possibility that the I distinctions might be
consonants and the R's vowels, because the numbers seem even worse.
Other weaknesses: It seems to me also that this analysis does nothing to explain the repeated word phenomenon, or the rather restricted length of words (ten characters maximum?). In regard to the latter, if anything this makes words shorter, as syllables are encoded in somewhat longer character sequences than an alphabetic system would employ.
Additionally, even if the script is on the basis suggested, it might be
naive to assume that the text is not encyphered in some way anyway.
John E. Koontz
http://spot.colorado.edu/~koontz
______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list

Dr William Edmondson
School of Computer Science
University of Birmingham
Edgbaston B15 2TT
UK

Voice: +44-121-414-4763
email: w.h.edmondson@xxxxxxxxxx

______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list

Follow-Ups:
- Re: VMs: Voynichese as an Abugida
  - From: Koontz John E

References:
- VMs: Voynichese as an Abugida
  - From: Koontz John E

Prev by Date: VMs: Voynichese as an Abugida
Next by Date: Re: VMs: Lessons From A Bookseller
Previous by thread: VMs: Voynichese as an Abugida
Next by thread: Re: VMs: Voynichese as an Abugida
Index(es):
- Date
- Thread