[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: VMs: Worry - information loss in transcription - pictures ...
At 00:00 02/09/2003 -0700, Rene Zandbergen wrote:
> >What's the 4.0 mean? And what about the 4.08?
> 4.0 / 4.08 / 4.36 are the h1 values (ie, the average
> number of bits per
> token) for each transcription.
I wondered.... It's a bit on the high side. For a
normal English/Latin text it would be about that,
but for the VMs, it's about 3.8 .
OK - I should own up to my reasoning , as that will probably make things a
little clearer. It goes like this: as there is evidence that the VMS is a
copy (most notably the indented text in the starred paragraph section,
which apparently tries to duplicate a flaw in the original manuscript)
where the copier(s) tried to retain the page-layout of the original (and
hence the line-size), but the coding system appears to be based on a
verbose cipher in some way, the text must therefore have been
abbreviated/shortened/truncated in order to retain the original shape.
Tironian notae aside (and many Quattrocento documents still used a small
(<10) number of abbreviatory tokens, like "p-with-a-bar-through-the-stem"
for "pr"), the only two shorthand systems I can reasonably argue the case
for existing pre-Bright's-Characterie are (a) Radcliff's vowel-reduction
technique, and (b) siyaqat from the Ottoman state apparatus (which I've now
had a look at - more on that later). Both of these try to remove redundancy
in the text by omitting letters which - to a suitably accultured reader -
aren't strictly necessary for comprehension. This is what I think is
happening in the VMS as well.
Therefore, the most valid statistical comparison would be with a text
abbreviated in much the same way: removing many (largely redundant) vowels
and truncating long words when it becomes clear what they are.
Unsurprisingly, these texts would typically have a higher h1 value (as
they're not being thinned down by high vowel-counts) - but h2 testing would
be interesting... :-)
Cheers, .....Nick Pelling.....
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying: