[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

VMs: Interlinear block codes...

Hi everyone,

Please look again at the interlinear text and you'll see a large number of line codes which (unlike the codes you listed) were not documented in the header, but appear to have been allocated almost at random.

I've just trawled through all the interlinear, trying to understand how its block code system works (each block of text is allocated a letter in a semi-structured way), which (AFAICT), is like this:-

P       paragraph
L       labels
        (except for f55v, where it means "Left column")
        (except for f68r2, where it means "the circle with sun")
        (except for f68v3, where it means "top Left quarter")
T       titles
        (except for f100v, where it means "Top row of plant labels")
C       circular text
        (except for f67v2, where it means "Captions")
        (except for f70r2, where it means "Central star text")
R       ring of text
        (except for f66v, where it means "Right hand text")
        (except for f67v2, where it means "Radial labels")
        (except for f68r3, where it means "Radial lines")
        (except for f69r, where it means "Radial lines")
        (except for f76r, where it means [didn't write this one down, sorry])
        (except for f101v, where it means "Row of plants")
S       star labels
        (except for f67r1, where it means "labels in circle Sectors")
        (except for f67v2, where it means "words above the Square")
        (except for f68v3, where it means "text on Spokes")
        (except for f69r, where it means "inner ring of text")

I've got three more pages of this rubbish, but inflicting any more on you would be far too cruel. :-o

Basically, ISTM that this was a basically well-intentioned approach which unfortunately wasn't implemented in a systematic way, thus making it very hard to exploit in a general-purpose way within programmatic contexts (like my JavaScript analysis tool) without having to add in a vast pile of special cases (like the ones listed above, which are just the start).

ISTM that the basic categories line-content categories users would be most likely to want to select between are:-
(1) first line of paragraphs (because of all the single-leg gallows stuff)
(2) remaining lines of paragraphs
(3) titles
(4) circular rings of text
(5) plant labels
(6) star labels
(7) other labels
(8) any other labels
(9) loose letters

*sigh* I'll try and implement this for my next tool release, but this semi-improvised block code system makes it not much fun at all to achieve (programmers like me hate lists of special cases). Or, I might simply write a fat stupid lump of Perl regexps to process the interlinear file into a flatter, more usable form... but neither is much fun, though. :-(((

Cheers, .....Nick Pelling.....

______________________________________________________________________ To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying: unsubscribe vms-list