[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

VMs: Re: VMS & XML...?



Dear Nick,

> Ummm... I've just gone through a load of the TEI docs (thankfully they were
> nice, easy reading), and am not 100% clear on the relationship between TEI,
> the VMS and Unicode. :-|
> 
> In some places it seems as though all TEI final (leaf) content has to be
> expressible in Unicode, but in others it seems that as long as your dtd can
> translate it into Unicode, Everything Will Turn Out Fine In The End.

Original TEI was SGML so entities and character maps were the basis
for encoding/display. The new P-4 version supports XML, too (but
still remaining SGML) so Unicode UTF-8 is the default coding.
However, character entities are also allowed.

> I know people have debated the ins and outs of Unicode here before - that
> as there's no evidence of a corpus of texts written in the VMS' alphabet, a
> portion of the main Unicode coding space shouldn't really be allocated to
> it, so it should reside in a user-defined "soft" area.

Yes, I don't think the Unicode Consortium would accept 
our proposal for including the VMS alphabet - especially as
it is not defined well enough. But the Private Use Area is
available for specific projects - and there are also "higher planes"
of Unicode, one of which is all available to private use
(14th or 15th IIRC). Those planes are not supported by 
software yet, however :-)

> But if the point of the TEI is to make an encoding permanently accessible
> in the future, doesn't that rely on the VMS' alphabet having a permanent
> place in the main Unicode space?

No, I don't think it is necessary at all. I was thinking
about encoding the VMS in TEI in using the transliterated
interlinear file - rather than creating entities or Unicode points
for the alphabet itself. Actually, that is exactly what you
proposed:

<transcription author='RZ' date='12Aug2001'>
   qoteedy.dain
</transcription>

If needed, it can be transformed into &entities; based 
transcription later - or used to generate various
transcriptions, depending on one's views/needs.

TEI predates XML by almost a decade and there was really
much thought and discussion from international authorities
on all aspects of humanities texts put into it - so it
is certainly worth considering. And the very fact of
publishing a TEI version of the VMS may draw attention
of more textual scholars to it and make the day of
cracking it sooner (this is just rethoric...).

Best regards,

Rafal