[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
VMs: Re: VMS & XML...?
Dear all,
being involved in TEI for some time, I can but recommend to
undergo the effort to encode the VMS in TEI. Even though TEI
positions itself as a general standard, it is *especially*
well suited for (old) manuscripts, since many projects has
been compiled using that portion of TEI.
I am not talking about the encoding issue - any entity-based
scheme can be adopted before the VMS alphabet issues will be
solved (if at all) - but rather about all the rest: headers,
comments, glosses, folio info, etc., etc. The mere fact that
the VMS will be standardized by some standard (and there is
no other proven standard for manuscript but the TEI today)
will be worth the effort.
I also recommend that if anyone wants to seriously consider
to work on this to become a member of the TEI Consortium
(see www.tei-c.org).
- Jan
On Mon, 30 Sep 2002, Rafal T. Prinke wrote:
>
> Dear Nick,
>
> > Ummm... I've just gone through a load of the TEI docs (thankfully they were
> > nice, easy reading), and am not 100% clear on the relationship between TEI,
> > the VMS and Unicode. :-|
> >
> > In some places it seems as though all TEI final (leaf) content has to be
> > expressible in Unicode, but in others it seems that as long as your dtd can
> > translate it into Unicode, Everything Will Turn Out Fine In The End.
>
> Original TEI was SGML so entities and character maps were the basis
> for encoding/display. The new P-4 version supports XML, too (but
> still remaining SGML) so Unicode UTF-8 is the default coding.
> However, character entities are also allowed.
>
> > I know people have debated the ins and outs of Unicode here before - that
> > as there's no evidence of a corpus of texts written in the VMS' alphabet, a
> > portion of the main Unicode coding space shouldn't really be allocated to
> > it, so it should reside in a user-defined "soft" area.
>
> Yes, I don't think the Unicode Consortium would accept
> our proposal for including the VMS alphabet - especially as
> it is not defined well enough. But the Private Use Area is
> available for specific projects - and there are also "higher planes"
> of Unicode, one of which is all available to private use
> (14th or 15th IIRC). Those planes are not supported by
> software yet, however :-)
>
> > But if the point of the TEI is to make an encoding permanently accessible
> > in the future, doesn't that rely on the VMS' alphabet having a permanent
> > place in the main Unicode space?
>
> No, I don't think it is necessary at all. I was thinking
> about encoding the VMS in TEI in using the transliterated
> interlinear file - rather than creating entities or Unicode points
> for the alphabet itself. Actually, that is exactly what you
> proposed:
>
> <transcription author='RZ' date='12Aug2001'>
> qoteedy.dain
> </transcription>
>
> If needed, it can be transformed into &entities; based
> transcription later - or used to generate various
> transcriptions, depending on one's views/needs.
>
> TEI predates XML by almost a decade and there was really
> much thought and discussion from international authorities
> on all aspects of humanities texts put into it - so it
> is certainly worth considering. And the very fact of
> publishing a TEI version of the VMS may draw attention
> of more textual scholars to it and make the day of
> cracking it sooner (this is just rethoric...).
>
> Best regards,
>
> Rafal
>