[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

VMs: Re: VMS & XML...?



Hi Rafal,

What I suggest is using the Text Encoding Initiative (TEI)
Guidelines which is the de-facto standard for encoding
text in the humanities (which the VMS is). It is very flexible
and ensures future data sharing with others, as well as
support from future text-analysis software.

The V4 Guidelines sprawl out from here: http://www.tei-c.org/Guidelines2/index.html

If my suggestion is accepted, I may prepare a suggested
subset of TEI tags for encoding the VMS interlinear file.

Ummm... I've just gone through a load of the TEI docs (thankfully they were nice, easy reading), and am not 100% clear on the relationship between TEI, the VMS and Unicode. :-|


In some places it seems as though all TEI final (leaf) content has to be expressible in Unicode, but in others it seems that as long as your dtd can translate it into Unicode, Everything Will Turn Out Fine In The End.

I know people have debated the ins and outs of Unicode here before - that as there's no evidence of a corpus of texts written in the VMS' alphabet, a portion of the main Unicode coding space shouldn't really be allocated to it, so it should reside in a user-defined "soft" area.

But if the point of the TEI is to make an encoding permanently accessible in the future, doesn't that rely on the VMS' alphabet having a permanent place in the main Unicode space?

Hopefully I haven't missed the point too much... :-)

Cheers, .....Nick Pelling.....