[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
VMs: VMS & XML...?
At 06:56 30/09/02 -0700, you wrote:
In the end, it depends on how one wants to use a
transcription file. The XML can be generated
automatically from a straight text file,
just like GC's PDF file is a product of some tool
that reads something from either an ASCII file
or some kind of database table.
The requirements coming from the desire to
visualise the page in VMs-lookalike form
are completely different from the requierements
to allow automated text processing.
So it is just my old-fashioned opinion that the
'meat' is the transcribed text and the file format
This is more or less what I replied to John Grove last year: that XML
didn't seem to serve a purpose for our existing set of transcriptions.
More directly useful would be things like having a central tool repository
(for entropy functions, etc), plus a standard set of (say) Perl regexp
filters... and an agreed single transcription (for a given set of glyphs).
However, this last part is perhaps the most contentious... :-)
Unless I'm misunderstanding GC, part of what he's suggesting involves many
parties trying to collaborate so as to agree a consensus transcription, as
well as to achieve some kind of reasonable modern consensus about the core
alphabet, as opposed to the core stroke-set (which EVA would seem - with
only a few minor differences of opinion - to fully embody).
XML would seem to be a good candidate for a bridging "glue" technology to
bind that kind of thing together.
Collaboration can be easy... *if* the collaborating parties have a
well-defined shared purpose. AIUI, GC's intent here isn't to transcribe the
VMS' strokes, but rather to transcribe the VMS' letters, in a glyph-based
way, and to open out a wide ongoing dialogue about the connection between
stroke + glyph.
The problem with the item 'word' is that it is
not uniquely defined. Since lines are not too
long, I am in favour of keeping that as the
Fair enough: something like...
<?xml version='1.0' encoding='windows-1252' standalone='yes'?>
<page name='f0r' hand='A' section='astro'>
<transcription author='GC' date='09Sep2002'>4o8cc89
...which is not so broadly different from the existing interlinear
transcription set. :-)
But within such a broadly collaborational exercise, alternate
transcriptions for each line can be built up within the underlying database
from the contributions of all participants. Votes for and against different
transcriptions could even be included. Just a thought. :-)
Cheers, .....Nick Pelling.....