[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

VMs: VMS & XML...?

Hi Rene,

At 06:56 30/09/02 -0700, you wrote:
In the end, it depends on how one wants to use a
transcription file. The XML can be generated
automatically from a straight text file,
just like GC's PDF file is a product of some tool
that reads something from either an ASCII file
or some kind of database table.
The requirements coming from the desire to
visualise the page in VMs-lookalike form
are completely different from the requierements
to allow automated text processing.
So it is just my old-fashioned opinion that the
'meat' is the transcribed text and the file format
is 'dressing'.

This is more or less what I replied to John Grove last year: that XML didn't seem to serve a purpose for our existing set of transcriptions.

More directly useful would be things like having a central tool repository (for entropy functions, etc), plus a standard set of (say) Perl regexp filters... and an agreed single transcription (for a given set of glyphs). However, this last part is perhaps the most contentious... :-)

Unless I'm misunderstanding GC, part of what he's suggesting involves many parties trying to collaborate so as to agree a consensus transcription, as well as to achieve some kind of reasonable modern consensus about the core alphabet, as opposed to the core stroke-set (which EVA would seem - with only a few minor differences of opinion - to fully embody).

XML would seem to be a good candidate for a bridging "glue" technology to bind that kind of thing together.

Collaboration can be easy... *if* the collaborating parties have a well-defined shared purpose. AIUI, GC's intent here isn't to transcribe the VMS' strokes, but rather to transcribe the VMS' letters, in a glyph-based way, and to open out a wide ongoing dialogue about the connection between stroke + glyph.

The problem with the item 'word' is that it is
not uniquely defined. Since lines are not too
long, I am in favour of keeping that as the
smallest unit.

Fair enough: something like...

<?xml version='1.0' encoding='windows-1252' standalone='yes'?>
<page name='f0r' hand='A' section='astro'>
<line name='X.01'>
<transcription author='GC' date='09Sep2002'>4o8cc89 8am</transcription>
<transcription author='RZ' date='12Aug2001'>qoteedy.dain</transcription>

...which is not so broadly different from the existing interlinear transcription set. :-)

But within such a broadly collaborational exercise, alternate transcriptions for each line can be built up within the underlying database from the contributions of all participants. Votes for and against different transcriptions could even be included. Just a thought. :-)

Cheers, .....Nick Pelling.....