[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Benchmark transcription file

Hi Bruce,

At 11:00 11/08/01 -0400, Bruce Grant wrote:
It seems to me that it would be nice to have a sort of "benchmark transcription"
of the VMS extracted from the EVA transcription which would have these
characteristics: </snip>

Is my mind really that easy to read? :-)

Even though such a file would contain less information than the EVA, it would be
useful for the various statistical tests that people do, making these these
tests directly comparable and repeatable, and allowing the group to accumulate a
set of consistent and useable statistics over time.

Each transcription system has its merits and demerits, and as we have yet to agree on decoding a single letter in any of them, we should be careful not to close doors if we can help it. :-/

Also: until we have high-quality digitised images of the VMS that (for example) we can run through enhancement filters, a significant proportion of quill-strokes are indistinct enough that definitive categorisation of them is unlikely. It may additionally turn out to be true that (if the VMS were copied by someone who didn't understand it) this may always remain beyond our reach. :-/

However, in the context of opening out the statistical analyses to a wider audience, locking to a single transcription text (while also building filters/maps to transform it into other spaces) seems like a good idea, at least for a while.

Perhaps a later iteration could merge the various interleaved lines into a single line, and assign a fuzzy probability to each possible reading? Just an idea. :-)

Cheers, .....Nick Pelling.....