[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Benchmark transcription file

To: voynich@xxxxxxxx
Subject: Re: Benchmark transcription file
From: Nick Pelling <incoming@xxxxxxxxxxxxxxxxx>
Date: Sun, 12 Aug 2001 11:48:10 +0100
In-reply-to: <3B75F107.14A07955@mail.msen.com>
References: <20010809222149.68887.qmail@web9107.mail.yahoo.com> <5.1.0.14.0.20010811090144.0269d9b0@mail.globalnet.co.uk> <5.1.0.14.0.20010811170954.0269ed10@mail.globalnet.co.uk>

Hi Bruce,

That's why I think it makes sense to have a "variorum" version like the EVA interlinear file and a "benchmark" file for analysis. (Of course, if you have serious disagreements in interpretation, you could have more than one benchmark file, but something less than a different one for each analysis/analyzer would be nice).

For a benchmark file, the EVA transcription style is well enough thought out that you can remap to other transcriptions from it without huge difficulty: so that's pretty much OK.

The interlinear file is a set more of interleaved interpretations than of interleaved transcriptions.

The only question, then, is to what degree (and at what stage) do we resolve ambiguous characters between those interpretations in order to create our benchmark file?

We could introduce a mark into the benchmark (like "~") to indicate "next character is ambiguous" - some scripts/filters/rules could then include it in, others could remap it to "*".

Or - if we still have access to the OCR scans separated by character - we could simply vote on the most contentious ones to form a unified consensus?

Or we could simply agree which single transcription to lock to (and just get on with it)?

Cheers, .....Nick Pelling.....

References:
- Collaboration on VMS
  - From: King Mordecai
- Re: Collaboration on VMS
  - From: Nick Pelling
- Re: Benchmark transcription file
  - From: Nick Pelling
- Re: Benchmark transcription file
  - From: Bruce Grant

Prev by Date: Non-visible scanning...
Next by Date: Documents Forensic
Previous by thread: Re: Benchmark transcription file
Next by thread: Re: Benchmark transcription file
Index(es):
- Date
- Thread