[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Concensus transcription?



    > I recall seeing someone refer to a transcription based on majority vote 
    > of the collated existing transcriptions, but can't seem to find it. I'm 
    > revisiting finite state automaton induction on the "words" in the Biological
    > B folios, and want to have the best transcript possible.

I have "majority vote" and a "consensus" version of the interlinear file.
I thought I had mentioned it here, but I notice that it is not listed 
in my Voynich pages.  Anyway, here it is:

http://www.dcc.unicamp.br/~stolfi/voynich/Notes/045/inter-cm.evt

It is in EVA encoding, basically in the EVMT format used by Gabriel
and Rene. The majority version is marked with transcriber code ";A>",
the consensus one with ";Y>". You should be able to use Rene's VTT
tool to extract the one you need.

You should also consider using Takeshi's version (code ";H>"), which
is the only complete one so far.

I used simple majority for the "A" version. Perhaps I should have
given different weights to different transcribers. But that would
would have been easy: a transcriber's reliability seems to vary with
page and character. (For instance, Currier often disagrees with
Friedman/D'Imperio on "i" vs "ii".) That may be another reason to use
Takeshi's version...

All the best,

--stolfi