[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

VMs: Bitrans, entropy, etc



Hi Gabriel,

At 10:10 03/07/2003 +0100, Gabriel wrote:
On Wednesday 02 July 2003 22:09, steve ekwall wrote:
> IF every gallows character was a "paired character" and the lower case
> 'pointers' were just _'single' coded characters_ how would that
> affect/effect its overall enthropy for European languages?
>
> I don't have the means to run this Ratio/Percentages question...

I think that this would lower the entropy even more because there are only a
few half right gallows to choose from once you envounter the left part.
Frogguy alphabet codes the gallows like that. You can see the entropies of
most transliterations here (near the bottom of the page):

http://web.bham.ac.uk/G.Landini/evmt/commas.htm

I interpreted Steve to mean "if every gallows character that could be obviously paired with an adjacent character [like EVA <ot>, etc] was treated as a pair (with everything else treated as a normal character), how would that affect the overall entropy?"


Your page graphs character count against entropy per character: for a pairified (fully or partially) text, the corresponding character count would be lower, and hence the entropy value would be higher. In fact, a partially pairified text would seem to be one of the most parsimonious explanations for the VMS' lower entropy.

To date, I haven't used Bitrans (I've been homebrewing in Perl and JavaScript, shame on me): but I guess a pairifying bitrans transliteration script (taking a fully lower-case EVA transcription as input) would probably look something like this:-

#=~
<(comment)> <(comment)>
{(comment)} {(comment)}
#(comment) #(comment)
cth A
cph B
ckh C
cfh D
ch F
sh E
ot G
op H
ok I
of J
dy K
ee L
ol M
ar N
or O
od P
qo Q
al R
cc S
iiii 4
iii 3
ii 2
i 1
[[[[...with everything else passing through unchanged]]]]

Cheers, .....Nick Pelling.....


______________________________________________________________________ To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying: unsubscribe vms-list