[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Paragraph-initial <q>
> [Gabriel:] Here is the list of paragraph-initial-<q> words
Quite intriguing... Here is the breakdown by section, with the
corresponding number of plain (non-label) text tokens in brackets:
hea.1 [6866]
> <f10v.4> qotchytor.shoiin.daiin...
> <f23r.4> qokoldy.okaiir.ykaiil,g...
> <f45v.5> qotol.choiin.okchar...
hea.2 [868] (f87r+v,f90r1-v1,f93r+v,f96r+v)
heb.1 [2901]
> <f34r.5> qoteedy.shedy.shedy....
> <f37v.8> qotor.choiin.chetchy...
heb.2 [557] (f94r-f95v1)
zod.1 [1010]
cos.1 [185] (f57v)
cos.2 [1491] (f67r1-f70r2)
cos.3 [884] (f85r2,f86v4,f85v2,f86v3)
bio.1 [6828]
> <f76v.37> qoeedy.lchedy.cheeb...
> <f77v.1> qetedy.shedy.qotol...
> <f78v.20> qofcheol.opchedy.qokain...
> <f82r.1> qosheedy.qokeol.daiin...
> <f82v.29> qody.shar.a(ith)y...
> <f83r.25> qokeedy.qolchey.qokeey...
> <f83v.19> qokeed.qokaiin.sheolkain...
> <f84r.10> qotchsdy.ykeedy.qokal...
pha.1 [926]
> <f89r1.1> qoar.shar.qopcholy...
> <f89r2.1> qokcheody.cheodal.dair...
pha.2 [1426]
> <f99v.9> qokeeoy.chokal.qokeeo...
str.1 [755] (f58r+v)
str.2 [10768]
> <f103r.27> qokechy.okeey.qokeey...
> <f103r.33> qokeey.chechy.qokey...
> <f103r.35> qokeear.chain.olain...
> <f103r.40> qokeey.sheeol.shckhy...
> <f103r.45> qokeedy.qokeedy.shol...
> <f103v.17> qokeedy.chedy.qoteey...
> <f108r.21> qolshy.qoeedy.lkeal,shedy...
> <f108v.23> qokeeor.okeey.qoeey...
> <f111r.13> qosheo.lchdy.lshedy...
> <f111r.20> qokeey.qokeey.lchedy...
> <f111v.16> qokain.sheol.qokain...
> <f111v.23> qokaiin.sheckhy.qokar...
> <f112r.23> qoain.qoiin.olcheedy...
> <f116r.24> qokedy.okain.chcthy...
unk.1 [213] (f1r)
unk.2 [140] (f49v)
> <f49v.18> qotcho.cheol.chol,s...
unk.3 [47] (f65r+v)
unk.4 [302] (f66r)
unk.5 [342] (f85r1)
unk.6 [489] (f86v6)
unk.7 [387] (f85v5)
I should have listed the number of paragraphs in each section, rather
than the number of tokens; but I don't have that statistic handy.
Anyway: in stars and bio, we see a little more than one
paragraph-initial <q> every 800 tokens. The same ratio seems to hold
for the pharma section, modulo statistical error. The phenomenon is
definitely scarce in the herbal, cosmo, and zodiac sections.
Note that the <q>-para-rich sections above are precisely those where
each topic is likely to span multiple paragraphs. So this datum too is
consistent with the theory that <q> = "and".
Incidentally, the rule against starting a sentence with "and" is a
modern academicism; the very fact that the rule has to be taught
means that otherwise people would do it naturally, all the time.
And not just in the Middle Ages...
All the best,
--stolfi