[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Doubled words
Hi!
Quick question on this one:
--- Jorge Stolfi <stolfi@xxxxxxxxxxxxx> wrote:
> > [Philip Neal:] If the current word is qokeey,
there is a 6%
> > chance that the next word will be qokeey.
> The repetitions of "qokeey" are indeed exceptional,
>
> [...]
>
> I looked for doublets (consecutive word repeats,
> ignoring punctuation)
> in some of my reference texts, see the table below.
> The columns are
>
> ndup number of doublets in the text
> fdup frequency of doublets relative to num of
> tokens
> topwd the most frequent word appearing in those
> doublets
> ntd count of "topwd topwd" doublets
Here's my question: are ndup and fdup based
on the sum over all words, or for the most
commonly reduplicated word only?
> sample language book ndup
> fdup topwd ntd
> -------- ---------- ----------------------- ----
> ------ ---------- ---
> chin/red Mandarin Dream_of_Red_Mansion 351
> .01002 lao3 (*) 44
So: what would fdup for the most commonly reduplicated
word be?
By the way, what is table-guessed Pinyin? :-)
Cheers, Rene
__________________________________________________
Do You Yahoo!?
Send FREE Valentine eCards with Yahoo! Greetings!
http://greetings.yahoo.com