[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Help me parse arabic text!
I'm interested in using arabic as a language for
comparison with the VMS. However, I find myself out of my league and hope
someone can offer advice.
I found a copy of the koran in ISO-8859-6 encoding, but tools like MONKEY
and TACT choke on the character set. I'm not sure how to do character-
and term frequency counts on it -- is there a better tool for this job? I
want to generate zipfian curves for comparison, removing likely bits that
may represent nulls in the VMS.
Any assistance or advice would be most appreciated!
School of Information and Library Science
UNC Chapel Hill