[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: VMs: 'Zip' language detection
4/12/03 5:42:13 PM, ajb@xxxxxxxxxxxxxxxxxx wrote:
>This week's New Scientist refers to a technique for language
>detection using 'zip' compression.
>http://www.newscientist.com/news/news.jsp?id=ns99993602
I had leafed through it at the newsagent's. I decided to
keep my money and spend it instead on a 6-pack of beer.
There is nothing new there. It is just another way of
counting the frequencies of digraphs, trigraphs, and so
on, in two different texts. If they match closely, then
those two texts are probably in the same language.
It fails lamentably when you have exactly the same text,
one plain, the other in a simple-substitution cipher.
Actually... I am lying. I didn't spend it on a 6-pack
of beer. I spent it on one fifth of a 30-pack (which in
this country costs about one half of what you would pay
if you bought them separately). Still, keep your money,
and don't waste your time on it.
______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list