[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: VMs: Huffman compression (was: VMs: Declaration of WAR against EVA)

To: vms-list@xxxxxxxxxxx
Subject: Re: VMs: Huffman compression (was: VMs: Declaration of WAR against EVA)
From: Nick Pelling <incoming@xxxxxxxxxxxxxxxxx>
Date: Thu, 06 Mar 2003 16:43:31 +0000
In-reply-to: <200303060844.h268iq213531@mail3.alphalink.com.au>
Reply-to: vms-list@xxxxxxxxxxx
Sender: owner-vms-list@xxxxxxxxxxx

Hi Jacques,

The key problems with using data-compression algorithms to analyse the VMS are: (a) they produce flat stats, whereas any underlying language would have peaky stats (b) if they use pattern-matching (like LZ77), they find effective groups, not semantic groups

For example, LZ77 (which is a commonly-used pattern-matching front-end) would quickly start finding matches above the likely semantic level of usefulness - for example, on the first line of f1r, it would probably match ".shol" backwards within the same line (ie, including "." as well as "shol").

For LZH, Huffman is then used as a "back-end coder", so that all the copy command parameters (effectively) get stored most effectively (that's what the "LZ" and "H" in "LZH" stand for). :-)

Also: Adaptive Huffman schemes achieve better performance (ie, smaller compressed files) by adapting the distributional stats (and hence the bit-lengths) during the compression. While this is a nice feature, it too might well get in the way of understanding what's going on.

It would be interesting to try out the language-comparison-via-compression-algorithm-performance on some grouped-glyph texts: I think they would show up as being closer to known languages than EVA (or Currier) transcription VMS text... but probably still not *too* close.

Worth trying, though. :-)

Cheers, .....Nick Pelling.....

______________________________________________________________________
To unsubscribe, send mail to majordomo@xxxxxxxxxxx with a body saying:
unsubscribe vms-list

References:
- VMs: Huffman compression (was: VMs: Declaration of WAR against EVA)
  - From: Jacques Guy

Prev by Date: RE: VMs: Declaration of WAR against EVA
Next by Date: RE: VMs: Declaration of WAR against EVA
Previous by thread: VMs: Huffman compression (was: VMs: Declaration of WAR against EVA)
Next by thread: VMs: Word Frequencies
Index(es):
- Date
- Thread