Data Compression White Papers

Comparison of Width-wise and Length-wise Language Model Compression

Overview In this paper we investigate the extent to which Katz back-off language models can be compressed through a combination of parameter quantization (width-wise compression) and parameter pruning (length-wise compression) methods while preserving performance.

We compare the compression and performance that is achieved using entropy-based pruning against that achieved using only parameter quantization. We then compare combinations of both methods. It is shown that a broadcast news language model can be compressed by up to 83% to only 12.6Mb with no loss in performance on a broadcast news task. Compressing the language model further by quantization to 10.3Mb resulted in only a 0.4% degradation in word error rate which is better than can be achieved through entropy-based pruning alone.

Further White Paper Details
PublisherMitsubishi Electric Research Laboratories (MERL) File FormatHTML & PDF
Date PublishedDecember 2001 Downloads103
FormatWhite Papers   
Topics

Quick Sitemap Links: