Krasimir D. Kolarov
Interval Research, Palo Alto, CA
An image transform codec (compression/decompression algorithm) consists of three steps: 1) a reversible transform, often linear, of the pixels for the purpose of decorrelation, 2) quantization of the transform values, and 3) entropy coding of the quantized transform coefficients. This talk presents an entropy codec which is fast, efficient in silicon area (for implementation in hardware), coding-wise efficient, and practical when the transform is a wavelet pyramid. The use of short wavelet bases is particularly appropriate for our focus on natural scene images quantized to match the human visual system (HVS). We will discuss the statistical characteristics of quantized wavelet pyramids derived from NTSC video quantized to be viewed under standard conditions. The resulting video pyramids have substantial runs of zeros and also substantial runs of non-zeros. To take advantage of these structures we will introduce a motion Wavelet transform Zero Tree (WZT) codec which achieves very good compression ratios and is implementable in a single ASIC of modest size (and very low cost). WZT includes a number of trade-offs which reduce the compression rate but which simplify the implementation and reduce the cost. The codec employs a group of pictures (GOP) of two interlaced video frames (i.e., four video fields). The results of the wavelet transform are coded using the zero-tree method (well known in the data compression literature). Specific features which contribute to an implementation in a small single chip are:
* Motion image compression is used in place of motion compensation
* Transform filters are short and use dyadic rational coefficients with small numerators. Implementation can be accomplished with adds and shifts.
* Processing can be decoupled into the processing of stripes of 8 scan lines each. This helps reduce the RAM requirements to the point that the RAM can be placed in the ASIC itself. This reduces the chip count and also simplifies the satisfaction of RAM bandwidth requirements.
* Quantization denominators are powers of two, enabling implementation by shifts.
* Zero-Tree coding yields a progressive (i.e., embedded) encoding which is easily rate controlled
* The codec itself imposes a very low delay of less than 3.5 ms within a field and 67 ms. for a GOP.
The technical innovations that enable the above features set are:
* Edge filters which enable blockwise processing while preserving quadratic continuity across block boundaries, greatly reducing blocking artifacts.
* Field image compression which reduces memory requirements for fields within a GOP.
The simulations we have performed demonstrate significantly better performance of WZT with respect to the commercially available wavelet codec ADV601 from Analog Devices both perceptually and in signal-to-noise ratio PSNR (by 1-2.3db). WZT is significantly faster (NO multiplication) vs. 55 million multiplications per second for ADV601 for 480x640 video frames. In addition, WZT achieves comparable compression performance to high quality commercial MPEG2 compressors for significantly less cost in computation.