Vlsi architectures for video compression applications

Advances in digital technology have given rise to many digital image applications such as visual communications. However, the major obstacle for many image applications is the vast amount of data required to represent a digital image. Currently, one of the active areas is development of techniques for fast image compression to result in reduced communication bandwidth and data storage requirements. In this dissertation, we show VLSI architectures for several representative problems in image processing and compression applications. We first show two methodologies for the design of area-efficient and high-throughput rate Data Format Converters. These are Two-Dimensional Data Format Converters and Dual Buffer Data Format Converters. Efficient data format conversion between the processing modules is important for the design of many hybrid VLSI systems. Discrete Wavelet Transform is an alternative to the existing time-frequency transforms such as DFT and DCT. Discrete Wavelet Transform is useful for multi-resolution representation and low bit-rate image compression. We show a state-of-the-art design methodology for computing 2-D Discrete Wavelet Transforms. Our design is the first single chip architecture to handle parallel block-based I/O. Our architecture provides high throughput rate and low latency as well as area efficiency. Tree-structured entropy coding based on hierarchical representations such as Discrete Wavelet Transforms provides high compression ratio and superior image quality. We show a general design technique for various tree-structured entropy coding such as Embedded Zerotree Wavelet algorithm, Embedded Zerotree Lossless, Lewis' algorithm and Space Frequency Quantization. Our architecture shows high throughput rate and area efficiency. Recently, H.263, a new low bit-rate video compression standard, has been developed. We show the design of a hardware/software codesigned system for H.263 encoder. Various encoding techniques are considered such as advanced motion vector mode, unrestricted motion vector mode, local search mode and half pel computation. Our parallel VLSI architecture performs motion estimation based on a fast search algorithm. It can process QCIF images at 76 fps and CIF images at 19 fps. The small sizes of on-chip memory and computation module enable a compact single chip design.