# An Introduction to Wavelets and Image Representation

Wednesday, September 13, 2000 - 2:00pm - 3:30pm

Keller 3-180

Jianhong Shen (University of Minnesota, Twin Cities)

The 10-digit phone number system and 9-digit zip code system have greatly facilitated the routing of phone calls and mails. And in music, any complex melody can be written down using a simple set of musical notes. Hidden in these daily life examples is my main point of this abstract --- the power of a good representation of information. Similarly, good image representations are also crucial for various tasks in image processing, among which are: image data compression and transmission, image registration and search engines, feature and pattern recognition, and image restoration. The pixel-wise representation of an image cannot be optimal since it assumes no structures in image data. But images are not white noises. They carry rich geometrical, statistical, and visual contents, and image information is highly correlated across the image domain. This observation explains that there must exist more efficient representations for image data.

First, what is the representation we are discussing here? It means that we have pre-selected a dictionary of building blocks A = { ga a in I}, where I is an appropriate set of indices. To represent an arbitrarily given image f is to express it by f = Suma in I caga . Three natural cases then follow up: if the dictionary A is too small (or incomplete), we may ask for the best approximation to minimize the error d(f, fr), where d is an error measure and fr is the right hand side of the last equation; if A is too large and thus redundant (called a frame), then we may fix the number m of terms, and ask for the best approximation; finally, if A is just right so that we have a basis, then the representation is unique, and even stable if luckily the dictionary is very close to an orthonormal basis. The Fourier Transform chooses harmonic waves as the dictionary: gw=exp(2 pi i w x), w in R or Z. The Windowed Fourier Transform favors a dictionary that is indexed by a=(x, w), with x indicating the location of a fixed window, and w the frequency inside. In a wavelets representation, the dictionary consists of small waves, which are indexed by a=(x, s), with x and s indicating the location and the scale of each small wave. The Fourier Transform works well for nearly periodic or band-limited signals. If the signal's frequencies vary gently with time or location, then the Windowed Fourier Transform can perform much better. But if the frequencies or scales depend sensitively on time or location, then the wavelet representation is the best. Most typical ECG/seismic/image signals fall into this last category. In this short course, we shall browse the basic theory of frequency analysis, short-time (or windowed) frequency analysis, and wavelets/multiresolution analysis, with most weight put on the last topic. We will also touch the design, implementation, and major applications of wavelets. The connection to subband coding and filter banks is the heart of the lecture.

First, what is the representation we are discussing here? It means that we have pre-selected a dictionary of building blocks A = { ga a in I}, where I is an appropriate set of indices. To represent an arbitrarily given image f is to express it by f = Suma in I caga . Three natural cases then follow up: if the dictionary A is too small (or incomplete), we may ask for the best approximation to minimize the error d(f, fr), where d is an error measure and fr is the right hand side of the last equation; if A is too large and thus redundant (called a frame), then we may fix the number m of terms, and ask for the best approximation; finally, if A is just right so that we have a basis, then the representation is unique, and even stable if luckily the dictionary is very close to an orthonormal basis. The Fourier Transform chooses harmonic waves as the dictionary: gw=exp(2 pi i w x), w in R or Z. The Windowed Fourier Transform favors a dictionary that is indexed by a=(x, w), with x indicating the location of a fixed window, and w the frequency inside. In a wavelets representation, the dictionary consists of small waves, which are indexed by a=(x, s), with x and s indicating the location and the scale of each small wave. The Fourier Transform works well for nearly periodic or band-limited signals. If the signal's frequencies vary gently with time or location, then the Windowed Fourier Transform can perform much better. But if the frequencies or scales depend sensitively on time or location, then the wavelet representation is the best. Most typical ECG/seismic/image signals fall into this last category. In this short course, we shall browse the basic theory of frequency analysis, short-time (or windowed) frequency analysis, and wavelets/multiresolution analysis, with most weight put on the last topic. We will also touch the design, implementation, and major applications of wavelets. The connection to subband coding and filter banks is the heart of the lecture.