HOME    »    PROGRAMS/ACTIVITIES    »    Annual Thematic Program
IMA Tutorial
Short Course: Mathematical Methods in Speech and Image Analysis
September 11-15, 2000


September 11-15, 2000

Speakers:

Stu Geman
Applied Mathematics
Brown University
Stuart_Geman@brown.EDU

Basilis Gidas
Division of Applied Mathematics
Brown University
gidas@brownvm.brown.edu

Peter J. Olver
School of Mathematics
University of Minnesota
olver@math.umn.edu

Guillermo Sapiro
Department of Electrical & Computer Engineering
University of Minnesota
sapiro@ece.umn.edu

Jackie Shen
School of Mathematics
University of Minnesota

jhshen@math.umn.edu

Allen Tannenbaum
Electrical & Computer Engineering
Georgia Institute of Technology
allen.tannenbaum@ee.gatech.edu


The purpose of the short course is to introduce basic mathematical methods used in speech and/or image analysis. There will be a brief introduction to the state of the art in speech and image analysis and to some of the related mathematical issues of current interest. Following the introduction there will be five tutorials, one on partial differential equations, one on signal processing, one on probability and statistics, one on Lie methods in computer vision, and one on conformal flows in computer vision. These will cover some of the core mathematical technologies of modern speech and image analysis, including: nonlinear PDE's for image denoising and deblurring, and for shape analysis; spectral (Fourier) and multiscale (wavelet) based signal transformations; random fields, graphical models, and dynamic programming in image and language analysis.

1. Introduction & overview: Stu Geman

2. PDE methods: Guillermo Sapiro

3. Spectral and multiscale (wavelet) methods: Jackie Shen

4. Probabilistic and statistical methods: Basilis Gidas

5. Symmetry in Computer Vision: Peter J. Olver

6. Conformal Flows in Computer Vision Allen Tannenbaum

SHORT COURSE SCHEDULE

Monday Tuesday
MONDAY, SEPTEMBER 11
All talks are in EE/CS 3-180 unless otherwise noted.
8:30 am Coffee and Registration

Reception Room EE/CS 3-176

9:25 am Willard Miller Introduction
9:30-10:30 am Stu Geman
Brown University
Introduction & Overview
10:30 am Break Reception Room EE/CS 3-176
11:00 am-12:00 pm Stu Geman
Brown University
Introduction & Overview (continued)
2:00-3:30 pm Basilis Gidas
Brown University
Probabilistic and Statistical Methods
TUESDAY, SEPTEMBER 12
All talks are in EE/CS 3-180 unless otherwise noted.
9:15 am Coffee Reception Room EE/CS 3-176
9:30-10:30 am Jackie Shen
University of Minnesota
An Introduction to Wavelets and Image Representation
10:30 am Break Reception Room EE/CS 3-176
11:00 am-12:00 pm Basilis Gidas
Brown University
Probabilistic and Statistical Methods
2:00-3:00 pm Jackie Shen
University of Minnesota
An Introduction to Wavelets and Image Representation (continued)
WEDNESDAY, SEPTEMBER 13
All talks are in EE/CS 3-180 unless otherwise noted.
9:15 am Coffee IMA East, Lind Hall 400
9:30-10:30 am Allen Tannenbaum
Georgia Institute of Technology
Conformal Flows in Computer Vision
10:30 am Break Reception Room EE/CS 3-176
11:00 am-12:00 pm Basilis Gidas
Brown University
Probabilistic and Statistical Methods
2:00-3:00 pm Jackie Shen
University of Minnesota
An Introduction to Wavelets and Image Representation
THURSDAY, SEPTEMBER 14
All talks are in EE/CS 3-180 unless otherwise noted.
9:15 am Coffee IMA East, Lind Hall 400
9:30-10:30 am Guillermo Sapiro
University of Minnesota
Geometric Partial Differential Equations and Image Analysis
10:30 am Break Reception Room EE/CS 3-176
11:00 am-12:00 pm Guillermo Sapiro
University of Minnesota
Geometric Partial Differential Equations and Image Analysis Continued
2:00-3:00 pm Guillermo Sapiro
University of Minnesota
Geometric Partial Differential Equations and Image Analysis Continued
FRIDAY, SEPTEMBER 15
All talks are in EE/CS 3-180 unless otherwise noted.
9:15 am Coffee IMA East, Lind Hall 400
9:30-10:30 am Peter J. Olver
University of Minnesota
Symmetry in Computer Vision
10:30 am Break IMA East, Lind Hall 400
Monday Tuesday

Geometric Partial Differential Equations and Image Analysis

Guillermo Sapiro, University of Minnesota

This short tutorial is an introduction to the use of Geometric Partial Differential Equations (PDE's) in image processing and computer vision. This relatively new research area brings a number of new concepts into the field, providing among other things a very fundamental and formal approach to image processing. State-of-the-art practical results in problems like image segmentation, stereo, image enhancement, distance computations, and object tracking have been obtained with algorithms based on PDE formulations. We develop the PDE approach, starting with curves and surfaces deforming with intrinsic velocities, passing through surfaces moving with image-based velocities, and escalating all the way to the use of PDE's for families of images. A large number of applications are presented, including image segmentation, shape analysis, image enhancement, stereo, and tracking. Some basic connections between PDE's among themself as well as to other more classical approaches to image processing will be discussed. This tutorial will be a useful resource for researchers and practitioners. It is intended to provide information for people investigation new solutions to image processing problems as well as for people searching for existent advanced solutions. Basic knowledge in mathematics and image processing is needed to enjoy this tutorial in an optimal form.


 

An Introduction to Wavelets and Image Representation

Jackie Shen, University of Minnesota

The 10-digit phone number system and 9-digit zip code system have greatly facilitated the routing of phone calls and mails. And in music, any complex melody can be written down using a simple set of musical notes. Hidden in these daily life examples is my main point of this abstract --- the power of a "good" representation of information. Similarly, good image representations are also crucial for various tasks in image processing, among which are: image data compression and transmission, image registration and search engines, feature and pattern recognition, and image restoration. The pixel-wise representation of an image cannot be optimal since it assumes no structures in image data. But images are not white noises. They carry rich geometrical, statistical, and visual contents, and image information is highly correlated across the image domain. This observation explains that there must exist more efficient representations for image data.

First, what is the representation we are discussing here? It means that we have pre-selected a dictionary of building blocks A = { ga| a in I}, where I is an appropriate set of indices. To represent an arbitrarily given image f is to express it by f = Suma in I caga . Three natural cases then follow up: if the dictionary A is too small (or incomplete), we may ask for the best approximation to minimize the error d(f, fr), where d is an error measure and fr is the right hand side of the last equation; if A is too large and thus redundant (called a frame), then we may fix the number m of terms, and ask for the best approximation; finally, if A is just right so that we have a basis, then the representation is unique, and even stable if luckily the dictionary is very close to an orthonormal basis. The Fourier Transform chooses harmonic waves as the dictionary: gw=exp(2 \pi i w x), w in R or Z. The Windowed Fourier Transform favors a dictionary that is indexed by a=(x, w), with x indicating the location of a fixed window, and w the frequency inside. In a wavelets representation, the dictionary consists of small waves, which are indexed by a=(x, s), with x and s indicating the location and the scale of each small wave. The Fourier Transform works well for nearly periodic or band-limited signals. If the signal's frequencies vary gently with time or location, then the Windowed Fourier Transform can perform much better. But if the frequencies or scales depend sensitively on time or location, then the wavelet representation is the best. Most typical ECG/seismic/image signals fall into this last category. In this short course, we shall browse the basic theory of frequency analysis, short-time (or windowed) frequency analysis, and wavelets/multiresolution analysis, with most weight put on the last topic. We will also touch the design, implementation, and major applications of wavelets. The connection to subband coding and filter banks is the heart of the lecture.


 

Probabilistic and Statistical Methods

Basilis Gidas, Brown University

In this tutorial we will present a solid introduction to some main stochastic model-based paradigms for image analysis and interpretation; the relevance of the paradigms to speech recognition, expert systems, coding theory, and linguistics, will also be pointed out. The paradigms are based on rigorous mathematical principles from Bayesian statistics, Information theory, Signal processing, and other disciplines; they support Monte Carlo and Dynamic programming type computational algorithms, as well as powerful parameter (parametric and nonparametric) estimation techniques. The tutorial will emphasize both methodology and applications. It will focus on three main topics:

(1) Stochastic Graphical Models and Applications. Here we will describe Markov Random Fields (MRF) and their Gibbs representations, with dependency graphs that include linear graphs (relevant to speech, filtering, convolution codes, and other applications), regular lattices (relevant to "low-level" vision tasks), and tree-structured graphs (relevant to "high-level" vision tasks, linguistics, error-correcting codes, etc). Dynamic Monte Carlo and Dynamic programming algorithms for sampling, optimization, or mean estimation will be presented, together with a summary of EM and variants of ML parameter estimation procedures. The main application to be treated will be texture segmentation and identification, but speech recognition, image enhancement, and tomographic reconstruction will also be indicated.

(2) Object Recognition. After a brief discussion of the main issues (e.g. invariance, contextual and global constraints) and some methodologies (especially templates, compositional/syntactic), we will focus on the decision trees approach to object recognition. We will begin with the classical Huffman code and the constrained 20 questions problem. These are special cases of the statistical decision trees approach to object recognition. The basic building blocks of these trees are the "queries" (or "tests" or "experiments"), i.e. a family of image data features. The choice of queries is critical. Most real-world recognition problems require a nearly infinite family of queries, and standard decision trees construction based on a fixed-length feature vector is not feasible. We will present a procedure (developed by Y. Amit and D. Geman) that simultaneously selects features and builds trees by inductive learning; the recognition algorithm employs multiple decision trees. We will describe the procedure using primarily the handwritten, binary, digit recognition problem as an example.

(3) Simultaneous Tracking and Recognition. Here we describe a coherent framework for tracking/recognition on the basis of video image sequences, that contains three basic models: (a) An Object Model that articulates the overall shape architecture of an object, together with the shape's random variabilities (position, orientation, non-rigid elastic deformations); (b) a dynamic model that describes an object's dynamical motions; and (c) a data (or observation) model that relates the image gray-level data (or functions thereof) to the object and dynamic models, and articulates random variability of the image data due to factors of uncertainty such as clutter, occlusion, noise, blur, etc. The combination of these models leads to a (typically) nonlinear filtering problem which is equivalent to a HMM. The solution of this filtering problem requires a computational (filtering) algorithm. The framework will be demonstrated using deformable templates for object representation, dynamical equations derived from Lagrangian mechanics, and a data model based on nonparametric ( rank type) statistics. We will describe a Monte Carlo type filtering algorithm (first employed by Blake and Isard), and compare it with the classical extented Kalman filter. Variations of the procedure based on compositional/syntactic models for object representation will also be described. The performance of the procedure will be demonstrated with a video showing the tracking of objects moving in highly cluttered environments.

 

 

Symmetry in Computer Vision

Peter J. Olver, School of Mathematics, University of Minnesota

Recent advances in computer vision, based on geometrically invariant nonlinear diffusion processes, have underscored the importance of Lie groups in the equations of image processing and object recognition. In this talk, I will survey how the invariants of Lie groups are used to construct both invariant differential equations and invariant signatures for objects in images. New, noise-resistant, invariant numerical schemes for approximating differential invariants, based on joint invariants, will be presented. Finally, practical applications to image processing, including multi-scale resolution, denoising, edge detection, segmentation, and object recognition, will be illustrated.


LIST OF CONFIRMED PARTICIPANTS (in addition to postdocs and long term participants)

as of 9/8/2000
Name Department Affiliation
Emil Cornea Mathematical Sciences Northern Illinois University
Zachariah Dietz Statistics Iowa State University
Stu Geman Applied Mathematics Brown University
Basilis Gidas Applied Mathematics Brown University
Jongwoo Jeon Statistics Seoul National University
Ioannis Konstantinidis Mathematics University of Maryland
Bradley Love Laboratory of Survival and Longevity Max Planck Institute for Demographic Research
Aaron Luttman Research and Development Pointcloud Inc.
Gennady Lyubeznik Mathematics University of Minnesota
Peter J. Olver Mathematics University of Minnesota
Michael Pearson Mathematics Mississippi State University
Alexander Powell Mathematics University of Maryland
Chaunxi Qian Mathematics & Statistics Mississippi State University
Errol Rowe Mathematics North Carolina A&T State University
Guillermo Sapiro Electrical & Computer Engineering University of Minnesota
Erik Schlicht Psychology University of Minnesota
Michael Schonwetter Research Lernout & Hauspie
Kevin Schweiker Engineering Freestyle Technologies, Inc.
Sunder Sethuraman Mathematics Iowa State University
Jackie Shen Mathematics University of Minnesota
Michael Steinbach Computer Science University of Minnesota
Allen Tannenbaum Electrical & Computer Engineering Georgia Institute of Technology
Andrew Torok Mathematics University of Houston


2000-2001 Program: Mathematics in Multimedia

Back to top of page

Connect With Us:
Go