Natural Image Statistics Enable Us to Quantitatively Model Visual Grouping and Figure-ground Cues

Monday, March 6, 2006 - 9:20am - 10:20am
EE/CS 3-180
Jitendra Malik (University of California, Berkeley)
Visual grouping and figure-ground discrimination were first studied by
the Gestalt school of visual perception nearly a century ago. By the use
of cleverly constructed examples, they were able to demonstrate the role
of factors such as proximity, similarity, curvilinear continuity and
common fate in visual grouping and factors such as convexity, size, and
symmetry in figure-ground discrimination. However, this left open (at
least) three major problems
(1) there wasn't a precise operationalization
of these factors for general images,
(2) the interaction of these cues
was ill understood
(3) and there was no justification for why these
factors might be helpful to an observer interacting with the visual

Over the last few years, we have been pursuing these problems in the
following paradigm:
(1) We start with a set of natural images and use
human observers to mark the perceptual groups and assign figure-ground
labels to the various boundary contours.
(2) We construct computational
models of various grouping and figure-ground factors.
(3) We calibrate
and optimally combine the grouping and figure-ground factors by using the
principle that vision evolved to be adaptive to the statistics of objects
in the natural world.

In my talk I will report on two recent results in this paradigm. One is
on understanding the power of the figure-ground cues, specifically size,
lower-region and convexity. We compared the predictions of such a model
with pyschophysics and found a pleasing agreement. The second is an
attempt at a unified probabilistic framework for mid-level vision using
conditional random fields defined on constrained Delaunay triangulations
of image edges.

This talk draws on joint work with Charless Fowlkes, David Martin and
Xiaofeng Ren; various papers can be found on the web site

MSC Code: