Seminar
on Industrial Problems
Geometric
Compression of Complex 3D Data Bases
October
3, 1997
Presented
by:
Gabriel
Taubin, taubin@watson.ibm.com
Visual and Geometric Computing
IBM T. J. Watson Research Center
IBM, Yorktown Heights
The primary means of representing 3D objects for industry and
consumer use and for transmission over the Internet is through
polyhedron models. Polyhedron models are particularly effective
for hardware-assisted rendering, which is important for video
games, virtual reality, fly-through, and electronic mock-up
applications involving complex CAD models. In comparison to
image and video compression, little attention has been devoted
to the compression of 3D shapes. However, this situation is
like ly to change rapidly:
- The
increasing complexity of CAD models raises significantly
the cost of the memory and storage required by these models.
- The
distribution of 3D models over networks for collaborative
design, gaming, rapid prototyping or virtual interactions
is seriously limited by the available bandwidth.
- The
graphics performance of high level hardware adapters is
limited by insufficient on-board memory to store the entire
model or by a data transfer bottleneck.
The
Virtual Reality Modeling Language
(VRML), sometimes pronounced verml, is rapidly becoming
the standard file format for transmitting 3D virtual worlds
across the Int ernet. Unfortunately, even what would be considered
small VRML files tend to be quite large. The workhorse VRML
node that is designed to contain geometry is the IndexedFaceSet
node (IFS). Any polygonal shape can be described with an IFS,
but this is expe nsive in terms of storage requirements. (Since
arbitrary polygonal faces may easily triangulated, we restrict
the exposition to triangular meshes.) A typical VRML file containing
an object of a couple hundred faces will have a size of about
30 kilobytes. The description of a large virtual world can easily
reach 10 megabytes. One easy way to reduce the storage requirements
is to use a binary rather than the standard ASCII representation
of coordinates. Another standard technique involves removing
redunda nt vertex references. In VRML several topologically
adjacent faces store the same vertex index. (In particular,
3 vertices are stored for every triangular face.) Models can
be compressed by constructing triangle strips and triangle fans,
so that a single vertex defines each new triangle in the strip
and redundancy is reduced. However, as the speaker showed, the
most dramatic reductions in storage requirements result from
the topologically assisted compression of the IFS. Here, the
specific geometry of t he mesh can be used to reduce storage
requirements.
The
speaker described the concepts behind IBM's Geometric Compression
Technology, originally presented in the IBM Research technical
report number RC-20340 (January 16, 1996), "Geometric
Compression Through Topological Surgery", by Gabriel Taubin
and Jarek Rossignac. Gabriel Taubin, Bill Horn, and Francis
Lazarus have extended the technology to handle features found
in VRML97. Compression ratios of up to 50:1 or more can be obtai
ned for large VRML97 models. The algorithms compress connectivity
data without loss of information, to approximately two bits
per triangle.
We
briefly describe the method of Taubin and Rossignac for representing
triangular meshes in compressed form. A triangular mesh is defined
by the location of its vertices (positions) and by the association
between each triangle and its sustaining vert ices (connectivity).
To develop the basic concepts, we first concentrate on simple
triangular meshes: triangulated connected oriented manifolds
without boundary, of Euler characteristic 2, i.e., meshes that
can be continuously deformed into a sphe re. (Recall that the
Euler characteristic of a triangular mesh of V vertices,
E edges and T triangles is V-E+T. This
number is 2 for a mesh that can be continuously deformed to
a sphere and 1 for a mesh that can be continuousl y deformed
to a topological disk.)
In
the compression scheme, the connectivity information of the
mesh is encoded by first constructing a vertex
spanning tree in the graph of vertices and edges. (Recall
that a vertex spanning tree is a connected graph with the same
vertices and a subset of the edges of the original graph, and
which contains no cycles. It is not unique.) The vertices of
a vertex spanning tree can be divided into leaf nodes
(vertices ad jacent to a single edge) and branch nodes
(vertices adjacent to two edges).
The
branching nodes and leaf nodes decompose the tree into vertex
runs. A vertex run is a sequence of edges connecting
a starting leaf or branching node to subsequent re gular nodes
and ending in a leaf or branching node. The vertex spanning
tree is represented as an array of triplets, each triplet is
composed of a length, a branching bit,
and a leaf bit, and describing a vertex run. These tri
plets are ordered according to when the corresponding runs are
visited during depth-first traversal of the tree starting from
the root leaf node. Runs with a common first node are ordered
according to the global orientation of the mesh. The length
is the number of edges of the run, the branching bit indicates
whether the run is the last run sharing the current branching
node or not, and the leaf bit indicates if the run ends
in a leaf or branching node. This representation efficiently
encodes the structure of the vertex spanning tree. To increase
the compression ratio, the compression algorithm attempts to
build vertex spanning trees with the least number of runs. A
peeling an orange approach
seems to be most efficient.
Now,
to encode the connectivity, the mesh is cut through the edges
(called the cut edges) of the vertex spanning tree. (Thus
each cut edge corresponds to exactly two boundary edges of the
new mesh.) The result is a triangulated simply connected polygon,
i.e., is continuously deformable to a topological disk.
Indeed
the result of cutting through the vertex spanning tree is a
connected oriented triangular mesh with boundary forming a single
bounding loop of edges and no internal vertic es. Furthermore,
a spanning tree of V nodes has V-1 edges, so the
bounding loop has 2V-2 edges and vertices. Therefore,
the resulting mesh has 2V-2, vertices, E+V-1 edges,
and T triangles. Thus the Euler characte ristic is
(2V-2)-(E+V-1)+T=(V-E+T)-1=1.
Now
consider the dual graph of this polygon. (Recall that the vertices
of the dual graph are the triangles of the polygon. Two vertices
of the dual graph are joined by an edge if and only if the corresponding
triangles of the original graph are adjace nt.) Clearly, the
dual graph of the polygon forms a binary spanning tree of the
triangles of the mesh, which can also be decomposed into runs.
This triangle spanning tree is
e ncoded in the same way as the vertex spanning tree is encoded.
Together, the vertex and triangle spanning trees permit the
recovery of the length and boundary of each triangle run and
the vertices that bound each triangle.
The
internal edges of the polygon are called marching
edges. Within each triangle run, each marching edge shares
a vertex with the previous marching edge in the run. That shared
vertex could lie on the left or the right boundary. A single
bit of information per marching edge is used to encode the correct
side.
Although
there are other compression techniques used in the authors'
method to reduce the size of the files needed to encode the
data for a triangular mesh, the above
construction is the main idea. The compression algorithm
involves the construction and encoding of the vertex spanning
tree and the triangle spanning tree. Mainly the total number
of runs of the vertex and triangle trees determines the compression
ratio, and optimal compression is achieved by minimizing this
number. (This combinatorial optimization problem is conjectured
to be NP complete, but the authors have found deterministic
methods that produce good results.) The decompression algorithm
generates the original mesh from the vertex and triangle spanning
trees. The method can fairly easily be generalized to more general
meshes with arbitrary Euler characteristic, with boundary,
non-or ientable, etc. Further, it can be generalized to take
into account properties of the mesh such as normals, colors,
and texture mapping vectors used for shading purposes.
The
authors have found that compression ratios of 50 to 1 with respect
to standard VRML files are not unusual. Other existing compression
methods either do not preserve connectivity of the mesh, or
do not achieve these compression ratios.
There
are a number of open problems
connected with this construction. First of all, the compression
algorithm involves a number of heuristics. How good are they?
What is the o ptimal way to generate a spanning tree? In particular
how can the length of each run be maximized and the number of
runs minimized? What is the tradeoff between file size and compression
time? How to compress time dependent objects in which the connectiv
ity changes?
Biographical
Information: Gabriel Taubin is Manager of the Visual
and Geometric Computing group at the IBM T.J. Watson Research
Center, where he leads a group of researchers in the creation
of new geometric computation and image-based algori thms and
technologies for 3D modeling, network-based graphics, and data
visualization. Before this assignment, he spent five years as
a member of the Exploratory Computer Vision group. Gabriel holds
a PhD in EE from Brown University in the area of Comput er Vision,
and a MSc in Pure Mathematics from the University of Buenos
Aires, Argentina. He authored 9 patents, and published over
30 papers.
Report
prepared by Willard Miller