The ellipsis in mathematical documents

Friday, December 8, 2006 - 4:00pm - 4:30pm
EE/CS 3-180
Alan Sexton (University of Birmingham), Volker Sorge (University of Birmingham)
An ellipsis is a series of dots which indicates the omission of some part of
a text which the reader should be able to reconstruct from its context. The
most complex and sophisticated use of ellipses occur in matrix expressions,
where whole classes of matrices of variable dimension are described with their
use. But ellipses also occur in discussions of sequences, series, polynomials, sets,
systems of equations and generally wherever there is a collection of mathematical
objects described by a pattern rather than an explicit enumeration or a closed
form. However, while ellipses are very common in mathematical and scientific
documents, relatively little work on their recognition, semantic analysis, formal
representation, and electronic communication has been carried out.

In our work, we have shown how a matrix expression containing ellipses can
be analysed to extract a semantic representation that can be used for a number
of purposes including validating and improving optical character recognition
of matrix expressions, symbolic calculation of expressions with such matrices
and re-representation as lambda expressions for use by theorem provers. This
work has opened a number of new research avenues for machine support for
mathematicians, scientists and engineers: since we can represent underspecified
matrices with ellipses, we can develop systems to solve matrix problems for
arbitrary dimensions directly, rather than only for individual subcases of specific
dimension; we can consider the question of generalising specific solutions without
ellipses to general patterns of solutions with ellipses.

In this talk we shall summarise our research results in this area to date and
outline its possible generalisation to deal with ellipsis constructs in other areas.
We shall also suggest a structured way of representing ellipses in a uniform
format suitable for electronic communication.