# The ellipsis in mathematical documents

Friday, December 8, 2006 - 4:00pm - 4:30pm

EE/CS 3-180

Alan Sexton (University of Birmingham), Volker Sorge (University of Birmingham)

An ellipsis is a series of dots which indicates the omission of some part of

a text which the reader should be able to reconstruct from its context. The

most complex and sophisticated use of ellipses occur in matrix expressions,

where whole classes of matrices of variable dimension are described with their

use. But ellipses also occur in discussions of sequences, series, polynomials, sets,

systems of equations and generally wherever there is a collection of mathematical

objects described by a pattern rather than an explicit enumeration or a closed

form. However, while ellipses are very common in mathematical and scientific

documents, relatively little work on their recognition, semantic analysis, formal

representation, and electronic communication has been carried out.

In our work, we have shown how a matrix expression containing ellipses can

be analysed to extract a semantic representation that can be used for a number

of purposes including validating and improving optical character recognition

of matrix expressions, symbolic calculation of expressions with such matrices

and re-representation as lambda expressions for use by theorem provers. This

work has opened a number of new research avenues for machine support for

mathematicians, scientists and engineers: since we can represent underspecified

matrices with ellipses, we can develop systems to solve matrix problems for

arbitrary dimensions directly, rather than only for individual subcases of specific

dimension; we can consider the question of generalising specific solutions without

ellipses to general patterns of solutions with ellipses.

In this talk we shall summarise our research results in this area to date and

outline its possible generalisation to deal with ellipsis constructs in other areas.

We shall also suggest a structured way of representing ellipses in a uniform

format suitable for electronic communication.

a text which the reader should be able to reconstruct from its context. The

most complex and sophisticated use of ellipses occur in matrix expressions,

where whole classes of matrices of variable dimension are described with their

use. But ellipses also occur in discussions of sequences, series, polynomials, sets,

systems of equations and generally wherever there is a collection of mathematical

objects described by a pattern rather than an explicit enumeration or a closed

form. However, while ellipses are very common in mathematical and scientific

documents, relatively little work on their recognition, semantic analysis, formal

representation, and electronic communication has been carried out.

In our work, we have shown how a matrix expression containing ellipses can

be analysed to extract a semantic representation that can be used for a number

of purposes including validating and improving optical character recognition

of matrix expressions, symbolic calculation of expressions with such matrices

and re-representation as lambda expressions for use by theorem provers. This

work has opened a number of new research avenues for machine support for

mathematicians, scientists and engineers: since we can represent underspecified

matrices with ellipses, we can develop systems to solve matrix problems for

arbitrary dimensions directly, rather than only for individual subcases of specific

dimension; we can consider the question of generalising specific solutions without

ellipses to general patterns of solutions with ellipses.

In this talk we shall summarise our research results in this area to date and

outline its possible generalisation to deal with ellipsis constructs in other areas.

We shall also suggest a structured way of representing ellipses in a uniform

format suitable for electronic communication.