Solving mixed integer programs arising in statistical data editing

Wednesday, July 27, 2005 - 2:45pm - 3:30pm
EE/CS 3-180
Juan-Jose Salazar-Gonzalez (University of La Laguna)
National Statistical Agencies needs automatic tools for detecting
errors in records not satisfying a set of consistency rules.
The topic is known as Statistical Data Editing, and a very
interesting Optimization Problem is the so-called Error Localization
It concerns finding the minimum number of fields in an invalid record
such that by modifying these values the new record satisfies all the rules.
It is an NP-hard problem because the Minimum Weight Solution to Linear
is a particular case. Also the Error Localization Problem is very important
in Signal Processing (the problem is then called Basis Selection Problem),

where information must be encoded and sent, or received and decoded,
through a Telecommunication Network.

We present an exact method, based on a Benders' decomposition approach,
solving instances with up to 100 fields and 40 edits. We also produce
a variant of this approach to solve heuristically larger instances.
Some procedures of this descending search make use of the Farkas' Lemma
in Linear Programming. Computational experience on randomly
generated instances shows that the approach can deal with
instances of up to 1000 fields and 400 edits.

The generality of the consistency rules makes the Error Localization Problem
very close to a general Mixed Integer Programming. Still, there is some
Therefore, we will open a discussion to the audience on how new results in
Mixed Integer Programming could help solving Error Localization Problems,
and viceversa!

This is part of a joint research work with Jorge Riera-Ledesma, University
of La Laguna, Tenerife, Spain.