Talk abstract:
Universal Modeling and Coding
Jorma Rissanen, IBM Almaden Research Center
Universal models are an outgrowth of the ideas rooted in universal
coding, stochastic complexity, and modeling by the shortest
code length or the MDL principle, which themselves are
modifications of algorithmic complexity, as introduced by Solomonoff.
Much like algorithmic complexity, which permits the definition
of a universal probability model, albeit a noncomputable one,
stochastic complexity can be used to define a computable model,
which is universal for a class of probability measures. Although
it can fully learn to imitate these models from their samples
only asymptotically, we can prove for many classes of models
that asymptotically no "estimate" of the data-generating machinery
can perform better than the universal model, which then may
be regarded as the ultimate "estimator".
In reality, such a foundation for statistics does not require
the data, let alone the parameters, to be a sample from any
population. The selected probability models are just a convenient
way via the Kraft inequality to describe good codes without
need to resort to the (false) claim that any of them has generated
the data. The central issue then becomes model fitting rather
than estimation, in which the elusive model complexity plays
a natural and essential part. This brings advantages well beyond
the reach of the traditional techniques, especially in complex
modeling problems.
This talk will outline algorithms for universal density and
regression models as well as a new way to do universal coding.
Back to Workshop Schedule
1996-1997
Mathematics in High Performance Computing
|