Fast Algorithms for Dealing with Data and Understanding Them


NOTE: This synopsis of S. Smale's talk was written by S. Adams. Credit for the novelty and depth of the results should go to Smale, but blame for inaccuracies should go to Adams.


Traditionally, if $(x_i,y_i)$ are data points, with $x_i,y_i\in{{\fam\msbfam\relax R}}$, then one seeks to minimize $\sum_i [(f(x_i))-y_i]^2$ over linear functions $f$. Under the algorithm proposed in Smale's talk, one requires $y_i\in{{\fam\msbfam\relax R}}$, but $x_i$ ranges inside some set $X$ with a prespecified symmetric positive semidefinite kernel $K:X\times X\to{{\fam\msbfam\relax R}}$. One then seeks to minimize a functional (to be explained in a moment) over all functions $f:X\to{{\fam\msbfam\relax R}}$; in fact, no sense can be made of linearity at this level of generality, since the set $X$ does not have the structure of a linear space.

Recall that we say $K$ is symmetric in case $K(x,x')=K(x',x)$, for all $x,x'\in X$. Recall that we say that $K$ is positive semidefinite in case, for all $n$, for all $x'_1,...,x'_n\in X$, we have that the matrix $[K(x'_i,x'_j)]$ is positive semidefinite. For all $x\in X$, let $K_x:X\to{{\fam\msbfam\relax R}}$ be defined by $K_x(x')=K(x,x')$.

The functional described in Smale's talk carries a function $f$ to

\begin{displaymath}\left\{\sum_i[(f(x_i))-y_i]^2\right\}+\vert f\vert^2_K,\end{displaymath}

where $\vert f\vert _K$, the $K$-norm of $f$, extends, to its quotient-completion, the positive semidefinite bilinear form $(K_x,K_{x'})_K=K(x,x')$. Smale also explained how, through quite simple calculus of variations arguments, this minimization problem leads to a linear programming problem whose solution yields the "curve of best fit" for the given data points, with respect to $K$.

A wide application of this interpolation method is guaranteed by the fact that positive semidefinite kernels occur with some regularity in nature. For example, if $X={{\fam\msbfam\relax R}}^n$, then $K(x,x')=\exp(-\vert x-x\vert^2)$ is positive semidefinite. In fact, if $D={\rm diag}(\exp(-\vert x'_1\vert^2),\ldots,\exp(-\vert x'_n\vert^2))$ and $S=[\exp(2(x'_i,x'_j))]$. then $[K(x'_i,x'_j)] = D S D$, so it suffices, then to show that $S$ is positive semidefinite. Since $[2(x'_i,x'_j)]$ is positive semidefinite one needs only show that the entry-by-entry exponential of a positive semidefinite matrix is again positive semidefinite. Through power series, it therefore suffices to show that the entry-by-entry product of two positive semidefinite matricies is again positive semidefinite. However this entry-by-entry product sits as a diagonal block in the tensor product, and positive semidefiniteness is closed under tensor product and passage to submatrices sitting on the diagonal.


Scot Adams 2003-05-21