Simultaneous variable and rank selection for optimal estimation<br/><br/>of high dimensional matrices

Wednesday, September 7, 2011 - 3:45pm - 4:25pm
Keller 3-180
Florentina Bunea (Cornell University)
Modeling high dimensional data has become a ubiquitous task,
and reducing the dimensionality a typical solution. This talk is devoted
to optimal dimension reduction in sparse multivariate response regression models in which both the number of responses and that of the predictors may exceed the sample size. Sometimes viewed as complementary, predictor selection and rank reduction are the most popular strategies for obtaining lower dimensional approximations of the parameter matrix in such models. Neither of them alone is tailored to simultaneous selection and rank reduction, therefore neither can be minimax rate optimal for low rank models corresponding to
just a few of the total number of available predictors. There are no estimators, to date, proved to have this property. The work presented here attempts to bridge this gap. We point out that, somewhat surprisingly, a procedure consisting in first selecting predictors, then reducing the rank, does not always yield estimates that are minimax adaptive. We show that this can be remedied by performing joint rank and predictor selection. The methods we propose are based on penalized least squares, with new penalties that are designed with the appropriate notions of matrix sparsity in mind. Of special importance is the fact that these penalties are rather robust to data adaptive choices of the tuning parameters, making them particularly appealing in practice. Our results can be immediately applied to standard multivariate analyses such as sparse PCA or CCA, as particular cases, or can be easily extended to inference in functional data. We support our theoretical results with an extensive simulation study and offer a concrete data example.
MSC Code: