Bandwidth selection for kernel density estimators of multivariate level sets and highest density regions

Tuesday, February 6, 2018 - 1:25pm - 2:25pm
Lind 305
Charles Doss (University of Minnesota, Twin Cities)
We consider bandwidth matrix selection for kernel density estimators (KDEs) of density level sets in $\RR^d$, $d \ge 2$. We also consider estimation of highest density regions, which differs from estimating level sets in that one specifies the probability content of the set rather than specifying the level directly; this complicates the problem. Bandwidth selection for KDEs is well-studied, but the goal of most methods is to minimize a global loss function for the density or its derivatives. The loss we consider here is instead the measure of the symmetric difference of the true set and estimated set. We derive an asymptotic approximation to the corresponding risk. The approximation depends on unknown quantities which can be estimated, and the approximation can then be minimized to yield a choice of bandwidth, which we show in simulations performs well.

I am an assistant professor in the School of Statistics of the University of Minnesota. I have worked largely on problems related to nonparametric estimation and inference of density or regression functions. In particular, I have worked on shape-constrained problems, where the function or some transformation of it is concave. Shape constraints serve as natural regularizers and automatically yield many nice properties, and in particular often can lead to estimators that perform automatic tuning parameter selection.