Factored Gradient Descent: on Dropping Convexity for Faster Optimization

Monday, May 16, 2016 - 10:00am - 10:50am
Keller 3-180
Sujay Sanghavi (The University of Texas at Austin)
Local algorithms like gradient descent are widely used in non-convex optimization, typically with few guarantees on performance. In this talk we consider the class of problems given by

min_{U,V} f(UV’)

where f is a convex function on the space of matrices. The problem is non-convex, but “only” because we adopt the bi-linear factored representation UV’, with tall matrices U,V.

We delve into the geometry of the problem, and establish an O(1/t) local convergence rate for when f is smooth, and linear convergence when f is strongly convex. We show that the constants depend not only on the shape of f (as in the convex case) but also on the condition number of the optimal point.

Factored gradient descent is widely used in matrix estimation; our work provides a generic (as opposed to specific application-dependent) view on its performance.

(Joint work with Srinadh Bhojanapalli, Anastasios Kyrillidis and Dohyung Park)
MSC Code: