Multivariate Regression with Calibration

Lie Wang (MIT)
E62-587

We propose a new method named calibrated multivariate regression (CMR) for fitting high dimensional multivariate regression models. Compared to existing methods, CMR calibrates the regularization for each regression task with respect to its noise level so that it is simultaneously tuning insensitive and achieves an improved finite sample performance. We also develop an efficient smoothed proximal gradient algorithm to implement it. Theoretically, it is proved that CMR achieves the optimal rate of convergence in parameter estimation. We illustrate the usefulness…

Find out more »

Consistency of Co-clustering exchangeable graph data

David Choi (Heinz College, Carnegie Mellon University)
E62-587

We analyze the problem of partitioning a 0-1 array or bipartite graph into subgroups (also known as co-clustering), under a relatively mild assumption that the data is generated by a general nonparametric process. This problem can be thought of as co-clustering under model misspecification; we show that the additional error due to misspecification can be bounded by O(n^(-1/4)). Our result suggests that under certain sparsity regimes, community detection algorithms may be robust to modeling assumptions, and that their usage is…

Find out more »

Semimartingale reflecting Brownian motions: tail asymptotics for stationary distributions

Jim Dai (Cornell University)
E62-587

Multidimensional semimartingale reflecting Brownian motions (SRBMs) arise as the diffusion limits for stochastic networks. I will describe a powerful methodology to obtain the tail asymptotics of the stationary distribution of an SRBM. The methodology uses a moment generating function version of the basic adjoint relationship that characterizes the stationary distribution. The tail asymptotics can be used to predict quality of service in stochastic networks. It can also be used to speed up an algorithm, devised in Dai and Harrison (1992),…

Find out more »

Sublinear Optimization

Elad Hazan (Technion)
E62-587

In many modern optimization problems, specifically those arising in machine learning, the amount data is too large to apply standard convex optimization methods. We'll discuss new optimization algorithms that make use of randomization to prune the data produce a correct solution albeit running in time which is smaller than the data representation, i.e. sublinear running time. We'll present such sublinear-time algorithms for linear classification, support vector machine training, semi-definite programming and other optimization problems. These new algorithms are based on…

Find out more »


MIT Statistics + Data Science Center
Massachusetts Institute of Technology
77 Massachusetts Avenue
Cambridge, MA 02139-4307
617-253-1764