Learning and estimation: separated at birth, reunited at last

Alexander Rakhlin (University of Pennsylvania, The Wharton School)
E62-587

Abstract: We consider the problem of regression in three scenarios: (a) random design under the assumption that the model F is correctly specified, (b) distribution-free statistical learning with respect to a reference class F; and (c) online regression with no assumption on the generative process. The first problem is often studied in the literature on nonparametric estimation, the second falls within the purview of statistical learning theory, and the third is studied within the online learning community. It is recognized…

Find out more »

Linear and Conic Programming Approaches to High-Dimensional Errors-in-variables Models

Alexandre Tsybakov (CREST-ENSAE)
E62-587

We consider the regression model with observation error in the design when the dimension can be much larger than the sample size and the true parameter is sparse. We propose two new estimators, based on linear and conic programming, and we prove that they satisfy oracle inequalities similar to those for the model with exactly known covariates. The only difference is that they contain additional scaling with the l1 or l2 norm of the true parameter. The scaling with the…

Find out more »

Integer Feasibility of Random Polytopes

Karthekeyan Chandrasekaran (Harvard)
E62-587

Optimization problems are ubiquitous in contemporary engineering. A principal barrier to solving several real-world optimization problems is input uncertainty. In this talk, I will present new tools to study probabilistic instances of integer programs. As an application, I will show a phase-transition phenomenon in a simple distribution model for random integer programs. Our main tool is an elementary connection between integer programming and matrix discrepancy. I will describe this connection and derive matching upper and lower bounds on the discrepancy…

Find out more »

On the Power of Adversarial Infections in Networks

Michael Brautbar (MIT)
E62-587

Over the last decade we have witnessed the rapid proliferation of online networks and Internet activity. Such activity is considered as a blessing but it brings with it a large increase in risk of computer malware --- malignant software that actively spreads from one computer to another. To date, the majority of existing models of malware spread use stochastic behavior, where the set of neighbors infected from the current set of infected nodes is chosen obliviously. In this work, we…

Find out more »

Degree-degree dependencies in random graphs with heavy-tailed degrees

Nelly Litvak (University of Twente)
E62-587

Mixing patterns in large self-organizing networks, such as the Internet, the World Wide Web, social and biological networks are often characterized by degree-degree dependencies between neighboring nodes. In assortative networks the degree-degree dependencies are positive (nodes with similar degrees tend to connect to each other), while in disassortative networks these dependencies are negative. One of the problems with the commonly used Pearson correlation coefficient, also known as the assortativity coefficient, is that its magnitude decreases with the network size in…

Find out more »

Conditional Phenomena in Time and Space

Ramon van Handel (Princeton University)
E62-587

Random phenomena are ubiquitous throughout science and engineering. Beyond the study of stochastic models in their own right, it is of importance in many applications to understand what information can be extracted on the basis of observed data. Mathematically, such ``data assimilation'' problems motivate the investigation of probabilistic phenomena that arise from conditioning. This topic has connections to several areas of probability, ergodic theory, measure theory, and statistical mechanics, as well as practical implications for the design and analysis of…

Find out more »

Multivariate Regression with Calibration

Lie Wang (MIT)
E62-587

We propose a new method named calibrated multivariate regression (CMR) for fitting high dimensional multivariate regression models. Compared to existing methods, CMR calibrates the regularization for each regression task with respect to its noise level so that it is simultaneously tuning insensitive and achieves an improved finite sample performance. We also develop an efficient smoothed proximal gradient algorithm to implement it. Theoretically, it is proved that CMR achieves the optimal rate of convergence in parameter estimation. We illustrate the usefulness…

Find out more »

Consistency of Co-clustering exchangeable graph data

David Choi (Heinz College, Carnegie Mellon University)
E62-587

We analyze the problem of partitioning a 0-1 array or bipartite graph into subgroups (also known as co-clustering), under a relatively mild assumption that the data is generated by a general nonparametric process. This problem can be thought of as co-clustering under model misspecification; we show that the additional error due to misspecification can be bounded by O(n^(-1/4)). Our result suggests that under certain sparsity regimes, community detection algorithms may be robust to modeling assumptions, and that their usage is…

Find out more »

Semimartingale reflecting Brownian motions: tail asymptotics for stationary distributions

Jim Dai (Cornell University)
E62-587

Multidimensional semimartingale reflecting Brownian motions (SRBMs) arise as the diffusion limits for stochastic networks. I will describe a powerful methodology to obtain the tail asymptotics of the stationary distribution of an SRBM. The methodology uses a moment generating function version of the basic adjoint relationship that characterizes the stationary distribution. The tail asymptotics can be used to predict quality of service in stochastic networks. It can also be used to speed up an algorithm, devised in Dai and Harrison (1992),…

Find out more »

Sublinear Optimization

Elad Hazan (Technion)
E62-587

In many modern optimization problems, specifically those arising in machine learning, the amount data is too large to apply standard convex optimization methods. We'll discuss new optimization algorithms that make use of randomization to prune the data produce a correct solution albeit running in time which is smaller than the data representation, i.e. sublinear running time. We'll present such sublinear-time algorithms for linear classification, support vector machine training, semi-definite programming and other optimization problems. These new algorithms are based on…

Find out more »


© MIT Statistics + Data Science Center | 77 Massachusetts Avenue | Cambridge, MA 02139-4307 | 617-253-1764 |
      
Accessibility