## Past Events › Stochastics and Statistics Seminar

### Events List Navigation

## Statistical theory for deep neural networks with ReLU activation function

Johannes Schmidt-Hieber (Leiden)

Abstract: The universal approximation theorem states that neural networks are capable of approximating any continuous function up to a small error that depends on the size of the network. The expressive power of a network does, however, not guarantee that deep networks perform well on data. For that, control of the statistical estimation risk is needed. In the talk, we derive statistical theory for fitting deep neural networks to data generated from the multivariate nonparametric regression model. It is shown…

Find out more »## Optimality of Spectral Methods for Ranking, Community Detections and Beyond

Jianqing Fan (Princeton University)

Abstract: Spectral methods have been widely used for a large class of challenging problems, ranging from top-K ranking via pairwise comparisons, community detection, factor analysis, among others. Analyses of these spectral methods require super-norm perturbation analysis of top eigenvectors. This allows us to UNIFORMLY approximate elements in eigenvectors by linear functions of the observed random matrix that can be analyzed further. We first establish such an infinity-norm pertubation bound for top eigenvectors and apply the idea to several challenging problems…

Find out more »## Testing degree corrections in Stochastic Block Models

Subhabrata Sen (Microsoft)

Abstract: The community detection problem has attracted significant attention in re- cent years, and it has been studied extensively under the framework of a Stochas- tic Block Model (SBM). However, it is well-known that SBMs t real data very poorly, and various extensions have been suggested to replicate characteristics of real data. The recovered community assignments are often sensitive to the model used, and this naturally begs the following question: Given a network with community structure, how to decide whether…

Find out more »## Inference, Computation, and Visualization for Convex Clustering and Biclustering

Genevera Allen (Rice)

Abstract: Hierarchical clustering enjoys wide popularity because of its fast computation, ease of interpretation, and appealing visualizations via the dendogram and cluster heatmap. Recently, several have proposed and studied convex clustering and biclustering which, similar in spirit to hierarchical clustering, achieve cluster merges via convex fusion penalties. While these techniques enjoy superior statistical performance, they suffer from slower computation and are not generally conducive to representation as a dendogram. In the first part of the talk, we present new convex…

Find out more »## Size-Independent Sample Complexity of Neural Networks

Ohad Shamir (Weizman Institute)

Abstract: I'll describe new bounds on the sample complexity of deep neural networks, based on the norms of the parameter matrices at each layer. In particular, we show how certain norms lead to the first explicit bounds which are fully independent of the network size (both depth and width), and are therefore applicable to arbitrarily large neural networks. These results are derived using some novel techniques, which may be of independent interest. Joint work with Noah Golowich (Harvard) and Alexander…

Find out more »## Dynamic Incentive-aware Learning: Robust Pricing in Contextual Auctions

Adel Javanmard (USC)

Abstract: Motivated by pricing in ad exchange markets, we consider the problem of robust learning of reserve prices against strategic buyers in repeated contextual second-price auctions. Buyers’ valuations for an item depend on the context that describes the item. However, the seller is not aware of the relationship between the context and buyers’ valuations, i.e., buyers’ preferences. The seller’s goal is to design a learning policy to set reserve prices via observing the past sales data, and her objective is…

Find out more »## Fitting a putative manifold to noisy data

Hariharan Narayanan (Tata Institute of Fundamental Research, Mumbai)

Abstract: We give a solution to the following question from manifold learning. Suppose data belonging to a high dimensional Euclidean space is drawn independently, identically distributed from a measure supported on a low dimensional twice differentiable embedded compact manifold M, and is corrupted by a small amount of i.i.d gaussian noise. How can we produce a manifold M whose Hausdorff distance to M is small and whose reach (normal injectivity radius) is not much smaller than the reach of M? This…

Find out more »## Variational problems on random structures and their continuum limits

Dejan Slepčev (Carnegie Mellon University)

Abstract: We will discuss variational problems arising in machine learning and their limits as the number of data points goes to infinity. Consider point clouds obtained as random samples of an underlying "ground-truth" measure. Graph representing the point cloud is obtained by assigning weights to edges based on the distance between the points. Many machine learning tasks, such as clustering and semi-supervised learning, can be posed as minimizing functionals on such graphs. We consider functionals involving graph cuts, graph laplacians…

Find out more »## An Information-Geometric View of Learning in High Dimensions

Gregory Wornell (MIT)

Abstract: We consider the problem of data feature selection prior to inference task specification, which is central to high-dimensional learning. Introducing natural notions of universality for such problems, we show a local equivalence among them. Our analysis is naturally expressed via information geometry, and represents a conceptually and practically useful learning methodology. The development reveals the key roles of the singular value decomposition, Hirschfeld-Gebelein-Renyi maximal correlation, canonical correlation and principle component analyses, Tishby's information bottleneck, Wyner's common information, Ky Fan…

Find out more »## Unsupervised Ensemble Learning

Boaz Nadler (Weizmann Institute)

Abstract: In various applications, crowdsourcing in particular, one is given the advice or predictions of several classifiers of unknown reliability, over multiple questions or queries. This scenario is different from standard supervised learning where classifier accuracy can be assessed from available labeled training or validation data, and raises several questions: given only the predictions of several classifiers of unknown accuracies, over a large set of unlabeled test data, is it possible to: a) reliably rank them, and b) construct a…

Find out more »