Sanjay Shakkottai, UT Austin

Title: On the Throughput vs Accuracy Trade-Off for Streaming Unsupervised Classification

We study an online and streaming unsupervised classification system. Our setting consists of a collection of classifiers (with unknown confusion matrices) each of which can classify one sample per unit time, and which are accessed by a stream of unlabeled samples. Each sample is dispatched to one or more classifiers, and depending on the labels collected from these classifiers, may be sent to other classifiers to collect additional labels. The labels are continually aggregated. Once the aggregated label has high enough accuracy (a pre-specified threshold for accuracy) or the sample is sent to all the classifiers, the now labeled sample is ejected from the system. For any given pre-specified threshold for accuracy, the objective is to sustain the maximum possible rate of arrival of new samples, such that the number of samples in memory does not grow unbounded.

In this talk, we characterize the Pareto-optimal region of accuracy and arrival rate, and discuss an algorithm that can operate at any point within this region. Our algorithm uses queueing-based routing and scheduling approaches combined with online tensor decomposition method to learn the hidden parameters, and provides Pareto-optimality guarantees. We finally verify our theoretical results through simulations on two ensembles formed using AlexNet, VGG, and ResNet deep image classifiers. Based on joint work with Soumya Basu, Steven Gutstein and Brent Lance.


Shivani Agarwal, U Penn

Title: Designing Statistically Consistent Learning Algorithms: The Role of Surrogate Loss Functions

For any machine learning problem, a basic desideratum of a good learning algorithm is that as it gets more and more training data, the performance of the learned model should converge to that of a Bayes optimal model -- in other words, the learning algorithm should be statistically consistent. How, then, can one design statistically consistent learning algorithms that are also computationally efficient?

In particular, in many learning problems where directly optimizing the target performance measure is computationally difficult, a practical approach to designing computationally efficient learning algorithms is that of minimizing a convex surrogate loss. A natural question that arises then is: Can such surrogate losses provide statistical consistency guarantees?

This tutorial will provide answers to some of these questions. In particular, we will review the theory of convex calibrated surrogate losses, which lead to statistically consistent learning algorithms, as well as "proper" surrogate losses and their connections with property elicitation. We will also discuss recent results establishing that all calibrated surrogates can effectively be viewed as proper surrogates that estimate a suitable "sufficient" property of the data distribution. With these tools, the task of designing of a statistically consistent learning algorithm reduces to the task of identifying a suitable "sufficient" property for the target learning problem, and then designing a proper surrogate loss for estimating this property.


Aaditya Ramdas, CMU

Title: Assumption-free uncertainty quantification for black-box algorithms

The fields of statistics and machine learning have made tremendous progress in the last few decades in designing accurate black-box prediction methods (boosting, random forests, bagging, neural nets, etc.) but for deployment in the real world, it would be useful to have uncertainty quantification for those point-predictions. In this tutorial, I will summarize recent work that my collaborators and I have done over the last few years, for designing a large class of such methods that work without any assumptions on the algorithm, or the distribution of the covariates, or the distribution of the labels, etc, just relying on “exchange ability” of the test and training points.

This talk is based on a sequence of joint works by the BaCaRaTi group (Rina Barber, Emmanuel Candes, myself, Ryan Tibshirani), + some recent work with a collaborator, Arun Kumar Kuchibhotla, and my student Chirag Gupta.


Sanjoy Mitter, MIT

Title:  Information and Entry Flow in Statistical Filtering.

In this talk I discuss Statistical Filtering as an Informationally Dissipative System leading to a relation between Shannon Information and fisher Information. When viewed as a problem in Bayesian Inference the Optimal Fileter (Conditional Distribution ) corresponds to a variational Principle related to Free Energy Minimization. These ideas are related to the recent work of Otto, Jordan and Kinderlehrer on the variational Principle for the Fokker Planck Equation.