Loading Events
  • This event has passed.
Stochastics and Statistics Seminar

Active learning with seed examples and search queries

April 14, 2017 @ 11:00 am - 12:00 pm

Daniel Hsu (Columbia)

Abstract: Active learning is a framework for supervised learning that explicitly models, and permits one to control and optimize, the costs of labeling data. The hope is that by carefully selecting which examples to label in an adaptive manner, the number of labels required to learn an accurate classifier is substantially reduced. However, in many learning settings (e.g., when some classes are rare), it is difficult to identify which examples are most informative to label, and existing active learning algorithms are prone to labeling uninformative examples.

Based on joint works with Alekh Agarwal, Alina Beygelzimer, Nicholas Herrera, TK Huang, John Langford, Rob Schapire, and Chicheng Zhang.

Biography: I’ll describe some improvements to active learning algorithms — and the active learning framework itself — that are better at identifying the informative examples. I’ll formalize the common practice of using seed examples and database search in learning, and demonstrate its benefits in active learning.

http://www.cs.columbia.edu/~djhsu/


MIT Statistics + Data Science Center
Massachusetts Institute of Technology
77 Massachusetts Avenue
Cambridge, MA 02139-4307
617-253-1764