-
Learning to Price Electricity for Optimal Demand Response
On October 24, 2025 at 11:00 am till 12:00 pm E18-304Abstract:
The time at which renewable (e.g., solar or wind) energy resources produce electricity cannot generally be controlled. In many settings, however, consumers have some flexibility in their energy consumption needs, and there is growing interest in demand-response programs that leverage this flexibility to shift energy consumption to better match renewable production — thus enabling more efficient utilization of these resources. We study optimal demand response in a setting where consumers use home energy management systems (HEMS) to autonomously adjust their electricity consumption. Our core assumption is that HEMS operationalize flexibility by querying the consumer for their preferences and computing the “indifference set” of all energy consumption profiles that can be used to satisfy these preferences. Then, given an indifference set, HEMS can respond to grid signals while guaranteeing user-defined comfort and functionality; e.g., if a consumer sets a temperature range, a HEMS can precool and preheat to align with peak renewable production, thus improving efficiency without sacrificing comfort. We show that while price-based mechanisms are not generally optimal for demand response, they become asymptotically optimal in large markets under a mean-field limit. Furthermore, we show that optimal dynamic prices can be efficiently computed in large markets by only querying HEMS about their planned consumption under different price signals. We leverage this result to build an online contextual pricing algorithm, and show it to enable considerable reduction in peak system load in simulators calibrated to a number of major US cities.
Bio:
Stefan Wager is an associate professor of Operations, Information, and Technology at the Stanford Graduate School of Business, an associate professor of Statistics (by courtesy), and the Philip F. Maritz Faculty Scholar for 2025-26. His research lies at the intersection of causal inference, optimization, and statistical learning. He is particularly interested in developing new solutions to problems in statistics, economics and decision making that leverage recent advances in machine learning. He is currently serving as an associate editor for several publications including Biometrika, Management Science, Operations Research, and the Journal of the American Statistical Association. He has worked with or consulted for several Silicon Valley companies, including Dropbox, Facebook, Google, and Uber. -
Attention Sinks: A ‘Catch, Tag, Release’ Mechanism for Embeddings
On October 31, 2025 at 11:00 am till 12:00 pm E18-304Abstract:
Large language models (LLMs) often concentrate their attention on a small set of tokens—referred to as attention sinks. Common examples include the first token, a prompt-independent sink, and punctuation tokens, which are prompt-dependent. Although these tokens often lack inherent semantic meaning, their presence is critical for model performance, particularly under model compression and KV-caching. Yet, the function, semantic role, and origin of attention sinks—especially those beyond the first token—remain poorly understood.In this talk, I’ll present a comprehensive investigation revealing that attention sinks catch a sequence of tokens, tag them with a shared perturbation, and release them back into the residual stream, where they are later retrieved based on the tags they carry. Probing experiments show that these tags encode semantically meaningful information, such as the truth of a statement.
This mechanism persists in models with query-key normalization—where prompt-dependent, non-BOS sinks have become more common—and DeepSeek-distilled models, where it spans more heads and accounts for greater variance in the embeddings. To support future theoretical work, we introduce a minimal task that is solvable via the catch, tag, release mechanism, and in which the mechanism naturally emerges through training.
Bio:
Vardan Papyan is an Assistant Professor in the Department of Mathematics at the University of Toronto, cross-appointed with the Department of Computer Science. He completed his postdoctoral studies at the Department of Statistics at Stanford University, under the guidance of David Donoho, and his PhD at the Department of Computer Science at the Technion – Israel Institute of Technology, under the supervision of Michael Elad. -
Back to the future – data efficient language modeling
On November 7, 2025 at 11:00 am till 12:00 pm E18-304Abstract:
Compute scaling has dominated the conversation with modern language models, leading to an impressive array of algorithms that optimize performance for a given training (and sometimes inference) compute budget. But as compute has grown cheaper and more abundant, data is starting to become a bottleneck, and our ability to exchange computing for data efficiency may be crucial to future model scaling. In this talk, I will discuss some of our recent work on synthetic data and algorithmic approaches to data efficiency, and show that in both cases, classical statistical perspectives based on nonparametric modeling and ensembling bring new insights and empirical benefits to modern questions of scaling and data efficiency.
Biography:
Tatsunori Hashimoto is an Assistant Professor in the Computer Science Department at Stanford University. Work from his group spans many areas within statistical machine learning and language models including language model post-training, uncertainty quantification, and data selection. He received his Ph.D. at MIT under the supervision of Tommi Jaakkola and David Gifford, and is the recipient of the NSF CAREER, Samsung AI researcher of the year award, a Kavli fellowship as well as best paper awards at ICML, ICLR, and CHI.