| | | | |
Transformers Learn Generalizable Chain-of-Thought Reasoning via Gradient Descent ()
On October 3, 2025 at 11:00 am till 12:00 pm Yuejie Chi, Yale University
| |
| | | | |
Do Large Language Models (Really) Need Statistical Foundations? ()
On October 10, 2025 at 11:00 am till 12:00 pm Weijie Su, University of Pennsylvania
| |
| | | | |
Hard-Constrained Neural Networks ()
On October 17, 2025 at 11:00 am till 12:00 pm Navid Azizan, MIT
| |
| | | | |
Learning to Price Electricity for Optimal Demand Response ()
On October 24, 2025 at 11:00 am till 12:00 pm Stefan Wager, Stanford University
| |
| | | | |
Attention Sinks: A ‘Catch, Tag, Release’ Mechanism for Embeddings ()
On October 31, 2025 at 11:00 am till 12:00 pm Vardan Papyan, University of Toronto
| |