Papers
Ordered by the date we reviewed them (reviewed first). Unreviewed papers appear at the bottom.
Gradient-based learning applied to document recognition.
LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P.
Contribution: Convolutional Neural Networks (CNNs), particularly LeNet-5.
Citation: Proceedings of the IEEE, 86(11), 2278-2234. (1998).
Attention is all you need.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I.
Contribution: The Transformer architecture.
Citation: Advances in Neural Information Processing Systems, 30, 5998-6008. (2017).
Bert: Pre-training of deep bidirectional transformers for language understanding.
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K.
Contribution: Bidirectional Encoder Representations from Transformers (BERT).
Citation: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT), 4171-4186. (2019).
On the Power of Context-Enhanced Learning in LLMs.
Xingyu Zhu, Abhishek Panigrahi, Sanjeev Arora.
Contribution: Analysis of context-enhanced learning in LLMs.
Citation: Proceedings of the 42nd International Conference on Machine Learning (ICML). (2025).
Improving language understanding by generative pre-training.
Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I.
Contribution: Generative Pre-trained Transformer (GPT-1).
Citation: OpenAI Technical Report. (2018).
Which Attention Heads Matter for In-Context Learning?
Kayo Yin, Jacob Steinhardt.
Contribution: Analysis of attention head roles in in-context learning.
Citation: Proceedings of the 42nd International Conference on Machine Learning (ICML). (2025).
Generalized brain-state modeling with KenazLBM
Graham W. Johnson, Ghassan S. Makhoul, Derek J. Doss, Bruno Hidalgo Monroy Lerma, Leon Y. Cai, Emily Liao, Danika L. Paulo
Citation: bioRxiv 2025.08.10.669538, 2025
A Finite-Time Analysis of Temporal Difference Learning With Linear Function Approximation
Jalaj Bhandari, Daniel Russo, Raghav Singal
Citation: 31st Conference on Learning Theory (COLT), PMLR 75:1691–1692, 2018
Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates
Dong Yin, Yudong Chen, Ramchandran Kannan, Peter Bartlett
Citation: 35th International Conference on Machine Learning (ICML), PMLR 80:5650–5659, 2018
CNN EXPLAINER: Learning Convolutional Neural Networks with Interactive Visualization.
Zijie J. Wang, Robert Turko, Omar Shaikh, Haekyu Park, Nilaksh Das, Fred Hohman, Minsuk Kahng, and Duen Horng (Polo) Chau.
Contribution: Interactive visualization tool for learning CNNs.
Citation: IEEE Transactions on Visualization and Computer Graphics, 27(2), 1396-1406. (2021).
Covert Malicious Finetuning: Challenges in Safeguarding LLM Adaptation.
Danny Halawi, Alexander Wei, Eric Wallace, Tony T. Wang, Nika Haghtalab, Jacob Steinhardt.
Contribution: Analysis of malicious finetuning vulnerabilities in LLMs.
Citation: Proceedings of the 41st International Conference on Machine Learning (ICML). (2024).
Estimating Divergence Functionals and the Likelihood Ratio by Convex Risk Minimization
XuanLong Nguyen, Martin J. Wainwright, Michael I. Jordan
Citation: IEEE Transactions on Information Theory, vol. 56, no. 11, pp. 5847–5861, 2010
Explaining Vulnerabilities to Adversarial Machine Learning through Visual Analytics.
Yuxin Ma, Tiankai Xie, Jundong Li, Ross Maciejewski.
Contribution: Visual analytics for explaining adversarial ML vulnerabilities.
Citation: IEEE Transactions on Visualization and Computer Graphics, 26(10), 3091-3101. (2020).
Language models are unsupervised multitask learners.
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I.
Contribution: GPT-2 and demonstrating zero-shot task learning.
Citation: OpenAI Blog, 1(8). (2019).
Learning Safety Constraints for Large Language Models.
Xin Chen, Yarden As, Andreas Krause.
Contribution: Method for learning safety constraints for LLMs.
Citation: Proceedings of the 42nd International Conference on Machine Learning (ICML). (2025).
Multiple Forecast Visualizations (MFVs): Trade-offs in Trust and Performance in Multiple COVID-19 Forecast Visualizations.
Lace Padilla, Racquel Fygenson, Spencer C. Castro, Enrico Bertini.
Contribution: Study on trust and performance in forecast visualizations.
Citation: IEEE Transactions on Visualization and Computer Graphics, 29(1), 589-599. (2023).
Provably Learning a Multi-head Attention Layer.
Sitan Chen, Yuanzhi Li.
Contribution: Theoretical analysis of learning multi-head attention.
Citation: Proceedings of the 57th Annual ACM Symposium on Theory of Computing (STOC). (2025).
Score-Based Generative Modeling Through Stochastic Differential Equations
Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, Ben Poole
Citation: International Conference on Learning Representations (ICLR), 2021
Score-Based Hypothesis Testing for Unnormalized Models
Suya Wu, Enmao Diao, Khalil Elkhalil, Jie Ding, Vahid Tarokh
Citation: IEEE Access
Towards LLM Unlearning Resilient to Relearning Attacks: A Sharpness-Aware Minimization Perspective and Beyond.
Chongyu Fan, Jinghan Jia, Yihua Zhang, Anil Ramakrishna, Mingyi Hong, Sijia Liu.
Contribution: Method for LLM unlearning resilient to attacks.
Citation: Proceedings of the 42nd International Conference on Machine Learning (ICML). (2025).
Uncertainty-Aware Multidimensional Scaling.
David Hagele, Tim Krake, and Daniel Weiskopf.
Contribution: Uncertainty-aware multidimensional scaling technique.
Citation: IEEE Transactions on Visualization and Computer Graphics, 29(9), 3740-3754. (2023).
VATLD: A Visual Analytics System to Assess, Understand and Improve Traffic Light Detection.
Liang Gou, Lincan Zou, Nanxiang Li, Michael Hofmann, Arvind Kumar Shekar, Axel Wendt and Liu Ren.
Contribution: Visual analytics system for traffic light detection models.
Citation: IEEE Transactions on Visualization and Computer Graphics, 28(1), 328-338. (2021).
VisEval: A Benchmark for Data Visualization in the Era of Large Language Models.
Nan Chen, Yuge Zhang, Jiahang Xu, Kan Ren, and Yuqing Yang.
Contribution: Benchmark for data visualization tasks for LLMs.
Citation: IEEE Transactions on Visualization and Computer Graphics. (2025).
