Publications

Here you can find my Google Scholar page. I don’t like Google Scholar and it think it has genuinely harmed the ML research community, but for some bureaucratic reason, I have to keep it open for now. I discourage everyone from using it. If you can, instead of the above link please use the following list.

Feature Learning Theory

Feature Learning in Linear-Width Two-Layer Networks: Two vs. One Step of Gradient Descent
Under Submission, 2026. [PDF]

On the Concurrence of Layer-wise Preconditioning Methods and Provable Feature Learning
International Conference on Machine Learning (ICML), 2025. [PDF]

A Theory of Non-Linear Feature Learning with One Gradient Step in Two-Layer Neural Networks
International Conference on Machine Learning (ICML), 2024. [PDF]

Deep Learning Phenomenology

On the Mechanisms of Weak-to-Strong Generalization: A Theoretical Perspective
Advances in Neural Information Processing Systems (NeurIPS), 2025. [PDF]

Asymptotics of Linear Regression with Linearly Dependent Data
Learning for Dynamics & Control Conference (L4DC), 2025. [PDF]

Demystifying Disagreement-on-the-Line in High Dimensions
International Conference on Machine Learning (ICML), 2023. [PDF]

Information Theory

Information-Theoretic Analysis of Minimax Excess Risk
IEEE Transactions on Information Theory, 2023. [PDF]

Rate-Distortion Analysis of Minimum Excess Risk in Bayesian Learning
International Conference on Machine Learning (ICML), 2021. [PDF] (Oral Presentation)

Empirical Studies

Evaluating the Performance of Large Language Models via Debates
Conference of the North American Chapter of ACL (NAACL), 2025. [PDF]

PSD Convolution: A Novel Structural Regularization Technique for Deep Convolutional Networks
Technical Report, 2020. [Arxiv]