Xiangming Gu

I am a student researcher at Google Deepmind and a final-year Ph.D. candidate from National University of Singapore. I obtained my bachelor degrees from Tsinghua University in 2021, and was a research intern at Sea AI Lab.

My recent research focus is to understand, advance and safely deploy generative models and agents. My next vision is (i) what's the next generation of thinking paradigm to enable LLMs solve challenging questions, e.g., scientific discovery? (ii) what's the next generation of attention paradigm in LLMs?

I am looking for full-time positions of research scientist or member of technical staff, please contact me if you are interested in my research.

Email / Google Scholar / Openreview / Linkedin / Twitter / Github

News

[2026.01]: I released gemma_penzai, a JAX package to look into LLM internals and debug LLMs with multi-modal support. This work was one of my projects at Google Deepmind. See the tutorials.
[2026.01]: I was invited to give a talk about Demystifying Attention Sink in LLMs and its Applications to Architecture Design by AER Labs at Network School.
[2025.11]: I was invited to give a talk about Attention Sink in LLMs and its Applications by Department of Electronic Engineering, Tsinghua University and Tencent Hunyuan.
[2025.10]: I was glad to give a final presentation to wrap up my student researcher at Google Deepmind: Looking into LLMs: From Tokens to Solutions.
[2025.10]: We released a technical report titled Extracting Alignment Data in Open Models!
[2025.09]: One paper got accepted to Advances in Neural Information Processing Systems (NeurIPS'2025)!
[2025.07]: One paper got accepted to Conference on Language Modeling (COLM'2025)!
[2025.06]: I was glad to give a talk about Understanding Attention Sink in (Large) Language Models in the team of Deep Learning: Agent Frontier at Google Deepmind.
[2025.05]: I was invited to give a talk on When Attention Sink Emerges in Language Models: An Empirical View by ASAP Seminar Series!
[2025.05]: I joined Google Deepmind (GDM) as a student researcher in London, United Kingdom!
[2025.02]: I was invited to give a talk titled On the Interpretability and Safety of Generative Models by research week open house of NUS!
[2025.01]: Two papers with one spotlight and one poster got accepted to International Conference on Learning Representations (ICLR'2025), one paper got accepted to Transactions on Machine Learning Research (TMLR'2025)!
[2025.01]: I gave a poster presentation about Agent Smith during Global Young Scientists Summit 2025!

Selected Research

* denotes equal contribution. Please see my Google Scholar for full list.

LLMs Reasoning

Parallel and Sequential Test-Time-Scaling in Large Reasoning Models
Xiangming Gu and the Team
Google Deepmind Internal Technical Report, 2025.

LLMs Pre-training and Attention

When Attention Sink Emerges in Language Models: An Empirical View
Xiangming Gu, Tianyu Pang, Chao Du, Qian Liu, Fengzhuo Zhang, Cunxiao Du, Ye Wang, Min Lin
International Conference on Learning Representations (ICLR), Singapore, Singapore, 2025. (Spotlight)
Also in Annual Conference on Neural Information Processing Systems Workshop on Attributing Model Behavior at Scale (ATTRIB @ NeurIPS), Vancouver, Canada, 2024. (Oral)
pdf / code / video / long talk / slides / poster

Why Do LLMs Attend to the First Token?
Federico Barbero*, Álvaro Arroyo*, Xiangming Gu, Christos Perivolaropoulos, Michael Bronstein, Petar Veličković, Razvan Pascanu
Conference on Language Modeling (COLM), Montreal, Canada, 2025.
pdf / slides

Memorization, Generalization, and Safety

Extracting Alignment Data in Open Models
Federico Barbero, Xiangming Gu, Christopher A. Choquette-Choo, Chawin Sitawarin, Matthew Jagielski, Itay Yona, Petar Veličković, Ilia Shumailov, Jamie Hayes
Technical Report, 2025.
pdf

Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast
Xiangming Gu*, Xiaosen Zheng*, Tianyu Pang*, Chao Du, Qian Liu, Ye Wang, Jing Jiang, Min Lin
International Conference on Machine Learning (ICML), Vienna, Austria, 2024.
Also in International Conference on Learning Representations Workshop on Large Language Model Agents (LLMAgents @ ICLR), Vienna, Austria, 2024.
pdf / project page / code / video / slides / ICML poster / GYSS poster / WIRED press

On Memorization in Diffusion Models
Xiangming Gu, Chao Du, Tianyu Pang, Chongxuan Li, Min Lin, Ye Wang
Transactions on Machine Learning Research (TMLR), 2025.
pdf / code

Open-Sourced Projects

gemma_penzai: A JAX Research Toolkit for Visualizing, Manipulating, and Understanding Gemma Models with Multi-modal Support. Tutorials on attention sink, logit-lens, Gemma Scope 1 and 2.

Experience and Education

	Google Deepmind Student Researcher 05.2025 - 10.2025 (London, United Kingdom), 11.2025 - 01.2026 (Singapore) Hosted by Petar Veličković and Larisa Markeeva. Also worked closed with Razvan Pascanu and Soham De. Research on reasoning and test-time-scaling of LLMs. Developing gemma_penzai to debug LLMs.
	Sea AI Lab (Sea Limited) Research Intern 03.2023 - 04.2025 (Singapore) Mentored by Tianyu Pang and Chao Du. Also worked closed with Qian Liu and Min Lin. Understanding, advancing, and safely deploying generative models and agents.
	National University of Singapore Ph.D. candidate in Computer Science 08.2021 - 02.2026 (Singapore) Supervised by Prof. Ye Wang. Research on speech, singing and multi-modality.
	Tsinghua University B.E. degree in Electronic Engineering and B.S. degree in Finance 08.2017 - 06.2021 (Beijing, China) Supervised by Prof. Jiansheng Chen. Research on computer vision.

Honors and Awards

Dean's Graduate Research Excellence Award, National University of Singapore, 2024
Research Achievement Award, National University of Singapore, 2025/2022
MM'22 Top Paper Award, Association for Computing Machinery, 2022
President's Graduate Fellowship, National University of Singapore, 2021-2025
Tsinghua's Friend- Zheng Geru Scholarship (Academic Excellence Scholarship), Tsinghua University, 2018

Talks and Sharings

[2026.01]: AER Labs and Network School, invited talk on Demystifying Attention Sink in LLMs and its Applications to Architecture Design.
[2025.11]: Department of Electronic Engineering, Tsinghua University and Tencent Hunyuan, invited talk on Attention Sink in LLMs and its Applications.
[2025.10]: Google Deepmind Team DL: Agent Frontier, talk on Looking into LLMs: From Tokens to Solutions.
[2025.06]: Google Deepmind Team DL: Agent Frontier, talk on Understanding Attention Sink in (Large) Language Models.
[2025.05]: ASAP Seminar Series, invited talk on When Attention Sink Emerges in Language Models: An Empirical View.
[2025.04]: Singapore Alignment Workshop, poster presentation on Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast.
[2025.02]: NUS Research Week Open House, invited talk on On the Interpretability and Safety of Generative Models .
[2025.01]: Global Young Scientists Summit, poster presentation on Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast.

Academic Services

Conference reviewer for NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV, ACL ARR, MM, IJCAI, AISTATS
Journal reviewer for TPAMI, TOMM, TASLP, RA-L

Teaching Services

Teaching Assistant, CS4347/CS5647, Sound and Music Computing, Fall 2024
Teaching Assistant, CS6212, Topics in Media, Spring 2024
Teaching Assistant, CS5242, Neural Networks and Deep Learning, Spring 2023
Teaching Assistant, CS3244, Machine Learning, Fall 2022
Teaching Assistant, CS4243, Computer Vision and Pattern Recognition, Spring 2022

Miscellaneous

I love tourism, movies, food, etc. I have been lived in 🇨🇳🇸🇬🇬🇧, and travelled to 🇹🇭🇫🇮🇵🇹🇧🇪🇺🇸🇭🇰🇲🇾🇨🇦🇦🇪🇦🇹🇯🇵🇭🇺🇨🇿🇮🇹🇻🇦🇭🇷🇫🇷🇨🇭🇩🇪🇳🇱🇰🇷 for holidays/conferences.

You've probably seen this website template before, thanks to Jon Barron.