Xiangming Gu

I am a third-year Ph.D. candidate from NUS Sound and Music Computing Lab, where I am supervised by Prof. Ye Wang. I am affiliated to Integrative Sciences and Engineering Programme and School of Computing at National University of Singapore. Before that, I obtained my B.E. degree of Electronic Engineering and B.S. degree of Finance at Tsinghua University in 2021.

I am also a research intern at Sea AI lab (SAIL). I am mentored by Dr. Tianyu Pang and Dr. Chao Du, and working closely with Dr. Qian Liu, and Dr. Min Lin.

My research interests include two directions: (i) fundamental research for generative models [Arxiv'2023], (multimodal) large language models and AI agents [ICML'2024]; (ii) application of machine learning, e.g. multimodal learning [MM'2022 Oral, TOMM'2024], multi-distribution learning (domain adaptation [ISMIR'2022, TOMM'2024], fairness [MM'2023]), to singing/speech techniques.

Email  /  CV  /  Google Scholar  /  Openreview  /  Linkedin  /  Twitter  /  Github

profile photo

  • [2024.05]: One paper got accepted to International Conference on Machine Learning (ICML'2024)!
  • [2024.02]: One paper got accepted to ACM Transactions on Multimedia Computing Communications and Applications (TOMM'2024)!
  • [2024.02]: We released Agent Smith, which got posted as "Here Come the AI Worms" on WIRED Magazine!
  • [2023.09]: One paper got accepted to IEEE Transactions on Audio, Speech and Language Processing (TASLP'2023)!
  • [2023.07]: One paper got accepted to ACM International Conference on Multimedia (MM'2023)!
  • [2023.01]: I received the Research Achievement Award from School of Computing, NUS!
  • [2022.12]: One paper got accepted to Transactions on Machine Learning Research (TMLR'2022)!
  • [2022.12]: I passed my Ph.D. Qualifying Examination (PQE) and became a Ph.D. candidate!
  • [2022.09]: One paper got accepted to Advances in Neural Information Processing Systems (NeurIPS'2022)!
  • [2022.07]: One paper got accepted to International Society for Music Information Retrieval Conference (ISMIR'2022)!
  • [2022.06]: One paper got accepted to ACM International Conference on Multimedia (MM'2022) as oral presentation, which also won the Top Paper Award!
  • [2022.05]: One paper got accepted to IEEE Transactions on Image Processing (TIP'2022)!

* denotes equal contribution, † denotes correspondence.
On Memorization in Diffusion Models
Xiangming Gu, Chao Du†, Tianyu Pang†, Chongxuan Li, Min Lin, Ye Wang
Preprints, 2023.
pdf / code

Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast
Xiangming Gu*, Xiaosen Zheng*, Tianyu Pang*†, Chao Du, Qian Liu, Ye Wang†, Jing Jiang†, Min Lin
International Conference on Machine Learning (ICML'2024), Vienna, Austria.
Also in International Conference on Learning Representations Workshop on Large Language Model Agents (LLMAgents @ ICLR'2024).
ICML / ICLR workshop / project page / code / video / slides / poster / press

Automatic Lyric Transcription and Automatic Music Transcription from Multimodal Singing
Xiangming Gu, Longshen Ou, Wei Zeng, Jianan Zhang, Nicholas Wong, Ye Wang
ACM Transactions on Multimedia Computing Communications and Applications (TOMM'2024).
pdf / code / data

Disentangled Adversarial Domain Adaptation for Phonation Mode Detection in Singing and Speech
Yixin Wang, Wei Wei, Xiangming Gu, Xiaohong Guan, Ye Wang
IEEE Transactions on Audio, Speech and Language Processing (TASLP'2023).
IEEE document / code

Elucidate Gender Fairness in Singing Voice Transcription
Xiangming Gu, Wei Zeng, Ye Wang
ACM International Conference on Multimedia (MM'2023), Ottawa, Canada.
pdf / code / video / poster

Unsupervised Mismatch Localization in Cross-Modal Sequential Data with Application to Mispronunciations Localization
Wei Wei*, Hengguan Huang*, Xiangming Gu, Hao Wang, Ye Wang
Transactions on Machine Learning Research (TMLR'2022).
pdf / code

Extrapolative Continuous-time Bayesian Neural Network for Fast Training-free Test-time Adaptation
Hengguan Huang†, Xiangming Gu, Hao Wang, Chang Xiao, Hongfu Liu, Ye Wang
Advances in Neural Information Processing Systems (NeurIPS'2022), New Orleans, USA.
pdf / code / video

Transfer Learning of wav2vec 2.0 for Automatic Lyric Transcription
Longshen Ou*, Xiangming Gu*, Ye Wang
International Society for Music Information Retrieval Conference (ISMIR'2022), Bengaluru, India.
pdf / code

MM-ALT: A Multimodal Automatic Lyric Transcription System
Xiangming Gu*, Longshen Ou*, Danielle Ong, Ye Wang
ACM International Conference on Multimedia (MM'2022). (Oral, Top Paper Award), Lisbon, Portugal.
pdf / appendix / project page / code / data / video / press

Boosting Monocular 3D Human Pose Estimation with Part Aware Attention
Youze Xue, Jiansheng Chen†, Xiangming Gu, Huimin Ma, Hongbing Ma
IEEE Transactions on Image Processing (TIP'2022).
IEEE document / code

Laser Endoscopic Manipulator Using Spring-Reinforced Multi-DoF Soft Actuator
Boyu Zhang, Penghui Yang, Xiangming Gu, Hongen Liao
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS'2021), Virtual.
Also in IEEE Robotics and Automation Letter (RA-L'2021).
IEEE document

Awards and Scholarships
Research Incentive Award, National University of Singapore, 2023
Research Achievement Award, National University of Singapore, 2022
MM'22 Top Paper Award, Association for Computing Machinery, 2022
MM'22 Student Travel Grant, Association for Computing Machinery, 2022
President's Graduate Fellowship, National University of Singapore, 2021
Visiting Undergraduate Student Scholarship, Tsinghua University, 2020
Tsinghua's Friend- Zheng Geru Scholarship, Tsinghua University, 2018

Academic Services
Conference reviewer for EMNLP 2024, NeurIPS 2024, MM 2024, ECCV 2024, IJCAI 2024, ICCV 2023, AISTATS 2021
Journal reviewer for TASLP, RA-L

Teaching Assistant, CS4347/CS5647, Sound and Music Computing, Fall 2024
Teaching Assistant, CS6212, Topics in Media, Spring 2024
Teaching Assistant, CS5242, Neural Networks and Deep Learning, Spring 2023
Teaching Assistant, CS3244, Machine Learning, Fall 2022
Teaching Assistant, CS4243, Computer Vision and Pattern Recognition, Spring 2022

You've probably seen this website template before, thanks to Jon Barron.
Last Updated July 2024.