Xiangyu Wu (武祥宇)

Ph.D. student
Knowledge Mining Group
Nanjing University of Science and Technology
836178735@qq.com
wxy_yyjhl@njust.edu.cn

Google Scholar CV

About me

I am currently a third-year Ph.D. student in Knowledge Mining Group, Nanjing University of Science and Technology.
I am supervised by Prof. Jianfeng Lu and Prof. Yang Yang.
My current research interests are MLLM, Text Generation, Test-Time Adaptation and Prompt Learning.

News

[08/2025] Four papers was submitted to ICLR 2026.
[07/2025] Two papers accepted by ACM MM 2025.
[01/2025] One paper accepted by ICLR 2025.
[08/2024] One paper accepted by ACML 2024.
[04/2024] One paper accepted by IJCAI 2024.
[03/2024] One paper accepted by ICME 2024.
[06/2022] One paper accepted by ICIP 2022.

Research Intern

Alibaba International Digital Commerce Group. (2023.08 to Now)
Working on - Algorithms - Visual & Multimodal.

Publications

Text as Any-Modality for Zero-Shot Classification by Consistent Prompt Tuning [Github]
Xiangyu Wu, Feng Yu, Yang Yang*, Jianfeng Lu*
Proceedings of the 33rd ACM International Conference on Multimedia (ACM MM'2025)

Noise Self-Correction via Relation Propagation for Robust Cross-Modal Retrieval [Github]
Ruoxuan Li, Xiangyu Wu, Yang Yang*
Proceedings of the 33rd ACM International Conference on Multimedia (ACM MM'2025)

Multi-Label Test-Time Adaptation with Bound Entropy Minimization [Github]
Xiangyu Wu, Feng Yu, Qing-Guo Chen, Yang Yang*, Jianfeng Lu*
Proceedings of the 13th International Conference on Learning Representations (ICLR'2025)

Refining Visual Perception for Decoration Display: A Self-Enhanced Deep Captioning Model [Github]
Longfei Huang, Xiangyu Wu, Jingyuan Wang, Weili Guo, Yang Yang*
Proceedings of the 16th Asian Conference on Machine Learning (ACML'2024)

TAI++: Text as Image for Multi-Label Image Classification by Co-Learning Transferable Prompt [Github]
Xiangyu Wu, Qingyuan Jiang, Yifeng Wu, Qingguo Chen, Yang Yang*, and Jianfeng Lu*
Proceedings of the 33rd International Joint Conference on Artificial Intelligence (IJCAI'2024)

CoVLR: Coordinating Cross-Modal Consistency and Intra-Modal Relations for Vision-Language Retrieval [Github]
Fengqiang Wan, Xiangyu Wu, Zhihao Guan, Yang Yang*
IEEE International Conference on Multimedia and Expo (ICME'2024)

QUES-TO-VISUAL GUIDED VISUAL QUESTION ANSWERING
Xiangyu Wu, Jianfeng Lu, Zhuanfeng Li, Fengchao Xiong
IEEE International Conference on Image Processing (ICIP'2022)

Awards & Honors

Excellence Award WWW2025多模态对话系统意图识别挑战赛 2025.01
First Place 2024第三届粤港澳大湾区国际算法算例大赛多模态大模型学科能力综合强化 2024.10
Second Prize 2024全球人工智能技术创新大赛赛道1 无人机视角下的双光目标检测 2024.06
First Place CVPR 2024 New Frontiers for Zero-shot lmage Captioning Evaluation 2024.03
Second Place 2023第二届粤港澳大湾区国际算法算例大赛基于语言增强的图像新类别发现 2024.01
First Place ICCV 2023 The First Scientific Figure Captioning (SCICAP) Challenge 2023.10
First Place ICCV 2023 Multi-modal Algorithmic Reasoning (SMART-101) Challenge 2023.10
First Place 2023全球人工智能技术创新大赛赛道1 影像学NLP-医学影像诊断报告生成 2023.06
First Place CVPR 2023 foundation model challenge Cross-Modal Image Retrieval 2023.05
First Place CVPR 2023 New Frontiers for Zero-shot lmage Captioning Evaluation 2023.05
Second Place WSDM 2023 Cup: Visual Question Answering Challenge 2023.01

Academic Services

Journal Reviewer of PR
Conference Reviewer of AAAI, ICLR, NeurIPS