Speaker: Assistant Professor Yinpeng Dong
Affiliation: School of Artificial Intelligence, Tsinghua University
Organizer: School of Computer Science and Information Engineering
Time: 9:00, Thursday, October 16, 2025
Venue: Lecture Hall B501, Science and Education Building, Feicui Lake Campus
Report Abstract:
With the continuous enhancement of the capabilities of large language models, the problem of security alignment in complex reasoning and decision-making scenarios has become increasingly prominent. How to achieve in-depth reasoning security and value alignment without weakening the model's performance has become a key challenge in the current development of artificial intelligence. This report will focus on Reasoning-Enhanced Security Alignment for Large Models, and explore new ideas to improve the model's self-reflection and security constraint capabilities from the reasoning level. It will introduce the recent research progress in strengthening model security reasoning, balancing security and efficiency, and security alignment in multimodal scenarios, and demonstrate the potential of reasoning enhancement in improving the credibility and robustness of models combined with practical application cases. Through these explorations, we hope to promote the shift from result-oriented security constraints to a comprehensive security alignment paradigm centered on the reasoning process.
Speaker Profile:
Yinpeng Dong is an Assistant Professor at the School of Artificial Intelligence, Tsinghua University. He has published more than 60 papers in journals and conferences such as TPAMI, IJCV, CVPR and NeurIPS, with more than 12,000 citations on Google Scholar, and serves as an Area Chair of ICLR, ICML and NeurIPS. He has won the CCF Excellent Doctoral Dissertation Award, the Excellent Postdoctoral Fellowship of Tsinghua University, the Microsoft Research Fellowship, the Baidu Fellowship, etc., and has been selected into the list of the world's top 2% scientists for four consecutive years.