Academic Report Notices（Reference Number: 2025-23）-计算机与信息学院（人工智能学院）

Speaker: Specially Appointed Professor Xiang Wang

Affiliation: University of Science and Technology of China

Organizer: School of Computer Science and Information Engineering

Time: 14:00, Thursday, October 16, 2025

Venue: Lecture Hall B501, Science and Education Building, Feicui Lake Campus

Report Abstract:

With the continuous expansion of the scale and capabilities of large models, their potential security risks and uncontrollability have become major challenges to be solved urgently. Traditional security alignment methods are often limited to a single stage, making it difficult to achieve comprehensive and refined control of model behavior. This report attempts to realize systematic intervention of risky behaviors throughout the entire life cycle from training and alignment to deployment.

In the training stage, AlphaSteer is introduced to conduct early security correction at the level of the model's internal representation through a security-prior activation guidance mechanism; in the alignment stage, AlphaAlign is designed to refine the model's values using a security-reinforced incentive mechanism to ensure that its behavior patterns are deeply aligned with security criteria; in the deployment stage, AlphaEdit is developed to perform real-time and atomic correction of specific risky behaviors of the deployed model through risk-triggered model editing technology.

Speaker Profile:

Xiang Wang is a Specially Appointed Professor and Ph.D. Supervisor at the University of Science and Technology of China, and a National Young Talent. His research interests include information recommendation and mining, large models, trustworthy artificial intelligence, etc. He has published more than 70 papers in top international conferences (such as SIGIR, WWW, NeurIPS, ICLR) and top journals (such as IEEE TPAMI, ACM TOIS), with more than 30,000 citations on Google Scholar and an H-index of 60, and is selected as an Elsevier Highly Cited Chinese Scholar.

More than 10 of his papers have been selected into the list of the most influential papers and Best Paper Candidates of international conferences. He won the ICLR Outstanding Paper Award in 2025, the Frontier Science Award of the International Congress of Basic Science twice in 2023 and 2025, the ACM SIGIR Young Scholar Award and the Wu Wenjun Artificial Intelligence Natural Science First Prize in 2024, and was selected into the MIT Technology Review TR35 list and the AI100 Young Pioneer in the same year.

Academics

Report

Academic Report Notices（Reference Number: 2025-23）

CONTACT US