Report Title: Towards Social AI Agents
Time: 14:30, Tuesday, January 6, 2026
Venue: Lecture Hall B501, Science and Education Building, Feicui Lake Campus
Speaker: Xu Cao
Affiliation: University of Illinois Urbana-Champaign, USA
Organizer: School of Computer Science and Information Engineering
Report Abstract:
Social interaction is the cornerstone of human collaboration. Despite significant advances in artificial intelligence (AI) for surface-level semantic analysis (e.g., dialogue content understanding), building social agents that can seamlessly understand and participate in interactions and accurately predict user behaviors remains a grand challenge. The core difficulty lies in bridging the logical gap between high-level reasoning and low-level perception. Although current Multimodal Large Language Models (MLLMs) possess general content understanding capabilities, they are still limited in interpreting fine-grained social cues when performing high-level skills such as Theory of Mind (ToM) in real social scenarios. Human social cognition heavily relies on complex signals including tone, gaze direction, body gestures and micro-expressions to convey intentions, negotiate turn-taking and build trust. Without accurate detection of these critical cues, models struggle to capture the deep coordination underlying human behaviors. By reviewing the evolution of social artificial intelligence, this report aims to propose an MLLM-based social agent framework that empowers models with high-level social cognitive abilities, enabling them to precisely parse the subtleties of human social collaboration just as they process code or text.
Speaker Profile:
Xu Cao is a PhD Candidate in Computer Science at the University of Illinois Urbana-Champaign. He is also a Student Researcher at Google and a recipient of the CIFAR PhD Fellowship from the Canadian Institute for Advanced Research. His research focuses on social artificial intelligence, augmented reality, vision-language models, multimodal large language models and pediatric AI. He has published 37 papers in top AI conferences and journals including CVPR, NeurIPS, AAAI, ICCV, IJCAI, COLM, IROS, UAI, KDD, Information Fusion, and Nature Scientific Data, with more than 1,800 citations on Google Scholar. During his tenure at Tencent, the high-precision map automatic annotation system he led and participated in won the AAAI 2022 AI Innovation Application Award. In academic service, he has served as organizer and chair for workshops at several top AI conferences, including Founder and General Chair of the 2025 ICLR Workshop on AI for Children and Pediatrics (AI4CHL), Founder and General Chair of the 2024 WACV Workshop on Large Language and Vision Models for Autonomous Driving (LLVM-AD), Roundtable Chair of ML4H 2024, and co-organizer of the 2025 ICCV/CVPR Workshop on Model Distillation for Autonomous Driving (WDFM-AD) and 2024–2025 ITSC & 2025 WACV Workshop on LLVM-AD.