Report Title: Learning to Perceive and Generate 3D World
Speaker: Dr. Yinghao Xu, Research Scientist
Affiliation: Research Department, Ant Group
Time: 10:00, Wednesday, June 25, 2025
Venue: Meeting Room A1104, Science and Education Building, Feicui Lake Campus
Report Abstract:
Perceiving and generating the 3D world from visual inputs is the foundation of human understanding and interaction with the physical environment. Although computer vision has made remarkable progress in 2D scene understanding, it is still difficult to capture the complete spatial and dynamic characteristics of the 3D world. In this lecture, I will introduce a human-like 3D perception system that can learn to understand 3D structures from multi-view images, usually without extensive supervision. Such systems not only achieve general 3D reconstruction and perception, but also lay the foundation for generating and manipulating 3D scenes. Furthermore, I will demonstrate how to combine 3D modeling with generative models to realize structured control of virtual scenes and intelligent agents, thereby advancing the development of artificial intelligence in spatial reasoning, interaction and environment creation towards human-level 3D intelligence.
Speaker Profile:
Yinghao Xu is currently a Research Scientist at the Research Department of Ant Group and will join the Department of Computer Science and Engineering at the Hong Kong University of Science and Technology (HKUST) as an Assistant Professor in the spring of 2026. Previously, he was a Postdoctoral Researcher at the Computational Imaging Lab of Stanford University, supervised by Professor Gordon Wetzstein. He received his Ph.D. from the Chinese University of Hong Kong, supervised by Professor Bolei Zhou and Professor Dahua Lin, and his Bachelor's degree from the Department of Information Engineering, Zhejiang University. During his undergraduate studies, he was a visiting student at the University of California, San Diego, supervised by Professor Hao Su. His research focuses on the intersection of 3D computer vision, computer graphics and generative artificial intelligence. He has published many papers in top conferences such as CVPR, ICCV, ECCV, SIGGRAPH, SIGGRAPH Asia, ICLR, NeurIPS and ICML, and has been selected for Oral or Spotlight presentations for many times, among which one paper was nominated as a 2022 CVPR Best Paper Candidate. He was named a WAIC Rising Star in 2024 and a Snap Fellowship Nominee in 2022.