X

News & Events

News Events
...

News

【Academic Leadership】Congratulations! Undergraduate Zhang Ran's Research Paper Presented at ICME

Release time:2025-04-16 clicks:

When faced with unknown objects, how can robots quickly identify operable key parts and achieve precise manipulation? The GASEM framework, developed by undergraduate student Zhang Ran under the guidance of Associate Professor Liu Liu, equips robots with "intelligent perception eyes." By capturing the motion patterns of objects, it enables cross-category operable part segmentation and pose estimation.

At the 2025 IEEE International Conference on Multimedia & Expo (ICME, CCF-B ranked), the paper "GASEM"authored by Zhang Ran (first author, shared first authorship), a 2023 undergraduate from the Computer Science and Technology programwas successfully accepted. The paper proposes a motion-aware embodied intelligence framework (GASEM), which innovatively integrates multi-view scene alignment and dynamic motion modeling to achieve cross-category operable part segmentation and pose estimation. The research received high praise from reviewers, with no negative feedback in any of the evaluations. Ultimately, it stood out among numerous submissions, demonstrating our school's strength in cultivating talent in the field of artificial intelligence.

Paper Title: GASEM: Boosting Generalized and Actionable Parts Segmentation and Pose Estimation via Object Motion Perception

Authors: Ran Zhang, Liu Liu, Wenbo Xu, Li Zhang, Yiming Tang, Qi Wu, Hao Wu

Fig. 1: The overall pipeline of GASEM

Fig. 2: Qualitative results on GAPart segmentation (left) and pose estimation (right) of GAPartNet dataset

Abstract: Category-level object understanding has progressed, but generalized part perception remains underexplored. This paper introduces GASEM, a framework for Generalizable and Actionable Parts GAPart Segmentation and pose Estimation via object Motion perception. GASEM utilizes point-wise motion data from observed point clouds and cross-perspective alignment to learn object motion using a scene flow model. It features a segmentation proposal architecture for GAPart segmentation and a Normalized Object Coordinate Space (NPCS) branch for pose estimation. Additionally, a reinforcement learning agent is trained for robust GAPart manipulation in both simulations and real-world environments. Experiments on the GAPartNet dataset show GASEM outperforms state-of-the-art methods. This work promises advancements in embodied intelligence applications like robot-object interaction and generalizable manipulation.

ICME is one of the premier annual international conferences in multimedia research with significant academic influence. Recognized as a CCF-B ranked conference in multimedia computing, it maintains rigorous standardsthis year accepting only 27.3% of 3,737 submissions.


TOP