Pengxiang Ding 丁鹏翔

Hi! I am Pengxiang Ding (丁鹏翔 in Chinese). I am a second-year Ph.D. student at Zhejiang University, advised by Prof. Donglin Wang. Additionally, I am involved in a joint program with Westlake University as a member of Machine Intelligence Laboratory (MiLAB). Prior to my Ph.D. career, I received my Msc. Degree from School of Artificial Intelligence, Beijing University of Posts and Telecommunications in 2022.

Research Interests

Currently, My research has centered on embodied ai, including

Vision-Language-Action models: Foundation models for robots.
Efficient Learning: Pratical acceleration paradigm for foundation models
Data Centric Optimization: Enhance the data utilization of limited robot data

I am always looking for related collaborations, and some of them have produced top-level publications. If you are interested in communication, feel free to drop me an email.

News

[May 02, 2025] Three papers BC-IB, OTPR and ReinboT got accepted for ICML2025.
[Mar 28, 2025] Unicorn is available in arxiv.
[Mar 27, 2025] Exploring the Evolution of Physics Cognition in Video Generation: A Survey is available in arxiv.
[Mar 04, 2025] PD-VLA is available in arxiv.
[Feb 27, 2025] BC-IB is available in arxiv.
[Feb 21, 2025] OTPR is available in arxiv.
[Feb 21, 2025] Humanoid-VLA is available in arxiv.
[Jan 28, 2025] Two papers MoRE, QUART-Online got accepted for ICRA 2025.
[Jan 23, 2025] Two papers VLAS, GEVRM got accepted for ICLR 2025.
[Dec 12, 2024] SDP is available in arxiv.
[Dec 10, 2024] CARP is available in arxiv.
[Dec 10, 2024] One paper Cobra got accepted for AAAI 2025.
[Jul 31, 2024] One paper DHRNet got accepted for KBS 2024.
[Jul 16, 2024] One paper ProFD got accepted for ACMMM 2024.
[Jul 02, 2024] Two papers QUAR-VLA, PiTe (Oral) got accepted for ECCV 2024.
[Jun 30, 2024] One paper GeRM got accepted for IROS 2024.
[May 04, 2024] One paper RL2AC got accepted for RSS 2024.
[Dec 9, 2023] EAI got accepted for AAAI 2024.

Hiring: We are looking for postdoctors, research assistants and visiting students for MiLAB in Westlake University. More information about requirements can be found here, and if you are still in school, being a visiting student is also welcome. Please send email to mi_lab[AT]westlake.edu.cn with your CV if you are interested. Specially, if you are interested in my research direction and would like to be my collaborator after coming, please specify in the email and also send a copy to me.

Publications

†: Equal contribution

Peer-reviewed Conference

Shuanghao Bai, Wanqi Zhou, Pengxiang Ding, Wei Zhao, Donglin Wang, Badong Chen "Rethinking Latent Representations in Behavior Cloning: An Information Bottleneck Approach for Robot Manipulation". [paper]

Hongyin Zhang, Zifeng Zhuang, Han Zhao, Pengxiang Ding, Hongchao Lu, Donglin Wang "ReinboT: Amplifying Robot Visual-Language Manipulation with Reinforcement Learning".

Mingyang Sun, Pengxiang Ding, Weinan Zhang, Donglin Wang "Score-Based Diffusion Policy Compatible with Reinforcement Learning via Optimal Transport". [paper]

Han Zhao, Wenxuan Song, Donglin Wang, Xinyang Tong, Pengxiang Ding, Xuelian Cheng, Zongyuan Ge "MoRE: Unlocking Scalability in Reinforcement Learning for Quadruped Vision-Language-Action Models". [paper]

Xinyang Tong†, Pengxiang Ding†, Donglin Wang, Wenjie Zhang, Can Cui, Mingyang Sun, Yiguo Fan, Han Zhao, Hongyin Zhang, Yonghao Dang, Siteng Huang, Shangke Lyu "QUART-Online: Latency-Free Large Multimodal Language Model for Quadruped Robot Learning". [paper][Project]

Hongyin Zhang, Pengxiang Ding, Shangke Lyu, Ying Peng, Donglin Wang, "GEVRM: Goal-Expressive Video Generation Model For Robust Visual Manipulation". The Thirteenth International Conference on Learning Representations (ICLR2025). [paper]

Wei Zhao, Pengxiang Ding, Zhang Min, Zhefei Gong, Shuanghao Bai, Han Zhao, Donglin Wang, "VLAS: Vision-Language-Action Model with Speech Instructions for Customized Robot Manipulation". The Thirteenth International Conference on Learning Representations (ICLR2025). [paper]

Han Zhao, Min Zhang, Wei Zhao, Pengxiang Ding, Siteng Huang, Donglin Wang, "Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference". AAAI2025. [paper] [project page] [Chinese intro] [github] [demo] [Twitter@AK]

Yonghao Dang, Jianqin Yin, Liyuan Liu, Pengxiang Ding, Yuan Sun, Yanzhu Hu, "DHRNet: A Dual-path Hierarchical Relation Network for multi-person pose estimation". Knowledge-Based Systems 2024 (KBS2024). [paper][code]

Can Cui, Siteng Huang, Wenxuan Song, Pengxiang Ding, Min Zhang, Donglin Wang, "ProFD: Prompt-Guided Feature Disentangling for Occluded Person Re-Identification". ACM Multimedia 2024 (ACMMM24). [paper][code]

Pengxiang Ding, Han Zhao, Wenxuan Song, Wenjie Zhang, Min Zhang, Siteng Huang, Ningxi Yang, Donglin Wang, "QUAR-VLA: Vision-Language-Action Model for Quadruped Robots". The 18th European Conference on Computer Vision (ECCV2024). [paper] [Project]

Yang Liu†, Pengxiang Ding†, Siteng Huang, Min Zhang, Han Zhao, Donglin Wang, "PiTe: Pixel-Temporal Alignment for Large Video-Language Model". The 18th European Conference on Computer Vision (ECCV2024).[paper][code]

Wenxuan Song, Han Zhao, Pengxiang Ding, Can Cui, Shangke Lyu, Yaning Fan, Donglin Wang, "GeRM: A Generalist Robotic Model with Mixture-of-experts for Quadruped Robot". IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS2024). [paper]

Shangke Lyu, Xin Lang, Han Zhao, Hongyin Zhang, Pengxiang Ding, Donglin Wang, "RL2AC: Reinforcement Learning-based Rapid Online Adaptive Control for Legged Robot Robust Locomotion". Robotics: Science and Systems 2024 (RSS24).

Pengxiang Ding, Qiongjie Cui, Min Zhang, Mengyuan Liu, Haofan Wang, Donglin Wang, "Expressive Forecasting of 3D Whole-body Human Motions". In Proceedings of the 38th AAAI Conference on Artificial Intelligence. [paper] [code]

Chao Qi, Jianqin Yin, Jinghang Xu, Pengxiang Ding, "Instance-incremental Scene Graph Generation from Real-world Point Clouds via Normalizing Flows". In IEEE Transactions on Circuits and Systems for Video Technology. [paper][code]

Pengxiang Ding, Jianqin Yin, "Towards more realistic human motion prediction with attention to motion coordination". In IEEE Transactions on Circuits and Systems for Video Technology. [paper][code]

Xiaoli Liu, Jianqin Yin, Jin Liu, Pengxiang Ding, Jun Liu, Huaping Liu, "Trajectorycnn: a new spatio-temporal feature learning network for human motion prediction". In IEEE Transactions on Circuits and Systems for Video Technology. [paper][code]

Experience

Research Intern - DAMO Academy, Machine Intelligence Laboratory (达摩院/机器智能实验室)
- Advisor: Xin Li
- Time: Jan 2025 - March 2025.
Research Intern - RedBook, Intelligent Creation Group (小红书/智能创作组)
- Advisor: Haofan Wang
- Time: Sep 2022 - March 2023.
Research Intern - Sensetime/Smart City Group (商汤/智慧城市事业群)
- Advisor: Dongliang Wang
- Time: Sep 2021 - March 2022.

Services

Journal/Conference Reviewer

ICML, ICLR, NeurIPS
CVPR, ICCV, ACMMM, ICME
ICRA, IROS
TNNLS, TASE, TSCVT

Talk

End-to-End Quadruped Robot Large Model(端到端四足机器人大模型)，深蓝学院

Misc

Welcome to follow my Redbook.