Pengxiang Ding 丁鹏翔

Hi! I am Pengxiang Ding (丁鹏翔 in Chinese). I am a second-year Ph.D. student at Zhejiang University, advised by Prof. Donglin Wang. Additionally, I am involved in a joint program with Westlake University as a member of Machine Intelligence Laboratory (MiLAB). Prior to my Ph.D. career, I received my Msc. Degree from School of Artificial Intelligence, Beijing University of Posts and Telecommunications in 2022.

Research Interests

Currently, My research has centered on multi-modal large models (mainly on vision-language models), including

  • Multi-modal large models: multimodal large language models (MLLM), vision-language pre-trained models (VLM)
  • Embodied AI: foundation models for robotics
  • AIGC: human motion analysis

I am always looking for related collaborations, and some of them have produced top-level publications. If you are interested in communication, feel free to drop me an email.

News

  • [Dec 10, 2024] One paper “Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference” got accepted for AAAI 2025.

  • [July 31, 2024] One paper “DHRNet: A Dual-path Hierarchical Relation Network for multi-person pose estimation” got accepted for KBS 2024.

  • [July 16, 2024] One paper “ProFD: Prompt-Guided Feature Disentangling for Occluded Person Re-Identification” got accepted for ACMMM 2024.

  • [July 2, 2024] Two papers “QUAR-VLA: Vision-Language-Action Model for Quadruped Robots”, “PiTe: Pixel-Temporal Alignment for Large Video-Language Model” got accepted for ECCV 2024.

  • [June 30, 2024] One paper “GeRM: A Generalist Robotic Model with Mixture-of-experts for Quadruped Robot” got accepted for IROS 2024.

  • [May 4, 2024] One paper “RL2AC: Reinforcement Learning-based Rapid Online Adaptive Control for Legged Robot Robust Locomotion” got accepted for RSS 2024.

  • [March 21, 2024] A new paper about Cobra, an efficient multi-modal large language model, was released. Project page has been available. The paper has been featured by Hugging Face Daily Papers! Demo has been available!
  • [December 9, 2023] One paper about whole-body human motion prediction got accepted for AAAI 2024.

Hiring: We are looking for postdoctors, research assistants and visiting students for MiLAB in Westlake University. More information about requirements can be found here, and if you are still in school, being a visiting student is also welcome. Please send email to mi_lab[AT]westlake.edu.cn with your CV if you are interested. Specially, if you are interested in my research direction and would like to be my collaborator after coming, please specify in the email and also send a copy to me.

Publications

Google Scholar †: Equal contribution

Peer-reviewed Conference

Han Zhao, Min Zhang, Wei Zhao, Pengxiang Ding, Siteng Huang, Donglin Wang, "Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference". AAAI2025. [paper] [project page] [Chinese intro] [github] [demo] [Twitter@AK]

Yonghao Dang, Jianqin Yin, Liyuan Liu, Pengxiang Ding, Yuan Sun, Yanzhu Hu, "DHRNet: A Dual-path Hierarchical Relation Network for multi-person pose estimation". Knowledge-Based Systems 2024 (KBS2024). [paper][code]

Can Cui, Siteng Huang, Wenxuan Song, Pengxiang Ding, Min Zhang, Donglin Wang, "ProFD: Prompt-Guided Feature Disentangling for Occluded Person Re-Identification". ACM Multimedia 2024 (ACMMM24). [paper][code]

Pengxiang Ding, Han Zhao, Wenxuan Song, Wenjie Zhang, Min Zhang, Siteng Huang, Ningxi Yang, Donglin Wang, "QUAR-VLA: Vision-Language-Action Model for Quadruped Robots". The 18th European Conference on Computer Vision (ECCV2024). [paper] [Project]

Yang Liu†, Pengxiang Ding†, Siteng Huang, Min Zhang, Han Zhao, Donglin Wang, "PiTe: Pixel-Temporal Alignment for Large Video-Language Model". The 18th European Conference on Computer Vision (ECCV2024).[paper][code]

Wenxuan Song, Han Zhao, Pengxiang Ding, Can Cui, Shangke Lyu, Yaning Fan, Donglin Wang, "GeRM: A Generalist Robotic Model with Mixture-of-experts for Quadruped Robot". IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS2024). [paper]

Shangke Lyu, Xin Lang, Han Zhao, Hongyin Zhang, Pengxiang Ding, Donglin Wang, "RL2AC: Reinforcement Learning-based Rapid Online Adaptive Control for Legged Robot Robust Locomotion". Robotics: Science and Systems 2024 (RSS24).

Pengxiang Ding, Qiongjie Cui, Min Zhang, Mengyuan Liu, Haofan Wang, Donglin Wang, "Expressive Forecasting of 3D Whole-body Human Motions". In Proceedings of the 38th AAAI Conference on Artificial Intelligence. [paper] [code]

Chao Qi, Jianqin Yin, Jinghang Xu, Pengxiang Ding, "Instance-incremental Scene Graph Generation from Real-world Point Clouds via Normalizing Flows". In IEEE Transactions on Circuits and Systems for Video Technology. [paper][code]

Pengxiang Ding, Jianqin Yin, "Towards more realistic human motion prediction with attention to motion coordination". In IEEE Transactions on Circuits and Systems for Video Technology. [paper][code]

Xiaoli Liu, Jianqin Yin, Jin Liu, Pengxiang Ding, Jun Liu, Huaping Liu, "Trajectorycnn: a new spatio-temporal feature learning network for human motion prediction". In IEEE Transactions on Circuits and Systems for Video Technology. [paper][code]

Preprints & Under Submission

Pengxiang Ding, Jianqin Yin, "Uncertainty-aware Human Motion Prediction". arXiv preprint arXiv:2107.03575. [paper]

Experience

  • Research Intern - RedBook, Intelligent Creation Group (小红书/智能创作组)
  • Research Intern - Sensetime/Smart City Group (商汤/智慧城市事业群)

Services

Journal Reviewer

  • IEEE Transactions on Neural Networks and Learning Systems(TNNLS)
  • IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)

Conference Reviewer

  • International Conference on Machine Learning (ICML)
  • International Conference on Learning Representations (ICLR)
  • Annual Conference on Neural Information Processing Systems (NeurIPS)
  • International Conference on Robotics and Automation (ICRA)
  • International Conference on Intelligent Robots and Systems (IROS)
  • IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • IEEE International Conference on Multimedia and Expo (ICME)

Misc

Welcome to follow my Zhihu.