Research
(* indicates equal contribution)
|
|
ModSkill: Physical Character Skill Modularization
Yiming Huang, Zhiyang Dou, Lingjie Liu,
ICCV 2025
arxiv /
website /
We introduce a novel skill learning framework, ModSkill, that decouples complex full-body skills into compositional, modular skills for independent body parts, leveraging body structure-inspired inductive bias to enhance skill learning performance.
|
|
PhysHMR: Learning Humanoid Control Policies from Vision for Physically Plausible Human Motion Reconstruction
Qiao Feng, Yiming Huang, Yufu Wang, Jiatao Gu, Lingjie Liu,
SIGGRAPH Asia 2025 (Conditionally Accepted)
PhysHMR learns a visual-to-action policy that directly predicts control signals from visual input for physically plausible motion reconstruction.
|
|
PhysCtrl: Generative Physics for Controllable and Physics-Grounded Video Generation
Chen Wang*, Chuhao Chen*, Yiming Huang, Zhiyang Dou, Yuan Liu, Jiatao Gu, Lingjie Liu,
NeurIPS 2025
website /
PhysCtrl achieves controllable and physics-grounded video generation from an initial force input.
|
|
Vid2Sim: Generalizable, Video-based Reconstruction of Appearance, Geometry and Physics for Mesh-free Simulation
Chuhao Chen, Zhiyang Dou, Chen Wang, Yiming Huang, Anjun Chen, Qiao Feng, Jiatao Gu, Lingjie Liu,
CVPR 2025
arxiv /
code /
website /
Vid2Sim achieves high-quality, simulation-ready reconstruction of appearance, geometry, and physics from multi-view videos.
|
|
CoMo: Controllable Motion Generation through Language Guided Pose Code Editing
Yiming Huang, Weilin Wan, Yue Yang, Chris Callison-Burch, Mark Yatskar, Lingjie Liu,
ECCV 2024
arxiv /
code /
website /
We present CoMo, a unified framework for fine-grained, text-driven human motion generation and editing using discrete and semantically meaningful pose codes.
|
|
Kristen Grauman, Andrew Westbury, (et al., including Yiming Huang)
CVPR 2024 (Oral)
arxiv /
video /
website /
We present Ego-Exo4D, a diverse, large-scale multimodal multiview video dataset and benchmark challenge, centered around simultaneously-captured egocentric and exocentric video of skilled human activities (e.g., sports, music, dance, bike repair).
|
|
Weilin Wan* , Yiming Huang*, Shutong Wu, Taku Komura, Wenping Wang, Dinesh Jayaraman, Lingjie Liu
arXiv 2023
arxiv /
video /
code /
website /
We propose a network encoder that converts motion sequences into periodic signals and a conditional diffusion model for predicting periodic motion parameters based on text descriptions and the starting pose, enabling the generation of a broader variety of high-quality longer motion sequences.
|
Academic Service
Reviewer: ECCV, CVPR, SIGGRAPH, SIGGRAPH Asia, CGI
|
Teaching
Graduate Teaching Assistant at the University of Pennsylvania:
Fall 2023, Spring 2023, Fall 2022: CIT5900 Programming Languages and Techniques
Fall 2023, Spring 2023: CIS4190/5190 Applied Machine Learning
Summer 2023: ESE 5410 Machine Learning for Data Science
Course Development at the University of Pennsylvania:
EAS 5740 How to Use Data
Undergraduate Teaching Assistant at NYU Shanghai
Summer 2022, Spring 2019: CSCI-SHU 101 Intro to Computer Science
Spring 2022: CSCI-SHU 220 Algorithms
Spring 2022, Fall 2021: CSCI-SHU 2314 Discrete Mathematics
Fall 2021, Fall 2018: CSCI-UA 9002/CSCI-SHU 11 Intro to Computer Programming
Spring 2021: CENG-SHU 202 Computer Architecture
Fall 2020: MATH-SHU 235 Probability and Statistics
Other:
2021-2022: Fablab O Shanghai STEAM Instructor
|
Misc
I enjoy doodling!
I'm also a member of the Penn Enchord Acapella Group, check out some of our performances!
Here's a small stash of some of my acapella arrangments :)
|
|