CMU MRSD Capstone Project (Fall 2025), sponsored by Nissan and Field AI
Advised by Prof. Guanya Shi
View the project website

Skills

Computer Vision, Point Clouds, Robot Foundation Models, Vision-Language-Action Models
Frameworks: PyTorch, PyTorch 3D, MuJoCo, PCL, Open3D, Apple Vision Pro, Rerun

Collected 800+ high-quality tele-operated manipulation data on the Unitree G1 robot using Apple Vision Pro and a custom built data teleoperation and collection pipeline. LoRA fine-tuned NVIDIA GR00T N1.5 and deployed the policy. Developed and benchmarked other diffusion policies with modality (point-cloud perception) and architecture (DDPM and ACT) changes with a focus on latency and reliability.

GR00T N1.5 VLA on the Unitree G1 performing box manipulation.

Project Poster

Download PDF