Shuhan Tan

Hi there! I am currently a PhD student at The University of Texas at Austin, advised by Philipp Krähenbühl.

My research focus on simulate realistic human behavior and environment evolument. Specifically, my current research focuses on content generation for autonomous driving simulation systems, which aims to make autonomous driving safe and easily accessible for everyone.

My long-term goal is to develop realistic world modeling and simulation systems that can be used to train and test inteligent agents in a variety of domains.

Email / CV / Scholar / LinkedIn

Education

The University of Texas at Austin
PhD in Computer Science • Aug. 2021 -

Sun Yat-Sen University
B.E. in Computer Science • Sep. 2016 - Jun. 2021
Ranking: 2/189

Research Experience

Waymo Research
Research Intern • Jun. 2024 - Nov. 2024
With Chiyu "Max" Jiang

NVIDIA Research
Autonomous Vehicle Research Group
Research Intern • Aug. 2023 - Jun. 2024; Jun. 2025 - Present
With Boris Ivanovic , Xinshuo Weng , Yuxiao Chen ,
Boyi Li , Yulong Cao , Marco Pavone

The University of Texas at Austin
Research Assistant • Jan. 2023 - Present
PhD Advisor: Philipp Krähenbühl
Research Assistant • Aug. 2021 - Dec. 2022
With Kristen Grauman

The Chinese University of Hong Kong
Research Assistant • Sep. 2020 - Mar. 2021
With Bolei Zhou

Uber ATG Toronto
Research Intern • Sep. 2019 - Aug. 2020
With Raquel Urtasun, Shenlong Wang

Boston University
Research Assistant • July 2019 - Sep. 2019
With Kate Saenko, Xingchao Peng

Sun Yat-Sen University
Undergraduate Researcher • Sep. 2017 - June 2019
With Wei-Shi Zheng

News

[06/2025] I join NVIDIA Research as a Research Intern this summer!

[05/2025] RIPT-VLA released! Check it out!

[03/2025] SceneDiffuser++ got accepted to CVPR 2025!

[09/2024] ProSim got accepted to CoRL 2024!

[06/2024] I started my internship at Waymo Research. Let's catch up in Bay Area!

[06/2024] I finished my fatastic internship at NVIDIA Research, shoutout to my great mentors!

[05/2024] I gave a 30-min intivated talk at Long-term Human Motion Prediction Workshop at ICRA 2024. [Slides]

Talks

Towards realistic, controllable and reactive
traffic simulation

Long-term Human Motion Prediction Workshop
ICRA 2024. Yokohama, Japan.

Slides / Workshop

Publications

RIPT-VLA: Interactive Post-Training for Vision-Language-Action Models.

Shuhan Tan, Kairan Dou, Yue Zhao, Philipp Krähenbühl

arxiv 2505.17016 preprint.

Paper / Page / Code / Model

Interactive post-training for any pretrained Vision-Language-Action (VLA) model using only sparse binary success rewards.

SceneDiffuser++: City-Scale Traffic Simulation via a Generative World Model.

Shuhan Tan, John Lambert, Hong Jeon, Sakshum Kulshrestha, Yijing Bai, Jing Luo, Dragomir Anguelov, Mingxing Tan, Chiyu Max Jiang

CVPR 2025.

Paper / Page

Diffusion-based World Model for city-scale traffic simulation.

Promptable Closed-loop Traffic Simulation.

Shuhan Tan, Boris Ivanovic, Yuxiao Chen, Boyi Li, Xinshuo Weng, Yulong Cao, Philipp Krähenbühl, Marco Pavone

CoRL 2024.

Paper / Project page / Code / Video / Colab Demo / Dataset

Simulate multi-agent interactions with multimodal prompts.

Language Conditioned Traffic Generation.

Shuhan Tan, Boris Ivanovic, Xinshuo Weng, Marco Pavone, Philipp Krähenbühl

CoRL 2023.

Paper / Project page / Code / Video / Colab Demo

Traffic scene generation with language condition using LLM.

EgoDistill: Egocentric Head Motion Distillation for Efficient Video Understanding.

Shuhan Tan, Tushar Nagarajan, Kristen Grauman

NeurIPS 2023.

Paper / Project page

Efficient egocentric video understanding with head motion data from IMU.

TrafficGen: Learning to Generate Diverse and Realistic Traffic Scenarios.

Lan Feng, Quanyi Li, Zhenghao Peng, Shuhan Tan, Bolei Zhou

ICRA 2023.

Paper / Project page/ Video

Synthesis new traffic scenario and replay in simulation.

Improving the Fairness of Deep Generative Models without Retraining.

Shuhan Tan, Yujun Shen, Bolei Zhou.

arXiv.2012.04842 preprint

Paper / Project page / Colab

Mitigate biases of GAN models without retraining.

SceneGen: Learning to Generate Realistic Traffic Scenes.

Shuhan Tan*, Kelvin Wong*, Shenlong Wang, Sivabalan Manivasagam, Mengye Ren, Raquel Urtasun.

CVPR 2021.

Paper / Video / Bibtex

Generate realistic traffic scences automatically.

Class-imbalanced Domain Adaptation: An Empirical Odyssey.

Shuhan Tan, Xingchao Peng, Kate Saenko.

TASK-CV Workshop, ECCV 2020.

Paper / Bibtex

Align feature distributions across domains while the label distributions of the two domains are also different.

LidarSIM: Realistic LiDAR Simulation by Leveraging the Real World

Sivabalan Manivasagam, Shenlong Wang, Kelvin Wong, Wenyuan Zeng, Mikita Sazanovich, Shuhan Tan, Bin Yang, Wei-Chiu Ma and Raquel Urtasun.

CVPR 2020 (Oral).

Paper / Supplement / Bibtex

Realistic sensor simulation for LiDAR.

Biomarker Localization by Combining CNN Classifier and Generative Adversarial Network

Rong Zhang, Shuhan Tan, Ruixuan Wang, Siyamalan Manivannan, Jingjing Chen, Haotian Lin, Wei-Shi Zheng.

MICCAI 2019.

Paper / Bibtex

We proposed a novel deep neural network architecture to effectively localize potential biomarkers in medical images, when only the image-level labels are available during model training.

Weakly Supervised Open-set Domain Adaptation by Dual-domain Collaboration

Shuhan Tan, Jiening Jiao, Wei-Shi Zheng

CVPR 2019.

Paper / Supplement / Bibtex

We proposed a practical weakly supervised setting for open-set domain adaptation, where two scarcely-labeled domains collaboratively learn from each other.

Invited for presentation at WebVision 2019.

Selected Honors

Distinguished Graduate Thesis, Sun Yat-sen University

Patent

Systems and Methods for Simulating Traffic Scenes. US Patent App. 17/528,277.

Shuhan Tan, Kelvin Ka Wing Wong, Shenlong Wang, Sivabalan Manivasagam, Mengye Ren, Raquel Urtasun

Acedemic Service

Reviewer: CVPR, ECCV, ICCV, ICRA, RA-L, TCSVT, TMI.

Cat Service

Shufa, aka Maomi
British Longhair Boy😺

Born in 2022, California.

Instagram / Wiki

My little colleague, who always sleeps on my desk,
put his paw on the F5 key to prevent vscode debugging.

Updated Sep 2024.

Special thanks to Jon Barron for website template.