Zheng Wang (汪政)



Bio

Zheng Wang is a Research Scientist at Meta Superintelligence Labs, Kernel & Optimization team. He received his Ph.D. in computer science from UC San Diego, where he was advised by Prof. Yufei Ding. His research focuses on high-performance system design and end-to-end optimization for large-scale deep learning, specifically Large Language Models (LLMs) and Recommendation Models (DLRMs).

News

  • [09/2025] Our Yggdrasil paper has been accepted to NeurIPS’25.
  • [04/2025] Our GMI-DRL paper has been accepted to ATC’25.
  • [03/2025] Our WLB-LLM paper has been accepted to OSDI’25.
  • [02/2025] Our FastTree paper has been accepted to MLSys’25.

Internship

  • [06/2025-09/2025] Super-Intelligence Lab Infra Team, Meta.
  • [06/2024-09/2024] AI & System Co-design Team, Meta.
  • [06/2023-09/2023] AI & System Co-design Team, Meta.
  • [06/2022-09/2022] Research Intern, Pacific Northwest National Laboratory (PNNL).
  • [11/2020-01/2021] Software Engineering Intern, Tencent.

Publication

[NeurIPS’25] Yue Guan, Changming Yu, Shihan Fang, Weiming Hu, Zaifeng Pan, Zheng Wang, Zihan Liu, Yangjie Zhou, Yufei Ding, Minyi Guo, Jingwen Leng. Yggdrasil: Bridging Dynamic Speculation and Static Runtime for Latency-Optimal Tree-Based LLM Decoding. (to appear)

[ATC’25] Yuke Wang, Boyuan Feng, Zheng Wang, Guyue Huang, Tong Geng, Ang Li, Yufei Ding. GMI-DRL: Empowering Multi-GPU Deep Reinforcement Learning with GPU Spatial Multiplexing. [Link]

[OSDI’2025] Zheng Wang, Anna Cai, Xinfeng Xie, Zaifeng Pan, Yue Guan, Weiwei Chu, Jie Wang, Shikai Li, Jianyu Huang, Chris Cai, Yuchen Hao, Yufei Ding. WLB-LLM: Workload-Balanced 4D Parallelism for Large Language Model Training. [Link]

[MLSys’2025] Zaifeng Pan, Yitong Ding, Yue Guan, Zheng Wang, Zhongkai Yu, Xulong Tang, Yida Wang, Yufei Ding. FastTree: Optimizing Attention Kernel and Runtime for Tree-Structured LLM Inference [Link]

[ATC’2024] Zheng Wang, Yuke Wang, Boyuan Feng, Guyue Huang, Dheevatsa Mudigere, Bharath Muthiah, Ang Li, Yufei Ding. OPER: Optimality-Guided Embedding Table Parallelization for Large-scale Recommendation Model. [Link]

[ASPLOS’2024] Zheng Wang, Yuke Wang, Jiaqi Deng, Da Zheng, Ang Li, Yufei Ding. RAP: Resource-aware Automated GPU Sharing for Multi-GPU Recommendation Model Training and Input Preprocessing. [Link]

[ASPLOS’2024] Boyuan Feng, Zheng Wang, Yuke Wang, Shu Yang, Yufei Ding. ZENO: A Type-based Optimization Framework for Zero-Knowledge Neural Network Inference. [Link]

[ATC’2023] Yuke Wang, Boyuan Feng, Zheng Wang, Guyue Huang, Yufei Ding. TC-GNN: Bridging Sparse GNN Computation and Dense Tensor Cores on GPUs. [Link]

[OSDI’2023] Yuke Wang, Boyuan Feng, Zheng Wang, Tong Geng, Ang Li, Kevin Barker, Yufei Ding. MGG: Accelerating Graph Neural Networks with Fine-Grained Intra-Kernel Communication-Computation Pipelining on Multi-GPU Platforms. [Link]

[ISCA’2023] Siqi Li, Fengbin Tu, Liu Liu, Jilan Lin, Zheng Wang, Yangwook Kang, Yufei Ding, Yuan Xie. ECSSD: Hardware/Data Layout Co-Designed In-Storage-Computing Architecture for Extreme Classification. [Link]

[SC’2022] Zheng Wang, Yuke Wang, Boyuan Feng, Dheevatsa Mudigere, Bharath Muthiah, Yufei Ding. EL-Rec: Efficient Large-scale Recommendation Model Training via Tensor-train Embedding Table. [Link]

[ATC’2022] Boyuan Feng, Tianqi Tang, Yuke Wang, Zhaodong Chen, Zheng Wang, Shu Yang, Yuan Xie, Yufei Ding. Faith: An Efficient Framework for Transformer Verification on GPUs. [Link]


Selected Awards

[04/2023] 2023 Meta Fellowship Finalist [Link]


Professional Services

  • [01/2023] PLDI’23 Artifact Evaluation Committee
  • [01/2023] EuroSys’23 Artifact Evaluation Committee
  • [11/2022] ECOOP’23 Extended Review Committee
  • [11/2022] PPoPP’23 Artifact Evaluation Committee
  • [10/2022] CGO’23 Artifact Evaluation Committee
  • [07/2022] MICRO’22 Artifact Evaluation Committee
  • [06/2022] SIGCOMM’22 Artifact Evaluation Committee Member
  • [04/2022] Teaching Assistant of CS160 (Translation of Programming Languages)