I’m an Associate Professor of CSE at the University of Michigan, Ann Arbor, and I direct SymbioticLab.
CV | Bio | Students | Google Scholar | Software
Interests: I build large-scale systems for AI/ML and data-intensive workloads. Recent work spans fault-tolerant large model training in the cloud, energy-optimal AI/ML workloads, and memory disaggregation over CXL.
Impact: Everything we build is open source. SymbioticLab produced the first memory disaggregation software (Infiniswap), the first software-only GPU sharing system for deep learning (Salus), the largest federated learning benchmark platform and runtime (FedScale), and the first AI energy optimizer (Zeus). Our work has earned paper awards at NSDI, OSDI, ATC, and MICRO.
I am one of the original co-creators of Apache Spark. My earlier works on coflow and virtual network embedding opened two research directions that the community continues to build on.
Teaching
- CSE 585: Advanced Scalable Systems for X [Agentic AI – F26, W26] [GenAI – F25, F24]
- EECS 489: Computer Networks [W25, W24, F21, F20, F19, F18, W17]
- EECS 598: Systems for X [GenAI – W24] [AI – W21, W20] [Big Data – W19, F17]
- EECS 582: Advanced Operating Systems [F16, W16]
Service
- Program Co-Chair
- Program Committees
- 2026: SOSP, MLSys, ICML, NeurIPS
- 2025: OSDI, NSDI, SIGCOMM, TTODLer-FM
- 2024: MLSys, HotCarbon
- 2023: OSDI, MLSys
- 2022: SIGCOMM, HotCarbon
- 2021: OSDI, NSDI, SIGCOMM, NeurIPS D&B
- 2020: ICDCS
- 2019: NSDI, SIGCOMM, BigData, APNet, WORD, NSysS
- 2018: SIGCOMM, ICDCS
- 2017: NSDI, SIGCOMM, KBNets, APNet, NSysS
- 2016: CoNEXT, SIGCOMM Poster/Demo, ICCIT
- 2015: CoNEXT Student Workshop