Publications
See a full list on Google Scholar
High-Throughput Generative Inference of Large Language Models with a Single GPU
Ying Sheng, Lianmin Zheng, Binhang Yuan, Zhuohan Li, Max Ryabinin, Daniel Y Fu, Zhiqiang Xie, Beidi Chen, Clark Barrett, Joseph E Gonzalez, Percy Liang, Christopher RĂ©, Ion Stoica, Ce Zhang
ICML 2023
| paper | code |
AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving
Zhuohan Li*, Lianmin Zheng*, Yinmin Zhong*, Vincent Liu, Ying Sheng, Xin Jin, Yanping Huang, Zhifeng Chen, Hao Zhang, Joseph E Gonzalez, Ion Stoica
OSDI 2023
| paper | code |
TensorIR: An Abstraction for Automatic Tensorized Program Optimization
Siyuan Feng, Bohan Hou, Hongyi Jin, Wuwei Lin, Junru Shao, Ruihang Lai, Zihao Ye, Lianmin Zheng, Cody Hao Yu, Yong Yu, Tianqi Chen
ASPLOS 2023
| paper | code |
Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning
Lianmin Zheng *, Zhuohan Li *, Hao Zhang *, Yonghao Zhuang, Zhifeng Chen, Yanping Huang, Yida Wang, Yuanzhong Xu, Danyang Zhuo, Eric P. Xing, Joseph E. Gonzalez, Ion Stoica
OSDI 2022
| paper | code | slides | talk |
TenSet: A Large-scale Program Performance Dataset for Learned Tensor Compilers
Lianmin Zheng *, Ruochen Liu *, Junru Shao, Tianqi Chen, Joseph Gonzalez, Ion Stoica, Ameer Haj-Ali
NeurIPS 2021 (Datasets and Benchmarks Track)
| paper | code | slides | talk |
ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training
Jianfei Chen *, Lianmin Zheng *, Zhewei Yao, Dequan Wang, Ion Stoica, Michael W. Mahoney, Joseph E. Gonzalez
ICML 2021
| paper | code | slides | talk |
Ansor: Generating High-Performance Tensor Programs for Deep Learning
Lianmin Zheng, Chengfan Jia, Minmin Sun, Zhao Wu, Cody Hao Yu, Ameer Haj-Ali, Yida Wang, Jun Yang, Danyang Zhuo, Koushik Sen, Joseph E. Gonzalez, Ion Stoica
OSDI 2020
| paper | code | tutorial | slides | talk |
A Hardware-Software Blueprint for Flexible Deep Learning Specialization
Thierry Moreau, Tianqi Chen, Luis Vega, Jared Roesch, Eddie Yan, Lianmin Zheng, Josh Fromm, Ziheng Jiang, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy
IEEE Micro 2019 (Best paper award)
| paper | code |
Learning to Optimize Tensor Programs
Tianqi Chen, Lianmin Zheng, Eddie Yan, Ziheng Jiang, Thierry Moreau, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy
NeurIPS 2018
| paper | code | tutorial |
TVM: An Automated End-to-End Optimizing Compiler for Deep Learning
Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Meghan Cowan, Haichen Shen, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy
OSDI 2018
| paper | code |
MAgent: A Many-Agent Reinforcement Learning Platform for Artificial Collective Intelligence
Lianmin Zheng *, Jiacheng Yang *, Han Cai, Weinan Zhang, Jun Wang, Yong Yu
AAAI 2018 (Demo Track)
| paper | code | video |