copy and paste this google map to your website or blog!
Press copy button and paste into your blog or website.
(Please switch to 'HTML' mode when posting into your blog. Examples: WordPress Example, Blogger Example)
Zhen Zheng Zhen Zheng, Chanyoung Oh, Jidong Zhai, Xipeng Shen, Youngmin Yi, Wenguang Chen [ PDF | Code ] [SC’16] “ Refactoring and Optimizing the Community Atmosphere Model (CAM) on the Sunway Taihulight Supercomputer ”
AStitch:EnablingaNewMulti-dimensionalOptimizationSpace . . . - Zhen Zheng ASPLOS ’22, February 28 ś March 4, 2022, Lausanne, Switzerland Zhen Zheng, et al 50% 93% 92% 87% 86% 90%75% 63% 53% 75% 0% 50% 100% CRNN ASR BERT Transformer DIEN Kernel execution time ratio Kernel number ratio Figure 1: Ratio of memory-intensive computations The ra-tio is the proportion of memory-intensive ops’ metrics to
DREW: Efficient Winograd CNN Inference with Deep Reuse - Zhen Zheng WWW ’22, April 25–29, 2022, Virtual Event, Lyon, France Ruofan Wu, Feng Zhang, Jiawei Guan, Zhen Zheng, Xiaoyong Du, and Xipeng Shen needs to be designed for exploiting the similarities and saving computations •Introduced overhead Deep reuse is an on-line process in which the similarity detection process among neuron vectors happens
Quant-LLM: Accelerating the Serving of Large Language . . . - Zhen Zheng Haojun Xia, University of Sydney; Zhen Zheng and Xiaoxia Wu, Microsoft; Shiyang Chen, Rutgers University; Zhewei Yao, Stephen Youn, Arash Bakhtiari, and Michael Wyatt, Microsoft; Donglin Zhuang and Zhongzhu Zhou, University of Sydney; Olatunji Ruwase, Yuxiong He, and Shuaiwen Leon Song, Microsoft
Optimizing Distributed Training Deployment in . . . - Zhen Zheng Zhen Zheng2, Jun Yang2, Wei Lin2 {xdyi,swzhang,zyluo,cwu}@cs hku hk,{guopinglong lgp,lansong dls,james zz,muzhuo yj,weilin lw}@alibaba-inc com The University of Hong Kong1, Alibaba2 ABSTRACT This paper proposes HeteroG, an automatic module to accelerate deep neural network training in heterogeneous GPU clusters To
WiseGraph: Optimizing GNN with Joint Workload Partition of . . . - Zhen Zheng Kezhao Huang, Jidong Zhai, Liyan Zheng, Haojie Wang, Yuyang Jin, Qihao Zhang, Runqing Zhang, Zhen Zheng, Youngmin Yi, and Xipeng Shen 2024 WiseGraph: Optimizing GNN with Joint Workload Partition of Graph and Operations In Nineteenth Euro-pean Conference on Computer Systems (EuroSys ’24), April 22ś25, 2024, Athens, Greece
Blog posts - Zhen Zheng Zhen Zheng (郑祯) ML System Research Engineer Follow Email; LinkedIn; Github; Google Scholar; Blog posts 2015 Blog Post number 4 less than 1 minute read Published: August 14, 2015 This is a sample blog post Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now Testing testing testing
RECom: A Compiler Approach to Accelerate Recommendation . . . - Zhen Zheng Zaifeng Pan, Zhen Zheng, Feng Zhang, Ruofan Wu, Hao Liang, Dalin Wang, Xiafei Qiu, Junjie Bai, Wei Lin, and Xiaoyong Du 2023 RECom: A Compiler Approach to Accelerate Recommendation Model Inference with Massive Embedding Columns In 28th ACM International Conference on Architec-tural Support for Programming Languages and Operating Systems
MonoNN: Enabling a New Monolithic Optimization Space for . . . - Zhen Zheng Donglin Zhuang †∗⋄, Zhen Zheng ‡∗, Haojun Xia †⋄, Xiafei Qiu ‡, Junjie Bai ‡, Wei Lin ‡ Shuaiwen Leon Song † †The University of Sydney ‡Alibaba Group Abstract In this work, we reveal that the kernel-by-kernel execution scheme in the existing machine learning optimizing compilers