Recent News
Recent News
All updates in reverse chronological order.
Event · 12 Paper · 11 Project · 4 Talk · 2 Service · 5 Intern · 3 Full Time · 1 Honor · 6
- 2026/05/19 Project Thrilled to release OSCAR: Offline Spectral Covariance-Aware Rotation for 2-bit KV Cache Quantization. INT2 KV-cache serving at 2.28 effective bits per element — ~8× KV-memory reduction and up to ~7× higher large-batch throughput with near-BF16 accuracy on Qwen3-4B/8B/32B and GLM-4.7-FP8.
- 2026/05/18
- 2026/04/03 Event I will be attending the International Conference on Learning Representations (ICLR 2026) in Rio. Looking forward to connecting with amazing researchers — see you in Rio!
- 2026/02/25
- 2026/02/25
- 2026/02/04
- 2026/01/30 Event I will be attending HPCA / PPoPP / CGO / CC 2026 as the web chair of HPCA 2026. Happy to see everyone there.
- 2026/01/27 Paper Our paper KITTY: Accurate and Efficient 2-bit KV Cache Quantization with Dynamic Channel-wise Precision Boost has been accepted by MLSys 2026! Congrats to Haojun!
- 2026/01/27 Paper Our paper CARE: Covariance-Aware and Rank-Enhanced Decomposition for Enabling Multi-Head Latent Attention has been accepted by ICLR 2026!
- 2025/11/13 Talk I will join a panel discussion on AI in the Cross-Border Digital Technologies Ecosystem: Infrastructure, Platforms & Global Launchpads.
- 2025/10/01 Service Joined 32nd IEEE International Symposium on High-Performance Computer Architecture (HPCA 2026) organization as Web Chair.
- 2025/09/02 Paper My first theoretical work on RL has been released on arXiv. Feel free to discuss!
- 2025/07/18 Full Time I will join Together Computer as a Senior AI Researcher on January 12, 2026.
- 2025/07/01 Service Nominated as Reviewer for Thirty-ninth Conference on Neural Information Processing Systems (NeurIPS 2025).
- 2025/06/15 Honor I passed my Progress Evaluation Meeting as satisfactory or excellent — thank you USYD! Preparing for my thesis!
- 2025/05/02 Paper Our paper Ladder-Residual has been accepted by ICML 2025. Congrats to Muru. See you in Vancouver, Canada!
- 2025/04/24 Event I will be attending the International Conference on Learning Representations (ICLR 2025) in Singapore. Looking forward to connecting with amazing researchers — see you in Singapore!
- 2025/03/16 Event I will be attending the NVIDIA GPU Technology Conference (GTC 2025) in San Jose, California. Excited to meet fellow GPU enthusiasts — see you there!
- 2024/12/07 Paper Our survey, RenAIssance, is finally accepted by TPAMI! Congratulations to Bobby!
- 2024/09/26 Paper Our paper, CorDA, has been accepted by NeurIPS 2024.
- 2024/07/10 Event I will be attending the USENIX Annual Technical Conference (ATC 2024) online. Looking forward to connecting!
- 2024/06/15 Honor I passed my Progress Evaluation Meeting as satisfactory or excellent — thank you USYD!
- 2024/06/06 Paper Our paper, Quant-LLM, has been accepted by USENIX ATC 2024. Congratulations to Haojun again!
- 2024/05/01 Intern Joined Together Computer as a Research Consultant.
- 2024/03/01 Intern Joined Dolby as a Research Intern.
- 2023/11/22 Honor Received APR Intern Program Scholarship (SC3600). Thank you USYD and Dolby!
- 2023/10/06 Paper DeepSpeed4Science has been released! Hope to participate in more AI4Science projects!
- 2023/07/13 Service Joined ACM International Conference on Architectural Support for Programming Languages and Operating Systems Artifact Evaluation Committee (ASPLOS'24 AEC).
- 2023/06/20 Paper Our paper, Flash-LLM, has been accepted by VLDB 2024. Congratulations to Haojun!
- 2023/06/15 Honor I passed my Progress Evaluation Meeting as satisfactory or excellent — thank you Dolby and USYD!
- 2023/04/04 Paper The DeepSpeed Chat project I participated in has reached #11 on Zhihu and #7 on GitHub this week — DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales.
- 2023/04/03 Service Nominated as Reviewer for Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS 2023).
- 2023/03/25 Event I will be attending the ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2023) online. Excited to see you there!
- 2023/03/01 Intern Joined DeepSpeed Team, Microsoft as a Research Intern.
- 2023/03/01 Service Joined Computer Science Research Methods (CSRM 2023) — INFO5993 / INFO4990 at the University of Sydney — Program Committee.
- 2022/11/13 Event I will be attending the International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2022) online. Looking forward to engaging with the global HPC community!
- 2022/10/01 Honor Received The Jingdong Technology (JD) Co Ltd Research Scholarship in Artificial Intelligence. Thank you JD and USYD!
- 2021/12/16 Event I will be attending the China National Computer Congress (CNCC 2021) online.
- 2021/11/14 Event I will be attending the International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2021) online. See you online at SC — always an inspiring HPC gathering!
- 2021/07/01 Honor Received SYSU Overseas Visiting and Collaborative Research Program Funding Plan. Thank you SYSU!
- 2021/06/24 Event I will be attending the ACM International Conference on Supercomputing (ISC 2021) online. Looking forward to exchanging ideas with HPC experts worldwide!
- 2021/06/14 Event I will be attending the International Symposium on Computer Architecture (ISCA 2021) online.
- 2020/07/11 Talk I will give a talk on JSidentify at ICSE 2020. Looking forward to connecting!
- 2020/05/23 Event I will be attending the International Conference on High Performance Big Data and Intelligent Systems (HPBD&IS 2020) online.