CV
Education
- Bachelor of Computer Science and Technology, University of Science and Technology of China (USTC), Sept. 2018 - June 2022
- Ph.D student at Institute of Computing Technology, Chinese Academy of Sciences, Sept. 2022 - present
Work experience
- Aug. 2021 - Jan. 2023 & Aug. 2023 - Present: Research intern
- Feb. 2021 - July 2021: Teaching Assistant for Mathematical analysis B2, undergraduate course
- USTC
- Supervisor: Professor Yelong Zheng
- Sept. 2020 - Jan. 2021: Teaching Assistant for Mathematical analysis B1, undergraduate course
- USTC
- Supervisor: Professor Yelong Zheng
Awards
- Outstanding Teaching Assistant at USTC, 2021
- Silver Award of Outstanding Student Scholarship at USTC, 2020
- Huawei Scholarship, 2020
- Silver Award of Outstanding Student Scholarship at USTC, 2019
Services
- Journal Reviewer: IEEE Transactions on Affective Computing.
- Conference Reviewer: ICLR, ICCV, ECCV, EMNLP, ACL Rolling Review.
Talks
Preprints & Publications
Scalable and Efficient Foundation Model
- BitNet v2: Native 4-bit Activations with Hadamard Transformation for 1-bit LLMs. Hongyu Wang*, Shuming Ma*, Furu Wei
- BitNet b1.58 2B4T Technical Report. Shuming Ma*, Hongyu Wang*, Shaohan Huang, Xingxing Zhang, Ying Hu, Ting Song, Yan Xia, Furu Wei
- BitNet a4.8: 4-bit Activations for 1-bit LLMs. Hongyu Wang*, Shuming Ma*, Furu Wei
- Bitnet.cpp: Efficient Edge Inference for Ternary LLMs. Jinheng Wang, Hansong Zhou, Ting Song, Shijie Cao, Yan Xia, Ting Cao, Jianyu Wei, Shuming Ma, Hongyu Wang, Furu Wei. ACL 2025.
- Q-Sparse: All Large Language Models can be Fully Sparsely-Activated. Hongyu Wang*, Shuming Ma*, Ruiping Wang, Furu Wei
- DeepNet: Scaling Transformers to 1,000 Layers. Hongyu Wang*, Shuming Ma*, Li Dong, Shaohan Huang, Dongdong Zhang, Furu Wei, IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI 2024)
- The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits. Shuming Ma*, Hongyu Wang*, Lingxiao Ma, Lei Wang, Wenhui Wang, Shaohan Huang, Li Dong, Ruiping Wang, Jilong Xue, Furu Wei
- BitNet: Scaling 1-bit Transformers for Large Language Models. Hongyu Wang*, Shuming Ma*, Li Dong, Shaohan Huang, Huaijie Wang, Lingxiao Ma, Fan Yang, Ruiping Wang, Yi Wu, Furu Wei
- Magneto: A Foundation Transformer. Hongyu Wang*, Shuming Ma*, Shaohan Huang, Li Dong, Wenhui Wang, Zhiliang Peng, Yu Wu, Payal Bajaj, Saksham Singhal, Alon Benhaim, Barun Patra, Zhun Liu, Vishrav Chaudhary, Xia Song, Furu Wei. International Conference on Machine Learning (ICML), 2023.
- TorchScale: Transformers at Scale. Shuming Ma*, Hongyu Wang*, Shaohan Huang, Wenhui Wang, Zewen Chi, Li Dong, Alon Benhaim, Barun Patra, Vishrav Chaudhary, Xia Song, Furu Wei
Multimodal and Robotics
- M4U: Evaluating Multilingual Understanding and Reasoning for Large Multimodal Models. Hongyu Wang*, Jiayu Xu*, Senwei Xie*, Ruiping Wang, Jialin Li, Zhaojie Xie, Bin Zhang, Chuyan Xiong, Xilin Chen
- Robotic Programmer: Video Instructed Policy Code Generation for Robotic Manipulation. Senwei Xie, Hongyu Wang, Zhanqi Xiao, Ruiping Wang, Xilin Chen