CV
Education
- Bachelor of Computer Science and Technology, University of Science and Technology of China (USTC), Sept. 2018 - June 2022
- Ph.D student at Institute of Computing Technology, Chinese Academy of Sciences, Sept. 2022 - present
Work experience
- Aug. 2021 - Jan. 2023 & Aug. 2023 - June 2025: Research intern
- Feb. 2021 - July 2021: Teaching Assistant for Mathematical analysis B2, undergraduate course
- USTC
- Supervisor: Professor Yelong Zheng
- Sept. 2020 - Jan. 2021: Teaching Assistant for Mathematical analysis B1, undergraduate course
- USTC
- Supervisor: Professor Yelong Zheng
Awards
- Outstanding Teaching Assistant at USTC, 2021
- Silver Award of Outstanding Student Scholarship at USTC, 2020
- Huawei Scholarship, 2020
- Silver Award of Outstanding Student Scholarship at USTC, 2019
Services
- Journal Reviewer: IEEE Transactions on Affective Computing.
- Conference Reviewer: ICLR, ICCV, ECCV, EMNLP, ACL Rolling Review.
Talks
Preprints & Publications
Scalable and Efficient Foundation Model
- MoTE: Mixture of Ternary Experts for Memory-efficient Large Multimodal Models. Hongyu Wang, Jiayu Xu, Ruiping Wang, Yan Feng, Yitao Zhai, Peng Pei, Xunliang Cai, Xilin Chen.
- BitVLA: 1-bit Vision-Language-Action Models for Robotics Manipulation. Hongyu Wang, Chuyan Xiong, Ruiping Wang, Xilin Chen.

- BitNet v2: Native 4-bit Activations with Hadamard Transformation for 1-bit LLMs. Hongyu Wang*, Shuming Ma*, Furu Wei.
- BitNet b1.58 2B4T Technical Report. Shuming Ma*, Hongyu Wang*, Shaohan Huang, Xingxing Zhang, Ying Hu, Ting Song, Yan Xia, Furu Wei.

- BitNet a4.8: 4-bit Activations for 1-bit LLMs. Hongyu Wang*, Shuming Ma*, Furu Wei.
- Bitnet.cpp: Efficient Edge Inference for Ternary LLMs. Jinheng Wang, Hansong Zhou, Ting Song, Shijie Cao, Yan Xia, Ting Cao, Jianyu Wei, Shuming Ma, Hongyu Wang, Furu Wei. ACL 2025.

- Q-Sparse: All Large Language Models can be Fully Sparsely-Activated. Hongyu Wang*, Shuming Ma*, Ruiping Wang, Furu Wei.
- BitNet: 1-bit Pre-training for Large Language Models. Hongyu Wang*, Shuming Ma*, Lingxiao Ma, Lei Wang, Wenhui Wang, Li Dong, Shaohan Huang, Huaijie Wang, Jilong Xue, Ruiping Wang, Yi Wu, Furu Wei. Journal of Machine Learning Research (JMLR 2025) [BitNet b1] [BitNet b1.58]

- DeepNet: Scaling Transformers to 1,000 Layers. Hongyu Wang*, Shuming Ma*, Li Dong, Shaohan Huang, Dongdong Zhang, Furu Wei, IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI 2024).

- Magneto: A Foundation Transformer. Hongyu Wang*, Shuming Ma*, Shaohan Huang, Li Dong, Wenhui Wang, Zhiliang Peng, Yu Wu, Payal Bajaj, Saksham Singhal, Alon Benhaim, Barun Patra, Zhun Liu, Vishrav Chaudhary, Xia Song, Furu Wei. International Conference on Machine Learning (ICML), 2023. Slides Poster

- TorchScale: Transformers at Scale. Shuming Ma*, Hongyu Wang*, Shaohan Huang, Wenhui Wang, Zewen Chi, Li Dong, Alon Benhaim, Barun Patra, Vishrav Chaudhary, Xia Song, Furu Wei.

Multimodal and Robotics
- M4U: Evaluating Multilingual Understanding and Reasoning for Large Multimodal Models. Hongyu Wang*, Jiayu Xu*, Senwei Xie*, Ruiping Wang, Jialin Li, Zhaojie Xie, Bin Zhang, Chuyan Xiong, Xilin Chen.
- Robotic Programmer: Video Instructed Policy Code Generation for Robotic Manipulation. Senwei Xie*, Hongyu Wang*, Zhanqi Xiao*, Ruiping Wang, Xilin Chen. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2025.
Search for 1-bit LLMs