Biography

I am Hongyu Wang (王鸿钰 in Chinese), a third-year Ph.D candidate at Institute of Computing Technology, Chinese Academy of Sciences (ICT, CAS) under the supervision of Professor Xilin Chen. I received my B.Eng. degree from Department of Computer Science and Technology, University of Science and Technology of China (USTC). I was advised by Professor Chao Qian at USTC. I was a research intern under the supervision of Dr. Furu Wei and Shuming Ma at General Artificial Intelligence group (GenAI), MSR-Asia from Aug. 2021 to June 2025.

I have great interest on the following topics:

Scale efficiently! Efficient architecture for the large-scale foundation models
Multimodal reasoning, robotics

Contact: why0711@mail.ustc.edu.cn

News:

[06/2025] BitNet is accepted as the regular paper by JMLR 2025!
[06/2025] Introducing BitVLA, the first 1-bit VLA model for robotics manipulation and multimodal tasks! Model weights and code are public!
[05/2025] Wrote a slides to review our exploration in BitNet series . Feel free to send your questions through e-mail.
[04/2025] BitNet v2, native 4-bit activations for 1-bit LLMs.
[04/2025] Introducing BitNet b1.58 2B4T, the first native 1-bit LLM trained at scale! Model weights and technical report are public! Cooking larger models now…
[11/2024] BitNet a4.8, enabling 4-bit activations for 1-bit LLMs. BitNet a4.8 has only 55% active parameters and further supports 3-bit KV cache without extra training.
[10/2024] bitnet.cpp, the official inference framework for BitNet b1.58! Run a 100B BitNet b1.58 model on a single CPU at a human reading speed!
[07/2024] Q-Sparse, the fully Sparsely-Activated LLM.
[04/2024] DeepNet is accepted as the regular paper by TPAMI 2024.
[03/2024] BitNet b1.58: Training Tips, Code and FAQ.
[02/2024] BitNet b1.58, the first ternary LLM that matches the performance of FP16 LLM with siginificant reduction of inference cost (latency, memory, throughput, and energy consumption)