Biography

I am Hongyu Wang (王鸿钰 in Chinese), a third-year Ph.D candidate at Chinese Academy of Sciences (CAS). I received my B.Eng. degree from Department of Computer Science and Technology, University of Science and Technology of China (USTC). I was advised by associate researcher Chao Qian at USTC. I am a research intern under the supervision of Dr. Furu Wei and Shuming Ma at General Artificial Intelligence group (GenAI), MSR-Asia from Aug. 2021 to present.

I have great interest on the following topics:

  1. Scale efficiently! Efficient architecture for the large-scale foundation models
  2. Self-supervised learning for robotics

Contact: why0711@mail.ustc.edu.cn

News:

  • [11/2024] BitNet a4.8, enabling 4-bit activations for 1-bit LLMs. BitNet a4.8 has only 55% active parameters and further supports 3-bit KV cache without extra training. 2B BitNet a4.8 trained with 2T tokens achieves 50.30% acc on MMLU!
  • [10/2024] bitnet.cpp, the official inference framework for BitNet b1.58! Run a 100B BitNet b1.58 model on a single CPU at a human reading speed!
  • [07/2024] Q-Sparse, the fully Sparsely-Activated LLM.
  • [04/2024] DeepNet is accepted as the regular paper by TPAMI 2024.
  • [03/2024] BitNet b1.58: Training Tips, Code and FAQ.
  • [02/2024] BitNet b1.58, the first ternary LLM that matches the performance of FP16 LLM with siginificant reduction of inference cost (latency, memory, throughput, and energy consumption)
  • [10/2023] BitNet, the first binary LLM that has competitive performance of FP16 LLM and SoTA 8-bit quantization methods
  • [05/2023] Magneto is accepted by ICML 2023
  • [11/2022] TorchScale: Transformers at Scale
  • [10/2022] Magneto, foundation Transformer, outperforms the de facto Transformer variants designed for various applications, including language modeling (i.e., BERT, and GPT), machine translation, vision pretraining (i.e., BEiT), speech recognition, and multimodal pretraining (i.e., BEiT-3).
  • [03/2022] DeepNet, Scaling Transformers to 1,000 Layers! Outperform M2M-100 by 5 BLEU point on the massive multilingual benchmarks.
  • [08/2021] Start my internship at MSR-Asia ~