Microsoft BitNet

1-bit LLMs, representing an extreme yet promising form of model quantization where weights and potentially activations are constrained to binary {-1, +1} or ternary {-1, 0, +1}, offer a compelling solution to the efficiency challenges.

https://github.com/microsoft/BitNet

GitHub - microsoft/BitNet: Official inference framework for 1-bit LLMs

Official inference framework for 1-bit LLMs. Contribute to microsoft/BitNet development by creating an account on GitHub.

github.com

Quantization: Native 1.58-bit weights and 8-bit activations (W1.58A8).
- Weights are quantized to ternary values {-1, 0, +1} using absmean quantization during the forward pass.
- Activations are quantized to 8-bit integers using absmax quantization (per-token).
- Crucially, the model was trained from scratch with this quantization scheme, not post-training quantized.

실제로 테스트 해보니 반응 속도가 생각보다 빠르고 데스크탑 CPU의 사용량도 높지 않음.

'LLM' 카테고리의 다른 글

Gemini-cli (0)	2025.07.05

Page by page

Microsoft BitNet

'LLM' 카테고리의 다른 글

티스토리툴바

Microsoft BitNet

'LLM' 카테고리의 다른 글

'LLM' Related Articles

티스토리툴바