openbmb/minicpm5

Details

Updated yesterday

yesterday

08239e8f70e0 · 688MB ·

model

archllama

parameters1.08B

quantizationQ4_K_M

688MB

template

<s>{{ if .System }}<|im_start|>system {{ .System }}<|im_end|> {{ end }}{{ range .Messages }}<|im_sta

172B

params

{ "stop": [ "<|im_end|>", "<|endoftext|>" ], "temperature": 0.7, "to

90B

MiniCPM Tech Report | GitHub Repo | UltraData | MiniCPM Desk Pet | Online Demo

English | 中文

Highlights

We are releasing MiniCPM5-1B, the first model in the MiniCPM5 series. It is a dense 1B Transformer built for on-device, local deployment, and resource-constrained scenarios, reaching 1B-class open-source SOTA.

🏆 1B-class open-source SOTA: compared with strong open-source models in the same size class, MiniCPM5-1B reaches SOTA within this comparison set. Its advantage is most visible in agentic tool use, code generation, and difficult reasoning.

🧠 Hybrid Reasoning: built-in <think> chat template, switch via enable_thinking. The same checkpoint serves as both a fast assistant and a deliberate reasoner.

🛠️ Deployment / Fine-tuning Resources: the MiniCPM GitHub repo provides single-page cookbooks and Agent Skills for major inference backends and fine-tuning frameworks.

🐱 Desktop Pet: a local-LLM desktop pet driven by MiniCPM5-1B.

Model List

Use this directory to choose the model format that matches your runtime:

MiniCPM5-1B · ModelScope · BF16 final release (post-trained with RL + OPD) 👈 you are here
MiniCPM5-1B-SFT · ModelScope · BF16 SFT-only checkpoint (before RL / OPD)
MiniCPM5-1B-Base · ModelScope · BF16 base checkpoint (pre-training only)
MiniCPM5-1B-GGUF · ModelScope · GGUF for llama.cpp / Ollama / LM Studio
MiniCPM5-1B-MLX · ModelScope · MLX / 4bit for Apple Silicon

Model Information

MiniCPM5-1B has the following features:

Type: Causal Language Model
Architecture: Standard LlamaForCausalLM
Number of Parameters: 1,080,632,832
Number of Non-Embedding Parameters: 679,552,512
Number of Layers: 24
Number of Attention Heads (GQA): 16 for Q and 2 for KV
Context Length: 131,072

Introduction

MiniCPM5-1B is the first checkpoint in the MiniCPM5 series. It is designed for local assistants, coding agents, tool-use workflows, and reasoning scenarios where a compact model is preferred. The model keeps a small deployment footprint while providing native long-context support and both Think / No Think chat modes through the same checkpoint.

Note: If you want to use local deployment, you can refer to this document.

highly efficient large language models (LLMs) designed explicitly for end-side devices

Details

Readme

Highlights

Model List

Model Information

Introduction