openbmb

minicpm-o2.6

A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

26.6K Pulls 13 Tags Updated 11 months ago

A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone

17.6K Pulls 11 Tags Updated 9 months ago

A Pocket-Sized MLLM for Ultra-Efficient Image and Video Understanding on Your Phone

6,952 Pulls 13 Tags Updated 6 hours ago

A Gemini 2.5 Flash Level MLLM for Vision, Speech, and Full-Duplex Mulitmodal Live Streaming on Your Phone

6,901 Pulls 12 Tags Updated 3 months ago

A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

2,477 Pulls 12 Tags Updated 11 months ago

A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

1,826 Pulls 12 Tags Updated 9 months ago

highly efficient large language models (LLMs) designed explicitly for end-side devices

1,279 Pulls 1 Tag Updated 8 months ago

A GPT-4V Level Multimodal LLM on Your Phone

433 Pulls 13 Tags Updated 11 months ago

highly efficient large language models (LLMs) designed explicitly for end-side devices

222 Pulls 4 Tags Updated yesterday