45 3 days ago

Pruned to 98 experts gemma-4 a4b 26b v3

tools thinking
ollama run mannix/gemma4-98e:IQ3_M

Applications

Claude Code
Claude Code ollama launch claude --model mannix/gemma4-98e:IQ3_M
OpenClaw
OpenClaw ollama launch openclaw --model mannix/gemma4-98e:IQ3_M
Hermes Agent
Hermes Agent ollama launch hermes --model mannix/gemma4-98e:IQ3_M
Codex
Codex ollama launch codex --model mannix/gemma4-98e:IQ3_M
OpenCode
OpenCode ollama launch opencode --model mannix/gemma4-98e:IQ3_M

Models

View all →

26 models

gemma4-98e:IQ3_M

9.8GB · 256K context window · Text · 3 days ago

gemma4-98e:Q3_K_XL

11GB · 256K context window · Text · 3 days ago

gemma4-98e:Q6_K_L

18GB · 256K context window · Text · 3 days ago

gemma4-98e:Q3_K_S

9.7GB · 256K context window · Text · 3 days ago

gemma4-98e:Q3_K_M

10GB · 256K context window · Text · 3 days ago

gemma4-98e:Q3_K_L

11GB · 256K context window · Text · 3 days ago

gemma4-98e:Q4_0

11GB · 256K context window · Text · 3 days ago

gemma4-98e:Q4_1

13GB · 256K context window · Text · 3 days ago

gemma4-98e:Q4_K_S

12GB · 256K context window · Text · 3 days ago

gemma4-98e:Q4_K_M

11GB · 256K context window · Text · 3 days ago

gemma4-98e:Q5_K_S

14GB · 256K context window · Text · 3 days ago

gemma4-98e:Q5_K_M

13GB · 256K context window · Text · 3 days ago

gemma4-98e:Q6_K

15GB · 256K context window · Text · 3 days ago

gemma4-98e:IQ2_XXS

7.4GB · 256K context window · Text · 3 days ago

gemma4-98e:IQ2_XS

7.8GB · 256K context window · Text · 3 days ago

gemma4-98e:IQ2_S

7.8GB · 256K context window · Text · 3 days ago

gemma4-98e:IQ2_M

8.2GB · 256K context window · Text · 3 days ago

gemma4-98e:IQ3_XXS

8.9GB · 256K context window · Text · 3 days ago

gemma4-98e:IQ3_XS

9.2GB · 256K context window · Text · 3 days ago

gemma4-98e:IQ4_XS

11GB · 256K context window · Text · 3 days ago

gemma4-98e:IQ4_NL

11GB · 256K context window · Text · 3 days ago

gemma4-98e:CD-Q2_K

8.6GB · 256K context window · Text · 3 days ago

gemma4-98e:CD-Q3_K_M

10GB · 256K context window · Text · 3 days ago

gemma4-98e:CD-Q4_K_M

11GB · 256K context window · Text · 3 days ago

gemma4-98e:CD-Q5_K_M

13GB · 256K context window · Text · 3 days ago

gemma4-98e:CD-Q6_K

15GB · 256K context window · Text · 3 weeks ago

Readme

The gemma-4-A4B-98e-v3 is pruned specifically to keep intact reasoning, the token usage is higher than the original 128e version. It’s superseded by the v4 version (https://ollama.com/mannix/gemma4-98e-v4) that scores better and is within the same 128e original token usage:

HumanEval-chat token usage (164 problems × max=3072)

  ┌──────────────┬─────┬─────┬─────┬─────┬──────┬─────┐
  │   variant    │ min │ p10 │ p50 │ p90 │ max  │ avg │
  ├──────────────┼─────┼─────┼─────┼─────┼──────┼─────┤
  │ 128e @3072   │  35 │ 125 │ 314 │ 589 │  917 │ 334 │
  ├──────────────┼─────┼─────┼─────┼─────┼──────┼─────┤
  │ 98e-v4       │  35 │ 114 │ 304 │ 648 │  895 │ 340 │
  ├──────────────┼─────┼─────┼─────┼─────┼──────┼─────┤
  │ 98e-v3 @3072 │  35 │ 206 │ 490 │ 897 │ 1013 │ 512 │
  └──────────────┴─────┴─────┴─────┴─────┴──────┴─────┘

Template fixed for tools usage

Model on HF:

https://huggingface.co/ManniX-ITA/gemma-4-A4B-98e-v3-it

Full GGUF:

https://huggingface.co/ManniX-ITA/gemma-4-A4B-98e-v3-it-GGUF