410 Downloads Updated yesterday
ollama run mannix/gemma4-98e-v7-coder
ollama launch claude --model mannix/gemma4-98e-v7-coder
ollama launch codex-app --model mannix/gemma4-98e-v7-coder
ollama launch openclaw --model mannix/gemma4-98e-v7-coder
ollama launch hermes --model mannix/gemma4-98e-v7-coder
ollama launch codex --model mannix/gemma4-98e-v7-coder
ollama launch opencode --model mannix/gemma4-98e-v7-coder
Name
57 models
gemma4-98e-v7-coder:latest
13GB · 256K context window · Text · yesterday
gemma4-98e-v7-coder:IQ3_M
9.8GB · 256K context window · Text · yesterday
gemma4-98e-v7-coder:IQ4_K_M
10GB · 256K context window · Text · yesterday
gemma4-98e-v7-coder:Q2_K_L
8.6GB · 256K context window · Text · yesterday
gemma4-98e-v7-coder:Q3_K_XL
11GB · 256K context window · Text · yesterday
gemma4-98e-v7-coder:Q4_K_L
13GB · 256K context window · Text · yesterday
gemma4-98e-v7-coder:Q5_K_L
15GB · 256K context window · Text · yesterday
gemma4-98e-v7-coder:Q6_K_L
18GB · 256K context window · Text · yesterday
gemma4-98e-v7-coder:qat
11GB · 256K context window · Text · yesterday
gemma4-98e-v7-coder:Q3_K_S
9.7GB · 256K context window · Text · yesterday
gemma4-98e-v7-coder:Q3_K_M
11GB · 256K context window · Text · yesterday
gemma4-98e-v7-coder:Q3_K_L
11GB · 256K context window · Text · yesterday
gemma4-98e-v7-coder:Q4_0
11GB · 256K context window · Text · yesterday
gemma4-98e-v7-coder:Q4_1
13GB · 256K context window · Text · yesterday
gemma4-98e-v7-coder:Q4_K_S
12GB · 256K context window · Text · yesterday
gemma4-98e-v7-coder:Q4_K_M
13GB · 256K context window · Text · yesterday
gemma4-98e-v7-coder:Q5_K_S
14GB · 256K context window · Text · yesterday
gemma4-98e-v7-coder:Q5_K_M
15GB · 256K context window · Text · yesterday
gemma4-98e-v7-coder:Q6_K
18GB · 256K context window · Text · yesterday
gemma4-98e-v7-coder:Q8_0
21GB · 256K context window · Text · yesterday
gemma4-98e-v7-coder:IQ2_XS
7.8GB · 256K context window · Text · yesterday
gemma4-98e-v7-coder:IQ4_XS
11GB · 256K context window · Text · yesterday
gemma4-98e-v7-coder:IQ4_NL
11GB · 256K context window · Text · yesterday
gemma4-98e-v7-coder:CD-Q2_K
8.8GB · 256K context window · Text · yesterday
gemma4-98e-v7-coder:CD-Q3_K_L
10GB · 256K context window · Text · yesterday
gemma4-98e-v7-coder:CD-Q4_K_M
11GB · 256K context window · Text · yesterday
gemma4-98e-v7-coder:CD-qat-Q4_K_M
11GB · 256K context window · Text · yesterday
gemma4-98e-v7-coder:CD-Q5_K_M
13GB · 256K context window · Text · yesterday
gemma4-98e-v7-coder:CD-Q6_K
16GB · 256K context window · Text · yesterday
gemma4-98e-v7-coder:vision-CD-Q2_K
10GB · 256K context window · Text, Image · yesterday
gemma4-98e-v7-coder:vision-CD-Q3_K_L
11GB · 256K context window · Text, Image · yesterday
gemma4-98e-v7-coder:vision-CD-Q4_K_M
12GB · 256K context window · Text, Image · yesterday
gemma4-98e-v7-coder:vision-CD-qat-Q4_K_M
12GB · 256K context window · Text, Image · yesterday
gemma4-98e-v7-coder:vision-CD-Q5_K_M
14GB · 256K context window · Text, Image · yesterday
gemma4-98e-v7-coder:vision-CD-Q6_K
17GB · 256K context window · Text, Image · yesterday
gemma4-98e-v7-coder:vision-IQ3_M
11GB · 256K context window · Text, Image · yesterday
gemma4-98e-v7-coder:vision-IQ4_K_M
11GB · 256K context window · Text, Image · yesterday
gemma4-98e-v7-coder:vision-Q2_K_L
9.8GB · 256K context window · Text, Image · yesterday
gemma4-98e-v7-coder:vision-Q3_K_XL
12GB · 256K context window · Text, Image · yesterday
gemma4-98e-v7-coder:vision-Q4_K_L
15GB · 256K context window · Text, Image · yesterday
gemma4-98e-v7-coder:vision-Q5_K_L
16GB · 256K context window · Text, Image · yesterday
gemma4-98e-v7-coder:vision-Q6_K_L
19GB · 256K context window · Text, Image · yesterday
gemma4-98e-v7-coder:vision-qat
13GB · 256K context window · Text, Image · yesterday
gemma4-98e-v7-coder:vision-Q3_K_S
11GB · 256K context window · Text, Image · yesterday
gemma4-98e-v7-coder:vision-Q3_K_M
12GB · 256K context window · Text, Image · yesterday
gemma4-98e-v7-coder:vision-Q3_K_L
12GB · 256K context window · Text, Image · yesterday
gemma4-98e-v7-coder:vision-Q4_0
13GB · 256K context window · Text, Image · yesterday
gemma4-98e-v7-coder:vision-Q4_1
14GB · 256K context window · Text, Image · yesterday
gemma4-98e-v7-coder:vision-Q4_K_S
13GB · 256K context window · Text, Image · yesterday
gemma4-98e-v7-coder:vision-Q4_K_M
14GB · 256K context window · Text, Image · yesterday
gemma4-98e-v7-coder:vision-Q5_K_S
15GB · 256K context window · Text, Image · yesterday
gemma4-98e-v7-coder:vision-Q5_K_M
16GB · 256K context window · Text, Image · yesterday
gemma4-98e-v7-coder:vision-Q6_K
19GB · 256K context window · Text, Image · yesterday
gemma4-98e-v7-coder:vision-Q8_0
22GB · 256K context window · Text, Image · yesterday
gemma4-98e-v7-coder:vision-IQ2_XS
9.0GB · 256K context window · Text, Image · yesterday
gemma4-98e-v7-coder:vision-IQ4_XS
12GB · 256K context window · Text, Image · yesterday
gemma4-98e-v7-coder:vision-IQ4_NL
13GB · 256K context window · Text, Image · yesterday
20.8B params · 98 experts (30 dropped) · ~4B active · code map + targeted_gpqa
A research checkpoint that takes Gemma-4-26B-A4B-it and drops 30⁄128 experts per layer using a code-targeted recipe on the rebuilt v7 competence maps (audited producers, 10 classes, multilingual category), on a [24,40] per-layer floor — plus a targeted_gpqa class at weight 1.5 that protects a science-specialist keep-set derived from GPQA-diamond pass-traces. Same router, attention, and norms as base, plus the mandatory shared-FFN α=1.2 upweight every coder variant carries.
A coder that kept all its science: it holds v6-coder’s top-tier code profile while pulling GPQA-diamond to 70.71% — +9.6pp over v6-coder and at parity with the unpruned 128e (67.17 on the same Q6_K run). For maximal coding throughput with science left at baseline, see the sibling v7-coderx.
Full model card & methodology: ManniX-ITA/gemma-4-A4B-98e-v7-coder-it on Hugging Face.
Other formats:
- GGUF (29 tiers, imatrix, CD-* per-layer mixes + F16 + mmproj): ManniX-ITA/gemma-4-A4B-98e-v7-coder-it-GGUF
- NVFP4A16 (native vLLM, ~13 GB): ManniX-ITA/gemma-4-A4B-98e-v7-coder-NVFP4A16
| GPQA-D | AIME | MATH-500 | GSM8K | ARC | IFEval | HE | HE+ | LCB-55 | LCB-100 | MultiPL-E |
|---|---|---|---|---|---|---|---|---|---|---|
| 70.71 | 76.67 | 92.00 | 93.00 | 94.80 | 95.00 | 98.78 | 92.68 | 96.36 | 97.00 | 88.67 |
Reference columns on the same Q6_K run: unpruned 128e GPQA 67.17 / AIME 73.33 / HE 97.56 / LCB-55 96.36; v6-coder GPQA 61.11 / HE 98.17 / LCB-55 92.73. v7-coder matches or beats 128e on GPQA, AIME and GSM8K while holding the cohort code profile. (Small benches — GPQA 198q, AIME 30q — carry run-to-run variance; read GPQA/AIME as “recovered the science gap”, not a robust win over the base.)
Every K-quant and CD tier was scored on HumanEval+ (164) and MultiPL-E-100 (llama.cpp, greedy T=0), with per-problem completion length from token_stats. bpw is the true bits-per-weight (8 × bytes ÷ 19,877,953,946). ⭐ marks a recommended pick.
| Tier | Size (GB) | bpw | HE+ % | HE+ tok p50/p90/max | MPE-100 % | MPE tok p50/p90/max |
|---|---|---|---|---|---|---|
| Q8_0 | 21.16 | 8.52 | 91.46 | 231/440/1796 | 88.67 | 83/190/1012 |
| Q6_K_L | 17.98 | 7.24 | 92.07 | 233/443/1039 | 90.00 | 84/188/976 |
| ⭐ Q6_K | 17.81 | 7.17 | 93.29 | 238/451/1238 | 89.33 | 84/188/975 |
| Q5_K_L | 15.25 | 6.14 | 91.46 | 236/455/16209 | 88.67 | 84/193/1013 |
| Q5_K_M | 15.07 | 6.07 | 91.46 | 240/463/6367 | 88.67 | 88/190/1011 |
| Q5_K_S | 14.19 | 5.71 | 90.24 | 231/466/4754 | 89.00 | 88/199/1011 |
| Q4_K_L | 13.42 | 5.40 | 92.68 | 243/455/2047 | 88.67 | 84/188/1013 |
| Q4_K_M | 13.24 | 5.33 | 92.68 | 254/539/16210 | 88.00 | 85/189/1013 |
| Q4_1 | 12.61 | 5.08 | 90.85 | 233/442/4061 | 88.67 | 87/175/704 |
| Q4_K_S | 12.21 | 4.91 | 91.46 | 242/461/1952 | 88.33 | 84/185/1012 |
| IQ4_NL | 11.42 | 4.60 | 90.85 | 234/429/1429 | 87.67 | 85/186/1011 |
| Q4_0 | 11.42 | 4.60 | 89.02 | 242/555/3513 | 87.00 | 87/212/1012 |
| ⭐ qat-Q4_0 | 11.42 | 4.60 | 93.29 | 234/508/16210 | 87.00 | 84/181/974 |
| IQ4_XS | 11.01 | 4.43 | 90.85 | 238/438/2905 | 88.67 | 83/185/883 |
| Q3_K_L | 10.94 | 4.40 | 91.46 | 236/459/1819 | 88.33 | 86/207/1017 |
| ⭐ CD-qat-Q4_K_M | 10.83 | 4.36 | 92.68 | 234/483/3283 | 87.67 | 89/204/1011 |
| Q3_K_XL | 10.69 | 4.30 | 90.85 | 233/427/2524 | 88.00 | 86/200/1013 |
| Q3_K_M | 10.51 | 4.23 | 90.85 | 232/438/3528 | 88.00 | 87/207/957 |
| CD-Q3_K_L | 10.22 | 4.11 | 91.46 | 234/558/2054 | 89.67 | 86/218/1013 |
| Q3_K_S | 9.68 | 3.89 | 89.02 | 257/620/16208 | 84.00 | 93/223/1017 |
| ⭐ CD-Q2_K | 8.82 | 3.55 | 91.46 | 237/568/3124 | 87.67 | 90/234/1013 |
| Q2_K_L | 8.58 | 3.45 | 86.59 | 240/1141/16220 | 82.33 | 95/459/1017 |
| IQ2_XS | 7.77 | 3.13 | 73.78 | 249/15917/16293 | 72.00 | 91/626/1016 |
Recommended picks:
The K-quant and CD tiers hold HE+ in the 90–93% band with length essentially identical to Q6_K; the 2-bit Q2_K_L / IQ2_XS tiers are the cliff (HE+ into the 80s/70s, token p90 blows out). The K-quant CD tiers are the recommended low-bit path — CD-IQ* i-quant bodies are not offered (the pruned MoE degenerates on an IQ-family body, score → 0). Prefer Q4_K_M or higher for production.
Pairing by tier name is misleading — this is a ~20.8B-total MoE and Qwen2.5-Coder-14B is a 14.7B dense model, so the same tier name lands at a different size. The fair comparison is iso-disk: at a given GB budget, which scores higher on HumanEval+? Same rig (RTX 3090, llama.cpp, greedy). Qwen GGUFs are bartowski’s (83–85% across the ladder). At every band the MoE runs lower bpw at the same disk and still scores higher.
| Disk band | Qwen2.5-Coder-14B (size / bpw / HE+) | v7-coder best (size / bpw / HE+) | Δ HE+ |
|---|---|---|---|
| ~21.2 GB | (none — Qwen ceiling Q8_0 15.70 GB) | Q8_0 21.16 / 8.52 / 91.46% | new top |
| ~17.8 GB | (none — Qwen ceiling Q8_0 15.70 GB) | Q6_K 17.81 / 7.17 / 93.29% | new top |
| ~15.1 GB | Q8_0 15.70 / 8.54 / 84.76% | Q5_K_M 15.07 / 6.07 / 91.46% | +6.70 |
| ~13.2 GB | Q6_K 12.12 / 6.60 / 84.76% | Q4_K_M 13.24 / 5.33 / 92.68% | +7.92 |
| ~12.2 GB | Q6_K 12.12 / 6.60 / 84.76% | Q4_K_S 12.21 / 4.91 / 91.46% | +6.70 |
| ~11.4 GB | Q5_K_M 10.51 / 5.72 / 83.54% | qat-Q4_0 11.42 / 4.60 / 93.29% — ⭐ ties top HE+ | +9.75 |
| ~10.8 GB | Q5_K_M 10.51 / 5.72 / 83.54% | CD-qat-Q4_K_M 10.83 / 4.36 / 92.68% — ⭐ recommended | +9.14 |
| ~10.5 GB | Q5_K_M 10.51 / 5.72 / 83.54% | Q3_K_M 10.51 / 4.23 / 90.85% — iso-disk (same 10.5 GB) | +7.31 |
| ~10.2 GB | Q5_K_M 10.51 / 5.72 / 83.54% | CD-Q3_K_L 10.22 / 4.11 / 91.46% | +7.92 |
| ~8.8 GB | Q4_K_M 8.99 / 4.89 / 85.37% | CD-Q2_K 8.82 / 3.55 / 91.46% — ⭐ smallest 90%+ | +6.09 |
ollama pull mannix/gemma4-98e-v7-coder # :latest = Q4_K_M
ollama pull mannix/gemma4-98e-v7-coder:CD-qat-Q4_K_M # ⭐ recommended — 92.68% HE+, 10.8 GB
ollama pull mannix/gemma4-98e-v7-coder:Q6_K # max quality — 93.29% HE+ (bench tier)
ollama pull mannix/gemma4-98e-v7-coder:CD-Q2_K # smallest 90%+ — 91.46% HE+, 8.8 GB
ollama pull mannix/gemma4-98e-v7-coder:vision-Q4_K_M # + SigLIP vision tower
Inherits Gemma 4’s thinking format — serve with the reasoning parser enabled (--reasoning-format deepseek --reasoning-budget 8192 on llama-server).
Derivative of Gemma 4 — Gemma Terms of Use.