410 yesterday

An even more improved version of Gemma-4 98e coder variant, the best 20b coder

vision tools thinking
ollama run mannix/gemma4-98e-v7-coder

Applications

Claude Code
Claude Code ollama launch claude --model mannix/gemma4-98e-v7-coder
Codex App
Codex App ollama launch codex-app --model mannix/gemma4-98e-v7-coder
OpenClaw
OpenClaw ollama launch openclaw --model mannix/gemma4-98e-v7-coder
Hermes Agent
Hermes Agent ollama launch hermes --model mannix/gemma4-98e-v7-coder
Codex
Codex ollama launch codex --model mannix/gemma4-98e-v7-coder
OpenCode
OpenCode ollama launch opencode --model mannix/gemma4-98e-v7-coder

Models

View all →

57 models

gemma4-98e-v7-coder:latest

13GB · 256K context window · Text · yesterday

gemma4-98e-v7-coder:IQ3_M

9.8GB · 256K context window · Text · yesterday

gemma4-98e-v7-coder:IQ4_K_M

10GB · 256K context window · Text · yesterday

gemma4-98e-v7-coder:Q2_K_L

8.6GB · 256K context window · Text · yesterday

gemma4-98e-v7-coder:Q3_K_XL

11GB · 256K context window · Text · yesterday

gemma4-98e-v7-coder:Q4_K_L

13GB · 256K context window · Text · yesterday

gemma4-98e-v7-coder:Q5_K_L

15GB · 256K context window · Text · yesterday

gemma4-98e-v7-coder:Q6_K_L

18GB · 256K context window · Text · yesterday

gemma4-98e-v7-coder:qat

11GB · 256K context window · Text · yesterday

gemma4-98e-v7-coder:Q3_K_S

9.7GB · 256K context window · Text · yesterday

gemma4-98e-v7-coder:Q3_K_M

11GB · 256K context window · Text · yesterday

gemma4-98e-v7-coder:Q3_K_L

11GB · 256K context window · Text · yesterday

gemma4-98e-v7-coder:Q4_0

11GB · 256K context window · Text · yesterday

gemma4-98e-v7-coder:Q4_1

13GB · 256K context window · Text · yesterday

gemma4-98e-v7-coder:Q4_K_S

12GB · 256K context window · Text · yesterday

gemma4-98e-v7-coder:Q4_K_M

13GB · 256K context window · Text · yesterday

gemma4-98e-v7-coder:Q5_K_S

14GB · 256K context window · Text · yesterday

gemma4-98e-v7-coder:Q5_K_M

15GB · 256K context window · Text · yesterday

gemma4-98e-v7-coder:Q6_K

18GB · 256K context window · Text · yesterday

gemma4-98e-v7-coder:Q8_0

21GB · 256K context window · Text · yesterday

gemma4-98e-v7-coder:IQ2_XS

7.8GB · 256K context window · Text · yesterday

gemma4-98e-v7-coder:IQ4_XS

11GB · 256K context window · Text · yesterday

gemma4-98e-v7-coder:IQ4_NL

11GB · 256K context window · Text · yesterday

gemma4-98e-v7-coder:CD-Q2_K

8.8GB · 256K context window · Text · yesterday

gemma4-98e-v7-coder:CD-Q3_K_L

10GB · 256K context window · Text · yesterday

gemma4-98e-v7-coder:CD-Q4_K_M

11GB · 256K context window · Text · yesterday

gemma4-98e-v7-coder:CD-qat-Q4_K_M

11GB · 256K context window · Text · yesterday

gemma4-98e-v7-coder:CD-Q5_K_M

13GB · 256K context window · Text · yesterday

gemma4-98e-v7-coder:CD-Q6_K

16GB · 256K context window · Text · yesterday

gemma4-98e-v7-coder:vision-CD-Q2_K

10GB · 256K context window · Text, Image · yesterday

gemma4-98e-v7-coder:vision-CD-Q3_K_L

11GB · 256K context window · Text, Image · yesterday

gemma4-98e-v7-coder:vision-CD-Q4_K_M

12GB · 256K context window · Text, Image · yesterday

gemma4-98e-v7-coder:vision-CD-qat-Q4_K_M

12GB · 256K context window · Text, Image · yesterday

gemma4-98e-v7-coder:vision-CD-Q5_K_M

14GB · 256K context window · Text, Image · yesterday

gemma4-98e-v7-coder:vision-CD-Q6_K

17GB · 256K context window · Text, Image · yesterday

gemma4-98e-v7-coder:vision-IQ3_M

11GB · 256K context window · Text, Image · yesterday

gemma4-98e-v7-coder:vision-IQ4_K_M

11GB · 256K context window · Text, Image · yesterday

gemma4-98e-v7-coder:vision-Q2_K_L

9.8GB · 256K context window · Text, Image · yesterday

gemma4-98e-v7-coder:vision-Q3_K_XL

12GB · 256K context window · Text, Image · yesterday

gemma4-98e-v7-coder:vision-Q4_K_L

15GB · 256K context window · Text, Image · yesterday

gemma4-98e-v7-coder:vision-Q5_K_L

16GB · 256K context window · Text, Image · yesterday

gemma4-98e-v7-coder:vision-Q6_K_L

19GB · 256K context window · Text, Image · yesterday

gemma4-98e-v7-coder:vision-qat

13GB · 256K context window · Text, Image · yesterday

gemma4-98e-v7-coder:vision-Q3_K_S

11GB · 256K context window · Text, Image · yesterday

gemma4-98e-v7-coder:vision-Q3_K_M

12GB · 256K context window · Text, Image · yesterday

gemma4-98e-v7-coder:vision-Q3_K_L

12GB · 256K context window · Text, Image · yesterday

gemma4-98e-v7-coder:vision-Q4_0

13GB · 256K context window · Text, Image · yesterday

gemma4-98e-v7-coder:vision-Q4_1

14GB · 256K context window · Text, Image · yesterday

gemma4-98e-v7-coder:vision-Q4_K_S

13GB · 256K context window · Text, Image · yesterday

gemma4-98e-v7-coder:vision-Q4_K_M

14GB · 256K context window · Text, Image · yesterday

gemma4-98e-v7-coder:vision-Q5_K_S

15GB · 256K context window · Text, Image · yesterday

gemma4-98e-v7-coder:vision-Q5_K_M

16GB · 256K context window · Text, Image · yesterday

gemma4-98e-v7-coder:vision-Q6_K

19GB · 256K context window · Text, Image · yesterday

gemma4-98e-v7-coder:vision-Q8_0

22GB · 256K context window · Text, Image · yesterday

gemma4-98e-v7-coder:vision-IQ2_XS

9.0GB · 256K context window · Text, Image · yesterday

gemma4-98e-v7-coder:vision-IQ4_XS

12GB · 256K context window · Text, Image · yesterday

gemma4-98e-v7-coder:vision-IQ4_NL

13GB · 256K context window · Text, Image · yesterday

Readme

Gemma 4 26B-A4B 98e v7-coder — science-augmented code prune

20.8B params · 98 experts (30 dropped) · ~4B active · code map + targeted_gpqa

A research checkpoint that takes Gemma-4-26B-A4B-it and drops 30128 experts per layer using a code-targeted recipe on the rebuilt v7 competence maps (audited producers, 10 classes, multilingual category), on a [24,40] per-layer floor — plus a targeted_gpqa class at weight 1.5 that protects a science-specialist keep-set derived from GPQA-diamond pass-traces. Same router, attention, and norms as base, plus the mandatory shared-FFN α=1.2 upweight every coder variant carries.

A coder that kept all its science: it holds v6-coder’s top-tier code profile while pulling GPQA-diamond to 70.71%+9.6pp over v6-coder and at parity with the unpruned 128e (67.17 on the same Q6_K run). For maximal coding throughput with science left at baseline, see the sibling v7-coderx.

Full model card & methodology: ManniX-ITA/gemma-4-A4B-98e-v7-coder-it on Hugging Face.

Other formats: - GGUF (29 tiers, imatrix, CD-* per-layer mixes + F16 + mmproj): ManniX-ITA/gemma-4-A4B-98e-v7-coder-it-GGUF - NVFP4A16 (native vLLM, ~13 GB): ManniX-ITA/gemma-4-A4B-98e-v7-coder-NVFP4A16

Scores (Q6_K, llama.cpp, greedy, same host)

GPQA-DAIMEMATH-500GSM8KARCIFEvalHEHE+LCB-55LCB-100MultiPL-E
70.7176.6792.0093.0094.8095.0098.7892.6896.3697.0088.67

Reference columns on the same Q6_K run: unpruned 128e GPQA 67.17 / AIME 73.33 / HE 97.56 / LCB-55 96.36; v6-coder GPQA 61.11 / HE 98.17 / LCB-55 92.73. v7-coder matches or beats 128e on GPQA, AIME and GSM8K while holding the cohort code profile. (Small benches — GPQA 198q, AIME 30q — carry run-to-run variance; read GPQA/AIME as “recovered the science gap”, not a robust win over the base.)

Quantizations — HE+ / MultiPL-E-100 score, size & answer length

Every K-quant and CD tier was scored on HumanEval+ (164) and MultiPL-E-100 (llama.cpp, greedy T=0), with per-problem completion length from token_stats. bpw is the true bits-per-weight (8 × bytes ÷ 19,877,953,946). ⭐ marks a recommended pick.

TierSize (GB)bpwHE+ %HE+ tok p50/p90/maxMPE-100 %MPE tok p50/p90/max
Q8_021.168.5291.46231/440/179688.6783/190/1012
Q6_K_L17.987.2492.07233/443/103990.0084/188/976
Q6_K17.817.1793.29238/451/123889.3384/188/975
Q5_K_L15.256.1491.46236/455/1620988.6784/193/1013
Q5_K_M15.076.0791.46240/463/636788.6788/190/1011
Q5_K_S14.195.7190.24231/466/475489.0088/199/1011
Q4_K_L13.425.4092.68243/455/204788.6784/188/1013
Q4_K_M13.245.3392.68254/539/1621088.0085/189/1013
Q4_112.615.0890.85233/442/406188.6787/175/704
Q4_K_S12.214.9191.46242/461/195288.3384/185/1012
IQ4_NL11.424.6090.85234/429/142987.6785/186/1011
Q4_011.424.6089.02242/555/351387.0087/212/1012
qat-Q4_011.424.6093.29234/508/1621087.0084/181/974
IQ4_XS11.014.4390.85238/438/290588.6783/185/883
Q3_K_L10.944.4091.46236/459/181988.3386/207/1017
CD-qat-Q4_K_M10.834.3692.68234/483/328387.6789/204/1011
Q3_K_XL10.694.3090.85233/427/252488.0086/200/1013
Q3_K_M10.514.2390.85232/438/352888.0087/207/957
CD-Q3_K_L10.224.1191.46234/558/205489.6786/218/1013
Q3_K_S9.683.8989.02257/620/1620884.0093/223/1017
CD-Q2_K8.823.5591.46237/568/312487.6790/234/1013
Q2_K_L8.583.4586.59240/1141/1622082.3395/459/1017
IQ2_XS7.773.1373.78249/15917/1629372.0091/626/1016

Recommended picks:

  • Q6_K ⭐ (17.81 GB) — max quality — 93.29% HE+ (17.8 GB).
  • qat-Q4_0 ⭐ (11.42 GB) — ties the top HE+ (93.29%) at 11.4 GB — QAT base.
  • CD-qat-Q4_K_M ⭐ (10.83 GB) — recommended default — 92.68% HE+ at 10.8 GB.
  • CD-Q2_K ⭐ (8.82 GB) — smallest tier still in the 90%+ band — 91.46% HE+ at 8.8 GB.

The K-quant and CD tiers hold HE+ in the 90–93% band with length essentially identical to Q6_K; the 2-bit Q2_K_L / IQ2_XS tiers are the cliff (HE+ into the 80s/70s, token p90 blows out). The K-quant CD tiers are the recommended low-bit path — CD-IQ* i-quant bodies are not offered (the pruned MoE degenerates on an IQ-family body, score → 0). Prefer Q4_K_M or higher for production.

Head-to-head by file size — v7-coder vs Qwen2.5-Coder-14B (iso-disk)

Pairing by tier name is misleading — this is a ~20.8B-total MoE and Qwen2.5-Coder-14B is a 14.7B dense model, so the same tier name lands at a different size. The fair comparison is iso-disk: at a given GB budget, which scores higher on HumanEval+? Same rig (RTX 3090, llama.cpp, greedy). Qwen GGUFs are bartowski’s (83–85% across the ladder). At every band the MoE runs lower bpw at the same disk and still scores higher.

Disk bandQwen2.5-Coder-14B (size / bpw / HE+)v7-coder best (size / bpw / HE+)Δ HE+
~21.2 GB(none — Qwen ceiling Q8_0 15.70 GB)Q8_0 21.16 / 8.52 / 91.46%new top
~17.8 GB(none — Qwen ceiling Q8_0 15.70 GB)Q6_K 17.81 / 7.17 / 93.29%new top
~15.1 GBQ8_0 15.70 / 8.54 / 84.76%Q5_K_M 15.07 / 6.07 / 91.46%+6.70
~13.2 GBQ6_K 12.12 / 6.60 / 84.76%Q4_K_M 13.24 / 5.33 / 92.68%+7.92
~12.2 GBQ6_K 12.12 / 6.60 / 84.76%Q4_K_S 12.21 / 4.91 / 91.46%+6.70
~11.4 GBQ5_K_M 10.51 / 5.72 / 83.54%qat-Q4_0 11.42 / 4.60 / 93.29% — ⭐ ties top HE++9.75
~10.8 GBQ5_K_M 10.51 / 5.72 / 83.54%CD-qat-Q4_K_M 10.83 / 4.36 / 92.68% — ⭐ recommended+9.14
~10.5 GBQ5_K_M 10.51 / 5.72 / 83.54%Q3_K_M 10.51 / 4.23 / 90.85% — iso-disk (same 10.5 GB)+7.31
~10.2 GBQ5_K_M 10.51 / 5.72 / 83.54%CD-Q3_K_L 10.22 / 4.11 / 91.46%+7.92
~8.8 GBQ4_K_M 8.99 / 4.89 / 85.37%CD-Q2_K 8.82 / 3.55 / 91.46% — ⭐ smallest 90%++6.09

Pull

ollama pull mannix/gemma4-98e-v7-coder                  # :latest = Q4_K_M
ollama pull mannix/gemma4-98e-v7-coder:CD-qat-Q4_K_M    # ⭐ recommended — 92.68% HE+, 10.8 GB
ollama pull mannix/gemma4-98e-v7-coder:Q6_K             # max quality — 93.29% HE+ (bench tier)
ollama pull mannix/gemma4-98e-v7-coder:CD-Q2_K          # smallest 90%+ — 91.46% HE+, 8.8 GB
ollama pull mannix/gemma4-98e-v7-coder:vision-Q4_K_M    # + SigLIP vision tower

Inherits Gemma 4’s thinking format — serve with the reasoning parser enabled (--reasoning-format deepseek --reasoning-budget 8192 on llama-server).

Derivative of Gemma 4 — Gemma Terms of Use.