Applications

Claude Code ollama launch claude --model mannix/gemma4-98e-v7-coderx

Codex App ollama launch codex-app --model mannix/gemma4-98e-v7-coderx

OpenClaw ollama launch openclaw --model mannix/gemma4-98e-v7-coderx

Hermes Agent ollama launch hermes --model mannix/gemma4-98e-v7-coderx

Codex ollama launch codex --model mannix/gemma4-98e-v7-coderx

OpenCode ollama launch opencode --model mannix/gemma4-98e-v7-coderx

Gemma 4 26B-A4B 98e v7-coderx — code-maximal prune

20.8B params · 98 experts (30 dropped) · ~4B active · code-maximal drop map

A research checkpoint that takes Gemma-4-26B-A4B-it and drops ³⁰⁄₁₂₈ experts per layer using a code-maximal recipe on the rebuilt v7 competence maps (audited producers, 10 classes) — generic-code 3× + LiveCodeBench-medium 2× on a [24,40] per-layer floor, with no science or multilingual targeting. Same router, attention, and norms as base, plus the mandatory shared-FFN α=1.2 upweight every coder variant carries.

The strongest coder in the cohort: it spends its whole prune budget on code and lands LiveCodeBench-medium-55 at 98.18% and LCB-100 at 99.0% — the highest of any Gemma-4 prune to date, +1.8pp / +2.0pp past the unpruned 128e (96.36 / 97.0). The trade is graduate science (GPQA 48.48). If you need the science back without giving up the code profile, use the sibling v7-coder (GPQA 70.71, LCB-55 96.36).

Full model card & methodology: ManniX-ITA/gemma-4-A4B-98e-v7-coderx-it on Hugging Face.

Other formats: - GGUF (29 tiers, imatrix, CD-* per-layer mixes + F16 + mmproj): ManniX-ITA/gemma-4-A4B-98e-v7-coderx-it-GGUF - NVFP4A16 (native vLLM, ~13 GB): ManniX-ITA/gemma-4-A4B-98e-v7-coderx-NVFP4A16

Scores (Q6_K, llama.cpp, greedy, same host)

LCB-55	LCB-100	MultiPL-E	HE	HE+	IFEval	GSM8K	MATH-500	AIME	ARC	GPQA-D
98.18	99.00	90.00	95.73	92.68	95.00	91.00	89.00	70.00	94.28	48.48

Reference columns on the same Q6_K run: unpruned 128e LCB-55 96.36 / LCB-100 97.00 / MultiPL-E 90.00; v6-coder LCB-55 92.73 / LCB-100 94.00. v7-coderx tops the cohort on every code/instruction axis; the budget is paid almost entirely on graduate science (GPQA 48.48, vs 128e 67.17).

Quantizations — HE+ / MultiPL-E-100 score, size & answer length

Every K-quant and CD tier was scored on HumanEval+ (164) and MultiPL-E-100 (llama.cpp, greedy T=0), with per-problem completion length from token_stats. bpw is the true bits-per-weight (8 × bytes ÷ 19,877,953,946). ⭐ marks a recommended pick.

Tier	Size (GB)	bpw	HE+ %	HE+ tok p50/p90/max	MPE-100 %	MPE tok p50/p90/max
Q8_0	21.16	8.52	90.85	233/430/1391	88.67	83/189/1012
Q6_K_L	17.98	7.24	92.07	236/443/1233	89.00	84/174/1012
Q6_K	17.81	7.17	92.07	236/440/1335	90.67	83/178/973
Q5_K_L	15.25	6.14	90.24	230/448/5932	89.33	85/188/935
Q5_K_M	15.07	6.07	90.85	232/463/5316	88.33	85/194/1013
Q5_K_S	14.19	5.71	92.07	235/466/3979	87.67	86/196/1013
Q4_K_L	13.42	5.40	92.07	245/476/2814	88.33	84/179/1012
⭐ Q4_K_M	13.24	5.33	93.29	241/445/11365	89.00	86/183/1003
Q4_1	12.61	5.08	92.68	223/450/3495	89.00	85/170/826
Q4_K_S	12.21	4.91	91.46	242/448/1749	87.67	84/185/1011
IQ4_NL	11.42	4.60	90.24	230/439/1908	89.00	85/173/724
Q4_0	11.42	4.60	92.07	251/531/15918	85.67	85/192/1012
IQ4_XS	11.01	4.43	90.85	234/431/1977	90.33	85/185/920
Q3_K_L	10.94	4.40	92.07	234/439/2498	88.00	84/200/1013
CD-qat-Q4_K_M	10.83	4.36	90.85	242/508/4383	86.00	87/188/507
Q3_K_XL	10.69	4.30	90.85	237/438/1657	88.00	86/196/1009
Q3_K_M	10.51	4.23	92.07	237/440/3068	87.33	87/190/1013
⭐ CD-Q3_K_L	10.22	4.11	93.90	239/504/2671	87.00	86/213/1013
Q3_K_S	9.68	3.89	87.80	250/636/16227	87.67	92/223/1017
⭐ CD-Q2_K	8.82	3.55	90.24	241/492/3072	86.33	92/200/1012
Q2_K_L	8.58	3.45	84.76	248/1480/16218	81.00	99/594/1017
IQ2_XS	7.77	3.13	75.61	251/6383/16239	71.00	94/670/1012

Recommended picks:

Q4_K_M ⭐ (13.24 GB) — best K-quant — 93.29% HE+ (13.2 GB).
CD-Q3_K_L ⭐ (10.22 GB) — best overall — highest HE+ of any tier (93.90%) at 10.2 GB.
CD-Q2_K ⭐ (8.82 GB) — smallest tier still in the 90%+ band — 90.24% HE+ at 8.8 GB.

The K-quant and CD tiers hold HE+ in the 90–93% band with length essentially identical to Q6_K; the 2-bit Q2_K_L / IQ2_XS tiers are the cliff (HE+ into the 80s/70s, token p90 blows out). The K-quant CD tiers are the recommended low-bit path — CD-IQ* i-quant bodies are not offered (the pruned MoE degenerates on an IQ-family body, score → 0). Prefer Q4_K_M or higher for production.

Head-to-head by file size — v7-coderx vs Qwen2.5-Coder-14B (iso-disk)

Pairing by tier name is misleading — this is a ~20.8B-total MoE and Qwen2.5-Coder-14B is a 14.7B dense model, so the same tier name lands at a different size. The fair comparison is iso-disk: at a given GB budget, which scores higher on HumanEval+? Same rig (RTX 3090, llama.cpp, greedy). Qwen GGUFs are bartowski’s (83–85% across the ladder). At every band the MoE runs lower bpw at the same disk and still scores higher.

Disk band	Qwen2.5-Coder-14B (size / bpw / HE+)	v7-coderx best (size / bpw / HE+)	Δ HE+
~21.2 GB	(none — Qwen ceiling Q8_0 15.70 GB)	Q8_0 21.16 / 8.52 / 90.85%	new top
~17.8 GB	(none — Qwen ceiling Q8_0 15.70 GB)	Q6_K 17.81 / 7.17 / 92.07%	new top
~15.1 GB	Q8_0 15.70 / 8.54 / 84.76%	Q5_K_M 15.07 / 6.07 / 90.85%	+6.09
~13.2 GB	Q6_K 12.12 / 6.60 / 84.76%	Q4_K_M 13.24 / 5.33 / 93.29% — ⭐ best K-quant	+8.53
~12.2 GB	Q6_K 12.12 / 6.60 / 84.76%	Q4_K_S 12.21 / 4.91 / 91.46%	+6.70
~11.0 GB	Q5_K_M 10.51 / 5.72 / 83.54%	IQ4_XS 11.01 / 4.43 / 90.85%	+7.31
~10.5 GB	Q5_K_M 10.51 / 5.72 / 83.54%	Q3_K_M 10.51 / 4.23 / 92.07% — iso-disk (same 10.5 GB)	+8.53
~10.2 GB	Q5_K_M 10.51 / 5.72 / 83.54%	CD-Q3_K_L 10.22 / 4.11 / 93.90% — ⭐ best overall 93.90%	+10.36
~8.8 GB	Q4_K_M 8.99 / 4.89 / 85.37%	CD-Q2_K 8.82 / 3.55 / 90.24% — ⭐ smallest 90%+	+4.87

Pull

ollama pull mannix/gemma4-98e-v7-coderx                 # :latest = Q4_K_M (best K-quant, 93.29% HE+)
ollama pull mannix/gemma4-98e-v7-coderx:CD-Q3_K_L       # ⭐ best overall — 93.90% HE+, 10.2 GB
ollama pull mannix/gemma4-98e-v7-coderx:Q6_K            # max fidelity (bench tier)
ollama pull mannix/gemma4-98e-v7-coderx:CD-Q2_K         # smallest 90%+ — 90.24% HE+, 8.8 GB
ollama pull mannix/gemma4-98e-v7-coderx:vision-Q4_K_M   # + SigLIP vision tower

Inherits Gemma 4’s thinking format — serve with the reasoning parser enabled (--reasoning-format deepseek --reasoning-budget 8192 on llama-server).

Derivative of Gemma 4 — Gemma Terms of Use.

Gemma-4 98e coder max variant, top notch coding skills at the expense of science knowledge