AI Researcher @ delta : kitsune
-
nanbeige4.1
3B model that shouldn't be this good - crushes benchmarks through deep chain-of-thought reasoning
1,432 Pulls 1 Tag Updated 3 months ago
-
nanbeige4.1-python-deepthink
Fine-tuned version of Nanbeige 4.1 3B specialized for Python code generation with direct, focused output.
3b599 Pulls 3 Tags Updated 3 months ago
-
codellama-python-13b-q6
Specialized CodeLlama variant fine-tuned specifically for Python code generation. 13B params, Q6_K quant (very high quality, minimal loss).
515 Pulls 1 Tag Updated 3 months ago
-
gemma-4-E4B-it-textonly-sysprmpt-Q4_K_M
A structurally extracted, text-only iteration of Google's multimodal gemma-4-E4B-it model. Vision and audio encoders have been fully decoupled to minimize VRAM footprint for text-centric workloads. System Prompt to address lost abilities.
tools thinking417 Pulls 1 Tag Updated 1 month ago
-
arch-router
A specialized 1.5B parameter model for intelligent routing between multiple LLMs based on domain and action preferences.
1.5b162 Pulls 1 Tag Updated 3 months ago
-
squishy
A tiny 150M completion model trained from scratch for short story generation and small-model pipeline validation. It is best for generating small story prompts. (Think young child telling small stories. And cute at times.)
150m79 Pulls 1 Tag Updated 1 month ago
-
properly
Properly is the proofreader that doesn't steal your voice. Trained on Grammarly's CoEdit. LoRA was used with Gemma 1b-it.
28 Pulls 1 Tag Updated 1 month ago
-
Deimos-A1
Deimos A1 is a concise chain-of-thought (CCoT) fine-tune of Qwen3.5-4B. It produces dense, stepwise <think> blocks averaging ~1/8 the tokens of the base model while improving accuracy on every reasoning benchmark measured.
26 Pulls 1 Tag Updated 1 month ago
-
MINT-empathy-Qwen3-4B
MINT (Multi-turn Inter-tactic Novelty Training) model for empathic dialogue, fine-tuned from Qwen/Qwen3-4B.
12 Pulls 1 Tag Updated 1 month ago