fauxpaslife

fauxpaslife

AI Researcher @ delta : kitsune

nanbeige4.1

3B model that shouldn't be this good - crushes benchmarks through deep chain-of-thought reasoning

1,432 Pulls 1 Tag Updated 3 months ago
nanbeige4.1-python-deepthink

Fine-tuned version of Nanbeige 4.1 3B specialized for Python code generation with direct, focused output.

3b

599 Pulls 3 Tags Updated 3 months ago
codellama-python-13b-q6

Specialized CodeLlama variant fine-tuned specifically for Python code generation. 13B params, Q6_K quant (very high quality, minimal loss).

515 Pulls 1 Tag Updated 3 months ago
gemma-4-E4B-it-textonly-sysprmpt-Q4_K_M

A structurally extracted, text-only iteration of Google's multimodal gemma-4-E4B-it model. Vision and audio encoders have been fully decoupled to minimize VRAM footprint for text-centric workloads. System Prompt to address lost abilities.

tools thinking

417 Pulls 1 Tag Updated 1 month ago
arch-router

A specialized 1.5B parameter model for intelligent routing between multiple LLMs based on domain and action preferences.

1.5b

162 Pulls 1 Tag Updated 3 months ago
squishy

A tiny 150M completion model trained from scratch for short story generation and small-model pipeline validation. It is best for generating small story prompts. (Think young child telling small stories. And cute at times.)

150m

79 Pulls 1 Tag Updated 1 month ago
properly

Properly is the proofreader that doesn't steal your voice. Trained on Grammarly's CoEdit. LoRA was used with Gemma 1b-it.

28 Pulls 1 Tag Updated 1 month ago
Deimos-A1

Deimos A1 is a concise chain-of-thought (CCoT) fine-tune of Qwen3.5-4B. It produces dense, stepwise <think> blocks averaging ~1/8 the tokens of the base model while improving accuracy on every reasoning benchmark measured.

26 Pulls 1 Tag Updated 1 month ago
MINT-empathy-Qwen3-4B

MINT (Multi-turn Inter-tactic Novelty Training) model for empathic dialogue, fine-tuned from Qwen/Qwen3-4B.

12 Pulls 1 Tag Updated 1 month ago