766 Downloads Updated 1 week ago
ollama run evilfreelancer/rugpt3:small-q8_0
A curated collection of all ruGPT-3 family models converted to GGUF and packaged for Ollama. These are foundational (base) language models for Russian, originally developed by SberDevices / ai-forever and published in 2021 year. They are not instruction-tuned - they perform text completion only.
Think of this repository as a museum of classic Russian NLP models, preserved and made accessible with modern tooling. All four sizes from the ruGPT-3 family are available here in quantized GGUF format, ready to run locally via Ollama.
Details in the paper: A Family of Pretrained Transformer Language Models for Russian.
ollama run evilfreelancer/rugpt3
All models share a 2048-token context window and use a BPE tokenizer with 50,264 tokens.
Based on ai-forever/rugpt3small_based_on_gpt2. Standard GPT-2 architecture pretrained on 80B tokens of Russian text for ~3 epochs, then finetuned with 2048 context.
| Tag | Quantization | Size |
|---|---|---|
small, 125m, 125m-q8_0, small-q8_0 |
Q8_0 | 182 MB |
125m-fp16, small-fp16 |
FP16 | 334 MB |
Based on ai-forever/rugpt3medium_based_on_gpt2. Pretrained on 80B tokens for 3 epochs, finetuned with 2048 context. Test perplexity: 17.4.
| Tag | Quantization | Size |
|---|---|---|
medium, 356m, 356m-q8_0, medium-q8_0 |
Q8_0 | 443 MB |
356m-fp16, medium-fp16 |
FP16 | 823 MB |
Based on ai-forever/rugpt3large_based_on_gpt2. Pretrained on 80B tokens for 3 epochs, finetuned 1 epoch with 2048 context. Test perplexity: 13.6.
| Tag | Quantization | Size |
|---|---|---|
large, 760m, 760m-q8_0, large-q8_0 |
Q8_0 | 904 MB |
760m-fp16, large-fp16 |
FP16 | 1.7 GB |
Based on ai-forever/rugpt3xl. A deeply modified GPT-2 architecture (Pre-LayerNorm, fused QKV projections, Megatron-LM style). Trained from scratch on 80B tokens for 4 epochs using DeepSpeed + Megatron-LM, then finetuned with 2048 context. Test perplexity: 12.05.
The XL variant was converted from the original Megatron-LM checkpoint to HuggingFace format and then to GGUF. More quantization options are available due to its larger size.
| Tag | Quantization | Size |
|---|---|---|
xl, 1.3b, 1.3b-q8_0, xl-q8_0, latest |
Q8_0 | 1.5 GB |
1.3b-q4_k_m, xl-q4_k_m |
Q4_K_M | 935 MB |
1.3b-q4_0, xl-q4_0 |
Q4_0 | 845 MB |
1.3b-fp16, xl-fp16 |
FP16 | 2.9 GB |
Run with a specific size:
ollama run evilfreelancer/rugpt3:small
ollama run evilfreelancer/rugpt3:medium
ollama run evilfreelancer/rugpt3:large
ollama run evilfreelancer/rugpt3:xl
Run a specific quantization:
ollama run evilfreelancer/rugpt3:1.3b-q4_k_m
ollama run evilfreelancer/rugpt3:760m-fp16