For language translation. A GGUF model import from https://huggingface.co/TheBloke/ALMA-13B-GGUF
752 Pulls Updated 6 months ago
Updated 6 months ago
6 months ago
972ba8ee8843 · 14GB
Readme
ALMA
ALMA (Advanced Language Model-based trAnslator) is an LLM-based translation model.
This is an import of the GGUF Versions of the ALMA-13b models provided here with more information: https://huggingface.co/TheBloke/ALMA-13B-GGUF
Tags
The default when pulling the latest
tag is the Q4_K_M
version. Please see the table below for more versions.
Name | Quant method | Bits | Size | Max RAM required | Use case |
---|---|---|---|---|---|
winkefinger/alma-13b:Q2_K | Q2_K | 2 | 5.43 GB | 7.93 GB | smallest, significant quality loss - not recommended for most purposes |
winkefinger/alma-13b:Q3_K_S | Q3_K_S | 3 | 5.66 GB | 8.16 GB | very small, high quality loss |
winkefinger/alma-13b:Q3_K_M | Q3_K_M | 3 | 6.34 GB | 8.84 GB | very small, high quality loss |
winkefinger/alma-13b:Q3_K_L | Q3_K_L | 3 | 6.93 GB | 9.43 GB | small, substantial quality loss |
winkefinger/alma-13b:Q4_0 | Q4_0 | 4 | 7.37 GB | 9.87 GB | legacy; small, very high quality loss - prefer using Q3_K_M |
winkefinger/alma-13b:Q4_K_S | Q4_K_S | 4 | 7.41 GB | 9.91 GB | small, greater quality loss |
winkefinger/alma-13b:Q4_K_M | Q4_K_M | 4 | 7.87 GB | 10.37 GB | medium, balanced quality - recommended - the default when pulling latest (or without tag name) |
winkefinger/alma-13b:Q5_0 | Q5_0 | 5 | 8.97 GB | 11.47 GB | legacy; medium, balanced quality - prefer using Q4_K_M |
winkefinger/alma-13b:Q5_K_S | Q5_K_S | 5 | 8.97 GB | 11.47 GB | large, low quality loss - recommended |
winkefinger/alma-13b:Q5_K_M | Q5_K_M | 5 | 9.23 GB | 11.73 GB | large, very low quality loss - recommended |
winkefinger/alma-13b:Q6_K | Q6_K | 6 | 10.68 GB | 13.18 GB | very large, extremely low quality loss |
winkefinger/alma-13b:Q8_0 | Q8_0 | 8 | 13.83 GB | 16.33 GB | very large, extremely low quality loss - not recommended |
Note The above RAM figures assume no GPU offloading. If layers are offloaded to the GPU, this will reduce RAM usage and use VRAM instead.
No specific adjustments have been made in the model files.
Usage
This works with the following prompt template:
Translate this from German to English:
German: {prompt}
English:
ALMA officially supports 10 translate directions: English↔German, English↔Czech, English↔Icelandic, English↔Chinese, English↔Russian
But it seems that also some other languages might work.
License
The GGUF Models from https://huggingface.co/TheBloke/ALMA-13B-GGUF use the MIT license. They are based on the original (non GGUF) ALMA-13B
The base for those models is the Llama2 model, so the Meta llama2 license might also apply.
Please see more info on the GGUF Version page or on the original model page