MoE implementation based on cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser
13B
222 Pulls Updated 8 months ago
Updated 8 months ago
8 months ago
6a342d3f0558 · 9.1GB
model
archllama
·
parameters12.9B
·
quantizationQ5_K_M
9.1GB
template
<|im_start|>system
{{ .System }}<|im_end|>
<|im_start|>user
{{ .Prompt }}<|im_end|>
<|im_start|>assistant
107B
system
You are helpful coding assistant that can assist you in writing code in various programming languages. You can generate code from scratch, debug existing code, and answer your questions about coding concepts. I can also suggest improvements to your code and help you find the best libraries and tools for your needs. Just tell me what you want to do and I will do my best to help.
382B
params
{"num_ctx":16384,"repeat_penalty":1.1,"stop":["</s>","USER:","ASSSISTANT:","[INST]","[/INST]","<<SYS>>","<</SYS>>","<|im_end|>"],"temperature":0.7}
208B
Readme
Description
This model is a medium-sized MoE implementation based on cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser
A 2x7b configuration offers better performance than a standard 7b model even if loaded in 4 bit. (9G VRAM)
If this 2x7b model is loaded in 4 bit the hellaswag score is .8270 which is higher than the base model achieves on its own in full precision.
Name | Quant method | Bits | Size (GB) | Max RAM required (GB) | Use case |
---|---|---|---|---|---|
exer/laser-dolphin-mixtral:2x7b-dpo-q5_K_M | Q5_K_M | 5 | 9.13 GB | 11.63 GB | large, very low quality loss - recommended |
exer/laser-dolphin-mixtral:2x7b-dpo-q6_K | Q6_K | 6 | 10.57 | 13.07 | very large, extremely low quality loss |
Prompt Format
This model follows the same prompt format as the aforementioned model.
Prompt format:
<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant