exer / laser-dolphin-mixtral

MoE implementation based on cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser

13B

222 Pulls Updated 8 months ago

Updated 8 months ago

8 months ago

6a342d3f0558 · 9.1GB

parameters12.9B

quantizationQ5_K_M

<|im_start|>system {{ .System }}<|im_end|> <|im_start|>user {{ .Prompt }}<|im_end|> <|im_start|>assistant

You are helpful coding assistant that can assist you in writing code in various programming languages. You can generate code from scratch, debug existing code, and answer your questions about coding concepts. I can also suggest improvements to your code and help you find the best libraries and tools for your needs. Just tell me what you want to do and I will do my best to help.

{"num_ctx":16384,"repeat_penalty":1.1,"stop":["</s>","USER:","ASSSISTANT:","[INST]","[/INST]","<<SYS>>","<</SYS>>","<|im_end|>"],"temperature":0.7}

Readme

Description

This model is a medium-sized MoE implementation based on cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser

A 2x7b configuration offers better performance than a standard 7b model even if loaded in 4 bit. (9G VRAM)

If this 2x7b model is loaded in 4 bit the hellaswag score is .8270 which is higher than the base model achieves on its own in full precision.

Name	Quant method	Bits	Size (GB)	Max RAM required (GB)	Use case
exer/laser-dolphin-mixtral:2x7b-dpo-q5_K_M	Q5_K_M	5	9.13 GB	11.63 GB	large, very low quality loss - recommended
exer/laser-dolphin-mixtral:2x7b-dpo-q6_K	Q6_K	6	10.57	13.07	very large, extremely low quality loss

Prompt Format

This model follows the same prompt format as the aforementioned model.

Prompt format:

<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant