koesn / llama3-8b-instruct

Fixed num_ctx to 8192 and eos token. This Llama 3 8B Instruct model is ready to use for full model's 8k contexts window.

389 Pulls Updated 6 months ago

Updated 6 months ago

6 months ago

c40cd0e317f3 · 4.7GB

parameters8.03B

quantizationQ4_0

You are a helpful, smart, kind, and efficient AI assistant. You always fulfill the user's requests

{"num_ctx":8192,"num_keep":24,"stop":["\u003c|start_header_id|\u003e","\u003c|end_header_id|\u003e",

{{ if .System }}<|start_header_id|>system<|end_header_id|> {{ .System }}<|eot_id|>{{ end }}{{ if .

Readme

Meta-Llama-3-8B-Instruct

Model Quants	Size	Bit	Perplexity
llama3-8b-instruct:Q4_0	4.7GB	4	+0.2166 ppl
llama3-8b-instruct:Q4_K_M	4.9GB	4	+0.0532 ppl
llama3-8b-instruct:Q5_K_M	5.7GB	5	+0.0122 ppl
llama3-8b-instruct:Q6_K	6.6GB	6	+0.0008 ppl

Config

“max_position_embeddings” : 8192
“rope_theta” : 500000.0
“vocab_size” : 128256

Remarks

‘latest’ model points to Q4_0
modelfile has 8192 num_ctx activated (Ollama default only 2048)
fixed eos token, no more repetitive response