Fixed num_ctx to 8192 and eos token. This Llama 3 8B Instruct model is ready to use for full model's 8k contexts window.

389 6 months ago

Readme

Meta-Llama-3-8B-Instruct

Model Quants Size Bit Perplexity
llama3-8b-instruct:Q4_0 4.7GB 4 +0.2166 ppl
llama3-8b-instruct:Q4_K_M 4.9GB 4 +0.0532 ppl
llama3-8b-instruct:Q5_K_M 5.7GB 5 +0.0122 ppl
llama3-8b-instruct:Q6_K 6.6GB 6 +0.0008 ppl
Config

“max_position_embeddings” : 8192
“rope_theta” : 500000.0
“vocab_size” : 128256

Remarks
  • ‘latest’ model points to Q4_0
  • modelfile has 8192 num_ctx activated (Ollama default only 2048)
  • fixed eos token, no more repetitive response