Fietje: An open and efficient LLM for Dutch (chat)
225 Pulls Updated 6 months ago
Updated 6 months ago
6 months ago
8394fc31afbe · 3.0GB
model
archphi2
·
parameters2.78B
·
quantizationQ8_0
3.0GB
params
{"num_ctx":2048,"stop":["\u003c|im_start|\u003e","\u003c|im_end|\u003e"]}
74B
template
{{ if .System }}<|im_start|>system
{{ .System }}<|im_end|>
{{ end }}{{ if .Prompt }}<|im_start|>user
156B
Readme
This repository contains quantized versions of BramVanroy/fietje-2b-chat.
Available quantization types and expected performance differences compared to base f16
, higher perplexity=worse (from llama.cpp):
Q3_K_M : 3.07G, +0.2496 ppl @ LLaMA-v1-7B
Q4_K_M : 3.80G, +0.0532 ppl @ LLaMA-v1-7B
Q5_K_M : 4.45G, +0.0122 ppl @ LLaMA-v1-7B
Q6_K : 5.15G, +0.0008 ppl @ LLaMA-v1-7B
Q8_0 : 6.70G, +0.0004 ppl @ LLaMA-v1-7B
F16 : 13.00G @ 7B
Quants were made with release b2777
of llama.cpp.