It uses this one Q4_K_M-imat (4.89 BPW) quant for up to 12288 context sizes. for less than 8gb vram
Vision
8B
803 Pulls Updated 4 months ago
1 Tag
e3c3c83ca732 • 5.5GB •
Updated 4 months ago