jasonkwan / command-r

Versions of command-r using either the default system prompt or the rag/tool use prompt suggested by Cohere.

176 Pulls Updated 6 months ago

28 Tags

Updated 7 months ago

7 months ago

a9666a5d8aaa · 22GB

{"stop":["\u003c|START_OF_TURN_TOKEN|\u003e","\u003c|END_OF_TURN_TOKEN|\u003e"]}

81B

template

{{ if .System }}<|START_OF_TURN_TOKEN|><|SYSTEM_TOKEN|>{{ .System }}<|END_OF_TURN_TOKEN|>{{ end }}{{

270B

system

## Task and Context\\nYou help people answer their questions and other requests interactively. You w

532B

license

Creative Commons Attribution-NonCommercial 4.0 International Public License with Acceptable Use Add

14kB

Readme

In the Hugging Face CohereForAI/c4ai-command-r-v01 repo, two system prompts are specified in tokenizer_config.json:

“Default”:

You are Command-R, a brilliant, sophisticated, AI-assistant trained to assist human users by providing thorough responses. You are trained by Cohere.

“Tool_use” and “Rag” are the same:

## Task and Context\\nYou help people answer their questions and other requests interactively. You will be asked a very wide array of requests on all kinds of topics. You will be equipped with a wide range of search engines or similar tools to help you, which you use to research your answer. You should focus on serving the user\\'s needs as best you can, which will be wide-ranging.\\n\\n## Style Guide\\nUnless the user asks for a different style of answer, you should answer in full sentences, using proper grammar and spelling.

I am offering both options here in case they are useful. In my testing sometimes they make a difference, but it is not entirely clear to me how/why. I benchmarked using an example script from the langroid-examples repo, examples/docqa/chat-multi-extract-local.py. This script uses multiple agents to extract information from a simple lease document. The document is not long and quite easy for both a human and GPT4-turbo to extract the requested information. However, I have found that a lot of models fail to get the right answers. Results using command-r with the different prompts are below:

Model	Quant	System prompt	Temperature	Start date	End date	Rent	Deposit	Address
command-r	Q2_K	Default	0.2	❔	❔	❔	❔	❔
command-r	Q2_K	Tool_use	0.2	❔	❔	❔	❌	❌
command-r	Q3_K_L	Default	0.2	❔	❔	❔	❔	❔
command-r	Q3_K_L	Tool_use	0.2	❔	❔	❔	❔	❔
command-r	Q3_K_M	Default	0.2	❔	✅	✅	❔	❔
command-r	Q3_K_M	Tool_use	0.2	❔	❔	❔	❔	❔
command-r	Q3_K_S	Default	0.2	❔	✅	✅	✅	✅
command-r	Q3_K_S	Tool_use	0.2	✅	✅	✅	❔	☑️
command-r	Q4_0	Default	0.2	✅	✅	✅	❔	✅
command-r	Q4_0	Tool_use	0.2	✅	✅	✅	✅	✅
command-r	Q4_1	Default	0.2	✅	✅	✅	✅	✅
command-r	Q4_1	Tool_use	0.2	✅	✅	✅	✅	☑️
command-r	Q4_K_M	Default	0.2	✅	✅	✅	✅	✅
command-r	Q4_K_M	Tool_use	0.2	✅	✅	✅	✅	✅
command-r	Q4_K_S	Default	0.2	✅	✅	☑️	❔	✅
command-r	Q4_K_S	Tool_use	0.2	✅	✅	✅	❔	✅
command-r	Q5_1	Default	0.2	✅	✅	✅	✅	☑️
command-r	Q5_1	Tool_use	0.2	✅	✅	✅	❌	✅
command-r	Q5_K_M	Default	0.2	✅	✅	✅	✅	✅
command-r	Q5_K_M	Tool_use	0.2	✅	✅	✅	❌	✅
command-r	Q5_K_S	Default	0.2	✅	✅	✅	❌	☑️
command-r	Q5_K_S	Tool_use	0.2	✅	✅	✅	❌	☑️
command-r	Q6_K	Default	0.2	✅	✅	✅	❌	✅
command-r	Q6_K	Tool_use	0.2	✅	✅	✅	✅	✅
command-r	Q8_0	Default	0.2	✅	✅	✅	✅	✅
command-r	Q8_0	Tool_use	0.2	✅	✅	✅	✅	☑️
command-r	f16	Default	0.2	✅	✅	✅	❔	✅
command-r	f16	Tool_use	0.2	✅	✅	✅	✅	✅

Key:

✅: Correct answer, would accept address without a zip code

☑️: Incomplete correct answer, address does not include state

❌: Incorrect answer

❔: Answer given is “DO NOT KNOW” or something similar

Note: The above is using the unaltered script. You may get better results by, for example, changing the number of question variants to be greater than TWO.

More information on prompting Command-r can be found here.

In the Hugging Face `CohereForAI/c4ai-command-r-v01` repo, two system prompts are specified in `tokenizer_config.json`:

"Default":

```
You are Command-R, a brilliant, sophisticated, AI-assistant trained to assist human users by providing thorough responses. You are trained by Cohere.
```

"Tool_use" and "Rag" are the same:

```
## Task and Context\\nYou help people answer their questions and other requests interactively. You will be asked a very wide array of requests on all kinds of topics. You will be equipped with a wide range of search engines or similar tools to help you, which you use to research your answer. You should focus on serving the user\\'s needs as best you can, which will be wide-ranging.\\n\\n## Style Guide\\nUnless the user asks for a different style of answer, you should answer in full sentences, using proper grammar and spelling.
```

I am offering both options here in case they are useful. In my testing sometimes they make a difference, but it is not entirely clear to me how/why. I benchmarked using an example script from the [langroid-examples repo](https://github.com/langroid/langroid-examples), `examples/docqa/chat-multi-extract-local.py`. This script uses multiple agents to extract information from a simple lease document. The document is not long and quite easy for both a human and GPT4-turbo to extract the requested information. However, I have found that a lot of models fail to get the right answers. Results using command-r with the different prompts are below:

| Model     | Quant    | System prompt | Temperature | Start date | End date | Rent | Deposit | Address |
|-----------|----------|---------------|-------------|------------|----------|------|---------|---------|
| command-r | Q2_K     | Default       | 0.2         | ❔         | ❔       | ❔   | ❔      | ❔      |
| command-r | Q2_K     | Tool_use      | 0.2         | ❔         | ❔       | ❔   | ❌      | ❌      |
| command-r | Q3_K_L   | Default       | 0.2         | ❔         | ❔       | ❔   | ❔      | ❔      |
| command-r | Q3_K_L   | Tool_use      | 0.2         | ❔         | ❔       | ❔   | ❔      | ❔      |
| command-r | Q3_K_M   | Default       | 0.2         | ❔         | ✅       | ✅   | ❔      | ❔      |
| command-r | Q3_K_M   | Tool_use      | 0.2         | ❔         | ❔       | ❔   | ❔      | ❔      |
| command-r | Q3_K_S   | Default       | 0.2         | ❔         | ✅       | ✅   | ✅      | ✅      |
| command-r | Q3_K_S   | Tool_use      | 0.2         | ✅         | ✅       | ✅   | ❔      | ☑️      |
| command-r | Q4_0     | Default       | 0.2         | ✅         | ✅       | ✅   | ❔      | ✅      |
| command-r | Q4_0     | Tool_use      | 0.2         | ✅         | ✅       | ✅   | ✅      | ✅      |
| command-r | Q4_1     | Default       | 0.2         | ✅         | ✅       | ✅   | ✅      | ✅      |
| command-r | Q4_1     | Tool_use      | 0.2         | ✅         | ✅       | ✅   | ✅      | ☑️      |
| command-r | Q4_K_M   | Default       | 0.2         | ✅         | ✅       | ✅   | ✅      | ✅      |
| command-r | Q4_K_M   | Tool_use      | 0.2         | ✅         | ✅       | ✅   | ✅      | ✅      |
| command-r | Q4_K_S   | Default       | 0.2         | ✅         | ✅       | ☑️   | ❔      | ✅      |
| command-r | Q4_K_S   | Tool_use      | 0.2         | ✅         | ✅       | ✅   | ❔      | ✅      |
| command-r | Q5_1     | Default       | 0.2         | ✅         | ✅       | ✅   | ✅      | ☑️      |
| command-r | Q5_1     | Tool_use      | 0.2         | ✅         | ✅       | ✅   | ❌      | ✅      |
| command-r | Q5_K_M   | Default       | 0.2         | ✅         | ✅       | ✅   | ✅      | ✅      |
| command-r | Q5_K_M   | Tool_use      | 0.2         | ✅         | ✅       | ✅   | ❌      | ✅      |
| command-r | Q5_K_S   | Default       | 0.2         | ✅         | ✅       | ✅   | ❌      | ☑️      |
| command-r | Q5_K_S   | Tool_use      | 0.2         | ✅         | ✅       | ✅   | ❌      | ☑️      |
| command-r | Q6_K     | Default       | 0.2         | ✅         | ✅       | ✅   | ❌      | ✅      |
| command-r | Q6_K     | Tool_use      | 0.2         | ✅         | ✅       | ✅   | ✅      | ✅      |
| command-r | Q8_0     | Default       | 0.2         | ✅         | ✅       | ✅   | ✅      | ✅      |
| command-r | Q8_0     | Tool_use      | 0.2         | ✅         | ✅       | ✅   | ✅      | ☑️      |
| command-r | f16     | Default       | 0.2         | ✅         | ✅       | ✅   | ❔      | ✅      |
| command-r | f16     | Tool_use      | 0.2         | ✅         | ✅       | ✅   | ✅      | ✅      |

Key:

✅: Correct answer, would accept address without a zip code

☑️: Incomplete correct answer, address does not include state

❌: Incorrect answer

❔: Answer given is "DO NOT KNOW" or something similar

Note: The above is using the unaltered script. You may get better results by, for example, changing the number of question variants to be greater than TWO.

More information on prompting Command-r can be found [here](https://docs.cohere.com/docs/prompting-command-r).

Paste, drop or click to upload images (.png, .jpeg, .jpg, .svg, .gif)