C4AI Command R+ is a multilingual, advanced AI model trained for reasoning, summarization, and question answering using 104 billion parameters.
Version:
4B GPU: F16
This version is not quantized and a GPU is recommended.
Install our magic
package manager:
curl -ssL https://magic.modular.com/ | bash
Then run the source
command that's printed in your terminal.
Install Max Pipelines in order to run this model.
magic global install max-pipelines && magic global update
Start a local endpoint for c4ai-command-r-plus/4B:
max-pipelines serve --huggingface-repo-id=CohereForAI/c4ai-command-r-plus-4bit
The endpoint is ready when you see the URI printed in your terminal:
Server ready on http://0.0.0.0:8000 (Press CTRL+C to quit)
Now open another terminal to send a request using curl
:
curl -N http://0.0.0.0:8000/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "CohereForAI/c4ai-command-r-plus-4bit",
"stream": true,
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Who won the World Series in 2020?"}
]
}' | grep -o '"content":"[^"]*"' | sed 's/"content":"//g' | sed 's/"//g' | tr -d '
' | sed 's/\n//g'
🎉 Hooray! You’re running Generative AI. Our goal is to make this as easy as possible.
DETAILS
MAX GITHUB
Modular / MAX
MODEL
CohereForAI
CohereForAI/c4ai-command-r-plus-4bit
QUESTIONS ABOUT THIS MODEL?
Leave a comment
PROBLEMS WITH THE CODE?
File an Issue
TAGS
ENTERPRISES
@ Copyright - Modular Inc - 2025