Choose Version

(2 versions)

PyTorch

F16

GPU

This version is not quantized and a GPU is recommended.

Install our magic package manager:
```
curl -ssL https://magic.modular.com/ | bash
```
Then run the source command that's printed in your terminal.

Install Max Pipelines in order to run this model.

magic global install max-pipelines && magic global update

Start a local endpoint for aya-23/8B:

max-pipelines serve --huggingface-repo-id=CohereForAI/aya-23-8B

The endpoint is ready when you see the URI printed in your terminal:

Server ready on http://0.0.0.0:8000 (Press CTRL+C to quit)

Now open another terminal to send a request using curl:

curl -N http://0.0.0.0:8000/v1/chat/completions -H "Content-Type: application/json" -d '{
    "model": "CohereForAI/aya-23-8B",
    "stream": true,
    "messages": [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Who won the World Series in 2020?"}
    ]
}' | grep -o '"content":"[^"]*"' | sed 's/"content":"//g' | sed 's/"//g' | tr -d '
' | sed 's/\n//g'

🎉 Hooray! You’re running Generative AI. Our goal is to make this as easy as possible.

Deploy this model to cloud

Aya-23-8B

Model Summary

Aya 23 is a research-focused language model designed with advanced multilingual capabilities, utilizing the pre-trained Command family and Aya Collection models. Developed by Cohere For AI and Cohere, Aya 23 supports 23 languages, including Arabic, Chinese, Czech, Dutch, English, and others.

Generated by: Cohere For AI and Cohere
Contact: cohere.for.ai
License: CC-BY-NC, Compliance with C4AI's Acceptable Use Policy required.
Model: aya-23-8B
Size: 8 billion parameters

Model Details

Input: Text only
Output: Generates text
Architecture: Auto-regressive language model using optimized transformer architecture.
Languages Supported: Arabic, Chinese, Czech, Dutch, English, French, German, Greek, Hebrew, Hindi, Indonesian, Italian, Japanese, Korean, Persian, Polish, Portuguese, Romanian, Russian, Spanish, Turkish, Ukrainian, Vietnamese
Context length: 8192

Evaluation

Performance metrics are illustrated through multilingual benchmarks and win rate analysis.

Terms of Use

Aya 23 aims to improve accessibility for community-based research by providing a high-performing multilingual model. It is governed by a CC-BY-NC License with an acceptable use addendum, requiring adherence to C4AI's Acceptable Use Policy.

Citation Info

@misc{aryabumi2024aya,
      title={Aya 23: Open Weight Releases to Further Multilingual Progress},
      author={Viraat Aryabumi and John Dang and Dwarak Talupuru and Saurabh Dash and David Cairuz and Hangyu Lin and Bharat Venkitesh and Madeline Smith and Kelly Marchisio and Sebastian Ruder and Acyr Locatelli and Julia Kreutzer and Nick Frosst and Phil Blunsom and Marzieh Fadaee and Ahmet Üstün and Sara Hooker},
      year={2024},
      eprint={2405.15032},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Metadata

architectures.0	CohereForCausalLM
model_type	cohere

Version: 8B GPU F16

This code works on compatible Linux machines.
We are actively working on enabling MAX Serve for MacOS ARM64 as well.

You can quickly deploy aya-23-8B to an endpoint using our MAX container. It includes the latest version of MAX with GPU support and our Python-based inference server called MAX Serve.

With the following Docker command, you’ll get an OpenAI-compatible endpoint running aya-23-8B:

docker run --gpus 1 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_HUB_ENABLE_HF_TRANSFER=1" \
    --env "HF_TOKEN=" \
    -p 8000:8000 \
    docker.modular.com/modular/max-openai-api:nightly \
    --huggingface-repo-id CohereForAI/aya-23-8B

In order to download the model from Hugging Face, you just need to fill in the HF_TOKEN value with your access token, unless the model is from https://huggingface.co/modularai.

Learn more

For more information about the container image, see the MAX container documentation.

To learn more about how to deploy MAX to the cloud, check out our MAX Serve tutorials.

Point of Contact: Cohere For AI: cohere.for.ai
License: CC-BY-NC, requires also adhering to C4AI's Acceptable Use Policy
Model: aya-23-8B
Model Size: 8 billion parameters

DETAILS

ChatMODEL CLASS

PyTorch

HARDWARE

GPU

QUANTIZATION

F16

ARCHITECTURE

PyTorch

MAX GITHUB

Modular / MAX

MODEL

CohereForAI

CohereForAI/aya-23-8B

QUESTIONS ABOUT THIS MODEL?

Resources & support for
running aya-23-8B

Browse 27+ Tutorials

View Tutorials

Get help using MAX

Modular Forum

Read Documentation

Go to Docs

aya-23-8B

Aya-23-8B

Model Summary

Model Details

Evaluation

Terms of Use

Citation Info

Metadata

Learn more

Resources & support for running aya-23-8B

Resources & support for
running aya-23-8B