mistral-7b

MAX Model

1 versions

The 7B model released by Mistral AI, updated to version 0.3.

Run this model

  1. Install our magic package manager:

    curl -ssL https://magic.modular.com/ | bash

    Then run the source command that's printed in your terminal.

  2. Install Max Pipelines in order to run this model.

    magic global install max-pipelines
  3. Start a local endpoint for mistral/7b:

    max-serve serve --huggingface-repo-id mistralai/Mistral-7B-Instruct-v0.2

    The endpoint is ready when you see the URI printed in your terminal:

    Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
  4. Now open another terminal to send a request using curl:

    curl -N http://0.0.0.0:8000/v1/chat/completions -H "Content-Type: application/json" -d '{
        "model": "mistral/7b",
        "stream": true,
        "messages": [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Who won the World Series in 2020?"}
        ]
    }' | grep -o '"content":"[^"]*"' | sed 's/"content":"//g' | sed 's/"//g' | tr -d '
    ' | sed 's/\n/
    /g'
  5. 🎉 Hooray! You’re running Generative AI. Our goal is to make this as easy as possible.

About

Mistral is a 7B parameter AI model, released under the Apache license. It excels in instruction-following and text completion tasks. Key highlights include:

  • Outperforming Llama 2 13B across all benchmarks.
  • Surpassing Llama 1 34B on many benchmarks.
  • Matching CodeLlama 7B performance in code-related tasks while maintaining robust English capabilities.

Versions

Tag Date Notes
v0.3 latest 05/22/2024 A new version of Mistral 7B that supports function calling.
v0.2 03/23/2024 A minor release of Mistral 7B
v0.1 09/27/2023 Initial release

Function calling

Version 0.3 introduces function calling capabilities with raw prompt structuring.

Example raw prompt

[AVAILABLE_TOOLS] [{"type": "function", "function": {"name": "get_current_weather", "description": "Get the current weather", "parameters": {"type": "object", "properties": {"location": {"type": "string", "description": "The city and state, e.g. San Francisco, CA"}, "format": {"type": "string", "enum": ["celsius", "fahrenheit"], "description": "The temperature unit to use. Infer this from the users location."}}, "required": ["location", "format"]}}}][/AVAILABLE_TOOLS][INST] What is the weather like today in San Francisco [/INST]

Example response

[TOOL_CALLS] [{"name": "get_current_weather", "arguments": {"location": "San Francisco, CA", "format": "celsius"}}]

Variations

Type Instructions
instruct Instruct models follow instructions
text Text models are the base foundation model without any fine-tuning for conversations, and are best used for simple text completion.

References

HuggingFace

Mistral AI News Release

DETAILS

MODEL CLASS
MAX Model

MAX Models are extremely optimized inference pipelines to run SOTA performance for that model on both CPU and GPU. For many of these models, they are the fastest version of this model in the world.

Browse 18+ MAX Models

MODULAR GITHUB

Modular

CREATED BY

mistralai

MODEL

mistralai/Mistral-7B-Instruct-v0.2

TAGS

arxiv:2310.06825
autotrain_compatible
conversational
endpoints_compatible
finetuned
license:apache-2.0
mistral
pytorch
region:us
safetensors
text-generation
text-generation-inference
transformers

@ Copyright - Modular Inc - 2024