models

/

Llama-3.1-Tulu-3.1-8B-Q4_K_M

Choose Version

(3 versions)

MAX Model
Q4_K_M
CPU
  1. Install our magic package manager:

    curl -ssL https://magic.modular.com/ | bash

    Then run the source command that's printed in your terminal.

  2. Install Max Pipelines in order to run this model.

    magic global install max-pipelines && magic global update
  3. Start a local endpoint for Llama-3.1-Tulu-3.1/8B-Q4_K_M:

    max-pipelines serve --huggingface-repo-id=allenai/Llama-3.1-Tulu-3.1-8B \
    --weight-path=brittlewis12/Llama-3.1-Tulu-3.1-8B-GGUF/llama-3.1-tulu-3.1-8b.Q4_K_M.gguf

    The endpoint is ready when you see the URI printed in your terminal:

    Server ready on http://0.0.0.0:8000 (Press CTRL+C to quit)
  4. Now open another terminal to send a request using curl:

    curl -N http://0.0.0.0:8000/v1/chat/completions -H "Content-Type: application/json" -d '{
        "model": "allenai/Llama-3.1-Tulu-3.1-8B",
        "stream": true,
        "messages": [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Who won the World Series in 2020?"}
        ]
    }' | grep -o '"content":"[^"]*"' | sed 's/"content":"//g' | sed 's/"//g' | tr -d '
    ' | sed 's/\n//g'
  5. šŸŽ‰ Hooray! Youā€™re running Generative AI. Our goal is to make this as easy as possible.

Deploy this model to cloud

DETAILS

ChatMODEL CLASS
MAX Model

MAX Models are popular open-source models converted to MAXā€™s native graph format. Anything with the label is either SOTA or being worked on. Learn more about MAX Models.

Browse all MAX Models

HARDWARE
CPU
QUANTIZATION
Q4_K_M
ARCHITECTURE
MAX Model

MAX GITHUB

Modular / MAX

BASE MODEL

allenai

allenai/Llama-3.1-Tulu-3.1-8B

QUANTIZED BY

brittlewis12

brittlewis12/Llama-3.1-Tulu-3.1-8B-GGUF

QUESTIONS ABOUT THIS MODEL?

Leave a comment

PROBLEMS WITH THE CODE?

File an Issue

TAGS

transformers

/

safetensors

/

llama

/

text-generation

/

conversational

/

en

/

dataset:allenai/RLVR-GSM-MATH-IF-Mixed-Constraints

/

arxiv:2411.15124

/

base_model:allenai/Llama-3.1-Tulu-3-8B-DPO

/

base_model:finetune:allenai/Llama-3.1-Tulu-3-8B-DPO

/

license:llama3.1

/

autotrain_compatible

/

text-generation-inference

/

endpoints_compatible

/

region:us

Resources & support for
running Llama-3.1-Tulu-3.1-8B

@ Copyright - Modular Inc - 2025