Llama-3.1-Tulu-3 models excel in diverse tasks with advanced training techniques and open-source data.
MAX Models are popular open-source models converted to MAXās native graph format. Anything with the label is either SOTA or being worked on. Learn more about MAX Models.
Browse all MAX Models
MAX GITHUB
Modular / MAX
BASE MODEL
allenai
allenai/Llama-3.1-Tulu-3.1-8B
QUANTIZED BY
brittlewis12
brittlewis12/Llama-3.1-Tulu-3.1-8B-GGUF
QUESTIONS ABOUT THIS MODEL?
Leave a comment
PROBLEMS WITH THE CODE?
File an Issue
TAGS
/
/
/
/
/
/
/
/
/
/
/
/
/
/
Choose Version
(3 versions)
Install our magic
package manager:
curl -ssL https://magic.modular.com/ | bash
Then run the source
command that's printed in your terminal.
Install Max Pipelines in order to run this model.
magic global install max-pipelines && magic global update
Start a local endpoint for Llama-3.1-Tulu-3.1/8B-Q4_K_M:
max-pipelines serve --huggingface-repo-id=allenai/Llama-3.1-Tulu-3.1-8B \
--weight-path=brittlewis12/Llama-3.1-Tulu-3.1-8B-GGUF/llama-3.1-tulu-3.1-8b.Q4_K_M.gguf
The endpoint is ready when you see the URI printed in your terminal:
Server ready on http://0.0.0.0:8000 (Press CTRL+C to quit)
Now open another terminal to send a request using curl
:
curl -N http://0.0.0.0:8000/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "allenai/Llama-3.1-Tulu-3.1-8B",
"stream": true,
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Who won the World Series in 2020?"}
]
}' | grep -o '"content":"[^"]*"' | sed 's/"content":"//g' | sed 's/"//g' | tr -d '
' | sed 's/\n//g'
š Hooray! Youāre running Generative AI. Our goal is to make this as easy as possible.
TĆ¼lu 3 redefines instruction-following models, offering a fully open-source package of data and tools. It is designed to deliver top-tier performance across a range of applications, including MATH, GSM8K, and IFEval tasks.
Version Update: The latest TĆ¼lu model upgrades involve an improved RL training phase, showcasing significant strides in efficacy.
Stage | Llama 3.1 8B (New) | Llama 3.1 70B |
---|---|---|
Base Model | meta-llama/Llama-3.1-8B | meta-llama/Llama-3.1-70B |
SFT | allenai/Llama-3.1-Tulu-3-8B-SFT | allenai/Llama-3.1-Tulu-3-70B-SFT |
DPO | allenai/Llama-3.1-Tulu-3-8B-DPO | allenai/Llama-3.1-Tulu-3-70B-DPO |
Final Models (RLVR) | allenai/Llama-3.1-Tulu-3.1-8B | allenai/Llama-3.1-Tulu-3-70B |
Reward Model (RM) | None with GRPO | allenai/Llama-3.1-Tulu-3-8B-RM |
Stage | Llama 3.1 405B |
---|---|
Base Model | meta-llama/llama-3.1-405B |
SFT | allenai/llama-3.1-Tulu-3-405B-SFT |
DPO | allenai/llama-3.1-Tulu-3-405B-DPO |
Final Model (RLVR) | allenai/llama-3.1-Tulu-3-405B |
Reward Model (RM) | Same as 70B |
You may need to alter the ā-max-length=
and ā-max-batch-size=
parameters depending on the amount of memory you have access to.
Formatted user-assistant interaction examples are embedded in the tokenizer for easy use, including a default system prompt for Ai2 demos.
The TĆ¼lu3 models have limited safety training and might produce problematic outputs. The training corpus details are unclear, likely involving a mix of web data and technical sources.
Benchmark (eval) | TĆ¼lu 3 SFT 8B | TĆ¼lu 3.1 8B (NEW) | Llama 3.1 8B Instruct | Qwen 2.5 7B Instruct |
---|---|---|---|---|
Avg. | 60.4 | 66.3 | 62.2 | 66.5 |
MMLU (0 shot, CoT) | 65.9 | 69.5 | 71.2 | 76.6 |
GSM8K (8 shot, CoT) | 76.2 | 90.0 | 83.4 | 83.8 |
High scores and notable improvements emphasize the model's versatile performance across various benchmarks, including MMLU and GSM8K.
Key settings for RLVR with GRPO include:
Training curves for Llama-3.1-Tulu-3.1-8B showcase improvement milestones, with key evaluation metrics displayed.
All Llama 3.1 TĆ¼lu3 models are under the Llama 3.1 Community License Agreement. They are intended primarily for research and educational purposes. For more information, refer to the Responsible Use Guidelines.
If TĆ¼lu3 or related materials assisted your work, please cite:
@article{lambert2024tulu3,
title = {TĆ¼lu 3: Pushing Frontiers in Open Language Model Post-Training},
author = {Nathan Lambert et al.},
year = {2024},
email = {tulu@allenai.org}
}
version | 3 |
tensor_count | 292 |
kv_count | 45 |
general.architecture | llama |
general.type | model |
general.name | Llama 3.1 TĆ¼lu 3.1 8B |
general.author | AllenAI |
general.organization | AllenAI |
general.finetune | DPO |
general.basename | Llama-3.1-Tulu-3 |
general.quantized_by | Britt Lewis <britt@bl3.dev> |
general.size_label | 8B |
general.license | llama3.1 |
general.repo_url | https://huggingface.co/brittlewis12/Llama-3.1-Tulu-3.1-8B-GGUF |
general.source.repo_url | https://huggingface.co/allenai/Llama-3.1-Tulu-3.1-8B |
general.base_model.count | 1 |
general.base_model.0.name | Llama 3.1 Tulu 3 8B DPO |
general.base_model.0.organization | Allenai |
general.base_model.0.repo_url | https://huggingface.co/allenai/Llama-3.1-Tulu-3-8B-DPO |
general.dataset.count | 1 |
general.dataset.0.name | RLVR GSM MATH IF Mixed Constraints |
general.dataset.0.organization | Allenai |
general.dataset.0.repo_url | https://huggingface.co/allenai/RLVR-GSM-MATH-IF-Mixed-Constraints |
general.tags.0 | text-generation |
general.languages.0 | en |
llama.block_count | 32 |
llama.context_length | 131072 |
llama.embedding_length | 4096 |
llama.feed_forward_length | 14336 |
llama.attention.head_count | 32 |
llama.attention.head_count_kv | 8 |
llama.rope.freq_base | 500000 |
llama.attention.layer_norm_rms_epsilon | 0.000009999999747378752 |
llama.attention.key_length | 128 |
llama.attention.value_length | 128 |
llama.vocab_size | 128264 |
llama.rope.dimension_count | 128 |
general.quantization_version | 2 |
general.file_type | 15 |
Version: 8B CPU Q4_K_M
You can quickly deploy Llama-3.1-Tulu-3.1-8B
to an endpoint using our MAX container.
It includes the latest version of MAX with GPU support and our Python-based inference server called MAX Serve.
With the following Docker command, youāll get an OpenAI-compatible endpoint running Llama-3.1-Tulu-3.1-8B
:
docker run --gpus 1 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_HUB_ENABLE_HF_TRANSFER=1" \
--env "HF_TOKEN=" \
-p 8000:8000 \
docker.modular.com/modular/max-openai-api:nightly \
--huggingface-repo-id allenai/Llama-3.1-Tulu-3.1-8B \
--weight-path=brittlewis12/Llama-3.1-Tulu-3.1-8B-GGUF/llama-3.1-tulu-3.1-8b.Q4_K_M.gguf
In order to download the model from Hugging Face, you just need to fill in the
HF_TOKEN
value with your access token,
unless the model is from https://huggingface.co/modularai
.
For more information about the container image, see the MAX container documentation.
To learn more about how to deploy MAX to the cloud, check out our MAX Serve tutorials.
Llama 3.1 Version Release Date: July 23, 2024
By clicking āI Acceptā below or by using or distributing any portion or element of the Llama Materials, you agree to be bound by this Agreement.
i. If you distribute or make available the Llama Materials (or any derivative works thereof), or a product or service (including another AI model) that contains any of them, you shall (A) provide a copy of this Agreement with any such Llama Materials; and (B) prominently display āBuilt with Llamaā on a related website, user interface, blogpost, about page, or product documentation. If you use the Llama Materials or any outputs or results of the Llama Materials to create, train, fine tune, or otherwise improve an AI model, which is distributed or made available, you shall also include āLlamaā at the beginning of any such AI model name.
ii. If you receive Llama Materials, or any derivative works thereof, from a Licensee as part of an integrated end user product, then Section 2 of this Agreement will not apply to you.
iii. You must retain in all copies of the Llama Materials that you distribute the following attribution notice within a āNoticeā text file distributed as a part of such copies: āLlama 3.1 is licensed under the Llama 3.1 Community License, Copyright Ā© Meta Platforms, Inc. All Rights Reserved.ā
b. Subject to Metaās ownership of Llama Materials and derivatives made by or for Meta, with respect to any derivative works and modifications of the Llama Materials that are made by you, as between you and Meta, you are and will be the owner of such derivative works and modifications.
c. If you institute litigation or other proceedings against Meta or any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Llama Materials or Llama 3.1 outputs or results, or any portion of any of the foregoing, constitutes infringement of intellectual property or other rights owned or licensable by you, then any licenses granted to you under this Agreement shall terminate as of the date such litigation or claim is filed or instituted. You will indemnify and hold harmless Meta from and against any claim by any third party arising out of or related to your use or distribution of the Llama Materials.
DETAILS
MAX Models are popular open-source models converted to MAXās native graph format. Anything with the label is either SOTA or being worked on. Learn more about MAX Models.
Browse all MAX Models
MAX GITHUB
Modular / MAX
BASE MODEL
allenai
allenai/Llama-3.1-Tulu-3.1-8B
QUANTIZED BY
brittlewis12
brittlewis12/Llama-3.1-Tulu-3.1-8B-GGUF
QUESTIONS ABOUT THIS MODEL?
Leave a comment
PROBLEMS WITH THE CODE?
File an Issue
TAGS
/
/
/
/
/
/
/
/
/
/
/
/
/
/
ENTERPRISES
@ Copyright - Modular Inc - 2025