Dolphin 3.0, a versatile AI model, empowers businesses with customizable and private solutions.
MAX Models are popular open-source models converted to MAX’s native graph format. Anything with the label is either SOTA or being worked on. Learn more about MAX Models.
Browse all MAX Models
MAX GITHUB
Modular / MAX
BASE MODEL
cognitivecomputations
cognitivecomputations/Dolphin3.0-Llama3.1-8B
QUANTIZED BY
cognitivecomputations
cognitivecomputations/Dolphin3.0-Llama3.1-8B-GGUF
QUESTIONS ABOUT THIS MODEL?
Leave a comment
PROBLEMS WITH THE CODE?
File an Issue
TAGS
/
/
/
/
/
/
/
/
/
/
/
/
/
/
/
/
/
/
/
Choose Version
(3 versions)
Install our magic
package manager:
curl -ssL https://magic.modular.com/ | bash
Then run the source
command that's printed in your terminal.
Install Max Pipelines in order to run this model.
magic global install max-pipelines && magic global update
Start a local endpoint for Dolphin3.0-Llama3.1/8B-Q4_K_M:
max-pipelines serve --huggingface-repo-id=cognitivecomputations/Dolphin3.0-Llama3.1-8B \
--weight-path=bartowski/Dolphin3.0-Llama3.1-8B-GGUF/Dolphin3.0-Llama3.1-8B-Q4_K_M.gguf
The endpoint is ready when you see the URI printed in your terminal:
Server ready on http://0.0.0.0:8000 (Press CTRL+C to quit)
Now open another terminal to send a request using curl
:
curl -N http://0.0.0.0:8000/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "cognitivecomputations/Dolphin3.0-Llama3.1-8B",
"stream": true,
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Who won the World Series in 2020?"}
]
}' | grep -o '"content":"[^"]*"' | sed 's/"content":"//g' | sed 's/"//g' | tr -d '
' | sed 's/\n//g'
🎉 Hooray! You’re running Generative AI. Our goal is to make this as easy as possible.
Dolphin 3.0, part of the Dolphin 3.0 Collection, is a state-of-the-art general-purpose model. Trained by Eric Hartford, Ben Gitter, BlouseJury, and Cognitive Computations, this model aims to provide flexible and powerful AI capabilities for businesses. Dolphin 3.0 excels in coding, math, function calling, and more, serving as a key tool similar to models like ChatGPT or Claude, but with enhanced control for the users.
Dolphin 3.0 enhances broadly applicable instruct-tuned models, making it ideal for a wide range of tasks. Businesses can utilize Dolphin's features without losing control over the system prompt, model versions, or data privacy, ensuring full autonomy and adaptability.
The system prompt allows users to tailor responses with specific character settings, moods, and behavior guidelines to suit various applications.
ChatML is employed as the chat template, facilitating smooth interaction.
Several platforms can run the model:
ollama run cognitivecomputations/Dolphin3.0-Llama3.1-8B-GGUF:Q4_0
/set system
Evaluation details will be available soon.
Gratitude goes out to creators of datasets and vital contributors:
Acknowledgments to Meta, Qwen, OpenCoder for foundational models, RLHFlow for the reward model, and Deepseek for data augmentation tools.
version | 3 |
tensor_count | 292 |
kv_count | 81 |
general.architecture | llama |
general.type | model |
general.name | Dolphin 3.0 Llama 3.1 8B |
general.organization | Cognitivecomputations |
general.basename | Dolphin-3.0-Llama-3.1 |
general.size_label | 8B |
general.license | llama3.1 |
general.base_model.count | 1 |
general.base_model.0.name | Llama 3.1 8B |
general.base_model.0.organization | Meta Llama |
general.base_model.0.repo_url | https://huggingface.co/meta-llama/Llama-3.1-8B |
general.dataset.count | 13 |
general.dataset.0.name | Opc Sft Stage1 |
general.dataset.0.organization | OpenCoder LLM |
general.dataset.0.repo_url | https://huggingface.co/OpenCoder-LLM/opc-sft-stage1 |
general.dataset.1.name | Opc Sft Stage2 |
general.dataset.1.organization | OpenCoder LLM |
general.dataset.1.repo_url | https://huggingface.co/OpenCoder-LLM/opc-sft-stage2 |
general.dataset.2.name | Orca Agentinstruct 1M v1 |
general.dataset.2.version | v1 |
general.dataset.2.organization | Microsoft |
general.dataset.2.repo_url | https://huggingface.co/microsoft/orca-agentinstruct-1M-v1 |
general.dataset.3.name | Orca Math Word Problems 200k |
general.dataset.3.organization | Microsoft |
general.dataset.3.repo_url | https://huggingface.co/microsoft/orca-math-word-problems-200k |
general.dataset.4.name | Hermes Function Calling v1 |
general.dataset.4.version | v1 |
general.dataset.4.organization | NousResearch |
general.dataset.4.repo_url | https://huggingface.co/NousResearch/hermes-function-calling-v1 |
general.dataset.5.name | NuminaMath CoT |
general.dataset.5.organization | AI MO |
general.dataset.5.repo_url | https://huggingface.co/AI-MO/NuminaMath-CoT |
general.dataset.6.name | NuminaMath TIR |
general.dataset.6.organization | AI MO |
general.dataset.6.repo_url | https://huggingface.co/AI-MO/NuminaMath-TIR |
general.dataset.7.name | Tulu 3 Sft Mixture |
general.dataset.7.organization | Allenai |
general.dataset.7.repo_url | https://huggingface.co/allenai/tulu-3-sft-mixture |
general.dataset.8.name | Dolphin Coder |
general.dataset.8.organization | Cognitivecomputations |
general.dataset.8.repo_url | https://huggingface.co/cognitivecomputations/dolphin-coder |
general.dataset.9.name | Smoltalk |
general.dataset.9.organization | HuggingFaceTB |
general.dataset.9.repo_url | https://huggingface.co/HuggingFaceTB/smoltalk |
general.dataset.10.name | Samantha Data |
general.dataset.10.organization | Cognitivecomputations |
general.dataset.10.repo_url | https://huggingface.co/cognitivecomputations/samantha-data |
general.dataset.11.name | CodeFeedback Filtered Instruction |
general.dataset.11.organization | M A P |
general.dataset.11.repo_url | https://huggingface.co/m-a-p/CodeFeedback-Filtered-Instruction |
general.dataset.12.name | Code Feedback |
general.dataset.12.organization | M A P |
general.dataset.12.repo_url | https://huggingface.co/m-a-p/Code-Feedback |
general.languages.0 | en |
llama.block_count | 32 |
llama.context_length | 131072 |
llama.embedding_length | 4096 |
llama.feed_forward_length | 14336 |
llama.attention.head_count | 32 |
llama.attention.head_count_kv | 8 |
llama.rope.freq_base | 500000 |
llama.attention.layer_norm_rms_epsilon | 0.000009999999747378752 |
llama.attention.key_length | 128 |
llama.attention.value_length | 128 |
general.file_type | 15 |
llama.vocab_size | 128258 |
llama.rope.dimension_count | 128 |
general.quantization_version | 2 |
quantize.imatrix.file | /models_out/Dolphin3.0-Llama3.1-8B-GGUF/Dolphin3.0-Llama3.1-8B.imatrix |
quantize.imatrix.dataset | /training_dir/calibration_datav3.txt |
quantize.imatrix.entries_count | 224 |
quantize.imatrix.chunks_count | 125 |
Version: 8B CPU Q4_K_M
You can quickly deploy Dolphin3.0-Llama3.1-8B
to an endpoint using our MAX container.
It includes the latest version of MAX with GPU support and our Python-based inference server called MAX Serve.
With the following Docker command, you’ll get an OpenAI-compatible endpoint running Dolphin3.0-Llama3.1-8B
:
docker run --gpus 1 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_HUB_ENABLE_HF_TRANSFER=1" \
--env "HF_TOKEN=" \
-p 8000:8000 \
docker.modular.com/modular/max-openai-api:nightly \
--huggingface-repo-id cognitivecomputations/Dolphin3.0-Llama3.1-8B \
--weight-path=bartowski/Dolphin3.0-Llama3.1-8B-GGUF/Dolphin3.0-Llama3.1-8B-Q4_K_M.gguf
In order to download the model from Hugging Face, you just need to fill in the
HF_TOKEN
value with your access token,
unless the model is from https://huggingface.co/modularai
.
For more information about the container image, see the MAX container documentation.
To learn more about how to deploy MAX to the cloud, check out our MAX Serve tutorials.
Llama 3.1 Version Release Date: July 23, 2024
By clicking “I Accept” below or by using or distributing any portion or element of the Llama Materials, you agree to be bound by this Agreement.
i. If you distribute or make available the Llama Materials (or any derivative works thereof), or a product or service (including another AI model) that contains any of them, you shall (A) provide a copy of this Agreement with any such Llama Materials; and (B) prominently display “Built with Llama” on a related website, user interface, blogpost, about page, or product documentation. If you use the Llama Materials or any outputs or results of the Llama Materials to create, train, fine tune, or otherwise improve an AI model, which is distributed or made available, you shall also include “Llama” at the beginning of any such AI model name.
ii. If you receive Llama Materials, or any derivative works thereof, from a Licensee as part of an integrated end user product, then Section 2 of this Agreement will not apply to you.
iii. You must retain in all copies of the Llama Materials that you distribute the following attribution notice within a “Notice” text file distributed as a part of such copies: “Llama 3.1 is licensed under the Llama 3.1 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.”
b. Subject to Meta’s ownership of Llama Materials and derivatives made by or for Meta, with respect to any derivative works and modifications of the Llama Materials that are made by you, as between you and Meta, you are and will be the owner of such derivative works and modifications.
c. If you institute litigation or other proceedings against Meta or any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Llama Materials or Llama 3.1 outputs or results, or any portion of any of the foregoing, constitutes infringement of intellectual property or other rights owned or licensable by you, then any licenses granted to you under this Agreement shall terminate as of the date such litigation or claim is filed or instituted. You will indemnify and hold harmless Meta from and against any claim by any third party arising out of or related to your use or distribution of the Llama Materials.
DETAILS
MAX Models are popular open-source models converted to MAX’s native graph format. Anything with the label is either SOTA or being worked on. Learn more about MAX Models.
Browse all MAX Models
MAX GITHUB
Modular / MAX
BASE MODEL
cognitivecomputations
cognitivecomputations/Dolphin3.0-Llama3.1-8B
QUANTIZED BY
cognitivecomputations
cognitivecomputations/Dolphin3.0-Llama3.1-8B-GGUF
QUESTIONS ABOUT THIS MODEL?
Leave a comment
PROBLEMS WITH THE CODE?
File an Issue
TAGS
/
/
/
/
/
/
/
/
/
/
/
/
/
/
/
/
/
/
/
ENTERPRISES
@ Copyright - Modular Inc - 2025