Choose Version

(2 versions)

MAX Model

q4_k_m

CPU

Install our magic package manager:
```
curl -ssL https://magic.modular.com/ | bash
```
Then run the source command that's printed in your terminal.

Install Max Pipelines in order to run this model.

magic global install max-pipelines && magic global update

Start a local endpoint for DeepSeek-llama3.1-Bllossom/8B-Q4_K_M:

max-pipelines serve --huggingface-repo-id=kenonix/DeepSeek-llama3.1-Bllossom-8B-Q4_K_M-GGUF \
--weight-path=kenonix/DeepSeek-llama3.1-Bllossom-8B-Q4_K_M-GGUF/deepseek-llama3.1-bllossom-8b-q4_k_m.gguf

The endpoint is ready when you see the URI printed in your terminal:

Server ready on http://0.0.0.0:8000 (Press CTRL+C to quit)

Now open another terminal to send a request using curl:

curl -N http://0.0.0.0:8000/v1/chat/completions -H "Content-Type: application/json" -d '{
    "model": "kenonix/DeepSeek-llama3.1-Bllossom-8B-Q4_K_M-GGUF",
    "stream": true,
    "messages": [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Who won the World Series in 2020?"}
    ]
}' | grep -o '"content":"[^"]*"' | sed 's/"content":"//g' | sed 's/"//g' | tr -d '
' | sed 's/\n//g'

🎉 Hooray! You’re running Generative AI. Our goal is to make this as easy as possible.

Deploy this model to cloud

DeepSeek-llama3.1-Bllossom-8B

Overview

The DeepSeek-Bllossom Series addresses the language mixing and multilingual performance issues of the original model, DeepSeek-R1-Distill. Built upon DeepSeek-R1-distill-Llama-8B, DeepSeek-llama3.1-Bllossom-8B aims to boost inferential capabilities in Korean environments. It is a collaborative development by UNIVA and the Bllossom team.

Model Details

Model	Base Model	Download
DeepSeek-qwen-Bllossom-1.5B	DeepSeek-R1-Distill-Qwen-1.5B	공개예정
DeepSeek-qwen-Bllossom-7B	DeepSeek-R1-Distill-Qwen-7B	공개예정
DeepSeek-llama3.1-Bllossom-8B	DeepSeek-R1-Distill-Llama-8B	UNIVA-Bllossom
DeepSeek-qwen-Bllossom-14B	DeepSeek-R1-Distill-Qwen-14B	공개예정
DeepSeek-qwen-Bllossom-32B	DeepSeek-R1-Distill-Qwen-32B	공개예정
DeepSeek-llama3.3-Bllossom-70B	DeepSeek-R1-Distill-Llama-70B	UNIVA-Bllossom

1. Introduction

DeepSeek-llama3.1-Bllossom-8B is built on DeepSeek-R1-distill-Llama-8B to overcome its limitations due to training primarily on English and Chinese data. Initially, the base model struggled with Korean language inference, which is now significantly improved by ensuring internal reasoning in English while responding appropriately in the input language.

The development employed Korean and English reasoning datasets, incorporating varied domains beyond the mainly STEM-focused datasets previously used.

2. Post-training

During the post-training phase, DeepSeek-llama3.1-Bllossom-8B utilized diverse reasoning datasets to enhance its reasoning and Korean language handling skills, effectively distilling these capabilities into the model. This optimization process aims to deliver precise, trustworthy responses to multifaceted inferential queries.

3. Model Response Comparison

Input (Prompt)	DeepSeek-R1-distill-Llama-70B	DeepSeek-llama3.3-Bllossom-70B
Alice, Bob, Charlie scored in three games...	설명과 계산 제공	설명과 보다 체계적인 계산 제공
Prove that primes are infinite...	3 methods with explanations	3 Korean method explanations abridged

4. License

The DeepSeek-Bllossom series, including codes and model weights, is available under the MIT License. The series supports commercial use and allows modifications and derivative works, such as training other language models. Note that:

DeepSeek-R1-Distill-Llama-8B is derived from Llama3.1-8B-Instruct under the Llama3.1 license.
DeepSeek-llama3.1-Bllossom-8B also originates from DeepSeek-R1-Distill-Llama-8B.

5. Contributors

UNIVA AI Team (UNIVA, Main contributor)
최창수 (서울과학기술대학교, MLP연구실, Master's program)
임경태 (KAIST, MLP연구실, Professor)

6. Contact

For inquiries, raise an issue or contact us at frodobaggins@univa.co.kr or ktlim@seoultech.ac.kr.

Metadata

version	3
tensor_count	292
kv_count	39
general.architecture	llama
general.type	model
general.name	DeepSeek Llama3.1 Bllossom 8
general.version	8
general.organization	"UNIVA Bllossom
general.basename	DeepSeek-llama3.1-Bllossom
general.size_label	8B
general.license	mit
general.base_model.count	1
general.base_model.0.name	DeepSeek R1 Distill Llama 8B
general.base_model.0.organization	Deepseek Ai
general.base_model.0.repo_url	https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B
general.languages.0	ko
general.languages.1	en
llama.block_count	32
llama.context_length	131072
llama.embedding_length	4096
llama.feed_forward_length	14336
llama.attention.head_count	32
llama.attention.head_count_kv	8
llama.rope.freq_base	500000
llama.attention.layer_norm_rms_epsilon	0.000009999999747378752
llama.attention.key_length	128
llama.attention.value_length	128
llama.vocab_size	128257
llama.rope.dimension_count	128
general.quantization_version	2
general.file_type	15

Version: 8B-Q4_K_M CPU q4_k_m

This code works on compatible Linux machines.
We are actively working on enabling MAX Serve for MacOS ARM64 as well.

You can quickly deploy DeepSeek-llama3.1-Bllossom-8B-Q4_K_M to an endpoint using our MAX container. It includes the latest version of MAX with GPU support and our Python-based inference server called MAX Serve.

With the following Docker command, you’ll get an OpenAI-compatible endpoint running DeepSeek-llama3.1-Bllossom-8B-Q4_K_M:

docker run --gpus 1 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_HUB_ENABLE_HF_TRANSFER=1" \
    --env "HF_TOKEN=" \
    -p 8000:8000 \
    docker.modular.com/modular/max-openai-api:nightly \
    --huggingface-repo-id kenonix/DeepSeek-llama3.1-Bllossom-8B-Q4_K_M-GGUF \
    --weight-path=kenonix/DeepSeek-llama3.1-Bllossom-8B-Q4_K_M-GGUF/deepseek-llama3.1-bllossom-8b-q4_k_m.gguf

In order to download the model from Hugging Face, you just need to fill in the HF_TOKEN value with your access token, unless the model is from https://huggingface.co/modularai.

Learn more

For more information about the container image, see the MAX container documentation.

To learn more about how to deploy MAX to the cloud, check out our MAX Serve tutorials.

MIT License

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

DETAILS

ChatMODEL CLASS

MAX Model

MAX Models are popular open-source models converted to MAX’s native graph format. Anything with the label is either SOTA or being worked on. Learn more about MAX Models.

Browse all MAX Models

HARDWARE

CPU

QUANTIZATION

q4_k_m

ARCHITECTURE

MAX Model

MAX GITHUB

Modular / MAX

BASE MODEL

kenonix

kenonix/DeepSeek-llama3.1-Bllossom-8B-Q4_K_M-GGUF

QUANTIZED BY

kenonix

kenonix/DeepSeek-llama3.1-Bllossom-8B-Q4_K_M-GGUF

QUESTIONS ABOUT THIS MODEL?

Resources & support for
running DeepSeek-llama3.1-Bllossom-8B-Q4_K_M

Browse 27+ Tutorials

View Tutorials

Get help using MAX

Modular Forum

Read Documentation

Go to Docs

DeepSeek-llama3.1-Bllossom-8B-Q4_K_M

DeepSeek-llama3.1-Bllossom-8B

Overview

Model Details

1. Introduction

2. Post-training

3. Model Response Comparison

4. License

5. Contributors

6. Contact

Metadata

Learn more

Resources & support for running DeepSeek-llama3.1-Bllossom-8B-Q4_K_M

Resources & support for
running DeepSeek-llama3.1-Bllossom-8B-Q4_K_M