2 versions
Llama 2 based model fine tuned to improve Chinese dialogue ability.
Install our magic
package manager:
curl -ssL https://magic.modular.com/ | bash
Then run the source
command that's printed in your terminal.
Install Max Pipelines in order to run this model.
magic global install max-pipelines
Start a local endpoint for llama2-chinese/7b:
max-serve serve --huggingface-repo-id FlagAlpha/Llama2-Chinese-7b-Chat-LoRA
The endpoint is ready when you see the URI printed in your terminal:
Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
Now open another terminal to send a request using curl
:
curl -N http://0.0.0.0:8000/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "llama2-chinese/7b",
"stream": true,
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Who won the World Series in 2020?"}
]
}' | grep -o '"content":"[^"]*"' | sed 's/"content":"//g' | sed 's/"//g' | tr -d '
' | sed 's/\n/
/g'
🎉 Hooray! You’re running Generative AI. Our goal is to make this as easy as possible.
Llama 2 中文对话模型微调
该模型基于 Meta Platforms, Inc. 发布的 Llama 2 Chat 开源模型进行微调。Llama 2 使用两万亿个 token 进行训练,支持的上下文长度提升至 4096。同时,使用 100 万条人类标注数据对对话能力进行了优化。
由于原版 Llama 2 对中文的适配较弱,开发者使用中文指令集对其进行微调,从而显著增强其中文对话能力。目前,发布了包含 7B 和 13B 参数的两种中文微调模型。
DETAILS
MODULAR GITHUB
ModularCREATED BY
FlagAlpha
MODEL
FlagAlpha/Llama2-Chinese-7b-Chat-LoRA
TAGS
@ Copyright - Modular Inc - 2024