1 versions
A new small LLaVA model fine-tuned from Phi 3 Mini.
Install our magic
package manager:
curl -ssL https://magic.modular.com/ | bash
Then run the source
command that's printed in your terminal.
Install Max Pipelines in order to run this model.
magic global install max-pipelines
Start a local endpoint for llava-phi3/3.8b:
max-serve serve --huggingface-repo-id xtuner/llava-phi-3-mini-gguf
The endpoint is ready when you see the URI printed in your terminal:
Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
Now open another terminal to send a request using curl
:
curl -N http://0.0.0.0:8000/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "llava-phi3/3.8b",
"stream": true,
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Who won the World Series in 2020?"}
]
}' | grep -o '"content":"[^"]*"' | sed 's/"content":"//g' | sed 's/"//g' | tr -d '
' | sed 's/\n/
/g'
๐ Hooray! Youโre running Generative AI. Our goal is to make this as easy as possible.
llava-phi3
is a fine-tuned version of the LLaVA model based on the Phi 3 Mini 4k architecture, delivering highly competitive performance comparable to the original LLaVA model. It demonstrates robust capabilities across diverse benchmarks, catering to applications that require efficient and effective language-vision alignment.
This model is designed to offer a balance between computational efficiency and performance, thanks to its optimization on the lightweight Phi 3 Mini 4k backbone. Its enhancements enable it to excel in tasks demanding strong visual and textual understanding, making it a versatile solution for research and application development.
DETAILS
MODULAR GITHUB
ModularCREATED BY
xtuner
MODEL
xtuner/llava-phi-3-mini-gguf
TAGS
@ Copyright - Modular Inc - 2024