phi4-14b

PyTorch

1 versions

Phi-4 is a 14B parameter, state-of-the-art open model from Microsoft.

Run this model

  1. Install our magic package manager:

    curl -ssL https://magic.modular.com/ | bash

    Then run the source command that's printed in your terminal.

  2. Install Max Pipelines in order to run this model.

    magic global install max-pipelines
  3. Start a local endpoint for phi4/14b:

    max-serve serve --huggingface-repo-id unsloth/phi-4-GGUF

    The endpoint is ready when you see the URI printed in your terminal:

    Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
  4. Now open another terminal to send a request using curl:

    curl -N http://0.0.0.0:8000/v1/chat/completions -H "Content-Type: application/json" -d '{
        "model": "phi4/14b",
        "stream": true,
        "messages": [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Who won the World Series in 2020?"}
        ]
    }' | grep -o '"content":"[^"]*"' | sed 's/"content":"//g' | sed 's/"//g' | tr -d '
    ' | sed 's/\n/
    /g'
  5. 🎉 Hooray! You’re running Generative AI. Our goal is to make this as easy as possible.

About

Phi-4 is a 14B parameter state-of-the-art open model built using synthetic datasets, filtered public domain website data, and acquired academic and Q&A datasets. It has been rigorously enhanced and aligned through supervised fine-tuning and direct preference optimization to ensure robust instruction adherence and safety.

Context length: 16k tokens

Phi-4 benchmark

Primary use cases

Phi-4 accelerates research on language models and serves as a foundation for generative AI applications. It is ideal for general-purpose systems requiring:

  1. Memory/compute constrained environments.
  2. Latency-sensitive scenarios.
  3. Strong reasoning and logic capabilities.

Out-of-scope use cases

Phi-4 is not explicitly designed for all downstream applications. Developers should:

  1. Assess and mitigate limitations (e.g., accuracy, safety, fairness) for specific use cases, particularly in high-risk areas.
  2. Comply with applicable laws and regulations, including privacy and trade compliance.
  3. Recognize that this document does not modify the model’s licensing.

Phi-4 performance eval by Microsoft

DETAILS

MODEL CLASS
PyTorch

MODULAR GITHUB

Modular

CREATED BY

unsloth

MODEL

unsloth/phi-4-GGUF

TAGS

arxiv:2412.08905
base_model:microsoft/phi-4
base_model:quantized:microsoft/phi-4
chat
code
conversational
en
endpoints_compatible
gguf
license:mit
math
nlp
phi
phi4
region:us
text-generation
transformers
unsloth

@ Copyright - Modular Inc - 2024