qwen2.5-0.5b

PyTorch

6 versions

Qwen2.5 models are pretrained on Alibaba's latest large-scale dataset, encompassing up to 18 trillion tokens. The model supports up to 128K tokens and has multilingual support.

Run this model

  1. Install our magic package manager:

    curl -ssL https://magic.modular.com/ | bash

    Then run the source command that's printed in your terminal.

  2. Install Max Pipelines in order to run this model.

    magic global install max-pipelines
  3. Start a local endpoint for qwen2.5/0.5b:

    max-serve serve --huggingface-repo-id Qwen/Qwen2.5-0.5B

    The endpoint is ready when you see the URI printed in your terminal:

    Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
  4. Now open another terminal to send a request using curl:

    curl -N http://0.0.0.0:8000/v1/chat/completions -H "Content-Type: application/json" -d '{
        "model": "qwen2.5/0.5b",
        "stream": true,
        "messages": [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Who won the World Series in 2020?"}
        ]
    }' | grep -o '"content":"[^"]*"' | sed 's/"content":"//g' | sed 's/"//g' | tr -d '
    ' | sed 's/\n/
    /g'
  5. 🎉 Hooray! You’re running Generative AI. Our goal is to make this as easy as possible.

About

Qwen2.5 is the latest in the Qwen large language model series, featuring a variety of base and instruction-tuned models with sizes ranging from 0.5 to 72 billion parameters. This version introduces major improvements in several domains:

  • Enhanced Knowledge and Expertise: Qwen2.5 exhibits significantly greater knowledge along with specialized improvements in coding and mathematics through domain-specific expert models.
  • Advanced Capabilities: Improvements include superior instruction-following, generating and comprehending long texts (up to 8K tokens), structured data handling (e.g., tables), and producing structured outputs like JSON formats. It also delivers better resilience to diverse system prompts, benefiting chatbot interactions.
  • Extended Context Lengths: The model supports long contexts of up to 128K tokens, enabling robust long-form processing.
  • Multilingual Proficiency: Qwen2.5 supports over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more.

Licensing details: all models except the 3B and 72B versions are available under the Apache 2.0 license, while the 3B and 72B models are under the Qwen license.

References

GitHub
Blog post
HuggingFace

DETAILS

MODEL CLASS
PyTorch

MODULAR GITHUB

Modular

CREATED BY

Qwen

MODEL

Qwen/Qwen2.5-0.5B

TAGS

arxiv:2407.10671
autotrain_compatible
conversational
en
endpoints_compatible
license:apache-2.0
qwen2
region:us
safetensors
text-generation
text-generation-inference
transformers

@ Copyright - Modular Inc - 2024