qwen-0.5b

PyTorch

6 versions

Qwen 1.5 is a series of large language models by Alibaba Cloud spanning from 0.5B to 110B parameters

Run this model

  1. Install our magic package manager:

    curl -ssL https://magic.modular.com/ | bash

    Then run the source command that's printed in your terminal.

  2. Install Max Pipelines in order to run this model.

    magic global install max-pipelines
  3. Start a local endpoint for qwen/0.5b:

    max-serve serve --huggingface-repo-id Qwen/Qwen2.5-0.5B

    The endpoint is ready when you see the URI printed in your terminal:

    Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
  4. Now open another terminal to send a request using curl:

    curl -N http://0.0.0.0:8000/v1/chat/completions -H "Content-Type: application/json" -d '{
        "model": "qwen/0.5b",
        "stream": true,
        "messages": [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Who won the World Series in 2020?"}
        ]
    }' | grep -o '"content":"[^"]*"' | sed 's/"content":"//g' | sed 's/"//g' | tr -d '
    ' | sed 's/\n/
    /g'
  5. 🎉 Hooray! You’re running Generative AI. Our goal is to make this as easy as possible.

About

Qwen is a series of transformer-based large language models by Alibaba Cloud, pre-trained on diverse datasets, including web texts, books, and code. It now supports models ranging from 0.5B to 110B parameters, with improved performance and multilingual capabilities.

Key Features

  • Scalability: Offered in multiple sizes, including 0.5B, 1.8B, 4B, 7B, 14B, 32B, 72B, and 110B.
  • Low-cost deployment: Requires minimal memory (under 2GB) for inference.
  • Long context support: Supports 8K context lengths for smaller models and up to 32K for larger ones.
  • High-quality training data: Pre-trained on 2.2 trillion tokens spanning Chinese, English, multilingual texts, code, and mathematics, with optimized corpus distribution.
  • Performance: Excels in benchmarks for reasoning, common sense, math, and coding, surpassing models of similar and larger scales.
  • Vocabulary: Uses over 150K tokens, ensuring robust multilingual support.
  • System Prompt: Enables role-playing, style transfer, task setting, and behavior customization.

Qwen is designed for both general and professional applications, offering robust features and long-context capabilities across its diverse model sizes.

Reference

GitHub

Hugging Face

DETAILS

MODEL CLASS
PyTorch

MODULAR GITHUB

Modular

CREATED BY

Qwen

MODEL

Qwen/Qwen2.5-0.5B

TAGS

arxiv:2407.10671
autotrain_compatible
conversational
en
endpoints_compatible
license:apache-2.0
qwen2
region:us
safetensors
text-generation
text-generation-inference
transformers

@ Copyright - Modular Inc - 2024