phind-codellama-34b

MAX Model

1 versions

Code generation model based on Code Llama.

Run this model

  1. Install our magic package manager:

    curl -ssL https://magic.modular.com/ | bash

    Then run the source command that's printed in your terminal.

  2. Install Max Pipelines in order to run this model.

    magic global install max-pipelines
  3. Start a local endpoint for phind-codellama/34b:

    max-serve serve --huggingface-repo-id Phind/Phind-CodeLlama-34B-Python-v1

    The endpoint is ready when you see the URI printed in your terminal:

    Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
  4. Now open another terminal to send a request using curl:

    curl -N http://0.0.0.0:8000/v1/chat/completions -H "Content-Type: application/json" -d '{
        "model": "phind-codellama/34b",
        "stream": true,
        "messages": [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Who won the World Series in 2020?"}
        ]
    }' | grep -o '"content":"[^"]*"' | sed 's/"content":"//g' | sed 's/"//g' | tr -d '
    ' | sed 's/\n/
    /g'
  5. 🎉 Hooray! You’re running Generative AI. Our goal is to make this as easy as possible.

About

Phind CodeLlama is a code generation model derived from CodeLlama 34B and fine-tuned for instruction-based use cases. It is available in two versions: v1 and v2. The v1 version is based on CodeLlama 34B and CodeLlama-Python 34B, while v2 builds upon v1 with training on an additional 1.5 billion tokens of high-quality programming-related data.

Memory Requirements

The 34B models typically require at least 32GB of RAM to run.

References

Beating GPT-4 on HumanEval with a Fine-Tuned CodeLlama-34B

HuggingFace

DETAILS

MODEL CLASS
MAX Model

MAX Models are extremely optimized inference pipelines to run SOTA performance for that model on both CPU and GPU. For many of these models, they are the fastest version of this model in the world.

Browse 18+ MAX Models

MODULAR GITHUB

Modular

CREATED BY

Phind

MODEL

Phind/Phind-CodeLlama-34B-Python-v1

TAGS

autotrain_compatible
code llama
endpoints_compatible
license:llama2
llama
model-index
pytorch
region:us
text-generation
text-generation-inference
transformers

@ Copyright - Modular Inc - 2024