Learn How to Generate Embeddings with MAX Serve
Help us improve and tell us what you’d like us to build next.
Request a recipe topicREADME
Embeddings are a crucial component of intelligent agents, enabling efficient search and retrieval of proprietary information. MAX supports creation of embeddings using an OpenAI-compatible API, including the ability to run the popular sentence-transformers/all-mpnet-base-v2
model from Hugging Face. When you run MPNet on MAX, you'll be serving a high-performance implementation of the model built by Modular engineers with the MAX Graph API.
In this recipe you will:
MPNet works by encoding not only tokens (words and parts of words) but also positional data about where those tokens appear in a sentence. Upon its publication in 2020, MPNet met or exceeded the capability of popular predecessors, BERT and XLNet. Today, it is one of the most popular open-source models for generating embeddings.
Please make sure your system meets our system requirements.
To proceed, ensure you have the magic
CLI installed:
curl -ssL https://magic.modular.com/ | bash
or update it via:
magic self-update
A valid Hugging Face token is required to access the model.
Once you have obtained the token, include it in .env
by:
cp .env.example .env
then add your token in .env
HUGGING_FACE_HUB_TOKEN=
For running the app on GPU, ensure your system meets these GPU requirements:
Docker and Docker Compose are optional. Note that this recipe works on compatible Linux machines. We are actively working on enabling MAX Serve Docker image for MacOS ARM64 as well.
git clone https://github.com/modular/max-recipes.git
cd max-recipes/max-serve-openai-embeddings
magic run app
This command is defined in the pyproject.toml
file and invokes the max-pipelines
CLI using Procfile
for convenience.
MAX Serve is ready once you see a line containing the following in the Docker output:
Server running on http://0.0.0.0:8000/
When the embedding code in main.py
runs, you should see output like this:
=== Generated embeddings with OpenAI client ===
Successfully generated embeddings!
Number of embeddings: 5
Embedding dimension: 768
1st few values of 1st embedding: [0.36384445428848267, -0.7647817730903625, ...]
magic run clean
The code for this recipe is intentionally simple — we're excited for you to start building your own project on MAX.
Open up main.py
in your code editor. At the top of the file, you'll see the following:
from openai import OpenAI
MODEL_NAME = "sentence-transformers/all-mpnet-base-v2" #1
BASE_URL="http://localhost:8000/v1"
API_KEY="local"
client = OpenAI(base_url=BASE_URL, api_key=API_KEY) #2
def main():
"""Test embeddings using OpenAI client"""
sentences = [ #3
"Rice is often served in round bowls.",
"The juice of lemons makes fine punch.",
"The bright sun shines on the old garden.",
"The soft breeze came across the meadow.",
"The small pup gnawed a hole in the sock."
]
try:
response = client.embeddings.create( #4
model=MODEL_NAME,
input=sentences
)
print("\n=== Generated embeddings with OpenAI client ===")
print("Successfully generated embeddings!")
print(f"Number of embeddings: {len(response.data)}") #5
print(f"Embedding dimension: {len(response.data[0].embedding)}")
print(f"First embedding, first few values: {response.data[0].embedding[:5]}")
except Exception as e:
print(f"Error using client: {str(e)}")
if __name__ == "__main__":
main()
Here's what the code does:
API_KEY
; MAX Serve does not use one, but the OpenAI client requires this value not be blank.)Note how the code is a drop-in replacement for the proprietary OpenAI API---this is a key advantage of building with MAX!
Now that you've created embeddings with MAX Serve, you can explore more features and join our developer community. Here are some resources to help you continue your journey:
magic
CLI in this Magic tutorialDETAILS
THE CODE
max-serve-openai-embeddings
AUTHOR
Bill Welense
AVAILABLE TASKS
magic run app
magic run clean
PROBLEMS WITH THE CODE?
File an Issue
TAGS
Help us improve and tell us what you’d like us to build next.
Request a recipe topicENTERPRISES
@ Copyright - Modular Inc - 2025