Agentic Building Blocks: Creating AI Agents with MAX Serve and OpenAI Function Calling
Help us improve and tell us what you’d like us to build next.
Request a recipe topicREADME
Building intelligent agents that interact seamlessly with users requires structured and efficient mechanisms for executing actions. OpenAI's function calling feature and Modular's MAX Serve provide a powerful combination to enable dynamic and context-aware AI applications
In this recipe, we will:
Note that this feature is available in MAX nightly and the Serve nightly docker image.
To proceed, please make sure to install the magic
CLI with the magic --version
to be 0.7.2 or newer:
curl -ssL https://magic.modular.com/ | bash
Or update it via:
magic self-update
Then install max-pipelines
via:
magic global install max-pipelines=="25.2.0.dev2025031705"
For this recipe, you will need:
Set up your environment variables:
cp .env.sample .env
echo "HUGGING_FACE_HUB_TOKEN=your_hf_token" > .env
Download the code for this recipe using the magic
CLI:
magic init max-serve-openai-function-calling --from modular/max-recipes/max-serve-openai-function-calling
cd max-serve-openai-function-calling
Large Language Models (LLMs) are typically used for text-based interactions. However, in real-world applications, they often need to interact with APIs, databases, or external tools to fetch real-time information. OpenAI's function calling enables an LLM to:
Function calling allows LLMs to enhance their responses by:
Then to illustrate function calling, let's start with a simple example where an AI retrieves the weather using a mock function.
Make sure the port 8077
is available. You can adjust the port settings in pyproject.toml.
First run the server on port 8077
using the magic
CLI:
magic run server
Note that the very first compilation of the model can take a few minutes. The next invocations will be much faster.
Once the server is ready, in a separate terminal run your first function calling with:
magic run single_function_call
which outputs:
User message: What's the weather like in San Francisco?
Weather response: The weather in San Francisco is sunny with a temperature of 72°F
single_function_call.py
from openai import OpenAI
client = OpenAI(base_url="http://0.0.0.0:8000/v1", api_key="local")
def get_weather(city: str) -> str:
"""Mock weather function that returns a simple response."""
return f"The weather in {city} is sunny with a temperature of 72°F"
TOOLS = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather and forecast data for a city",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "The city name to get weather for"}
},
"required": ["city"],
},
},
}
]
def main():
user_message = "What's the weather like in San Francisco?"
response = client.chat.completions.create(
model="modularai/Llama-3.1-8B-Instruct-GGUF",
messages=[{"role": "user", "content": user_message}],
tools=TOOLS,
tool_choice="auto",
)
output = response.choices[0].message
print("Output:", output)
print("Tool calls:", output.tool_calls)
if output.tool_calls:
for tool_call in output.tool_calls:
if tool_call.function.name == "get_weather":
city = eval(tool_call.function.arguments)["city"]
weather_response = get_weather(city)
print("\nWeather response:", weather_response)
if __name__ == "__main__":
main()
The function definition follows OpenAI's structured format for tool specifications:
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather and forecast data for a city",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "The city name to get weather for"}
},
"required": ["city"]
}
}
}
Let's break down each component:
type
: Specifies this is a function tool (OpenAI supports different tool types)function
: Contains the function's specification
name
: The function identifier used by the LLM to call itdescription
: Helps the LLM understand when to use this functionparameters
: JSON Schema defining the function's parameters
type
: Defines this as an object containing parametersproperties
: Lists all possible parameters and their typesrequired
: Specifies which parameters must be providedThis schema enables the LLM to understand:
This script demonstrates how an AI model detects:
This automates API calls within conversational AI agents, allowing for structured responses instead of free-text generations.
For more complex applications, we can introduce multiple function calls. Below is an example that allows the LLM to fetch both weather and air quality data.
Now, in another terminal run
magic run multi_function_calls
which outputs:
User message: What's the weather like in San Francisco?
Weather response: The weather in San Francisco is sunny with a temperature of 72°F
User message: What's the air quality like in San Francisco?
Air quality response: The air quality in San Francisco is good with a PM2.5 of 10µg/m³
multi_function_calls.py
Let's include another mock function as follows:
from openai import OpenAI
client = OpenAI(base_url="http://0.0.0.0:8000/v1", api_key="local")
def get_weather(city: str) -> str:
return f"The weather in {city} is sunny with a temperature of 72°F"
def get_air_quality(city: str) -> str:
return f"The air quality in {city} is good with a PM2.5 of 10µg/m³"
TOOLS = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather and forecast data for a city",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The city name to get weather for",
}
},
"required": ["city"],
},
},
},
{
"type": "function",
"function": {
"name": "get_air_quality",
"description": "Get air quality data for a city",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The city name to get air quality for",
}
},
"required": ["city"],
"additionalProperties": False,
},
"strict": True,
},
},
]
The LLM can now determine when to call get_weather
or get_air_quality
based on user input. This makes it possible to automate multiple API calls dynamically, allowing AI assistants to retrieve data from various sources.
To better simulate the real use case, we use app.py
, a FastAPI-based service that integrates function calling with a real API.
Before running the application, make sure you have a valid API key for weather data. To follow along, obtain
your free API key WEATHERAPI_API_KEY
from https://www.weatherapi.com/ and include it in the .env
file.
WEATHERAPI_API_KEY=your_api_key_here
Here is the code for the FastAPI weather app:
class ChatRequest(BaseModel):
message: str
class ChatResponse(BaseModel):
type: str
message: str
data: Optional[Dict[str, Any]] = None
async def get_weather(city: str) -> Dict[str, Any]:
"""Get weather data for a city"""
async with httpx.AsyncClient() as client:
response = await client.get(
f"http://api.weatherapi.com/v1/current.json?key={WEATHER_API_KEY}&q={city}"
)
if response.status_code != 200:
raise HTTPException(
status_code=response.status_code, detail="Weather API error"
)
data = response.json()
return {
"location": data["location"]["name"],
"temperature": data["current"]["temp_c"],
"condition": data["current"]["condition"]["text"],
}
async def get_air_quality(city: str) -> Dict[str, Any]:
"""Get air quality data for a city"""
async with httpx.AsyncClient() as client:
response = await client.get(
f"http://api.weatherapi.com/v1/current.json?key={WEATHER_API_KEY}&q={city}&aqi=yes"
)
if response.status_code != 200:
raise HTTPException(
status_code=response.status_code, detail="Air quality API error"
)
data = response.json()
aqi = data["current"].get("air_quality", {})
return {
"location": data["location"]["name"],
"aqi": aqi.get("us-epa-index", 0),
"pm2_5": aqi.get("pm2_5", 0),
}
TOOLS = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a city",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name"},
},
"required": ["city"],
},
},
},
{
"type": "function",
"function": {
"name": "get_air_quality",
"description": "Get air quality for a city",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name"},
},
"required": ["city"],
},
},
},
]
@app.get("/api/health")
def health_check():
return {"status": "healthy"}
@app.post("/api/chat", response_model=ChatResponse)
async def chat(request: ChatRequest):
try:
logger.info("Received request: %s", request.message)
logger.info("Calling LLM...")
response = await client.chat.completions.create(
model="modularai/Llama-3.1-8B-Instruct-GGUF",
messages=[
{
"role": "system",
"content": "You are a weather assistant. Use the available functions to get weather and air quality data.",
},
{"role": "user", "content": request.message},
],
tools=TOOLS,
tool_choice="auto",
)
logger.info("LLM response received")
message = response.choices[0].message
if message.tool_calls:
logger.info("Processing tool call...")
tool_call = message.tool_calls[0]
function_name = tool_call.function.name
logger.info("Function called: %s", function_name)
function_args = eval(tool_call.function.arguments)
if function_name == "get_weather":
data = await get_weather(function_args["city"])
return ChatResponse(
type="weather", message="Here's the weather data", data=data
)
elif function_name == "get_air_quality":
data = await get_air_quality(function_args["city"])
return ChatResponse(
type="air_quality", message="Here's the air quality data", data=data
)
else:
raise HTTPException(status_code=400, detail="Unknown function call")
return ChatResponse(type="chat", message=message.content)
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
We deploy it locally using:
magic run app
This will run our FastAPI application on port 8078
which we can test with:
curl -X POST http://localhost:8078/api/chat \
-H "Content-Type: application/json" \
-d '{"message": "What is the weather in Toronto?"}'
the expected output is:
{
"type":"weather",
"message":"Here's the weather data",
"data": {
"location":"Toronto",
"temperature":-3.0,
"condition":"Partly cloudy"
}
}
and as another example testing the air quality function calling
curl -X POST http://localhost:8078/api/chat \
-H "Content-Type: application/json" \
-d '{"message": "What is the air quality in Vancouver?"}'
the expected output is:
{
"type": "air_quality",
"message": "Here's the air quality data",
"data": {
"location": "Vancouver",
"aqi": 1,
"pm2_5": 3.515
}
}
The app automates the following tasks:
OpenAI's function calling and MAX Serve together provide an efficient way to build intelligent, interactive agents. By leveraging these tools, developers can:
Now that you've implemented function calling with MAX Serve, you can explore more advanced features and join our developer community. Here are some resources to help you continue your journey:
magic
CLI in this Magic tutorialDETAILS
AUTHOR
Ehsan M. Kermani
AVAILABLE TASKS
magic run single_function_call
magic run multi_function_calls
magic run app
PROBLEMS WITH THE CODE?
File an Issue
TAGS
Help us improve and tell us what you’d like us to build next.
Request a recipe topic