Code Executor Agent with E2B Sandbox Recipe

Recipes

Code / recipes / Code Executor Agent with E2B Sandbox

Help us improve and tell us what you’d like us to build next.

Request a recipe topic

README

This recipe demonstrates how to build a secure code execution assistant that combines:

Llama-3.1-8B-Instruct for code generation
E2B Code Interpreter for secure code execution in sandboxed environments
OpenAI's function calling format for structured outputs
Rich for beautiful terminal interfaces
MAX for efficient model serving

The assistant provides:

Secure code execution in isolated sandboxes
Interactive Python REPL with natural language interface
Beautiful output formatting with syntax highlighting
Clear explanations of code and results

Requirements

Please make sure your system meets our system requirements.

To proceed, ensure you have the pixi CLI installed:

curl -fsSL https://pixi.sh/install.sh | sh

...and updated to the latest version:

pixi self-update

Important: GPU requirements

This recipe requires a GPU with CUDA 12.5 support. Recommended GPUs:

NVIDIA H100 / H200, A100, A40, L40

API keys

E2B API Key: Required for sandbox access
- Sign up at e2b.dev
- Get your API key from the dashboard
- Add to .env file: E2B_API_KEY=your_key_here
Hugging Face Token (optional): For faster model downloads
- Get token from Hugging Face
- Add to .env file: HF_TOKEN=your_token_here

Installation

Download the code for this recipe:

git clone https://github.com/modularml/max-recipes.git
cd max-recipes/code-execution-sandbox-agent-with-e2b

Copy the environment template:
```
cp .env.example .env
```
Add your API keys to .env

Quick start

Test the sandbox:
```
pixi run hello
```
This command runs a simple test to verify your E2B sandbox setup. You'll see a "hello world" output and a list of available files in the sandbox environment, confirming that code execution is working properly.
Start the LLM server:

Make sure the port 8010 is available. You can adjust the port settings in pyproject.toml.
```
pixi run server
```
This launches the Llama model with MAX, enabling structured output parsing for reliable code generation. The server runs locally on port 8010 and uses the --enable-structured-output flag for OpenAI-compatible function calling.
Run the interactive agent:
```
pixi run agent
```
This starts the interactive Python assistant. You can now type natural language queries like:
- "calculate factorial of 5"
- "count how many r's are in strawberry"
- "generate fibonacci sequence up to 10 numbers"

The demo below shows the agent in action, demonstrating:

Natural language code generation
Secure execution in the E2B sandbox
Beautiful output formatting with syntax highlighting
Clear explanations of the code and results

System architecture

The system follows a streamlined flow for code generation and execution:

graph TB
    subgraph User Interface
        CLI[Rich CLI Interface]
    end

    subgraph Backend
        LLM[Llama Model]
        Parser[Structured Output Parser]
        Sandbox[E2B Sandbox]
        Executor[Code Executor]
    end

    CLI --> LLM
    LLM --> Parser
    Parser --> Executor
    Executor --> Sandbox
    Sandbox --> CLI

Here's how the components work together:

Rich CLI Interface:
- Provides a beautiful terminal interface
- Handles user input in natural language
- Displays code, results, and explanations in formatted panels
Llama Model:
- Processes natural language queries
- Generates Python code using structured output format
- Runs locally via MAX with function calling enabled
Structured Output Parser:
- Validates LLM responses using Pydantic models
- Ensures code blocks are properly formatted
- Handles error cases gracefully
Code Executor:
- Prepares code for execution
- Manages the execution flow
- Captures output and error states
E2B Sandbox:
- Provides secure, isolated execution environment
- Handles file system operations
- Manages resource limits and timeouts

The flow ensures secure and reliable code execution while providing a seamless user experience with clear feedback at each step.

Technical deep dive

Hello world example (hello.py)

The hello.py script demonstrates basic E2B sandbox functionality:

from e2b_code_interpreter import Sandbox
from dotenv import load_dotenv
load_dotenv()

sbx = Sandbox() # Creates a sandbox environment
execution = sbx.run_code("print('hello world')") # Executes Python code

# Access execution results
for line in execution.logs.stdout:
    print(line.strip())

# List sandbox files
files = sbx.files.list("/")

Key features:

Sandbox initialization with automatic cleanup
Code execution in isolated environment
Access to execution logs and outputs
File system interaction capabilities

Interactive agent (agent.py)

The agent implements a complete code execution assistant with these additional key features:

Environment Configuration:

LLM_SERVER_URL = os.getenv("LLM_SERVER_URL", "http://localhost:8010/v1")
LLM_API_KEY = os.getenv("LLM_API_KEY", "local")
MODEL = os.getenv("MODEL", "modularai/Llama-3.1-8B-Instruct-GGUF")

Tool Definition for Function Calling:

tools = [{
    "type": "function",
    "function": {
        "name": "execute_python",
        "description": "Execute python code blocks in sequence",
        "parameters": CodeExecution.model_json_schema()
    }
}]

Enhanced Code Execution with Rich Output:

def execute_python(blocks: List[CodeBlock]) -> str:
    with Sandbox() as sandbox:
        full_code = "\n\n".join(block.code for block in blocks)
        # Step 1: Show the code to be executed
        console.print(Panel(
            Syntax(full_code, "python", theme="monokai"),
            title="[bold blue]Step 1: Code[/bold blue]",
            border_style="blue"
        ))

        execution = sandbox.run_code(full_code)
        output = execution.logs.stdout if execution.logs and execution.logs.stdout else execution.text
        output = ''.join(output) if isinstance(output, list) else output

        # Step 2: Show the execution result
        console.print(Panel(
            output or "No output",
            title="[bold green]Step 2: Result[/bold green]",
            border_style="green"
        ))
        return output

Three-Step Output Process:
- Code Display: Shows the code to be executed with syntax highlighting
- Result Display: Shows the execution output in a green panel
- Explanation: Provides a natural language explanation of the code and its result
Interactive Session Management:

def main():
    console.print(Panel("Interactive Python Assistant (type 'exit' to quit)",
                 border_style="cyan"))

    while True:
        query = console.input("[bold yellow]Your query:[/bold yellow] ")
        if query.lower() in ['exit', 'quit']:
            console.print("[cyan]Goodbye![/cyan]")
            break
        # ... process query ...

Explanation Generation:

explanation_messages = [
    {
        "role": "system",
        "content": "You are a helpful assistant. Explain what the code did and its result clearly and concisely."
    },
    {
        "role": "user",
        "content": f"Explain this code and its result:\n\nCode:\n{code}\n\nResult:\n{result}"
    }
]

Structured output generation and parsing

The agent uses OpenAI's structured output format to ensure reliable code generation and execution. Here's how it works:

Structured Data Models:

from pydantic import BaseModel
from typing import List

# Define the expected response structure
class CodeBlock(BaseModel):
    type: str
    code: str

class CodeExecution(BaseModel):
    code_blocks: List[CodeBlock]

Tool Definition:

# Define the function calling schema
tools = [{
    "type": "function",
    "function": {
        "name": "execute_python",
        "description": "Execute python code blocks in sequence",
        "parameters": CodeExecution.model_json_schema()
    }
}]

LLM Client Setup:

from openai import OpenAI

# Configure the client with local LLM server
client = OpenAI(
    base_url=LLM_SERVER_URL,  # "http://localhost:8010/v1"
    api_key=LLM_API_KEY       # "local"
)

Message Construction:

messages = [
    {
        "role": "system",
        "content": """You are a Python code execution assistant. Generate complete, executable code based on user queries.

Important rules:
1. Always include necessary imports at the top
2. Always include print statements to show results
3. Make sure the code is complete and can run independently
4. Test all variables are defined before use
"""
    },
    {
        "role": "user",
        "content": query
    }
]

Structured Response Parsing:

try:
    # Parse the response into structured format
    response = client.beta.chat.completions.parse(
        model=MODEL,
        messages=messages,
        response_format=CodeExecution
    )

    # Extract code blocks from the response
    code_blocks = response.choices[0].message.parsed.code_blocks

    # Execute the code
    result = execute_python(code_blocks)
except Exception as e:
    console.print(Panel(f"Error: {str(e)}", border_style="red"))

Example Response Structure:

{
    "code_blocks": [
        {
            "type": "python",
            "code": "def factorial(n):\n    if n == 0:\n        return 1\n    return n * factorial(n-1)\n\nresult = factorial(5)\nprint(f'Factorial of 5 is: {result}')"
        }
    ]
}

Explanation Generation:

# Generate explanation using vanilla completion
explanation_messages = [
    {
        "role": "system",
        "content": "You are a helpful assistant. Explain what the code did and its result clearly and concisely."
    },
    {
        "role": "user",
        "content": f"Explain this code and its result:\n\nCode:\n{code_blocks[0].code}\n\nResult:\n{result}"
    }
]

final_response = client.chat.completions.create(
    model=MODEL,
    messages=explanation_messages
)

explanation = final_response.choices[0].message.content

Key benefits of this structured approach:

Type Safety: Pydantic models ensure response validation
Reliable Parsing: Structured format prevents parsing errors
Consistent Output: Guaranteed code block structure
Error Handling: Clear error messages for parsing failures
Separation of Concerns:
- Code generation with structured output
- Code execution in sandbox
- Explanation generation with free-form text

This structured approach ensures that:

The LLM always generates valid, executable code
The response can be reliably parsed and executed
Error handling is consistent and informative
The execution flow is predictable and maintainable

Example interactions

You can interact with the agent using natural language queries like:

Test with querying "Hi" and see the agent responds by generating print("Hello") and executing it
Find fibonacci 100
Sum of all twin prime numbers below 1000
How many r's are in the word strawberry?

Key components

System Prompt:
- Ensures complete, executable code
- Requires necessary imports
- Mandates print statements for output
- Enforces variable definition
Code Execution Flow:
- Code generation by LLM
- Parsing into structured blocks
- Secure execution in sandbox
- Result capture and formatting
- Explanation generation
Error Handling:
- Sandbox execution errors
- JSON parsing errors
- LLM response validation

Customization options

Model Selection:

MODEL = os.getenv("MODEL", "modularai/Llama-3.1-8B-Instruct-GGUF")

Sandbox Configuration:

Sandbox(timeout=300)  # Configure timeout

Output Formatting:

# Customize Rich themes and styles
console.print(Panel(..., theme="custom"))

Troubleshooting

Sandbox Issues
- Error: "Failed to create sandbox"
- Solution: Check E2B API key
- Verify network connection
LLM Issues
- Error: "Failed to parse response"
- Solution: Check server is running
- Verify structured output format
Code Execution Issues
- Error: "No output"
- Solution: Check print statements
- Verify code completeness

Next steps

Enhance the System
- Add file upload capabilities
- Implement persistent sessions
- Add support for more languages
- Implement caching for responses
Deploy to Production
- Deploy MAX on AWS, GCP or Azure
- Set up CI/CD for documentation generation
- Add monitoring and observability
- Implement rate limiting and authentication
Join the Community
- Explore MAX documentation
- Join our Modular Forum
- Share your projects with #ModularAI on social media

We're excited to see what you'll build with this foundation!

DETAILS

THE CODE

code-execution-sandbox-agent-with-e2b

AUTHOR

Ehsan M. Kermani

AVAILABLE TASKS

pixi run hello

pixi run server

pixi run agent

PROBLEMS WITH THE CODE?

File an Issue