Copy, customize, and deploy. The quickest way to get your GenAI app up and running, and have total control over every layer.
Model families that can be run using MAX.
MODEL FAMILY | MODALITY | TYPE(S) | HARDWARE |
---|---|---|---|
![]() DeepSeek-R1-Distill-Llama 2 variants DeepSeek-R1 models improve reasoning through reinforcement learning and fine-tuning, outperforming major benchmarks. | Chat | MAX Model | CPU GPU |
![]() DeepSeek-R1-Distill-Qwen 3 variants DeepSeek-R1 models improve reasoning through reinforcement learning and fine-tuning, outperforming major benchmarks. | Chat | MAX Model | GPU |
![]() Llama-3.2-Instruct 4 variants Llama 3.2 offers advanced multilingual generative language models for diverse commercial and research applications. | Chat | MAX Model | CPU GPU |
![]() Llama-3.2-Vision-Instruct 1 variant Llama 3.#2-Vision models by Meta, integrating advanced image reasoning capabilities with text, enhance visual tasks. | Vision | MAX Model | GPU |
![]() Llama-3.1-Instruct 2 variants Meta Llama 3.1 is a suite of multilingual, large language models (LLMs) available in 8B,70B, and 405B sizes optimized for text. | Chat | MAX Model | CPU GPU |
![]() Llama-Guard-3 2 variants Llama Guard 3 improves content safety classification with high accuracy and supports multiple languages. | Chat | MAX Model | CPU GPU |
![]() all-mpnet-base-v2 1 variant all-mpnet-base-v2 is a sentence-transformers model for efficient semantic sentence representation and clustering. | Embedding | MAX Model | GPU |
![]() phi-4-llama-distill 2 variants Transform and enhance conversational AI with Phi-4 for improved language model performance and safety. | Chat Code | MAX Model | CPU GPU |
![]() Mistral-Instruct-v0.3 1 variant Mistral-7B-Instruct-v0.3 is a fine-tuned model with extended vocabulary and function calling. | Chat | MAX Model | GPU |
Created by our community.
(99)
Model families that can be run using MAX.
MODEL FAMILY | MODALITY | TYPE(S) | HARDWARE |
---|---|---|---|
![]() DeepSeek-R1-Distill-Llama 2 variants DeepSeek-R1 models improve reasoning through reinforcement learning and fine-tuning, outperforming major benchmarks. | Chat | MAX Model | CPU GPU |
![]() DeepSeek-R1-Distill-Qwen 3 variants DeepSeek-R1 models improve reasoning through reinforcement learning and fine-tuning, outperforming major benchmarks. | Chat | MAX Model | GPU |
![]() Llama-3.2-Instruct 4 variants Llama 3.2 offers advanced multilingual generative language models for diverse commercial and research applications. | Chat | MAX Model | CPU GPU |
![]() Llama-3.2-Vision-Instruct 1 variant Llama 3.#2-Vision models by Meta, integrating advanced image reasoning capabilities with text, enhance visual tasks. | Vision | MAX Model | GPU |
![]() Llama-3.1-Instruct 2 variants Meta Llama 3.1 is a suite of multilingual, large language models (LLMs) available in 8B,70B, and 405B sizes optimized for text. | Chat | MAX Model | CPU GPU |
![]() Llama-Guard-3 2 variants Llama Guard 3 improves content safety classification with high accuracy and supports multiple languages. | Chat | MAX Model | CPU GPU |
![]() all-mpnet-base-v2 1 variant all-mpnet-base-v2 is a sentence-transformers model for efficient semantic sentence representation and clustering. | Embedding | MAX Model | GPU |
![]() phi-4-llama-distill 2 variants Transform and enhance conversational AI with Phi-4 for improved language model performance and safety. | Chat Code | MAX Model | CPU GPU |
![]() Mistral-Instruct-v0.3 1 variant Mistral-7B-Instruct-v0.3 is a fine-tuned model with extended vocabulary and function calling. | Chat | MAX Model | GPU |
![]() Mistral-Small-Instruct-2501 1 variant Mistral Small 3 achieves state-of-the-art performance among small language models with 24 billion parameters. | Chat | MAX Model | GPU |
![]() Mistral-Nemo-Instruct-2407 1 variant Mistral-Nemo-Instruct-2407 outperforms similar models by integrating multilingual data and improved architecture. | Chat | MAX Model | GPU |
![]() Ministral-Instruct-2410 1 variant The Ministral-8B-Instruct-2410 model excels in multilingual tasks and code-related benchmarks, designed for on-device computing. | Chat | MAX Model | GPU |
![]() Qwen2.5-Instruct 5 variants Qwen2.5 language models offer enhanced multilingual support, instruction-following, and long-context capabilities. | Chat Code Vision | MAX Model PyTorch | GPU |
![]() Qwen2.5-Instruct-1M 2 variants Qwen2.5-7B-Instruct-1M, a powerful language model, excels in long-context tasks with enhanced efficiency. | Chat | MAX Model | GPU |
![]() Qwen2.5-Coder-Instruct 2 variants Qwen2.5-Coder excels in code generation, reasoning, and fixing with 128K context support. | Code | MAX Model | GPU |
![]() Qwen2.5-Math 2 variants Qwen2.5-Math improves multilingual math problem-solving with enhanced reasoning and computational accuracy. | Chat | MAX Model PyTorch | GPU |
![]() QwQ-Preview 1 variant QwQ-32B-Preview is an AI model with potential but requires improvements in reasoning. | Chat | MAX Model | GPU |
![]() aya-expanse 2 variants Aya Expanse 8B is a multilingual model leveraging advanced research breakthroughs for optimized performance. | Chat | PyTorch | GPU |
![]() aya-23 2 variants Aya 23 is a highly capable multilingual language model optimized for 23 languages and research use. | Chat | PyTorch | GPU |
![]() aya-101 1 variant Aya model is a multilingual AI outperforming rivals, supporting 101 languages, enhancing communication globally. | Chat | PyTorch | GPU |
![]() EXAONE-3.5-Instruct 6 variants Bilingual EXAONE 3.5 language models offer advanced, versatile text generation across device types. | Chat Code | MAX Model | CPU GPU |
![]() EXAONE-3.0-Instruct 2 variants EXAONE-3.0-7.8B-Instruct is an advanced bilingual AI model offering competitive benchmark performance. | Chat Code | MAX Model | CPU GPU |
![]() phi-4 1 variant Phi-4 is a state-of-the-art model aimed at enhancing research in language understanding, offering precise text generation with safety measures. | Code | MAX Model | GPU |
![]() Mistral-Small-Base-2501 1 variant Mistral Small 3 redefines small Large Language Models with 24B parameters, innovative capabilities, multilingual support. | Chat | MAX Model | GPU |
![]() Mistral-v0.1 1 variant This generative text model, Mistral-7B-v0.1, excels compared to similar models on multiple benchmarks. | Chat | MAX Model | GPU |
![]() Mistral-Instruct-v0.2 1 variant Mistral-7B-Instruct-v0.2 offers refined instruction-based capabilities for effective text generation tasks. | Chat | MAX Model | GPU |
![]() Mistral-Instruct-v0.1 1 variant The Mistral-7B-Instruct-v0.1 is a fine-tuned, instruction-following language model. | Chat | MAX Model | GPU |
![]() Mistral-Small-Instruct-2409 1 variant Mistral-Small-Instruct-2409 is a fine-tuned model designed for sequence tasks with 22B parameters. | Chat | MAX Model | GPU |
![]() Meta-Llama-3-Instruct 2 variants Meta Llama 3 models excel at generating text through optimized transformer architecture for diverse applications. | Chat | MAX Model | CPU GPU |
![]() Phi-3.5-mini-instruct 1 variant Phi-3.5-mini offers a powerful multilingual AI model designed for effective text generation across various use cases with enhanced reasoning capabilities and long-context understanding. | Code | MAX Model | GPU |
![]() Phi-3.5-vision-instruct 1 variant Discover the Phi-3.5-vision model, a state-of-the-art, lightweight multimodal solution for image-text tasks. | Vision | PyTorch | GPU |
![]() Llama-2-chat-hf 2 variants Llama 2 offers scalable generative text models enhancing dialogue applications with large parameter sizes. | Chat | MAX Model | CPU GPU |
![]() llava-v1.5 1 variant LLaVA is an advanced open-source chatbot designed for multimodal instruction research and language processing. | Vision | PyTorch | GPU |
![]() LLaVA-delta-v0 1 variant LLaVA is a fine-tuned open-source chatbot model using GPT for multimodal instruction-following. | Chat | MAX Model | GPU |
![]() CodeLlama-hf 4 variants Code Llama offers versatile models for code synthesis and understanding, designed for multiple programming needs. | Code | MAX Model | CPU GPU |
![]() TinyLlama-Chat-v1.0 2 variants TinyLlama efficiently pretrains a compact 1.1B Llama model, optimizing resources and enhancing adaptability. | Chat | MAX Model | CPU GPU |
![]() starcoder2-instruct-v0.1 1 variant StarCoder2-15B-Instruct is a self-aligned model optimizing code generation without human annotations. | Code | PyTorch | GPU |
![]() DeepSeek-Coder-V2-Lite-Instruct 1 variant DeepSeek-Coder-V2 excels in code tasks, supports 338 languages, and boasts advanced features. | Code | PyTorch | GPU |
![]() Dolphin3.0-R1-Mistral 1 variant Dolphin 3.0 R1, a flexible AI model, enables customized solutions for coding and general tasks. | Chat | MAX Model | GPU |
![]() Dolphin3.0-Mistral 1 variant Dolphin 3.0 Mistral 24B is an adaptable, user-controlled AI model for diverse applications. | Chat | MAX Model | GPU |
![]() Dolphin3.0-Llama3.1 2 variants Dolphin 3.0, a versatile AI model, empowers businesses with customizable and private solutions. | Chat | MAX Model | CPU GPU |
![]() Dolphin3.0-Llama3.2 4 variants Dolphin 3.0 is an adaptable AI model focused on privacy, control, and customization. | Chat | MAX Model | CPU GPU |
![]() Dolphin3.0-Qwen2.5 3 variants Dolphin 3.0 is an advanced AI model offering personalized, general-purpose functionalities for diverse applications. | Chat | MAX Model | GPU |
![]() WizardLM-2 1 variant Introducing WizardLM-2, an advanced multilingual model with enhanced performance for complex tasks. | Chat | MAX Model | GPU |
![]() OLMo-2-1124-Instruct 2 variants OLMo-2 models offer diverse language capabilities, perfect for cutting-edge state-of-the-art tasks. | Chat | PyTorch | GPU |
![]() OLMo-Instruct 1 variant OLMo 7B Instruct enhances language model science, excels at question answering, with open access. | Chat | PyTorch | GPU |
![]() OLMo-hf 1 variant OLMo is an open-source language model series by AI2, optimized for language model research. | Chat | MAX Model | GPU |
![]() OLMo-0424 1 variant OLMo 7B April 2024 improves language model performance using updated training techniques and datasets. | Chat | PyTorch | GPU |
![]() OLMo-0724-hf 1 variant The OLMo 1B July 2024 model is enhanced with advanced dataset training, showing substantial performance improvements. | Chat | MAX Model | GPU |
![]() OLMo-2-1124-Instruct-preview 1 variant OLMo-2 models excel in diverse tasks using advanced training techniques and a versatile dataset. | Chat | PyTorch | GPU |
![]() Nous-Hermes 1 variant Nous-Hermes-13b is a cutting-edge language model outperforming in long responses and task accuracy. | Chat | MAX Model | GPU |
![]() Nous-Hermes-Llama2 2 variants Nous-Hermes-Llama2-13b is optimized for long, accurate responses without censorship, outperforming predecessors. | Chat | MAX Model | CPU GPU |
![]() Nous-Hermes-2-Yi 2 variants Nous Hermes 2 - Yi-34B sets new standards with exceptional benchmark performance and usability. | Chat | MAX Model | CPU GPU |
![]() Nous-Hermes-2-SOLAR 2 variants Nous Hermes 2 on SOLAR 10.7B achieves enhanced performance on various AI benchmarks, closer to Yi-34B. | Chat | MAX Model | CPU GPU |
![]() Nous-Hermes-2-Mistral-DPO 1 variant Nous Hermes 2 Mistral 7B DPO model exhibits improved performance across multiple benchmarking tests. | Chat | MAX Model | GPU |
![]() Nous-Hermes-llama-2 2 variants Nous-Hermes-Llama2-7b excels with long responses and reduced hallucination using diverse datasets. | Chat | MAX Model | CPU GPU |
![]() c4ai-command-r-plus 1 variant C4AI Command R+ is a multilingual, advanced AI model trained for reasoning, summarization, and question answering using 104 billion parameters. | Chat | PyTorch | GPU |
![]() c4ai-command-r-v01 2 variants C4AI Command-R is a 35 billion parameter, multilingual generative model optimized for various tasks. | Chat Code | PyTorch | GPU |
![]() c4ai-command-r-08-2024 1 variant C4AI Command R 08-2024 is a multidimensional AI model excelling in multilingual tasks, reasoning, and citations. | Chat | PyTorch | GPU |
![]() Yi-1.5-Chat 4 variants Yi-1.5 enhances performance in language tasks, coding, and reasoning using extensive training. | Chat | MAX Model | CPU GPU |
![]() Yi-Coder 2 variants Yi-Coder offers advanced open-source code models supporting 52 languages with exceptional performance and efficiency. | Chat Code | MAX Model | CPU GPU |
![]() Yi-Coder-Chat 2 variants Yi-Coder delivers top coding performance using models under 10 billion parameters for 52 languages. | Chat | MAX Model | CPU GPU |
![]() Yi 4 variants Explore Yi's next-gen, bilingual open-source models excelling in various benchmarks globally. | Chat | MAX Model | CPU GPU |
![]() Yi-200K 4 variants Yi's innovative bilingual language models excel in large-scale multilingual benchmarks, offering top-tier performance across tasks. | Chat | MAX Model | CPU GPU |
![]() Yi-Chat 6 variants Yi series models are powerful bilingual, open-source AI language models, excelling in various NLP tasks. | Chat | MAX Model | CPU GPU |
![]() Codestral-v0.1 1 variant Codestral-22B-v0.1 excels in code generation, utilizing 80+ programming languages like Python and Java. | Code | MAX Model | GPU |
![]() Mamba-Codestral-v0.1 1 variant Codestral Mamba 7B is a leading open code model, excelling in multiple benchmarks. | Chat | PyTorch | GPU |
![]() Sailor2-Chat 3 variants Sailor2 offers a multilingual language model for 15 South-East Asian languages, supporting diverse applications. | Chat | MAX Model | GPU |
![]() Sailor2 3 variants Sailor2 develops multilingual language models for Southeast Asia with accessible advanced technology. | Chat | MAX Model | GPU |
![]() vicuna-delta-v0 2 variants Vicuna is a chat assistant fine-tuned from LLaMA, intended primarily for research purposes. | Chat | MAX Model | GPU |
![]() vicuna-v1.3 2 variants Vicuna is an advanced chat assistant model designed for research in NLP, AI, and ML. | Chat | MAX Model | GPU |
![]() vicuna-v1.5 2 variants Vicuna, an AI chat assistant, excels in research on language models and chatbots, benefiting NLP enthusiasts. | Chat | MAX Model | GPU |
![]() vicuna-v1.5-16k 1 variant Vicuna is a chat assistant model, fine-tuned on user-shared conversations for NLP research. | Chat | MAX Model | GPU |
![]() vicuna-delta-v1.1 2 variants Vicuna is a chat assistant trained on LLaMA for research in AI and NLP conversations. | Chat | MAX Model | GPU |
![]() vicuna-v1.1 1 variant Vicuna is a fine-tuned LLaMA model focused on chatbot research using user-shared conversations. | Chat | MAX Model | GPU |
![]() Nemotron-Mini-Instruct 1 variant The Nemotron-Mini-4B-Instruct is an optimized language model for roleplay, Q&A, and function calls. | Chat | PyTorch | GPU |
![]() Falcon3-Instruct 5 variants Falcon3-1B-Instruct is a versatile model excelling in reasoning, language, and inquiry tasks. | Chat | MAX Model | CPU GPU |
![]() falcon-instruct 2 variants Falcon-7B-Instruct is a powerful, optimized model for chat and instruction tasks. | Code | PyTorch | GPU |
![]() WizardCoder-Python-V1.0 2 variants WizardCoder models significantly advance code language models, reaching state-of-the-art results in evaluations. | Code | MAX Model | GPU |
![]() WizardCoder-V1.1 2 variants Explore WizardCoder, a leading Code Large Language Model with exceptional performance across multiple benchmarks. | Code | MAX Model | CPU GPU |
![]() CodeLlama-Instruct-hf 6 variants Code Llama offers versatile models for code synthesis and understanding across various programming tasks. | Code | MAX Model | CPU GPU |
![]() DeepHermes-3-Llama-3-Preview 2 variants DeepHermes 3, enhanced with advanced reasoning, offers flexible AI interaction and improved user alignment. | Chat | MAX Model | CPU GPU |
![]() OpenThinker 2 variants OpenThinker-7B excels in performance and open-source accessibility across multiple evaluation metrics. | Chat | MAX Model | GPU |
![]() DeepSeek-R1-Distill-Qwen-abliterated 1 variant Uncensored DeepSeek model version created via abliteration to enhance LLM functionalities. | Chat | MAX Model | GPU |
![]() DeepSeek-R1-Distill-Qwen-abliterated-v2 2 variants DeepSeek-R1-Distill-Qwen-7B-abliterated-v2 offers an uncensored, refined language model version. | Chat | MAX Model | GPU |
![]() Velvet 2 variants Velvet-2B, an Italian language model, excels in diverse linguistic applications, promoting ethical AI use. | Chat | MAX Model | GPU |
![]() SmolLM2-Instruct 2 variants SmolLM2 offers efficient, compact language models excelling in diverse tasks and instruction-following. | Chat | MAX Model | CPU GPU |
![]() OpenR1-Qwen 1 variant Finetuned Qwen2.5-Math model enhances mathematical problem-solving capabilities with extended context length. | Chat | MAX Model | GPU |
![]() QVikhr-2.5-Instruct-r 1 variant QVikhr-2.5-1.5B-Instruct-r is a bilingual language model specialized in Russian math datasets. | Chat | MAX Model | GPU |
![]() ReaderLM-v2 1 variant ReaderLM-v2 efficiently converts HTML into markdown or JSON, supporting 29 languages with 512K token handling. | Chat | MAX Model | GPU |
![]() Llama-2-hf 2 variants Llama 2 offers advanced, scalable language models with enhanced performance and ethical considerations. | Chat | MAX Model | CPU GPU |
![]() Llama-3.1-Tulu-3.1 2 variants Llama-3.1-Tulu-3 models excel in diverse tasks with advanced training techniques and open-source data. | Chat | MAX Model | CPU GPU |
![]() OREAL 1 variant OREAL mathematical reasoning models excel with innovative reinforcement learning, achieving significant accuracy improvements. | Chat | MAX Model | GPU |
![]() DeepSeek-llama3.1-Bllossom 2 variants DeepSeek-llama3.1-Bllossom-8B enhances multilingual AI performance, notably improving Korean language inference. | Chat | MAX Model | CPU GPU |
![]() DeepSeek-R1-Distill-Llama-abliterated 1 variant DeepSeek-R1-Distill-Llama-8B-abliterated is an uncensored model offering advanced AI capabilities. | Chat | MAX Model | GPU |
![]() YuE-s1-anneal-en-cot 2 variants YuE is an innovative music generation model transforming lyrics into complete songs, supporting diverse genres and languages. | Chat | MAX Model | CPU GPU |
![]() DeepSeek-R1-Distill-Qwen-Japanese 1 variant DeepSeek-R1-Distill-Qwen-32B-Japanese is a Japanese language model designed for advanced text generation. | Chat | MAX Model | GPU |
![]() Llama-Krikri-Instruct 2 variants Llama-Krikri-8B-Instruct enhances Greek text generation through extensive pretraining and bilingual capabilities. | Chat | MAX Model | CPU GPU |
![]() Fino1 2 variants Fino1-8B is a financial reasoning model fine-tuned for enhanced performance on specified tasks. | Chat | MAX Model | CPU GPU |
(0)
Step-by-step guides for how to deploy GenAI using MAX.
(0)
Reusable projects built by the Modular community on MAX and Mojo.
ENTERPRISES
@ Copyright - Modular Inc - 2025