Copy, customize, and deploy. The quickest way to get your GenAI app up and running, and have total control over every layer.
(218)
Model families that can be run using MAX.
MODEL FAMILY | MODALITY | TYPE(S) | HARDWARE |
---|---|---|---|
![]() DeepSeek-R1-Distill-Llama 3 variants DeepSeek-R1 models improve reasoning through reinforcement learning and fine-tuning, outperforming major benchmarks. | Chat | MAX Model | CPU GPU |
![]() DeepSeek-R1-Distill-Qwen 3 variants DeepSeek-R1 models improve reasoning through reinforcement learning and fine-tuning, outperforming major benchmarks. | Chat | MAX Model | GPU |
![]() Llama-3.2-Instruct 6 variants Llama 3.2 offers advanced multilingual generative language models for diverse commercial and research applications. | Chat | MAX Model | CPU GPU |
![]() Llama-3.2-Vision-Instruct 1 variant Llama 3.2-Vision models by Meta, integrating advanced image reasoning capabilities with text, enhance visual tasks. | Vision | MAX Model | GPU |
![]() Llama-3.1-Instruct 3 variants Meta Llama 3.1 is a suite of multilingual, large language models (LLMs) available in 8B,70B, and 405B sizes optimized for text. | Chat | MAX Model | CPU GPU |
![]() Llama-Guard-3 2 variants Llama Guard 3 improves content safety classification with high accuracy and supports multiple languages. | Chat | MAX Model | CPU GPU |
![]() all-mpnet-base-v2 1 variant all-mpnet-base-v2 is a sentence-transformers model for efficient semantic sentence representation and clustering. | Embedding | MAX Model | GPU |
![]() phi-4-llama-distill 1 variant Transform and enhance conversational AI with Phi-4 for improved language model performance and safety. | Chat | MAX Model | CPU |
![]() Mistral-Instruct-v0.3 1 variant Mistral-7B-Instruct-v0.3 is a fine-tuned model with extended vocabulary and function calling. | Chat | MAX Model | GPU |
![]() Mistral-Small-Instruct-2501 1 variant Mistral Small 3 achieves state-of-the-art performance among small language models with 24 billion parameters. | Chat | MAX Model | GPU |
![]() Mistral-Nemo-Instruct-2407 1 variant Mistral-Nemo-Instruct-2407 outperforms similar models by integrating multilingual data and improved architecture. | Chat | MAX Model | GPU |
![]() Ministral-Instruct-2410 1 variant The Ministral-8B-Instruct-2410 model excels in multilingual tasks and code-related benchmarks, designed for on-device computing. | Chat | MAX Model | GPU |
![]() Qwen2.5-Instruct 5 variants Qwen2.5 language models offer enhanced multilingual support, instruction-following, and long-context capabilities. | Chat Code Vision | MAX Model PyTorch | GPU |
![]() Qwen2.5-Instruct-1M 2 variants Qwen2.5-7B-Instruct-1M, a powerful language model, excels in long-context tasks with enhanced efficiency. | Chat | MAX Model | GPU |
![]() Qwen2.5-Coder-Instruct 2 variants Qwen2.5-Coder excels in code generation, reasoning, and fixing with 128K context support. | Code | MAX Model | GPU |
![]() Qwen2.5-Math 2 variants Qwen2.5-Math improves multilingual math problem-solving with enhanced reasoning and computational accuracy. | Chat | MAX Model | GPU |
![]() QwQ-Preview 1 variant QwQ-32B-Preview is an AI model with potential but requires improvements in reasoning. | Chat | MAX Model | GPU |
![]() aya-expanse 2 variants Aya Expanse 8B is a multilingual model leveraging advanced research breakthroughs for optimized performance. | Chat | PyTorch | GPU |
![]() aya-23 2 variants Aya 23 is a highly capable multilingual language model optimized for 23 languages and research use. | Chat | PyTorch | GPU |
![]() aya-101 1 variant Aya model is a multilingual AI outperforming rivals, supporting 101 languages, enhancing communication globally. | Chat | PyTorch | GPU |
![]() EXAONE-3.5-Instruct 9 variants Bilingual EXAONE 3.5 language models offer advanced, versatile text generation across device types. | Chat Code | MAX Model | CPU GPU |
![]() EXAONE-3.0-Instruct 3 variants EXAONE-3.0-7.8B-Instruct is an advanced bilingual AI model offering competitive benchmark performance. | Chat Code | MAX Model | CPU GPU |
![]() Mistral-Small-Base-2501 1 variant Mistral Small 3 redefines small Large Language Models with 24B parameters, innovative capabilities, multilingual support. | Chat | MAX Model | GPU |
![]() Mistral-v0.1 1 variant This generative text model, Mistral-7B-v0.1, excels compared to similar models on multiple benchmarks. | Chat | MAX Model | GPU |
![]() Mistral-Instruct-v0.2 1 variant Mistral-7B-Instruct-v0.2 offers refined instruction-based capabilities for effective text generation tasks. | Chat | MAX Model | GPU |
![]() Mistral-Instruct-v0.1 1 variant The Mistral-7B-Instruct-v0.1 is a fine-tuned, instruction-following language model. | Chat | MAX Model | GPU |
![]() Mistral-Small-Instruct-2409 1 variant Mistral-Small-Instruct-2409 is a fine-tuned model designed for sequence tasks with 22B parameters. | Chat | MAX Model | GPU |
![]() Meta-Llama-3-Instruct 2 variants Meta Llama 3 models excel at generating text through optimized transformer architecture for diverse applications. | Chat | MAX Model | CPU GPU |
![]() Phi-3.5-mini-instruct 1 variant Phi-3.5-mini offers a powerful multilingual AI model designed for effective text generation across various use cases with enhanced reasoning capabilities and long-context understanding. | Code | MAX Model | GPU |
![]() Phi-3.5-vision-instruct 1 variant Discover the Phi-3.5-vision model, a state-of-the-art, lightweight multimodal solution for image-text tasks. | Vision | PyTorch | GPU |
![]() Llama-2-chat-hf 3 variants Llama 2 offers scalable generative text models enhancing dialogue applications with large parameter sizes. | Chat | MAX Model | CPU GPU |
![]() llava-v1.5 1 variant LLaVA is an advanced open-source chatbot designed for multimodal instruction research and language processing. | Vision | PyTorch | GPU |
![]() LLaVA-delta-v0 1 variant LLaVA is a fine-tuned open-source chatbot model using GPT for multimodal instruction-following. | Chat | MAX Model | GPU |
![]() CodeLlama-hf 6 variants Code Llama offers versatile models for code synthesis and understanding, designed for multiple programming needs. | Code | MAX Model | CPU GPU |
![]() TinyLlama-Chat-v1.0 2 variants TinyLlama efficiently pretrains a compact 1.1B Llama model, optimizing resources and enhancing adaptability. | Chat | MAX Model | CPU GPU |
![]() starcoder2-instruct-v0.1 1 variant StarCoder2-15B-Instruct is a self-aligned model optimizing code generation without human annotations. | Code | PyTorch | GPU |
![]() DeepSeek-Coder-V2-Lite-Instruct 1 variant DeepSeek-Coder-V2 excels in code tasks, supports 338 languages, and boasts advanced features. | Code | PyTorch | GPU |
![]() Dolphin3.0-R1-Mistral 1 variant Dolphin 3.0 R1, a flexible AI model, enables customized solutions for coding and general tasks. | Chat | MAX Model | GPU |
![]() Dolphin3.0-Mistral 1 variant Dolphin 3.0 Mistral 24B is an adaptable, user-controlled AI model for diverse applications. | Chat | MAX Model | GPU |
![]() Dolphin3.0-Llama3.1 3 variants Dolphin 3.0, a versatile AI model, empowers businesses with customizable and private solutions. | Chat | MAX Model | CPU GPU |
![]() Dolphin3.0-Llama3.2 5 variants Dolphin 3.0 is an adaptable AI model focused on privacy, control, and customization. | Chat | MAX Model | CPU GPU |
![]() Dolphin3.0-Qwen2.5 3 variants Dolphin 3.0 is an advanced AI model offering personalized, general-purpose functionalities for diverse applications. | Chat | MAX Model | GPU |
![]() WizardLM-2 1 variant Introducing WizardLM-2, an advanced multilingual model with enhanced performance for complex tasks. | Chat | MAX Model | GPU |
![]() OLMo-2-1124-Instruct 1 variant OLMo-2 models offer diverse language capabilities, perfect for cutting-edge state-of-the-art tasks. | Chat | PyTorch | GPU |
![]() OLMo-Instruct 1 variant OLMo 7B Instruct enhances language model science, excels at question answering, with open access. | Code | PyTorch | GPU |
![]() OLMo-hf 1 variant OLMo is an open-source language model series by AI2, optimized for language model research. | Chat | MAX Model | GPU |
![]() OLMo-0424 1 variant OLMo 7B April 2024 improves language model performance using updated training techniques and datasets. | Chat | PyTorch | GPU |
![]() OLMo-0724-hf 1 variant The OLMo 1B July 2024 model is enhanced with advanced dataset training, showing substantial performance improvements. | Chat | MAX Model | GPU |
![]() OLMo-2-1124-Instruct-preview 1 variant OLMo-2 models excel in diverse tasks using advanced training techniques and a versatile dataset. | Chat | PyTorch | GPU |
![]() Nous-Hermes 1 variant Nous-Hermes-13b is a cutting-edge language model outperforming in long responses and task accuracy. | Chat | MAX Model | GPU |
![]() Nous-Hermes-Llama2 2 variants Nous-Hermes-Llama2-13b is optimized for long, accurate responses without censorship, outperforming predecessors. | Chat | MAX Model | CPU GPU |
![]() Nous-Hermes-2-Yi 3 variants Nous Hermes 2 - Yi-34B sets new standards with exceptional benchmark performance and usability. | Chat | MAX Model | CPU GPU |
![]() Nous-Hermes-2-SOLAR 3 variants Nous Hermes 2 on SOLAR 10.7B achieves enhanced performance on various AI benchmarks, closer to Yi-34B. | Chat | MAX Model | CPU GPU |
![]() Nous-Hermes-2-Mistral-DPO 1 variant Nous Hermes 2 Mistral 7B DPO model exhibits improved performance across multiple benchmarking tests. | Chat | MAX Model | GPU |
![]() Nous-Hermes-llama-2 3 variants Nous-Hermes-Llama2-7b excels with long responses and reduced hallucination using diverse datasets. | Chat | MAX Model | CPU GPU |
![]() c4ai-command-r-plus 1 variant C4AI Command R+ is a multilingual, advanced AI model trained for reasoning, summarization, and question answering using 104 billion parameters. | Chat | PyTorch | GPU |
![]() c4ai-command-r-v01 2 variants C4AI Command-R is a 35 billion parameter, multilingual generative model optimized for various tasks. | Chat Code | PyTorch | GPU |
![]() c4ai-command-r-08-2024 1 variant C4AI Command R 08-2024 is a multidimensional AI model excelling in multilingual tasks, reasoning, and citations. | Chat | PyTorch | GPU |
![]() Yi-1.5-Chat 6 variants Yi-1.5 enhances performance in language tasks, coding, and reasoning using extensive training. | Chat | MAX Model | CPU GPU |
![]() Yi-Coder 2 variants Yi-Coder offers advanced open-source code models supporting 52 languages with exceptional performance and efficiency. | Chat Code | MAX Model | CPU GPU |
![]() Yi-Coder-Chat 3 variants Yi-Coder delivers top coding performance using models under 10 billion parameters for 52 languages. | Chat | MAX Model | CPU GPU |
![]() Yi 5 variants Explore Yi's next-gen, bilingual open-source models excelling in various benchmarks globally. | Chat | MAX Model | CPU GPU |
![]() Yi-200K 6 variants Yi's innovative bilingual language models excel in large-scale multilingual benchmarks, offering top-tier performance across tasks. | Chat | MAX Model | CPU GPU |
![]() Yi-Chat 8 variants Yi series models are powerful bilingual, open-source AI language models, excelling in various NLP tasks. | Chat | MAX Model | CPU GPU |
![]() Codestral-v0.1 1 variant Codestral-22B-v0.1 excels in code generation, utilizing 80+ programming languages like Python and Java. | Code | MAX Model | GPU |
![]() Mamba-Codestral-v0.1 1 variant Codestral Mamba 7B is a leading open code model, excelling in multiple benchmarks. | Chat | PyTorch | GPU |
![]() Sailor2-Chat 3 variants Sailor2 offers a multilingual language model for 15 South-East Asian languages, supporting diverse applications. | Chat | MAX Model | GPU |
![]() Sailor2 3 variants Sailor2 develops multilingual language models for Southeast Asia with accessible advanced technology. | Chat | MAX Model | GPU |
![]() vicuna-delta-v0 2 variants Vicuna is a chat assistant fine-tuned from LLaMA, intended primarily for research purposes. | Chat | MAX Model | GPU |
![]() vicuna-v1.3 2 variants Vicuna is an advanced chat assistant model designed for research in NLP, AI, and ML. | Chat | MAX Model | GPU |
![]() vicuna-v1.5 2 variants Vicuna, an AI chat assistant, excels in research on language models and chatbots, benefiting NLP enthusiasts. | Chat | MAX Model | GPU |
![]() vicuna-v1.5-16k 1 variant Vicuna is a chat assistant model, fine-tuned on user-shared conversations for NLP research. | Chat | MAX Model | GPU |
![]() vicuna-delta-v1.1 2 variants Vicuna is a chat assistant trained on LLaMA for research in AI and NLP conversations. | Chat | MAX Model | GPU |
![]() vicuna-v1.1 1 variant Vicuna is a fine-tuned LLaMA model focused on chatbot research using user-shared conversations. | Chat | MAX Model | GPU |
![]() Nemotron-Mini-Instruct 1 variant The Nemotron-Mini-4B-Instruct is an optimized language model for roleplay, Q&A, and function calls. | Chat | PyTorch | GPU |
![]() Falcon3-Instruct 7 variants Falcon3-1B-Instruct is a versatile model excelling in reasoning, language, and inquiry tasks. | Chat | MAX Model | CPU GPU |
![]() falcon-instruct 2 variants Falcon-7B-Instruct is a powerful, optimized model for chat and instruction tasks. | Code | PyTorch | GPU |
![]() WizardCoder-Python-V1.0 2 variants WizardCoder models significantly advance code language models, reaching state-of-the-art results in evaluations. | Code | MAX Model | GPU |
![]() WizardCoder-V1.1 3 variants Explore WizardCoder, a leading Code Large Language Model with exceptional performance across multiple benchmarks. | Code | MAX Model | CPU GPU |
![]() CodeLlama-Instruct-hf 9 variants Code Llama offers versatile models for code synthesis and understanding across various programming tasks. | Code | MAX Model | CPU GPU |
![]() DeepHermes-3-Llama-3-Preview 3 variants DeepHermes 3, enhanced with advanced reasoning, offers flexible AI interaction and improved user alignment. | Chat | MAX Model | CPU GPU |
![]() OpenThinker 2 variants OpenThinker-7B excels in performance and open-source accessibility across multiple evaluation metrics. | Chat | MAX Model | GPU |
![]() DeepSeek-R1-Distill-Qwen-abliterated 1 variant Uncensored DeepSeek model version created via abliteration to enhance LLM functionalities. | Chat | MAX Model | GPU |
![]() DeepSeek-R1-Distill-Qwen-abliterated-v2 2 variants DeepSeek-R1-Distill-Qwen-7B-abliterated-v2 offers an uncensored, refined language model version. | Chat | MAX Model | GPU |
![]() Velvet 2 variants Velvet-2B, an Italian language model, excels in diverse linguistic applications, promoting ethical AI use. | Chat | MAX Model | GPU |
![]() SmolLM2-Instruct 2 variants SmolLM2 offers efficient, compact language models excelling in diverse tasks and instruction-following. | Chat | MAX Model | CPU GPU |
![]() OpenR1-Qwen 1 variant Finetuned Qwen2.5-Math model enhances mathematical problem-solving capabilities with extended context length. | Chat | MAX Model | GPU |
![]() QVikhr-2.5-Instruct-r 1 variant QVikhr-2.5-1.5B-Instruct-r is a bilingual language model specialized in Russian math datasets. | Chat | MAX Model | GPU |
![]() ReaderLM-v2 1 variant ReaderLM-v2 efficiently converts HTML into markdown or JSON, supporting 29 languages with 512K token handling. | Chat | MAX Model | GPU |
![]() Llama-2-hf 3 variants Llama 2 offers advanced, scalable language models with enhanced performance and ethical considerations. | Chat | MAX Model | CPU GPU |
![]() Llama-3.1-Tulu-3.1 3 variants Llama-3.1-Tulu-3 models excel in diverse tasks with advanced training techniques and open-source data. | Chat | MAX Model | CPU GPU |
![]() OREAL 1 variant OREAL mathematical reasoning models excel with innovative reinforcement learning, achieving significant accuracy improvements. | Chat | MAX Model | GPU |
![]() DeepSeek-llama3.1-Bllossom 2 variants DeepSeek-llama3.1-Bllossom-8B enhances multilingual AI performance, notably improving Korean language inference. | Chat | MAX Model | CPU GPU |
![]() DeepSeek-R1-Distill-Llama-abliterated 1 variant DeepSeek-R1-Distill-Llama-8B-abliterated is an uncensored model offering advanced AI capabilities. | Chat | MAX Model | GPU |
![]() YuE-s1-anneal-en-cot 3 variants YuE is an innovative music generation model transforming lyrics into complete songs, supporting diverse genres and languages. | Chat | MAX Model | CPU GPU |
![]() DeepSeek-R1-Distill-Qwen-Japanese 1 variant DeepSeek-R1-Distill-Qwen-32B-Japanese is a Japanese language model designed for advanced text generation. | Chat | MAX Model | GPU |
![]() Llama-Krikri-Instruct 1 variant Llama-Krikri-8B-Instruct enhances Greek text generation through extensive pretraining and bilingual capabilities. | Chat | MAX Model | CPU |
![]() Fino1 2 variants Fino1-8B is a financial reasoning model fine-tuned for enhanced performance on specified tasks. | Chat | MAX Model | CPU GPU |
ENTERPRISES
@ Copyright - Modular Inc - 2025