Phi 3.5 Mini Instruct
Model Summary
Phi-3.5-mini is an advanced and lightweight model that offers state-of-the-art text generation capabilities. It has been developed with a focus on high-quality data, including synthetic and selectively filtered publicly available websites. This model is part of the Phi-3 family and supports a maximum context length of 128K tokens. Its development involved a comprehensive enhancement process incorporating supervised fine-tuning, proximal policy optimization, and direct preference optimization to achieve precise instruction adherence and strong safety protocols.
Key links:
Intended Uses
Primary Use Cases
Phi-3.5-mini is designed for commercial and research purposes, providing versatile applications in general-purpose AI systems. It performs well under memory and compute-constrained environments and latency-bound scenarios, excelling in tasks requiring strong reasoning, logic, and mathematical capabilities. It acts as a foundational model for further research in language and multimodal model development.
Use Case Considerations
While versatile, the model is not universally applicable to all downstream uses. Developers should carefully assess potential accuracy, safety, and fairness issues before deploying it in critical or high-risk environments. Compliance with legal, privacy, and trade regulations is necessary when utilizing this model.
Release Notes
Following valuable user feedback, this update offers improved multilingual and reasoning capabilities through additional post-training data. Users are encouraged to test the model within their specific AI applications to maximize benefit from these enhancements and provide ongoing feedback to the developers.
Multilingual Capability
Phi-3.5-mini demonstrates competitive multilingual performance with its 3.8 billion active parameters, rivaling larger models in tasks such as Multilingual MMLU and MEGA datasets. It supports numerous languages, including Arabic, Chinese, Dutch, French, and more, ensuring broad usability across different linguistic contexts.
Long Context Capacity
With the ability to handle up to 128K context length, Phi-3.5-mini excels at tasks involving extended text comprehension, such as document summarization and long-form question answering. It outperforms models restricted to shorter context lengths and is comparable to larger models like Llama-3.1-8B-instruct and Mistral-7B-instruct-v0.3.
Responsible AI Considerations
Acknowledging potential limitations like bias, incorrect information, or inappropriate content, developers should emphasize responsible AI practices by testing for accuracy and implementing suitable safety measures tailored to specific use cases. Phi-3.5-mini models are general-purpose and should be fine-tuned for specific applications while incorporating proper safeguards.
Training Information
Phi-3.5-mini is a 3.8 billion parameter dense, decoder-only Transformer model, optimized for chat format prompts with a 128K token context capacity. It was trained over 10 days on 512 H100-80G GPUs using 3.4 trillion tokens from well-curated public documents, synthetic data, and high-quality supervised data in chat format. Such meticulous data curation focuses on enhancing the model's reasoning ability while maintaining efficient use of its parameter scale.
Datasets and Fine-tuning
The training utilized diverse datasets, including quality-filtered public documents, synthetic datasets designed for teaching reasoning and general knowledge, and supervised chat format data. For further improvement, a fine-tuning module example is provided for multi-GPU setups.
Benchmarks and Safety
Phi-3.5-mini's reasoning abilities are tested against standard benchmarks, showcasing impressive multilingual and reasoning capabilities while being on par with or surpassing much larger models. Safety evaluations, including red-teaming and multilingual safety datasets, emphasize the need for comprehensive safety measures in various languages and risk areas, reinforcing the importance of community-wide collaboration for safer AI deployment.
Additional Information
The model runs efficiently on compatible GPU hardware, utilizing libraries such as PyTorch, Transformers, and Flash-Attention for seamless integration into existing systems. For best performance, updates and further details on software and hardware specifications are provided.
Citations
For more technical details and academic references, please refer to the Phi-3 Technical Report and the Phi-3 Safety Post-Training Paper.