DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

1. Introduction

DeepSeek-Coder-V2 is an open-source Mixture-of-Experts (MoE) code language model offering performance similar to GPT4-Turbo in code-specific tasks. Enhanced with 6 trillion additional training tokens, it surpasses its predecessor, DeepSeek-Coder-33B, in coding and mathematical reasoning tasks. The model supports 338 programming languages and extends context length from 16K to 128K, excelling in coding and math benchmarks against closed-source models (Fig. 1).

DeepSeek-Coder-V2 Performance

2. Model Downloads

DeepSeek-Coder-V2 is available in variants with 16B and 236B parameters. Based on the DeepSeekMoE framework, models have active parameters of 2.4B and 21B. Below is a summary of model versions:

Model	#Total Params	#Active Params	Context Length	Download
DeepSeek-Coder-V2-Lite-Base	16B	2.4B	128k	[Download Link]
DeepSeek-Coder-V2-Lite-Instruct	16B	2.4B	128k	[Download Link]
DeepSeek-Coder-V2-Base	236B	21B	128k	[Download Link]
DeepSeek-Coder-V2-Instruct	236B	21B	128k	[Download Link]

3. Chat Website

Chat with DeepSeek-Coder-V2 at DeepSeek's official site: coder.deepseek.com.

4. API Platform

DeepSeek offers an OpenAI-compatible API at platform.deepseek.com, with competitive pricing.

API Pricing

5. How to run locally

Inference with vLLM (recommended)

To use vLLM for model inference, integrate relevant updates into your codebase. Setup and instructions are dependent on system capabilities.

6. License

The code repository is licensed under the MIT License. Usage of models follows the Model License, permitting commercial use.

7. Contact

For questions, please raise an issue or email service@deepseek.com.

Citations

For the framework referenced in this summary: DeepSeekMoE publication.