Mercury 2

Fastest LLM by output speed (735 tok/s) using diffusion

Inception Labs · Mercury · Released October 2025

Context Window

128K tokens

Parameters

Undisclosed

Source

Closed

Modality

Text, Code

Mercury 2 is a diffusion-based language model with the highest output throughput at 735 tokens/sec. Novel architecture from Inception Labs.

Specifications

Technical details

Developer

Inception Labs

Model Family

Mercury

Parameters

Undisclosed

Context Window

128,000 tokens (128K)

Modality

Text, Code

Open Source

License

Proprietary

API Available

Yes

Release Date

October 1, 2025

Pricing

$0.50/$1.50 per M tokens