Honeyindex  /  LLMs  /  Mercury 2

Mercury 2

Fastest LLM by output speed (735 tok/s) using diffusion

Context Window
128K tokens
Parameters
Undisclosed
Source
Closed
Modality
Text, Code

Mercury 2 is a diffusion-based language model with the highest output throughput at 735 tokens/sec. Novel architecture from Inception Labs.

Specifications

Technical details
Developer
Inception Labs
Model Family
Mercury
Parameters
Undisclosed
Context Window
128,000 tokens (128K)
Modality
Text, Code
Open Source
No
License
Proprietary
API Available
Yes
Release Date
October 1, 2025
Pricing
$0.50/$1.50 per M tokens