Gemini 3.1 Pro
Trend
Quality score uses HLE, coding, factual grounding, context, and speed from the published roster. For exact weights and coverage rules, see the Methodology page.
Buyer-facing table
| Rank | Model | Quality score | Trend | HLE | TTFT | Ease of use | Arc-AGI-2 | Context | Verdict |
|---|---|---|---|---|---|---|---|---|---|
| 01 | Gemini 3.1 Pro |
Trend
Editorial investigation
An investigation into the emergent structural behaviors of the latest frontier update and its impact on the latent quality scores.
Explore full analysis"Quality is what still makes sense after the excitement fades."
PickAIModel Editorial
Quality note
80.7 |
Falling -0.9 | 44.4% | 0.28s Google AI Studio | 90% Ready to use | 77.1% | 1M |
| 02 | Claude Opus 4.6 | 80.0 | Falling -1.3 | 62.7% | 0.35s Google Vertex | 90% Ready to use | 68.8% | 128K |
|---|
| 03 | Claude Sonnet 4.6 | 70.0 | Falling -7.7 | 33.2% | 1.2s Google Vertex (Global) | 90% Ready to use | 58.3% | 128K |
|---|
| 04 | GPT-5.4 Replaces GPT-5.2 | 68.1 | Falling -24.1 | 41.6% | 1.3s OpenAI | 90% Ready to use | 73.3% | 128K |
|---|
| 05 | Gemini 3 Flash | 62.0 | Rising +6.1 | 33.7% | 0.13s Google AI Studio | 90% Ready to use | 33.6% | 1M |
|---|
| 06 | Qwen 3.6 Plus Preview | 59.7 | Rising +20.3 | 46.3% | 3.6s Qwen | 90% Ready to use | n/a | 262.1K |
|---|
| 07 | Gemini 3.1 Flash-Lite | 37.2 | Rising +10.4 | 16.0% | 0.09s Google AI Studio | 90% Ready to use | 12.0% | 1M |
|---|
| 08 | Kimi K2.5 | 37.2 | Rising +8.5 | 29.4% | 0.65s Moonshot AI | 90% Ready to use | 12.1% | 262K |
|---|
| 09 | MiniMax M2.7 | 34.7 | Rising +8.8 | 28.1% | 0.19s MiniMax | 90% Ready to use | n/a | 204.8K |
|---|
| 10 | DeepSeek V3.2 (Thinking) | 32.4 | Rising +11.2 | 22.0% | 1.5s DeepSeek | 90% Ready to use | n/a | 163.8K |
|---|
HLE
TTFT
Ease of use
Ready to use
Arc-AGI-2
Context
Verdict
Quality score
Anthropic
Trend
HLE
TTFT
Ease of use
Ready to use
Arc-AGI-2
Context
Verdict
Quality score
Anthropic
Trend
HLE
TTFT
Ease of use
Ready to use
Arc-AGI-2
Context
Verdict
Quality score
OpenAI
Trend
HLE
TTFT
Ease of use
Ready to use
Arc-AGI-2
Context
Verdict
Quality score
Trend
HLE
TTFT
Ease of use
Ready to use
Arc-AGI-2
Context
Verdict
Quality score
Qwen
Trend
HLE
TTFT
Ease of use
Ready to use
Arc-AGI-2
Context
Verdict
Quality score
Trend
HLE
TTFT
Ease of use
Ready to use
Arc-AGI-2
Context
Verdict
Quality score
Moonshot AI
Trend
HLE
TTFT
Ease of use
Ready to use
Arc-AGI-2
Context
Verdict
Quality score
MiniMax
Trend
HLE
TTFT
Ease of use
Ready to use
Arc-AGI-2
Context
Verdict
Quality score
DeepSeek
Trend
HLE
TTFT
Ease of use
Ready to use
Arc-AGI-2
Context
Verdict
Quality score