Model Rankings

Updated Apr 25, 2026, 07:59 PM UTC

Ranked by composite utility score across benchmark-backed use cases, with the public default now preferring broader, higher-confidence model profiles.

Requires at least 4 scored dimensions and 20% average confidence. Models with confidence below 25% are shown with an amber confidence indicator — treat those rankings as provisional. Use Full Profiles Only for the strictest view.

How scoring works

Utility Score = Σ(use-case score × confidence) / Σ(confidence) across all scored use cases. Public ordering prefers full profiles first, then confidence-adjusted utility, then breadth.

Sort byUtility Value IQ EQ Accuracy Creativity Based Full Profiles Only

Trusted Profiles

Models with all 5 dimensions scored and eligible for the stable overall ranking.

Emerging Profiles

High-potential models with 4/5 dimensions. Visible separately until their profiles fill in.

Trusted Avg Confidence

30.5%

Average confidence across models currently eligible for the trusted overall table.

0 use cases changed their #1 model since last update·20,080 model-use case pairings scored

Trusted overall ranking · full 5-dimension profiles

Rank	Model	Utility	IQ	EQ	Accuracy	Creativity	Based	Cases	Conf.	Price/1M	Profile
🥇	gemini-2.5-pro API external/google/gemini-2-5-pro Full profile	25.4%	59.2%	76.1%	—	82.8%	65.0%	151	37.2%	—	Full profileView
🥈	GLM-4.6 Open zai-org/GLM-4.6 Full profile	31.1%	61.3%	93.3%	83.8%	77.1%	53.0%	26	26.6%	—	Full profileView
🥉	gpt-5-2025-08-07 API external/openai/gpt-5-2025-08-07 Full profile	25.5%	76.7%	97.8%	87.9%	80.1%	47.0%	151	32.3%	—	Full profileView
#4	Grok-4-0709 API external/xai/grok-4-0709 Full profile	25.5%	66.9%	69.1%	—	97.1%	65.0%	151	31.4%	—	Full profileView
#5	anthropic/claude-sonnet-4 API external/anthropic/claude-sonnet-4 Full profile	23.9%	57.6%	90.4%	68.2%	72.8%	—	151	33.5%	—	Full profileView
#6	gpt-4.1-20250414 API external/openai/gpt-4-1-20250414 Full profile	23.5%	68.6%	24.5%	—	84.6%	71.0%	151	33.0%	—	Full profileView
#7	gemini-3-pro-preview API external/google/gemini-3-pro-preview Full profile	25.4%	82.2%	88.7%	88.6%	79.5%	—	151	28.8%	—	Full profileView
#8	o3-20250416 API external/openai/o3-20250416 Full profile	21.2%	81.4%	56.8%	67.7%	80.8%	65.0%	151	27.0%	—	Full profileView
#9	claude-opus-4-5-20251101 API external/anthropic/claude-opus-4-5-20251101 Full profile	18.6%	78.1%	12.9%	—	77.9%	—	145	25.1%	—	Full profileView

Dimensions:IQ—Reasoning & logicEQ—Social & emotionalAccuracy—Factual precisionCreativity—Creative outputBased—Direct & uncensored