BasedAGIBasedAGI
Menu
Rankings live

Model Rankings

Models ranked by composite utility score across all benchmark-backed use cases. The score weights each use-case result by its confidence level, so models with broader, higher-confidence coverage rank higher.

How the score works

Utility Score = sum of (use-case score × confidence) / sum of (confidence) across all scored use cases. This means a model that scores well on many use cases with high confidence ranks above one that scores well on few. Coverage breadth is used as a tiebreaker.

Recent Changes

0 use cases changed their #1 model since the last scoring update. Currently 19,299 model-use case pairings scored.

Showing top 100 models
RankModelUtility Score
#1
external/google/gemini-3-pro-preview
24.8%
#2
external/google/gemini-2-5-pro
23.6%
#3
external/openai/gpt-4-1-20250414
21.4%
#4
external/anthropic/claude-sonnet-4-6
20.5%
#5
external/xai/grok-4-0709
20.2%
#6
katanemo/Arch-Agent-32B
19.0%
#7
external/openai/gpt-5-mini-2025-08-07
18.9%
#8
external/google/gemini-3-1-pro-preview
18.5%
#9
external/openai/gpt-5-2025-08-07
18.5%
#10
external/openai/gpt-5-4-2026-03-05
18.3%
#11
external/anthropic/claude-sonnet-4-20250514
17.3%
#12
external/google/gemini-2-5-flash
17.2%
#13
external/openai/gpt-5-1-2025-11-13
16.4%
#14
external/anthropic/claude-opus-4-5-20251101
16.4%
#15
external/google/gemini-3-flash-preview
15.7%
#16
external/openai/gpt-5-2-2025-12-11
15.6%
#17
external/anthropic/claude-opus-4-6-thinking
15.5%
#18
external/google/gemini-3-1-flash-lite-preview
15.3%
#19
external/xai-org/grok-4-fast-reasoning
15.2%
#20
tiiuae/falcon-7b-instruct
15.1%
#21
meta-llama/Llama-2-7b-chat-hf
15.0%
#22
external/anthropic/claude-opus-4-5-20251101-thinking
14.7%
#23
external/openai/gpt-4o
14.6%
#24
external/xai-org/grok-4-1-fast-reasoning
14.5%
#25
external/kimi/kimi-k2-5-thinking
14.0%
#26
katanemo/Arch-Agent-3B
13.6%
#27
external/anthropic/claude-sonnet-4-5-20250929-thinking
13.6%
#28
external/qwen/qwen-2-5-72b-instruct
13.4%
#29
katanemo/Arch-Agent-1.5B
13.2%
#30
external/openai/gpt-4-1-mini-20250414
12.7%
#31
HuggingFaceH4/zephyr-7b-beta
12.6%
#32
CohereLabs/c4ai-command-r-plus
12.3%
#33
EasyDeL/Kimi-VL-A3B-Instruct
12.2%
#34
external/kimi/kimi-k2-thinking
12.1%
#35
Laibaaaaa/GLM-5
12.0%
#36
AIDC-AI/Ovis1.6-Gemma2-9B
12.0%
#37
external/alibaba/qwen3-5-flash
12.0%
#38
google/gemma-2b-it
12.0%
#39
google/gemma-7b-it
11.8%
#40
google/gemma-2-27b-it
11.8%
#41
openai/gpt-oss-120b
11.7%
#42
external/x-ai/grok-3
11.7%
#43
external/anthropic/claude-haiku-4-5-20251001-thinking
11.7%
#44
external/z-ai/glm-4-7
11.5%
#45
external/minimax/minimax-m2-1
11.4%
#46
external/mistralai/mistral-large-2512
11.3%
#47
external/openai/o3-20250416
11.3%
#48
external/openai/gpt-4o-2024-08-06
11.2%
#49
grimjim/mistralai-Mistral-Nemo-Instruct-2407
11.2%
#50
contextboxai/Qwen3-1.7B-FC
11.1%
#51
Yura37/11
11.0%
#52
external/openai/gpt-5
11.0%
#53
maicomputer/alpaca-native
11.0%
#54
openai/gpt-oss-20b
10.9%
#55
external/xai-org/grok-4-1-fast-non-reasoning
10.7%
#56
meta-llama/Llama-3.3-70B-Instruct
10.6%
#57
EasyDeL/GLM-4.6V
10.5%
#58
external/anthropic/claude-opus-4-1-20250805
10.5%
#59
meta-llama/Meta-Llama-3-8B-Instruct
10.4%
#60
CometAPI/grok4
10.4%
#61
external/openai/gpt-4o-20241120
10.4%
#62
external/openai/gpt-4o-2024-05-13
10.3%
#63
unsloth/Kimi-K2-Instruct
10.2%
#64
external/deepseek/deepseek-r1
10.1%
#65
external/qwen/qwen3-max
10.1%
#66
RedHatAI/Mistral-Small-24B-Instruct-2501
10.0%
#67
mcrovero/gemma-3-27b-it
9.9%
#68
Qwen/Qwen2.5-32B-Instruct
9.9%
#69
CohereLabs/c4ai-command-r-plus-08-2024
9.8%
#70
Qwen/Qwen2.5-Coder-7B
9.8%
#71
meta-llama/Llama-3.1-70B-Instruct
9.7%
#72
anyidea/Qwen3-Embedding-8B
9.6%
#73
google/gemma-2-9b-it
9.5%
#74
Open-Orca/Mistral-7B-OpenOrca
9.5%
#75
Qwen/Qwen3-Embedding-4B
9.5%
#76
anas125244235/GLM-4.5-Air
9.4%
#77
meta-llama/Meta-Llama-3-70B-Instruct
9.3%
#78
Qwen/Qwen3-32B
9.1%
#79
Mira190/Euler-Legal-Embedding-V1
9.1%
#80
deepseek-ai/DeepSeek-V2.5
8.9%
#81
external/openai/gpt-4o-mini-2024-07-18
8.9%
#82
mistralai/Mistral-7B-Instruct-v0.2
8.7%
#83
Qwen/QwQ-32B-Preview
8.6%
#84
microsoft/Phi-3-medium-128k-instruct
8.6%
#85
1kxia/Qwen3-Embedding-0.6B
8.6%
#86
yokoe/baseline
8.5%
#87
external/xai-org/grok-4-fast-non-reasoning
8.5%
#88
microsoft/phi-4
8.4%
#89
raydel-0307/Qwen3-2B
8.3%
#90
CometAPI/o3-pro
8.2%
#91
external/openai/o4-mini-20250416
8.1%
#92
Qwen/Qwen-VL-Chat
8.0%
#93
ICT-TIME-and-Querit/BOOM_4B_v1
8.0%
#94
Alibaba-NLP/gte-Qwen2-1.5B-instruct
8.0%
#95
unsloth/Nemotron-3-Nano-30B-A3B
7.9%
#96
GritLM/GritLM-7B
7.9%
#97
Qwen/Qwen2.5-14B-Instruct
7.9%
#98
GritLM/GritLM-8x7B
7.8%
#99
residuals/gemma-3-12b
7.7%
#100
moonshotai/Kimi-K2-Instruct-0905
7.6%