BasedAGIBasedAGI
Menu
Rankings live

hr_recruiting

Best LLM for Job Descriptions

Ranked models for drafting job descriptions that match role requirements and tone.

#1 Recommendation

gpt-4.1-20250414

Strong on Galileo Agent Leaderboard v2 Avg TSQ (64%) and MMLongBench-Doc Leaderboard acc_score_pct (75%)

external/openai/gpt-4-1-20250414

23.7%

Score

36.3%

Confidence

24

Evidence

Ranked Models

30

Evidence Quality

79%

Scoring

Benchmark-backed

Top Signal

Galileo Agent Leaderboard v2: Avg TSQ

All Ranked Models

Max params:
Min confidence:
30 of 30
RankModelScore
#1gpt-4.1-20250414

Strong on Galileo Agent Leaderboard v2 Avg TSQ (64%) and MMLongBench-Doc Leaderboard acc_score_pct (75%)

23.7%
#2gemini-2.5-flash

Strong on Galileo Agent Leaderboard v2 Avg TSQ (100%) and LanguageBench Grammar/Clarity Official (Split) grammar_clarity_score_pct (100%)

17.7%
#3gpt-4.1-mini-20250414

Strong on Galileo Agent Leaderboard v2 Avg TSQ (62%) and OpenVLM OCRBench Official ocrbench_score_pct (88%)

17.5%
#5gemini-2.5-pro
15.8%
#6gpt-4o
15.0%
#12Grok-4-0709
12.6%
#13claude-sonnet-4-20250514
12.6%
#14qwen-2.5-72b-instruct
12.6%
#20gpt-5-2025-08-07
11.5%
#23google/gemini-2.0-flash-001
11.0%
#25gpt-5-mini-2025-08-07
10.9%
#29gemini-3-pro-preview
10.6%
#58google/gemini-3.1-pro-preview
9.6%
#68Llama-2-7b-chat-hf
9.0%
#87openai/gpt-5.4-2026-03-05
8.7%
#100gpt-5.1-2025-11-13
8.4%
#111anthropic/claude-sonnet-4.6
8.4%
#113claude-opus-4-5-20251101
8.3%
#117Qwen3-Embedding-4B
8.2%
#120GPT-4.1-nano-2025-04-14
8.1%
#127gemma-7b-it
7.9%
#144Qwen-VL-Chat
7.6%
#160gemma-2b-it
7.2%
#177xai-org/grok-4-fast-reasoning
6.9%
#178gpt-4o-20241120
6.9%
#210xai-org/grok-4-1-fast-reasoning
6.5%
#218deepseek/deepseek-r1
6.5%
#260openai/gpt-4o-mini-2024-07-18
5.9%
#288phi-4
5.5%
#386gpt-4o-2024-05-13
3.8%

Head-to-Head: #1 vs #2

#1

Top Pick

gpt-4.1-20250414

Strong on Galileo Agent Leaderboard v2 Avg TSQ (64%) and MMLongBench-Doc Leaderboard acc_score_pct (75%)

23.7%

Conf 36.3%

#2

gemini-2.5-flash

Strong on Galileo Agent Leaderboard v2 Avg TSQ (100%) and LanguageBench Grammar/Clarity Official (Split) grammar_clarity_score_pct (100%)

17.7%

Conf 21.2%

Related Lookups