BasedAGIBasedAGI
Menu
Rankings live

cybersecurity

Best LLM for Threat Intelligence

Ranked models for analyzing threat reports, CVEs, and advisories into structured risk assessments.

#1 Recommendation

gemini-2.5-pro

Strong on BaxBench Leaderboard average_secure_pass_1_pct (44%) and FACTS Benchmark Suite facts_grounding_score_pct (100%)

external/google/gemini-2-5-pro

27.9%

Score

43.6%

Confidence

30

Evidence

Ranked Models

30

Evidence Quality

79%

Scoring

Benchmark-backed

Top Signal

BaxBench Leaderboard: average_secure_pass_1_pct

All Ranked Models

Max params:
Min confidence:
30 of 30
RankModelScore
#1gemini-2.5-pro

Strong on BaxBench Leaderboard average_secure_pass_1_pct (44%) and FACTS Benchmark Suite facts_grounding_score_pct (100%)

27.9%
#2gemini-3-pro-preview

Strong on FACTS Benchmark Suite facts_grounding_score_pct (88%) and FACTS Benchmark Suite facts_search_score_pct (100%)

19.8%
#3gpt-4.1-20250414

Strong on Vectara HHEM Leaderboard overall_hallucination_error_pct (82%) and Vals CorpFin v2 overall_accuracy_pct (85%)

19.3%
#4gpt-5-2025-08-07
16.2%
#5gpt-5-mini-2025-08-07
16.1%
#6anthropic/claude-sonnet-4.6
14.9%
#7Grok-4-0709
14.4%
#8google/gemini-3.1-pro-preview
13.6%
#9openai/gpt-5.4-2026-03-05
13.4%
#10gemini-2.5-flash
13.1%
#11claude-opus-4-5-20251101
13.0%
#13gpt-5.1-2025-11-13
11.9%
#14claude-sonnet-4-20250514
11.8%
#15openai/gpt-4.1
11.8%
#16gemini-3-flash-preview
11.6%
#17x-ai/grok-3
11.5%
#19google/gemini-3.1-flash-lite-preview
11.1%
#20xai-org/grok-4-fast-reasoning
11.0%
#21gpt-4.1-mini-20250414
10.9%
#23xai-org/grok-4-1-fast-reasoning
10.4%
#24anthropic/claude-opus-4-6-thinking
10.4%
#25gpt-5.2-2025-12-11
10.4%
#26kimi/kimi-k2.5-thinking
9.8%
#27anthropic/claude-opus-4-5-20251101-thinking
9.6%
#28deepseek/deepseek-r1
9.5%
#29gpt-4o
9.3%
#30gpt-4o-2024-05-13
9.1%
#31anthropic/claude-sonnet-4-5-20250929-thinking
9.0%
#32openai/gpt-4o-mini-2024-07-18
8.9%
#33grok/grok-4.20-beta-0309-reasoning
8.8%

Head-to-Head: #1 vs #2

#1

Top Pick

gemini-2.5-pro

Strong on BaxBench Leaderboard average_secure_pass_1_pct (44%) and FACTS Benchmark Suite facts_grounding_score_pct (100%)

27.9%

Conf 43.6%

#2

gemini-3-pro-preview

Strong on FACTS Benchmark Suite facts_grounding_score_pct (88%) and FACTS Benchmark Suite facts_search_score_pct (100%)

19.8%

Conf 25.8%

Related Lookups