What model should
you use?
Benchmark-backed rankings for 143 use cases. No opinions. No vibes. Just evidence.
167/168
Benchmark sources
143
Use cases scored
Daily
Updates
Browse Use Cases
Find ranked models for any workflow
Explore →Model Rankings
Cross-task leaderboard by utility score
View Rankings →Top Models
Full Rankings →| # | Model | Score |
|---|---|---|
| 1 | gemini-3-pro-preview | 25.8% |
| 2 | gemini-2.5-pro | 24.7% |
| 3 | gpt-4.1-20250414 | 22.5% |
| 4 | anthropic/claude-sonnet-4.6 | 21.1% |
| 5 | Grok-4-0709 | 21.1% |
Popular Use Cases
All Use Cases →finance
Earnings call synthesis
Summarize earnings calls into key points, tone, and risks.
devops_sre
Log triage
Interpret logs and propose safe diagnostic steps.
business_productivity
Knowledge base Q&A (with citations)
Answer questions grounded in an internal KB, with evidence.
business_productivity
Document summarization
Summarize long business documents into scannable outputs.
legal
Contract term extraction
Extract key terms into structured fields with clause references.
customer_experience
Support bot (RAG grounded)
Support chatbot grounded in docs with optional citations and escalation.
Quick Lookups
50 indexedBest LLM for Code Generation
Benchmark-backed ranking of models for generating correct, secure code from requirements.
Best LLM for Debugging
Find the top-ranked models for localizing bugs and proposing fixes with explanations.
Best LLM for Unit Test Generation
Ranked models for generating meaningful unit tests and edge cases from code.
Best LLM for Code Review
Compare models for automated PR review covering correctness, security, and maintainability.
Best LLM for Refactoring
Ranked models for safely refactoring code while preserving behavior and improving clarity.
Best LLM for IDE Code Completion
Compare models for fast, accurate local-context code completion and snippet generation.