BasedAGIBasedAGI
Menu
Rankings live

legal

Contract Drafting & Redlining

Drafting, reviewing, and suggesting edits to legal contracts and agreements.

#1 Recommendation

gemini-3-pro-preview

Strong on Vals Legal Bench overall_accuracy_pct (99%) and LEXam Leaderboard average_score_pct (76%)

external/google/gemini-3-pro-preview

34.6%

Score

47.4%

Confidence

Limited benchmark evidence for this use case.

54 ranked models with average evidence of 15.9 points. Rankings may shift as more benchmark data is ingested.

Ranked Models

30

Evidence Quality

84%

Scoring

Benchmark-backed

Top Signal

Vals Legal Bench: overall_accuracy_pct

All Ranked Models

Max params:
Min confidence:
30 of 30
RankModelScore
#1gemini-3-pro-preview

Strong on Vals Legal Bench overall_accuracy_pct (99%) and LEXam Leaderboard average_score_pct (76%)

34.6%
#2gemini-2.5-pro

Strong on LEXam Leaderboard average_score_pct (89%) and FACTS Benchmark Suite facts_grounding_score_pct (100%)

34.5%
#3gpt-4.1-20250414

Strong on Vals Case Law v2 overall_accuracy_pct (86%) and Vals Legal Bench overall_accuracy_pct (91%)

26.6%
#4gpt-5-mini-2025-08-07
25.9%
#5gpt-5-2025-08-07
24.4%
#6anthropic/claude-sonnet-4.6
24.3%
#7Grok-4-0709
24.1%
#8google/gemini-3.1-pro-preview
23.2%
#9openai/gpt-5.4-2026-03-05
22.8%
#10gpt-5.1-2025-11-13
22.2%
#11gemini-2.5-flash
21.5%
#12claude-opus-4-5-20251101
20.3%
#13claude-sonnet-4-20250514
20.3%
#14xai-org/grok-4-fast-reasoning
20.0%
#15deepseek/deepseek-r1
19.5%
#16gemini-3-flash-preview
19.2%
#17xai-org/grok-4-1-fast-reasoning
18.7%
#18google/gemini-3.1-flash-lite-preview
18.5%
#19gpt-5.2-2025-12-11
17.8%
#20anthropic/claude-opus-4-6-thinking
17.2%
#21anthropic/claude-opus-4-5-20251101-thinking
16.5%
#22mistralai/mistral-large-2512
16.4%
#24openai/gpt-4.1
16.1%
#25anthropic/claude-sonnet-4-5-20250929-thinking
15.8%
#27Command A (03-2025)
15.0%
#28x-ai/grok-3
14.8%
#29Kimi K2 Thinking
14.8%
#30anthropic/claude-opus-4-1-20250805
14.7%
#31zai/glm-5-thinking
14.0%
#32alibaba/qwen3.5-flash
13.5%

Compare Models

Model A leads by +0.1%

Shareable Link →

Model A

gemini-3-pro-preview

external/google/gemini-3-pro-preview

34.6%

Rank #1

Confidence 47.4%26 evidence pts

Vals Legal Bench: overall_accuracy_pct

Value 99.2% · Conf 100.0% · Weight 3.9%

vals_legal_bench.overall_accuracy_pct (Mar 17, 2026)

LEXam Leaderboard: average_score_pct

Value 75.9% · Conf 100.0% · Weight 3.7%

lexam_leaderboard.average_score_pct (Mar 17, 2026)

FACTS Benchmark Suite: facts_grounding_score_pct

Value 88.3% · Conf 100.0% · Weight 3.3%

facts_benchmark_suite.facts_grounding_score_pct (Mar 17, 2026)

FACTS Benchmark Suite: facts_search_score_pct

Value 100.0% · Conf 100.0% · Weight 1.4%

facts_benchmark_suite.facts_search_score_pct (Mar 17, 2026)

Model B

gemini-2.5-pro

external/google/gemini-2-5-pro

34.5%

Rank #2

Confidence 51.5%33 evidence pts

LEXam Leaderboard: average_score_pct

Value 89.4% · Conf 100.0% · Weight 4.4%

lexam_leaderboard.average_score_pct (Mar 17, 2026)

FACTS Benchmark Suite: facts_grounding_score_pct

Value 100.0% · Conf 100.0% · Weight 3.8%

facts_benchmark_suite.facts_grounding_score_pct (Mar 17, 2026)

Vals Case Law v2: overall_accuracy_pct

Value 63.2% · Conf 100.0% · Weight 2.8%

vals_case_law_v2.overall_accuracy_pct (Mar 17, 2026)

LEXam Leaderboard: open_question_judge_score_pct

Value 94.9% · Conf 100.0% · Weight 1.7%

lexam_leaderboard.open_question_judge_score_pct (Mar 17, 2026)

Ranking Diagnostics & Missing Models

Source Lift

Ranked

54

Sources

8

Quality

Insufficient

Vals Legal Bench

vals_legal_bench

42 rows

3.5% avg lift

Vals CorpFin v2

vals_corp_fin_v2

42 rows

0.9% avg lift

Vals MedQA

vals_medqa

33 rows

0.2% avg lift

Vals Case Law v2

vals_case_law_v2

30 rows

2.4% avg lift

Missing Strong Models

gpt-4o-2024-05-13

external/openai/gpt-4o-2024-05-13

Rank #47

10.6%

Thin evidence after weighting
Taxonomy Details

Core Tasks

task.summarize_legal_contracttask.compare_docs_diff

Required Modes

mode.long_contextmode.citations

Domains

domain.legal_contracts

Related Use Cases