BasedAGIBasedAGI
Menu
Rankings live

finance

Earnings call synthesis

Summarize earnings calls into key points, tone, and risks.

#1 Recommendation

gemini-3-pro-preview

Strong on Vals Finance Agent overall_accuracy_pct (87%) and Vals CorpFin v2 overall_accuracy_pct (87%)

external/google/gemini-3-pro-preview

42.4%

Score

54.5%

Confidence

Ranked Models

30

Evidence Quality

90%

Scoring

Benchmark-backed

Top Signal

Vals Finance Agent: overall_accuracy_pct

All Ranked Models

Max params:
Min confidence:
30 of 30
RankModelScore
#1gemini-3-pro-preview

Strong on Vals Finance Agent overall_accuracy_pct (87%) and Vals CorpFin v2 overall_accuracy_pct (87%)

42.4%
#2gemini-2.5-pro

Strong on FACTS Benchmark Suite facts_grounding_score_pct (100%) and Vals CorpFin v2 overall_accuracy_pct (78%)

38.0%
#3anthropic/claude-sonnet-4.6

Strong on Vals Finance Agent overall_accuracy_pct (100%) and Vals CorpFin v2 overall_accuracy_pct (91%)

36.9%
#4Grok-4-0709
36.5%
#5gpt-5-mini-2025-08-07
35.4%
#6gpt-5-2025-08-07
34.0%
#7openai/gpt-5.4-2026-03-05
33.9%
#8google/gemini-3.1-pro-preview
33.8%
#9gpt-4.1-20250414
32.7%
#10gpt-5.1-2025-11-13
30.1%
#11gpt-5.2-2025-12-11
29.7%
#12anthropic/claude-opus-4-6-thinking
29.1%
#13xai-org/grok-4-fast-reasoning
29.0%
#14xai-org/grok-4-1-fast-reasoning
28.5%
#15gemini-3-flash-preview
28.2%
#16google/gemini-3.1-flash-lite-preview
28.2%
#17claude-sonnet-4-20250514
27.8%
#18anthropic/claude-opus-4-5-20251101-thinking
27.6%
#19kimi/kimi-k2.5-thinking
27.2%
#20claude-opus-4-5-20251101
26.2%
#21anthropic/claude-sonnet-4-5-20250929-thinking
25.8%
#23alibaba/qwen3.5-flash
23.7%
#24zai/glm-5-thinking
23.7%
#25anthropic/claude-haiku-4-5-20251001-thinking
22.9%
#26mistralai/mistral-large-2512
20.3%
#27xai-org/grok-4-1-fast-non-reasoning
20.2%
#28z-ai/glm-4.7
19.6%
#29qwen/qwen3-max
19.4%
#30Kimi K2 Thinking
19.1%
#31gpt-4.1-mini-20250414
19.0%

Compare Models

Model A leads by +4.3%

Shareable Link →

Model A

gemini-3-pro-preview

external/google/gemini-3-pro-preview

42.4%

Rank #1

Confidence 54.5%29 evidence pts

Vals Finance Agent: overall_accuracy_pct

Value 87.0% · Conf 100.0% · Weight 3.3%

vals_finance_agent.overall_accuracy_pct (Mar 12, 2026)

Vals CorpFin v2: overall_accuracy_pct

Value 86.7% · Conf 100.0% · Weight 3.1%

vals_corp_fin_v2.overall_accuracy_pct (Mar 12, 2026)

FACTS Benchmark Suite: facts_grounding_score_pct

Value 88.3% · Conf 100.0% · Weight 2.6%

facts_benchmark_suite.facts_grounding_score_pct (Mar 12, 2026)

FACTS Benchmark Suite: facts_search_score_pct

Value 100.0% · Conf 100.0% · Weight 2.3%

facts_benchmark_suite.facts_search_score_pct (Mar 12, 2026)

Model B

gemini-2.5-pro

external/google/gemini-2-5-pro

38.0%

Rank #2

Confidence 55.4%32 evidence pts

FACTS Benchmark Suite: facts_grounding_score_pct

Value 100.0% · Conf 100.0% · Weight 3.0%

facts_benchmark_suite.facts_grounding_score_pct (Mar 12, 2026)

Vals CorpFin v2: overall_accuracy_pct

Value 78.4% · Conf 100.0% · Weight 2.8%

vals_corp_fin_v2.overall_accuracy_pct (Mar 12, 2026)

Vals Finance Agent: overall_accuracy_pct

Value 65.5% · Conf 100.0% · Weight 2.5%

vals_finance_agent.overall_accuracy_pct (Mar 12, 2026)

FACTS Benchmark Suite: average_score_pct

Value 78.3% · Conf 100.0% · Weight 1.7%

facts_benchmark_suite.average_score_pct (Mar 12, 2026)

Ranking Diagnostics & Missing Models

Source Lift

Ranked

49

Sources

8

Quality

Sufficient

Vals CorpFin v2

vals_corp_fin_v2

42 rows

1.7% avg lift

Vals Tax Eval v2

vals_tax_eval_v2

42 rows

1.7% avg lift

Vals GPQA

vals_gpqa

36 rows

0.7% avg lift

Vals Mortgage Tax

vals_mortgage_tax

30 rows

1.3% avg lift

Missing Strong Models

gpt-4o

external/openai/gpt-4o

Rank #22

15.2%

Thin evidence after weighting

gpt-4o-2024-05-13

external/openai/gpt-4o-2024-05-13

Rank #51

10.5%

Thin evidence after weighting
Taxonomy Details

Core Tasks

task.summarize_meeting_transcripttask.sentiment_classification

Required Modes

mode.long_contextmode.json_schema

Domains

domain.finance_equity_research

Related Use Cases