engineering
Simulation setup assistant
Turn design requirements into simulation setup checklists and boundary notes.
#1 Recommendation
gemini-3-pro-preview
Strong on Vals SWE-bench overall_accuracy_pct (88%) and Vals LiveCodeBench overall_accuracy_pct (97%)
external/google/gemini-3-pro-preview
28.2%
Score
34.4%
Confidence
Ranked Models
30
Evidence Quality
85%
Scoring
Benchmark-backed
Top Signal
Vals SWE-bench: overall_accuracy_pct
All Ranked Models
Compare Models
Model A leads by +0.7%
Shareable Link →Model A
gemini-3-pro-preview
external/google/gemini-3-pro-preview
Rank #1
Vals SWE-bench: overall_accuracy_pct
Value 87.5% · Conf 100.0% · Weight 2.6%
vals_swebench.overall_accuracy_pct (Mar 12, 2026)
Vals LiveCodeBench: overall_accuracy_pct
Value 97.1% · Conf 100.0% · Weight 2.5%
vals_lcb.overall_accuracy_pct (Mar 12, 2026)
Vals Terminal-Bench 2: overall_accuracy_pct
Value 81.0% · Conf 100.0% · Weight 2.1%
vals_terminal_bench_2.overall_accuracy_pct (Mar 12, 2026)
FACTS Benchmark Suite: average_score_pct
Value 100.0% · Conf 100.0% · Weight 0.5%
facts_benchmark_suite.average_score_pct (Mar 12, 2026)
Model B
google/gemini-3.1-pro-preview
external/google/gemini-3-1-pro-preview
Rank #2
Vals LiveCodeBench: overall_accuracy_pct
Value 100.0% · Conf 100.0% · Weight 2.6%
vals_lcb.overall_accuracy_pct (Mar 12, 2026)
Vals Terminal-Bench 2: overall_accuracy_pct
Value 100.0% · Conf 100.0% · Weight 2.6%
vals_terminal_bench_2.overall_accuracy_pct (Mar 12, 2026)
Vals SWE-bench: overall_accuracy_pct
Value 85.2% · Conf 100.0% · Weight 2.5%
vals_swebench.overall_accuracy_pct (Mar 12, 2026)
Vals Mortgage Tax: overall_accuracy_pct
Value 100.0% · Conf 100.0% · Weight 0.5%
vals_mortgage_tax.overall_accuracy_pct (Mar 12, 2026)
▶Ranking Diagnostics & Missing Models
Source Lift
Ranked
50
Sources
8
Quality
Sufficient
Vals CorpFin v2
vals_corp_fin_v2
42 rows
0.4% avg lift
Vals LiveCodeBench
vals_lcb
41 rows
1.9% avg lift
Vals Legal Bench
vals_legal_bench
41 rows
0.5% avg lift
Vals Tax Eval v2
vals_tax_eval_v2
41 rows
0.4% avg lift
Missing Strong Models
No obvious gaps right now.
▶Taxonomy Details
Core Tasks
Required Modes
Domains
Related Use Cases
engineering
Component selection assistant
Recommend components under constraints with evidence and tradeoffs.
Top: gemini-3-pro-preview
engineering
Verilog/VHDL generation
Generate RTL code and testbenches from functional specs.
Top: z-ai/glm-4.7
engineering
CAD scripting helper
Generate and debug CAD automation scripts and parametric geometry code.
Top: anthropic/claude-sonnet-4.6