BasedAGIBasedAGI

Model Profile

claude-opus-4-6

External Benchmark Shadowexternal_benchmark_shadowpublic
4,096 ctx

Use this page to decide where this model is a strong fit. Rankings below are benchmark-backed by use case, with explicit confidence and contributor metrics.

Identity

ID: external/anthropic/claude-opus-4-6

Author: anthropic

Origin: external_benchmark_shadow

Arch: unknown

Benchmark Coverage

Scored use cases: 12

Avg confidence: 24.5%

Evidence points: 199

Raw rows: 196

Weighted rows: 36

Catalog Metadata

Parameters: unknown

Context window: 4096

Downloads: 0

Price / 1M tokens: $10.00 (blended 3:1)

Intelligence Profile

IQ *86%EQAccuracy *95%Creativity *96%Based *59%

Dimension Breakdown

IQ5 benchmarks
85.5%*
EQ0 benchmarks

No eq benchmarks found

Insufficient data
Accuracy1 benchmark
95.4%*
Creativity2 benchmarks
95.5%*
Based1 benchmark
59.0%*

* Low confidence — limited benchmark evidence for this dimension

4/5 dimensions scored · Last updated Apr 14, 2026

Benchmark Signals

Click through to the benchmark source behind this model profile.

Some fit rows have limited benchmark evidence.

6 of 12 scored use cases have low confidence or thin contributor coverage.

Coverage Diagnostics

actively scored

Use-Case Scores

120

Total Measurements

196

Weighted Measurements

36

Weighted Sources

16

Raw Source Coverage

ugi_main 57vectara_hhem_leaderboard 21halluhard_leaderboard 17openhands_index 13swe_bench_additional_public 12swe_bench_leaderboard 12

Weighted Source Coverage

vectara_hhem_leaderboard 12openhands_index 5halluhard_leaderboard 3ugi_main 3openhands_issue_resolution 2agentset_llms 1

Best Use Cases for This Model

Use CaseScore
Autonomous Coding Agent

use_case.dev.autonomous_coding_agent

28.4%
IDE code completion

use_case.dev.ide_completion

27.6%
CAD scripting helper

use_case.eng.cad_scripting_helper

27.5%
Code generation

use_case.dev.code_generation

26.6%
Agentic bug fixing

use_case.dev.agentic_bug_fixing

24.5%
PR review agent

use_case.dev.pr_review_agent

23.9%
Function Calling / Tool Use Agent

use_case.dev.function_calling_agent

21.8%
Quant research code generation

use_case.fin.alpha_research_codegen

17.7%
Poetry and lyrics

use_case.creative.poetry_lyrics

15.4%
Screenplay scene writing

use_case.creative.screenplay_scene

15.4%
Agentic incident response

use_case.sre.agentic_incident_response

15.3%
Prompt injection resistance (eval)

use_case.security.prompt_injection_resistance_eval

14.9%