BasedAGIBasedAGI
Menu
Rankings live

Model Profile

gpt-4o-2024-05-13

External Benchmark Shadowexternal_benchmark_shadowpublic
4,096 ctx

Use this page to decide where this model is a strong fit. Rankings below are benchmark-backed by use case, with explicit confidence and contributor metrics.

Identity

ID: external/openai/gpt-4o-2024-05-13

Author: openai

Origin: external_benchmark_shadow

Arch: unknown

Benchmark Coverage

Scored use cases: 12

Avg confidence: 31.1%

Evidence points: 118

Raw rows: 185

Weighted rows: 17

Catalog Metadata

Parameters: unknown

Context window: 4096

Downloads: 0

Some fit rows have limited benchmark evidence.

4 of 12 scored use cases have low confidence or thin contributor coverage.

Coverage Diagnostics

actively scored

Use-Case Scores

109

Total Measurements

185

Weighted Measurements

17

Weighted Sources

8

Raw Source Coverage

repoqa_leaderboard 74ugi_main 57llm_aggrefact_leaderboard 12vals_gpqa 12llm_trustworthy_leaderboard 8icelandic_llm_leaderboard 7

Weighted Source Coverage

llm_trustworthy_leaderboard 5ugi_main 3aider_code_editing 2llm_aggrefact_leaderboard 2repoqa_leaderboard 2icelandic_llm_leaderboard 1

Best Use Cases for This Model

Use CaseScore
Debugging assistant

use_case.dev.debugging

21.6%
Overrefusal (eval)

use_case.security.overrefusal_eval

21.5%
Jailbreak resistance (eval)

use_case.security.jailbreak_resistance_eval

21.5%
Refusal profile (eval)

use_case.security.refusal_profile_eval

21.5%
Scam and social engineering resistance (eval)

use_case.security.scam_social_engineering_resistance_eval

21.5%
Crisis escalation protocol (eval)

use_case.safety.crisis_escalation_protocol

21.5%
Unit test generation

use_case.dev.test_generation

19.2%
Refactoring assistant

use_case.dev.refactoring

17.1%
Vulnerability-oriented code review

use_case.cyber.vulnerability_review

15.8%
Malware analysis report (defensive)

use_case.cyber.malware_analysis_report

15.8%
Disinformation and manipulation resistance (eval)

use_case.security.disinformation_resistance_eval

14.4%
Codebase onboarding brief

use_case.dev.codebase_onboarding

13.6%

Deployment Fit Calculator

Model

gpt-4o-2024-05-13

external/openai/gpt-4o-2024-05-13

2-bit8-bit

Insufficient

Unknown parameter count. Cannot estimate deployment fit.

Required VRAM

~0.0GB

Est. Throughput

0.00 tok/s

Deployment Fit Matrix

GPU4-bit6-bit8-bit
RTX 3060 12GBInsufficientInsufficientInsufficient
RTX 3090 24GBInsufficientInsufficientInsufficient
RTX 4090 24GBInsufficientInsufficientInsufficient
Mac Studio M2 Ultra 192GBInsufficientInsufficientInsufficient