BasedAGIBasedAGI
Menu
Rankings live

Model Profile

openai/gpt-4o-mini-2024-07-18

External Benchmark Shadowexternal_benchmark_shadowpublic
4,096 ctx

Use this page to decide where this model is a strong fit. Rankings below are benchmark-backed by use case, with explicit confidence and contributor metrics.

Identity

ID: external/openai/gpt-4o-mini-2024-07-18

Author: openai

Origin: external_benchmark_shadow

Arch: unknown

Benchmark Coverage

Scored use cases: 12

Avg confidence: 32.8%

Evidence points: 152

Raw rows: 320

Weighted rows: 20

Catalog Metadata

Parameters: unknown

Context window: 4096

Downloads: 0

Some fit rows have limited benchmark evidence.

4 of 12 scored use cases have low confidence or thin contributor coverage.

Coverage Diagnostics

actively scored

Use-Case Scores

133

Total Measurements

320

Weighted Measurements

20

Weighted Sources

12

Raw Source Coverage

vals_mmlu_pro 60vals_mgsm 48corpfin_taxeval_public 28vals_medqa 28vals_legal_bench 18vals_corp_fin_v2 16

Weighted Source Coverage

llm_trustworthy_leaderboard 5vals_corp_fin_v2 3duckdb_nsql_leaderboard 2gaia_results_public 2icelandic_llm_leaderboard 1lmarena_arena_hard_v01 1

Best Use Cases for This Model

Use CaseScore
Refusal profile (eval)

use_case.security.refusal_profile_eval

22.5%
Scam and social engineering resistance (eval)

use_case.security.scam_social_engineering_resistance_eval

22.5%
Overrefusal (eval)

use_case.security.overrefusal_eval

22.5%
Crisis escalation protocol (eval)

use_case.safety.crisis_escalation_protocol

22.5%
Jailbreak resistance (eval)

use_case.security.jailbreak_resistance_eval

22.5%
Metric definition workshop

use_case.data.metric_definition_workshop

17.0%
Data quality assistant

use_case.data.data_quality_assistant

15.9%
SQL debugging

use_case.data.sql_debugging

14.8%
Executive brief from metrics

use_case.data.exec_brief_from_metrics

14.3%
Disinformation and manipulation resistance (eval)

use_case.security.disinformation_resistance_eval

13.7%
Insight mining from text corpora

use_case.data.insight_mining

13.2%
Vulnerability-oriented code review

use_case.cyber.vulnerability_review

13.1%

Deployment Fit Calculator

Model

openai/gpt-4o-mini-2024-07-18

external/openai/gpt-4o-mini-2024-07-18

2-bit8-bit

Insufficient

Unknown parameter count. Cannot estimate deployment fit.

Required VRAM

~0.0GB

Est. Throughput

0.00 tok/s

Deployment Fit Matrix

GPU4-bit6-bit8-bit
RTX 3060 12GBInsufficientInsufficientInsufficient
RTX 3090 24GBInsufficientInsufficientInsufficient
RTX 4090 24GBInsufficientInsufficientInsufficient
Mac Studio M2 Ultra 192GBInsufficientInsufficientInsufficient