Model Profile
Llama-3.1-8B-Instruct
Use this page to decide where this model is a strong fit. Rankings below are benchmark-backed by use case, with explicit confidence and contributor metrics.
Identity
ID: meta-llama/Llama-3.1-8B-Instruct
Author: meta-llama
Origin: huggingface_catalog
Arch: unknown
Benchmark Coverage
Scored use cases: 2
Avg confidence: 11.0%
Evidence points: 9
Raw rows: 43
Weighted rows: 8
Catalog Metadata
Parameters: unknown
Context window: 4096
Downloads: 5,867,664
Intelligence Profile
Dimension Breakdown
No eq benchmarks found
No creativity benchmarks found
No based benchmarks found
* Low confidence — limited benchmark evidence for this dimension
2/5 dimensions scored · Last updated Apr 2, 2026
Benchmark Signals
Click through to the benchmark source behind this model profile.
JSONSchemaBench Leaderboard
medium_schema_compliance_pct
Normalized value 75.9% · confidence 100.0%
Strongest impact in Metric definition workshop
jsonschemabench_leaderboard.medium_schema_compliance_pct · Mar 31, 2026
JSONSchemaBench Leaderboard
hard_schema_compliance_pct
Normalized value 44.7% · confidence 100.0%
Strongest impact in Metric definition workshop
jsonschemabench_leaderboard.hard_schema_compliance_pct · Mar 31, 2026
BRIDGE Medical Leaderboard
average_performance_pct
Normalized value 62.5% · confidence 100.0%
Strongest impact in Metric definition workshop
bridge_medical_leaderboard.average_performance_pct · Apr 1, 2026
Aider Code Editing Leaderboard
percent_correct_pct
Normalized value 27.1% · confidence 100.0%
Strongest impact in Metric definition workshop
aider_code_editing.percent_correct_pct · Apr 1, 2026
Multilingual MMLU Benchmark
mmmlu
Normalized value 0.1% · confidence 100.0%
Strongest impact in Historical document summarization
multilingual_mmlu_leaderboard.mmmlu · Apr 1, 2026
Some fit rows have limited benchmark evidence.
2 of 2 scored use cases have low confidence or thin contributor coverage.
Coverage Diagnostics
actively scoredUse-Case Scores
2
Total Measurements
43
Weighted Measurements
8
Weighted Sources
4
Raw Source Coverage
Weighted Source Coverage
Best Use Cases for This Model
| Use Case | Score |
|---|---|
| Metric definition workshop use_case.data.metric_definition_workshop | 7.0% |
| Historical document summarization use_case.history.historical_doc_summarization | 2.3% |