Model Profile

Phi-3-small-128k-instruct

Name: Phi-3-small-128k-instruct
Rating: 1.0 (49 reviews)
Author: microsoft

4,096 ctxOpen weights

Use this page to decide where this model is a strong fit. Rankings below are benchmark-backed by use case, with explicit confidence and contributor metrics.

Identity

ID: microsoft/Phi-3-small-128k-instruct

Author: microsoft

Origin: huggingface_catalog

Arch: unknown

Benchmark Coverage

Scored use cases: 10

Avg confidence: 13.8%

Evidence points: 49

Raw rows: 82

Weighted rows: 5

Catalog Metadata

Parameters: unknown

Context window: 4096

Downloads: 772

Intelligence Profile

Dimension Breakdown

IQ6 benchmarks

48.0%*

EQ0 benchmarks

No eq benchmarks found

Insufficient data

Accuracy2 benchmarks

70.8%*

Creativity0 benchmarks

No creativity benchmarks found

Insufficient data

Based0 benchmarks

No based benchmarks found

Insufficient data

* Low confidence — limited benchmark evidence for this dimension

2/5 dimensions scored · Last updated Apr 2, 2026

Benchmark Signals

Click through to the benchmark source behind this model profile.

RepoQA Official Results

overall_average_pass_at_1_pct

2.5%

Normalized value 53.4% · confidence 100.0%

Strongest impact in Debugging assistant

repoqa_leaderboard.overall_average_pass_at_1_pct · Apr 1, 2026

BigCodeBench Official

bigcodebench_complete_pct

1.3%

Normalized value 57.8% · confidence 100.0%

Strongest impact in IDE code completion

bigcodebench_official.bigcodebench_complete_pct · Apr 1, 2026

BigCodeBench Official

bigcodebench_instruct_pct

1.1%

Normalized value 54.6% · confidence 100.0%

Strongest impact in IDE code completion

bigcodebench_official.bigcodebench_instruct_pct · Apr 1, 2026

RepoQA Official Results

all_average_pass_at_1_pct

0.9%

Normalized value 53.4% · confidence 100.0%

Strongest impact in Unit test generation

repoqa_leaderboard.all_average_pass_at_1_pct · Apr 1, 2026

BigCodeBench Official

bigcodebench_hard_complete_pct

0.5%

Normalized value 32.7% · confidence 100.0%

Strongest impact in IDE code completion

bigcodebench_official.bigcodebench_hard_complete_pct · Apr 1, 2026

Some fit rows have limited benchmark evidence.

10 of 10 scored use cases have low confidence or thin contributor coverage.

Coverage Diagnostics

actively scored

Use-Case Scores

Total Measurements

Weighted Measurements

Weighted Sources

Raw Source Coverage

repoqa_leaderboard 74bigcodebench_official 8

Weighted Source Coverage

bigcodebench_official 3repoqa_leaderboard 2

Best Use Cases for This Model

Use Case	Vertical	Score	Confidence	Evidence	Top Contributor
Debugging assistant use_case.dev.debugging	developer_tools	9.5%	18.2%	5	RepoQA Official Results: overall_average_pass_at_1_pct
Unit test generation use_case.dev.test_generation	developer_tools	8.8%	16.8%	5	RepoQA Official Results: overall_average_pass_at_1_pct
Code Review Assistant use_case.dev.code_review_assistant	developer_tools	8.4%	16.3%	5	RepoQA Official Results: overall_average_pass_at_1_pct
Integration test generation use_case.dev.integration_tests	developer_tools	7.9%	15.4%	5	RepoQA Official Results: overall_average_pass_at_1_pct
Refactoring assistant use_case.dev.refactoring	developer_tools	7.4%	14.1%	5	RepoQA Official Results: overall_average_pass_at_1_pct
Verilog/VHDL generation use_case.eda.verilog_generation	engineering	7.3%	14.3%	4	BigCodeBench Official: bigcodebench_complete_pct
Documentation from code use_case.dev.docstrings_and_docs	developer_tools	6.3%	12.0%	5	RepoQA Official Results: overall_average_pass_at_1_pct
Code generation use_case.dev.code_generation	developer_tools	5.4%	10.5%	5	RepoQA Official Results: overall_average_pass_at_1_pct
IDE code completion use_case.dev.ide_completion	developer_tools	5.3%	10.4%	5	BigCodeBench Official: bigcodebench_complete_pct
Codebase onboarding brief use_case.dev.codebase_onboarding	developer_tools	5.3%	10.1%	5	RepoQA Official Results: overall_average_pass_at_1_pct