Model Profile

Phi-3.5-mini-instruct

Name: Phi-3.5-mini-instruct
Rating: 1.3 (70 reviews)
Author: microsoft

4,096 ctxOpen weights

Use this page to decide where this model is a strong fit. Rankings below are benchmark-backed by use case, with explicit confidence and contributor metrics.

Identity

ID: microsoft/Phi-3.5-mini-instruct

Author: microsoft

Origin: huggingface_catalog

Arch: unknown

Benchmark Coverage

Scored use cases: 12

Avg confidence: 24.0%

Evidence points: 70

Raw rows: 22

Weighted rows: 8

Catalog Metadata

Parameters: unknown

Context window: 4096

Downloads: 293,196

Intelligence Profile

Dimension Breakdown

IQ6 benchmarks

45.0%*

EQ5 benchmarks

58.9%*

Accuracy2 benchmarks

64.2%*

Creativity4 benchmarks

47.4%*

Based2 benchmarks

40.7%*

* Low confidence — limited benchmark evidence for this dimension

5/5 dimensions scored · Last updated May 1, 2026

Benchmark Signals

Click through to the benchmark source behind this model profile.

Open LLM Leaderboard MMLU-Pro

mmlu_pro_accuracy_pct

3.8%

Normalized value 47.0% · confidence 100.0%

Strongest impact in Social post generation

openllm_mmlu_pro_official.mmlu_pro_accuracy_pct · May 1, 2026

Open LLM Leaderboard GPQA

gpqa

3.3%

Normalized value 40.7% · confidence 100.0%

Strongest impact in Social post generation

openllm_gpqa_official.gpqa · May 1, 2026

Open LLM Leaderboard IFEval

ifeval

2.7%

Normalized value 64.2% · confidence 100.0%

Strongest impact in Job description drafting

openllm_ifeval_official.ifeval · May 1, 2026

EQ-Bench Leaderboard

eq_bench_score

2.2%

Normalized value 61.0% · confidence 100.0%

Strongest impact in Social post generation

eq_bench.eq_bench_score · May 1, 2026

Open LLM Leaderboard BBH

bbh

1.3%

Normalized value 47.7% · confidence 100.0%

Strongest impact in Brand voice localization

openllm_bbh_official.bbh · May 1, 2026

JSONSchemaBench Leaderboard

medium_schema_compliance_pct

1.0%

Normalized value 53.0% · confidence 100.0%

Strongest impact in Claims summary

jsonschemabench_leaderboard.medium_schema_compliance_pct · May 1, 2026

Some fit rows have limited benchmark evidence.

7 of 12 scored use cases have low confidence or thin contributor coverage.

Coverage Diagnostics

actively scored

Use-Case Scores

122

Total Measurements

Weighted Measurements

Weighted Sources

Raw Source Coverage

jsonschemabench_leaderboard 12open_llm_leaderboard_results 5eq_bench 1openllm_bbh_official 1openllm_gpqa_official 1openllm_ifeval_official 1

Weighted Source Coverage

jsonschemabench_leaderboard 2eq_bench 1open_llm_leaderboard_results 1openllm_bbh_official 1openllm_gpqa_official 1openllm_ifeval_official 1

Best Use Cases for This Model

Use Case	Vertical	Score	Confidence	Evidence	Top Contributor
Social post generation use_case.mkt.social_post_generation	marketing_sales	12.9%	27.3%	5	Open LLM Leaderboard MMLU-Pro: mmlu_pro_accuracy_pct
Product positioning and messaging use_case.mkt.product_positioning	marketing_sales	12.9%	27.3%	5	Open LLM Leaderboard MMLU-Pro: mmlu_pro_accuracy_pct
Campaign brief use_case.mkt.campaign_brief	marketing_sales	12.9%	27.3%	5	Open LLM Leaderboard MMLU-Pro: mmlu_pro_accuracy_pct
Personalized sales outreach use_case.mkt.sales_outreach_personalized	marketing_sales	12.4%	26.2%	5	Open LLM Leaderboard MMLU-Pro: mmlu_pro_accuracy_pct
Ad copy variants use_case.mkt.ad_copy_variants	marketing_sales	12.4%	26.2%	5	Open LLM Leaderboard MMLU-Pro: mmlu_pro_accuracy_pct
Job description drafting use_case.hr.job_description_drafting	hr_recruiting	11.1%	22.2%	6	Open LLM Leaderboard IFEval: ifeval
Brand voice localization use_case.mkt.brand_voice_localization	marketing_sales	11.0%	22.3%	6	Open LLM Leaderboard MMLU-Pro: mmlu_pro_accuracy_pct
Candidate summary memo use_case.hr.candidate_summary	hr_recruiting	10.5%	22.2%	8	Open LLM Leaderboard MMLU-Pro: mmlu_pro_accuracy_pct
Tail spend categorization use_case.proc.tail_spend_categorization	supply_chain	10.3%	21.6%	8	Open LLM Leaderboard MMLU-Pro: mmlu_pro_accuracy_pct
Claims summary use_case.ins.claims_summary	insurance	10.0%	22.2%	7	Open LLM Leaderboard MMLU-Pro: mmlu_pro_accuracy_pct
Kubernetes manifest generation use_case.sre.iac_k8s	devops_sre	9.8%	21.8%	5	Open LLM Leaderboard MMLU-Pro: mmlu_pro_accuracy_pct
Screenplay scene writing use_case.creative.screenplay_scene	creative	9.8%	21.8%	5	Open LLM Leaderboard MMLU-Pro: mmlu_pro_accuracy_pct