Model Profile

Qwen2.5-3B-Instruct

Name: Qwen2.5-3B-Instruct
Rating: 0.8 (88 reviews)
Author: Qwen

4,096 ctxOpen weights

Use this page to decide where this model is a strong fit. Rankings below are benchmark-backed by use case, with explicit confidence and contributor metrics.

Identity

ID: Qwen/Qwen2.5-3B-Instruct

Author: Qwen

Origin: huggingface_catalog

Arch: unknown

Benchmark Coverage

Scored use cases: 12

Avg confidence: 23.9%

Evidence points: 88

Raw rows: 31

Weighted rows: 10

Catalog Metadata

Parameters: unknown

Context window: 4096

Downloads: 5,245,261

Intelligence Profile

Dimension Breakdown

IQ6 benchmarks

26.1%*

EQ5 benchmarks

55.7%*

Accuracy2 benchmarks

72.0%*

Creativity4 benchmarks

34.5%*

Based2 benchmarks

10.3%*

* Low confidence — limited benchmark evidence for this dimension

5/5 dimensions scored · Last updated May 1, 2026

Benchmark Signals

Click through to the benchmark source behind this model profile.

Open LLM Leaderboard IFEval

ifeval

3.0%

Normalized value 72.0% · confidence 100.0%

Strongest impact in Job description drafting

openllm_ifeval_official.ifeval · May 1, 2026

Open LLM Leaderboard MMLU-Pro

mmlu_pro_accuracy_pct

2.9%

Normalized value 35.8% · confidence 100.0%

Strongest impact in Social post generation

openllm_mmlu_pro_official.mmlu_pro_accuracy_pct · May 1, 2026

JSONSchemaBench Leaderboard

medium_schema_compliance_pct

2.6%

Normalized value 60.1% · confidence 100.0%

Strongest impact in Metric definition workshop

jsonschemabench_leaderboard.medium_schema_compliance_pct · May 1, 2026

EQ-Bench Leaderboard

eq_bench_score

2.0%

Normalized value 55.5% · confidence 100.0%

Strongest impact in Social post generation

eq_bench.eq_bench_score · May 1, 2026

JSONSchemaBench Leaderboard

hard_schema_compliance_pct

1.2%

Normalized value 40.4% · confidence 100.0%

Strongest impact in Metric definition workshop

jsonschemabench_leaderboard.hard_schema_compliance_pct · May 1, 2026

Open LLM Leaderboard BBH

bbh

1.1%

Normalized value 33.4% · confidence 100.0%

Strongest impact in Grading and feedback assistant

openllm_bbh_official.bbh · May 1, 2026

Some fit rows have limited benchmark evidence.

7 of 12 scored use cases have low confidence or thin contributor coverage.

Coverage Diagnostics

actively scored

Use-Case Scores

128

Total Measurements

Weighted Measurements

Weighted Sources

Raw Source Coverage

jsonschemabench_leaderboard 12bridge_medical_leaderboard 9open_llm_leaderboard_results 5eq_bench 1openllm_bbh_official 1openllm_gpqa_official 1

Weighted Source Coverage

bridge_medical_leaderboard 2jsonschemabench_leaderboard 2eq_bench 1open_llm_leaderboard_results 1openllm_bbh_official 1openllm_gpqa_official 1

Best Use Cases for This Model

Use Case	Vertical	Score	Confidence	Evidence	Top Contributor
Campaign brief use_case.mkt.campaign_brief	marketing_sales	8.5%	27.9%	6	Open LLM Leaderboard MMLU-Pro: mmlu_pro_accuracy_pct
Product positioning and messaging use_case.mkt.product_positioning	marketing_sales	8.5%	27.9%	6	Open LLM Leaderboard MMLU-Pro: mmlu_pro_accuracy_pct
Social post generation use_case.mkt.social_post_generation	marketing_sales	8.5%	27.9%	6	Open LLM Leaderboard MMLU-Pro: mmlu_pro_accuracy_pct
Job description drafting use_case.hr.job_description_drafting	hr_recruiting	8.4%	22.6%	7	Open LLM Leaderboard IFEval: ifeval
Personalized sales outreach use_case.mkt.sales_outreach_personalized	marketing_sales	8.1%	26.8%	6	Open LLM Leaderboard MMLU-Pro: mmlu_pro_accuracy_pct
Ad copy variants use_case.mkt.ad_copy_variants	marketing_sales	8.1%	26.8%	6	Open LLM Leaderboard MMLU-Pro: mmlu_pro_accuracy_pct
Brand voice localization use_case.mkt.brand_voice_localization	marketing_sales	8.1%	22.7%	7	Open LLM Leaderboard MMLU-Pro: mmlu_pro_accuracy_pct
Candidate summary memo use_case.hr.candidate_summary	hr_recruiting	8.1%	22.7%	9	Open LLM Leaderboard MMLU-Pro: mmlu_pro_accuracy_pct
Tail spend categorization use_case.proc.tail_spend_categorization	supply_chain	7.9%	22.0%	9	Open LLM Leaderboard MMLU-Pro: mmlu_pro_accuracy_pct
Grading and feedback assistant use_case.edu.grading_feedback_assist	education	7.8%	20.0%	9	Open LLM Leaderboard IFEval: ifeval
Text tagging and routing use_case.business.text_tagging	business_productivity	7.7%	19.5%	9	Open LLM Leaderboard IFEval: ifeval
Metric definition workshop use_case.data.metric_definition_workshop	data_analytics	7.7%	20.7%	8	JSONSchemaBench Leaderboard: medium_schema_compliance_pct