GPT 4o Mini

Performance by Benchmark

Benchmarks

Accuracy

Rankings

CorpFin

55.0%

( 16 / 31 )

55.0%

16 / 31

CaseLaw

70.8%

( 38 / 50 )

70.8%

38 / 50

ContractLaw

72.4%

( 7 / 57 )

72.4%

7 / 57

TaxEval

64.9%

( 25 / 37 )

64.9%

25 / 37

MortgageTax

69.2%

( 12 / 18 )

69.2%

12 / 18

Math500

72.6%

( 23 / 33 )

72.6%

23 / 33

AIME

11.5%

( 22 / 29 )

11.5%

22 / 29

MGSM

86.2%

( 24 / 31 )

86.2%

24 / 31

LegalBench

76.2%

( 25 / 55 )

76.2%

25 / 55

MedQA

72.4%

( 23 / 35 )

72.4%

23 / 35

GPQA

44.2%

( 23 / 30 )

44.2%

23 / 30

MMLU Pro

62.7%

( 27 / 30 )

62.7%

27 / 30

MMMU

58.2%

( 15 / 17 )

58.2%

15 / 17

Academic Benchmarks

Proprietary Benchmarks (contact us to get access)

Overview

GPT-4o Mini represents OpenAI’s effort to provide a cheaper, more lightweight version of GPT-4. It offers a compelling balance of performance and cost, making it particularly suitable for production deployments where both quality and economics matter.

Key Specifications

Context Window: 128,000 tokens
Output Limit: 16,384 tokens
Training Cutoff: October 2023
Pricing:
- Input: $0.15 per million tokens
- Output: $0.60 per million tokens

Performance Highlights

Cost Efficiency: Significantly cheaper than GPT-4 while maintaining strong performance
Legal Tasks: Shows strong performance on legal reasoning tasks
Consistency: Reliable performance across various benchmark categories

Benchmark Results

The model demonstrates competitive performance across our benchmarks:

group performance ranking results:

Use Case Recommendations

Best suited for:

High-volume production deployments
Cost-sensitive applications
Tasks requiring balance of performance and efficiency
Legal document analysis at scale

Limitations

Lower performance ceiling compared to GPT-4o
May struggle with highly complex legal and financial tasks

Comparison with Other Models

More capable than GPT-3.5 Turbo
More cost-effective than GPT-4
Competitive with Claude 3.5 Haiku in terms of performance/cost ratio

Performance by Benchmark

Cost Analysis

Overview

Key Specifications

Performance Highlights

Benchmark Results

Use Case Recommendations

Limitations

Comparison with Other Models

Join our mailing list to receive benchmark updates on