Overview
o1 Preview represents OpenAI’s latest breakthrough in language model capabilities. It demonstrates unprecedented performance across our benchmarks, particularly excelling in complex reasoning tasks and mathematical computations. While it comes at a premium price point, it sets new standards for what’s possible in language model performance.
Key Specifications
- Context Window: 128,000 tokens
- Output Limit: 32,768 tokens
- Training Cutoff: October 2023
- Pricing:
- Input: $15.00 per million tokens
- Cached Input: $7.50 per million tokens
- Output: $60.00 per million tokens
Performance Highlights
- Mathematical Reasoning: Exceptional performance in numerical tasks
- Legal Analysis: Top performer across all legal benchmarks
- Complex Logic: Superior handling of multi-step reasoning
- Consistency: Most reliable outputs among all tested models
Benchmark Results
Leads performance across our benchmarks:
- TaxEval: Highest accuracy in tax computation and reasoning
- LegalBench: Top performance in legal analysis
- ContractLaw: Superior contract interpretation capabilities
- CaseLaw: Best-in-class understanding of legal precedents
Use Case Recommendations
Best suited for:
- High-stakes analysis
- Complex legal reasoning
- Tax computation and analysis
- Tasks requiring highest possible accuracy
- Research and development applications
Limitations
- Highest cost among all models, and may be cost-prohibitive for many applications
- Sometimes produces overly verbose outputs
- Much harder to control - does not support system prompts, temperature, etc.
Comparison with Other Models
- Significantly more expensive than GPT-4o
- Higher performance ceiling than all other models
- Better reasoning capabilities than Claude 3 Opus
- Superior mathematical abilities compared to all competitors