Accuracy (Vals Index)
75.14% ± 0.64
Latency (Vals Index)
1050.68s
Cost/Test (Vals Index)
$5.16
Context Window
1M
Max Output Tokens
128k
Input Modality
Hyperparameter settings
Default Provider :
Anthropic
Some benchmarks may use different provider and parameters. Please refer to the benchmark page for more information.
Temperature
1
Top P
Default
Top K
Default
Max Output Tokens
128,000
Compute Effort
max
Show scores with fallbacks counted as failures
Benchmarks
Accuracy
Rankings
Contact us
Proprietary Benchmarks (contact us to get access)
Academic Benchmarks
Read about our methodology.