Independent Evaluation, Unbiased Benchmarks

Testing AI on Real-World Tasks

We benchmark the world's leading AI models on rigorous, domain-specific tasks in finance, law, software, healthcare, and more. We run all of our own evaluations and create many of our benchmarks in-house.

Vals AI Updates

Fresh updates from our testing queue

model
06/09/2026

Anthropic's Claude Fable 5 evaluated across our benchmark suite

Anthropic's Claude Fable 5 evaluated across our benchmark suite

View Details

Benchmarks

Accuracy

Rankings

75.14%

± 0.64
1/ 25

74.15%

± 0.57
1/ 20

71.83%

± 0.88
1/ 111

56.31%

± 0.84
2/ 25

56.07%

± 2.20
2/ 63

88.52%

± 1.95
1/ 62

68.92%

± 0.91
5/ 79

77.00%

± 4.23
1/ 36

93.18%

± 1.94
2/ 111

72.25%

± 9.36
1/ 55
Contact us
Or send us an email at contact@vals.ai
Proprietary Benchmarks (contact us to get access)
Academic Benchmarks

Read about our methodology.

Industry Leaderboard

Independent benchmarks for industry-specific AI performance.

Industry
Benchmark