Independent Evaluation, Unbiased Benchmarks

Testing AI on Real-World Tasks

We benchmark the world's leading AI models on rigorous, domain-specific tasks in finance, law, software, healthcare, and more. We run all of our own evaluations and create many of our benchmarks in-house.

Vals AI Updates

Fresh updates from our testing queue

benchmark
07/01/2026

Excel Modeling Benchmark Released

Excel Modeling Benchmark Released

View Details

System

Accuracy

69.37%

± 2.61

66.32%

± 3.01

64.54%

± 2.87

63.55%

± 2.75

61.53%

± 2.88

60.15%

± 3.28

57.85%

± 2.74

57.04%

± 2.85

55.23%

± 3.11

52.62%

± 2.97
Showing top 10 models from the benchmark. Visit the benchmark page to view more

Industry Leaderboard

Independent benchmarks for industry-specific AI performance.

Industry
Benchmark

Model Performance Over Time

Tracking how foundation models improve with each release